Revisiting the Effect of Item Purification on Differential Item Functioning; Real Data Findings

Author :  

Year-Number: 2018-Volume 10, Issue 5
Language : null
Konu :
Number of pages: 139-147
Mendeley EndNote Alıntı Yap

Abstract

Keywords

Abstract

One of the important issues facing practitioners in the process of determining Differential Item Functioning (DIF) in a test, is that the presence of one or more items with DIF may affect the results of determining DIF in other items. In this case, items that do not function differentially could be falsely identified as showing DIF, which leads to an undesirable increase in Type I error. As a solution to this problem, it has been proposed that the items showing DIF are iteratively excluded from the analyses, in a process called item purification. The purpose of this study is to compare the results of gender based DIF analyses when item purification is conducted and when it is not. The study group consisted of 655 students who take undergraduate course of Measurement and Evaluation at a state university in İstanbul. The data collection tool consisted of 25 multiple choice items covering the curriculum in the course. Data analyses were performed using the R statistical program. The "difR" package was used for this analysis. The Mantel-Haenszel, Standardization, Logistic Regression, Lord's Chi-Square, Raju and Breslow Day methods were used during the purification and non purification processes. The findings showed that the DIF results with and without the iterative processes were changed and the numbers of DIF items showed difference.

Keywords


  • Aguerri, M. E., Galibert, M. S., Attorresi, H. F., & Marañón, P. P. (2009). Erroneous detection of nonuniform DIF using the Breslow–Day test in a short test. Quality & Quantity, 43, 35-44.

  • Breslow, N. E., & Day, N. E. (1980). Statistical methods in cancer research: Vol. 1. The analysis of case–control studies (Scientific Publication No. 32). Lyon, France: International Agency for Research on Cancer.

  • Camilli G. and Shepard L. A. (1994). Methods for identifying biased test items (volume 4). California: SAGE Publications. Inc.

  • Candell, G. L., & Drasgow, F. (1988). An iterative procedure for linking metrics and assessing item bias in item response theory. Applied Psychological Measurement, 12, 253-260.

  • Clauser, B. E., & Mazor, K. M. (1998). Using statistical procedures to identify differential functioning test items. Educational Measurement: Issues and Practice, 17(1), 31-44.

  • Clauser, B. E., Mazor, K. M., & Hambleton, R. K. (1993). The effects of purification of the matching criterion on the identification of DIF using the Mantel-Haenszel procedure. Applied Measurement in Education, 6, 269279.

  • Dorans, N. J. (1989). Two new approaches to assessing differential item functioning. Standardization and the Mantel–Haenszel method. Applied Measurement in Education, 2, 217-233.

  • Fidalgo, A. M., Mellenbergh, G. J., & Muniz, J. (2000). Effects of amount of DIF, test length and purification type on robustness and power of Mantel-Haenszel procedures. Methods of Psychological Research Online, 5, 43-53.

  • French, B. F., & Maller, S. J. (2007). Iterative purification and effect size use with logistic regression for differential item functioning detection. Educational and Psychological Measurement 67, 373-393.

  • Hidalgo-Montesinos, M.D., & Gómez-Benito, J. (2003). Test purification and the evaluation of differential item functioning with multinomial logistic regression. European Journal of Psychological Assessment, 19, 1-11.

  • Holland, P. W., & Thayer, D. T. (1988). Differential item performance and the Mantel- Haenszel procedure. In H. Wainer & H. Braun (eds.), Test validity (pp. 129-145). Hillsdale, NJ: Erlbaum.

  • Karami H. and Nodoushan M. A. S. (2011). Differential item functioning (DIF): current problems and future directions. International Journal of Language Studies, 5-4, 133-142.

  • Kim, J. (2010). Controlling type 1 error rate in evaluating differential item functioning for four DIF methods: Use of three procedures for adjustment of multiple item testing (Doctoral Dissertation). Georgia State University, ABD.

  • Lautenschlager, G. J., Flaherty, V. L., & Park, D. G. (1994). IRT differential item functioning: An examination of ability scale purifications. Educational and Psychological Measurement, 54, 21-31.

  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Erlbaum.

  • Magis, D., & Facon, B. (2012). Item purification does not always improve DIF detection: A counterexample with Angoff’s delta plot. Educational and Psychological Measurement, 73, 293-311.

  • Magis, D., Béland, S., Tuerlinckx, F., & De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior research methods, 42(3), 847-862.

  • Miller, M. D., & Oshima, T. C. (1992). Effect of sample size, number of biased items, and magnitude of bias on a two-stage item bias estimation method. Applied Psychological Measurement, 16, 381-388.

  • Mantel, N., & Haenszel, W. (1959). Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer Institute, 22, 719-748.

  • Navas-Ara, M. J., & Gómez-Benito, J. (2002). Effects of ability scale purification on identification of DIF. European Journal of Psychological Assessment, 18, 9-15.

  • Park, D. G., & Lautenschlager, G. J. (1990). Improving IRT item bias detection with iterative linking and ability scale purification. Applied Psychological Measurement, 14, 163-173.

  • Raju, N. S. (1988). The area between two item characteristic curves. Psychometrika, 53, 495-502.

  • Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27, 361-370.

  • Wang, W.-C., & Su, Y.-H. (2004). Effects of average signed area between two item characteristic curves and test purification procedures on the DIF detection via the Mantel-Haenszel method. Applied Measurement in Education, 17, 113-144.

  • Wang, W.-C., & Yeh, Y.-L. (2003). Effects of anchor item methods on differential item functioning detection with the likelihood ratio test. Applied Psychological Measurement, 27, 479-498.

  • Wang,W.-C., & Su, Y.-H. (2004). Factors Influencing the Mantel and generalized Mantel- Haenszel methods for the assessment of differential item functioning in polytomous items. Applied Psychological Measurement, 28, 450-480.

  • Wang,W.-C., & Su, Y.-H. (2010). MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items. Applied Psychological Measurement, 34, 166-180.

  • Wyse, A. E. and Mapuranga, R. (2009). Differential item functioning analysis using Rasch item information functions. International Journal of Testing, 9(4), 333-357.

  • Zumbo. B. D. (1999). A Handbook on the Theory and Methods of Differential Item Functioning (DIF): Logistic Regression Modeling as a Unitary Framework for Binary and Likert-Type (Ordinal) Item Scores. Ottawa. ON: Directorate of Human Resources Research and Evaluation. Department of National Defense.

                                                                                                                                                                                                        
  • Article Statistics