바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

구조방정식모형을 이용한 차별문항기능의 탐지: MACS와 MIMIC의 비교

Detection of differential item functioning using structural equation modeling: A comparison of MACS and MIMIC

한국심리학회지: 일반 / Korean Journal of Psychology: General, (P)1229-067X; (E)2734-1127
2013, v.32 no.4, pp.1023-1052
윤수철 (서울대학교 언어교육원)
이순묵 (성균관대학교)
  • 다운로드 수
  • 조회수

초록

구조방정식모형을 이용하여 차별문항기능을 탐지하는 방법으로 평균 및 공분산구조(MACS) 모형과 다지표-다원인(MIMIC) 모형의 두 가지를 들 수 있다. 두 모형은 모두 구조방정식모형의 특수한 경우에 해당하지만, 모형 간에 통계적 가정 및 자료의 입력 방식이 다르기 때문에 연구 상황에 따라 그 수행이 상이할 수 있다. 특히 MIMIC 모형은 MACS 모형에 비해 추가적인 가정을 필요로 하기 때문에, 가정이 위배될 경우 균일적 차별문항기능 탐지에 더 불리할 수 있다. 또한 MACS 모형과 달리 MIMIC 모형은 집단변수를 포함한 단일 입력자료를 사용하기 때문에, 집단 간에 표본크기가 상이할 경우 MIMIC 모형이 MACS 모형보다 균일적 차별문항기능 탐지에 더 유리할 것으로 예상할 수 있다. 이러한 가능성에 대해 여러 문헌에서 지적되었음에도 불구하고, 다양한 상황에서 두 모형의 차이를 체계적으로 비교한 연구는 발견되지 않았다. 따라서 다양한 연구 상황을 반영한 몬테 카를로 모의실험(Monte Carlo simulation)을 통해 균일적 차별문항기능 탐지에 대한 두 모형의 수행을 비교하였다. 구체적으로는 집단효과, 측정변수 신뢰도의 차이, 전체 표본크기, 표본크기의 비율, 차별문항기능의 크기, 차별문항기능 탐지 전략 등을 체계적으로 조작하였으며, 이에 따른 두 모형의 수행을 비교하였다. 비교 결과, 추가적인 가정이 위배되는 상황에서도 MIMIC 모형의 균일적 차별문항기능 탐지율이 MACS 모형에 비해 크게 저하되지 않았으며, 표본 크기가 집단 간에 다른 경우 MIMIC 모형이 MACS 모형보다 우수한 탐지율을 보였다. 단, 두 모형의 효과적인 사용을 위해서는 적절한 탐지 전략이 필요하므로, 이에 대해 논의한 후 현실적으로 바람직한 방안을 제안하였다.

keywords
Differential Item Functioning, Structural Equation Modeling, MACS, MIMIC, 차별문항기능, Differential Item Functioning, 구조방정식모형, MACS, MIMIC

Abstract

Two models, MACS and MIMIC, can be used to detect Differential Item Functioning(DIF) in a Structural Equation Modeling framework. Although these two models can be considered as special cases of general Structural Equation Models, they may perform differently in various research contexts due to differences in statistical assumptions and the way in which each model uses data. In particular, since MIMIC model requires some additional assumptions, its performance may decline when those assumptions are not satisfied. Furthermore, the performance of MIMIC model will be superior to that of MACS model when sample sizes vary among groups because the former uses a single dataset including group variable(s), unlike the latter. Although many articles have commented on these predictions, no systematic research comparing the performance of the two models under these circumstances had yet to be conducted. Thus, we investigated the differences in performance of these two models under various conditions, specifically the size of impact, differences in measurement variable reliability, sample size ratio, total sample size, the size of differential item functioning, and the strategy for detecting DIF through a Monte Carlo simulation study. We found that the performance of MIMIC model in detecting uniform DIF did not decline significantly, although one of its additional assumptions was violated. Moreover, MIMIC model was superior to MACS model when sample sizes differed between two groups. Finally, we emphasize the importance of employing appropriate strategies for effective use of the two models to detect uniform DIF.

keywords
Differential Item Functioning, Structural Equation Modeling, MACS, MIMIC, 차별문항기능, Differential Item Functioning, 구조방정식모형, MACS, MIMIC

참고문헌

1.

김성훈 (1995). 문항차별기능의 인지심리학적 해석 가능성 검토. 교육평가연구, 8(2), 39-62.

2.

김한조 (2010). 다집단 분석에서의 부분동일성 검증 전략들: 적절한 전략을 찾기 위한 시뮬레이션 연구. 성균관대학교 석사학위논문.

3.

성태제 (1993). 차별기능(편파성) 문항 추출을 위한 Raju 방법과 MH 방법의 비교 연구. 교육평가연구, 6(1), 91-12.

4.

이순묵, 금은희, 이찬순 (2010). 다집단 분석의 문제: 평균구조분석에서의 측정원점 동일성 검증 필요 여부. 교육평가연구, 23(2), 391-416.

5.

Ackerman, T. A. (1992). A Didactic Explanation of Item Bias, Item Impact, and Item Validity from a Multidimensional Perspective. Journal of Educational Measurement, 29(1), 67-91.

6.

Barendse, M. T., Oort, F. J., & Garst, G. J. A. (2010). Using restricted factor analysis with latent moderated structures to detect uniform and nonuniform measurement bias: a simulation study. Advanced Statistical Analysis, 94, 117-127.

7.

Bentler, P. M. & Bonett, D. G. (1980). Significant tests and goodness of fit in the analysis of covariance structure. Psychological Bulletin, 88, 588-606.

8.

Brown, T. A. (2006). Confirmatory Factor Analysis for Applied Research. New York: The Guilford Press.

9.

Byrne, B. M., Shavelson, R. J., & Muthen, M. (1989). Testing for the Equivalence of Factor Covariance and Mean Structures: The Issue of Partial Measurement Invariance. Psychological Bulletin, 105(3), 456-466.

10.

Camilli, G. (1979). A Critique of the chi-square method assessing item bias. Unpublished paper, Laboratory of Educational Research, University of Colorado at Boulder.

11.

Chan, D. (2000). Detection of Differential Item Functioning on the Kirton Adaptation- Innovation Inventory Using Multiple-Group Mean and Covariance Structure Analyses. Multivariate Behavioral Research, 35(2), 169-199.

12.

Cohen, J. (1988). Statistical Power Analysis for Behavioral Sciences(2nd ed.). Hillsdale, New Jersey: Lawrence Erlbaum Associates, Inc.

13.

Drasgow, F. (1984). Scrutinizing Psychological Tests: Measurement Equivalence and Equivalent Relations with External Variables. Psychological Bulletin, 95, 134-135.

14.

Gelin, M. N. & Zumbo, B. D. (2007). Operating characteristics of the DIF MIMIC approach using Jöreskog’s covariance matrix with ML and WLS estimation for short scales. Journal of Modern Applied Statistical Methods, 6(2), 573-588.

15.

Hancock, G. R., Lawrence, F. R., & Nevitt, J. (2000). Type I error and power of latent mean methods and MANOVA in factorially invariant and noninvariant latent variable systems. Structural Equation Modeling, 7(4), 534-556.

16.

Hogarty, K. Y., Hines, C. V., Kromrey, J. D., Ferron, J. M., & Mumford, K. R. (2005). The quality of factor solutions in Exploratory factor analysis: The influence of sample size, communality, and overdetermination. Educational and Psychological Measurement, 65(2), 202-226.

17.

Holland, P. W. & Thayer, D. T. (1988). Differential Item Functioning and the Mantel-Haenszel Procedure. In Wainer, H. & Braun, H. I. (Eds.), Test Validity(pp.129-145). Hillsdale, New Jersey: Lawrence Erlbaum Associates, Inc.

18.

Jöreskog, K. G. (1971). Statistical Analysis of sets of congeneric tests. Psychometrika, 36(2), 109-133.

19.

Jöreskog, K. G. & Goldberger, A. S. (1975). Estimation of a Model with Multiple Indicator and Multiple Causes of a Single Latent Variable. Journal of American Statistical Association, 70(351), 631-639.

20.

Jöreskog, K. G. & Sörbom, D. (1989). LISREL 7: A Guide to the Program and Applications. Chicago: SPSS Publications.

21.

Kaplan, D. & George, R. (1995). A study of the power associated with testing factor mean differences under violations of factorial invariance. Structural Equation Modeling, 2(2), 101-118.

22.

Lee, J. H. (2009). Type I Error and Power of the Mean and Covariance Structure Confirmatory Factor Analysis for Differential Item Functioning Detection: Methodological Issues and Resolutions. Unpublished Doctoral Dissertation. The University of Kansas.

23.

Little, T. D. (1997). Mean and Covariance Structures(MACS) Analyses of Cross-Cultural Data: Practical and Theoretical Issues. Multivariate Behavioral Research, 32(1), 53-76.

24.

Lord, F. M. (1980). Application of item response theory to practical testing problems. Hillsdale, New Jersey: Lawrence Erlbaum Associates, Inc.

25.

Lubke, G. H. & Dolan, C. V. (2003). Can unequal residual variances across groups mask differences in residual means in the common factor model?. Structural Equation Modeling, 10(2), 175-192.

26.

MacCallum, R. C., Widaman, K. F., Zhang, S., & Hong, S. (1999). Sample size in factor analysis. Psychological Methods, 4(1), 84-99.

27.

Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988). Goodness-of-fit indices in comfirmatory factor analysis: the effect of sample size. Psychological Bulletin, 103, 391-41.

28.

Maydeu-Olivares, A. & Cai, L. (2006). A cautionary note on using G2(dif) to assess relative model fit in categorical data analysis. Multivariate Behavioral Research, 41, 55-64.

29.

Meade, A. W. & Lautenschlager, G. J. (2004). A Comparison of Item Response Theory and Confirmatory Factor Analytic Methodologies for Establishing Measurement Equivalence/Invariance. Organizational Research Methods, 7, 361-388.

30.

Mellenberg, G. J. (1982). Contingency Table Models for Assessing Item Bias. Journal of Educational Statistics, 7(2), 105-118.

31.

Meredith, W. (1993). Measurement Invariance, Factor Analysis and Factorial Invariance. Psychometrika, 58(4), 525-543.

32.

Meredith, W. & Millsap, R. E. (1992). On the misuse of manifest variables in the detection of measurement bias. Psychometrika, 57(2), 289-311.

33.

Miller, T. R. & Spray, J. A. (1993). Logistic Discriminant Function Analysis for DIF Identification of Polytomously Scored Items. Journal of Educational Measurement, 30(2), 107-122.

34.

Millsap, R. E. & Everson, H. T. (1993) Methodological Review: Statistical Approaches for Assessing Measurement Bias. Applied Psychological Measurement, 17(4), 297-334.

35.

Muthén, B. O. (1979). A structural probit model with latent variables, Journal of the American Statistical Association, 74, 807-811.

36.

Muthén, B. O. (1983). Latent variable structural equation modeling with categorical data, Journal of Econometrics, 22, 43-65.

37.

Muthén, L. K. and Muthén, B. O. (1998-2010). Mplus User’s Guide. Sixth Edition. Los Angeles, CA: Muthén & Muthén.

38.

Oort, F. J. (1998). Simulation study of item bias detection with restricted factor analysis. Structural Equation Modeling, 5(2), 107-124.

39.

Paxton, P., Curran, P. J., Bollen, K. A., Kirby, J., & Chen, F. (2001). Monte Carlo Experiments: Design and Implementation. Structural Equation Modeling, 8(2), 287-312.

40.

Pennell, R. (1968). The influence of communality and N on the sampling distributions of factor loadings. Psychometrika, 33(4), 423-439.

41.

Raju, N. S., Raffitte, L. J., & Byrne, B. M. (2002). Measurement Equivalence: A Comparison of Methods Based on Confirmatory Factor Analysis and Item Response Theory. Journal of Applied Psychology, 87(3), 517-529.

42.

Rock, D. A., Werts, C. E., & Flaugher, R. L. (1978). The use of analysis of covariance structure for comparing the psychometric properties of multiple variables across populations, Multivariate Behavioral Research, 13, 408-418.

43.

SAS Institution Inc. (2011). SAS/STAT® 9.3 User's Guide. Cary, NC: SAS Institute Inc.

44.

Schmitt, N. & Kuljanin, G. (2006). Measurement Invariance: Review of practice and implications. Human Resource Management Review, 18, 210-222.

45.

Shealy, R. & Stout, W. (1993). A Model-based Standardization Approach that separates true bias/DIF from Group ability differences and Detects Test Bias/DTF as well as Item Bias/DIF. Psychometrika, 58(2), 159-194.

46.

Smith, L. L. (2002). On the Usefulness of Item Bias Analysis to Personality Psychology. Personality and Social Psychology Bulletin, 28, 754-763.

47.

Sörbom, D. (1974). A general method for studying differences in factor means and factor structure between groups. British Journal of Mathematical and Statistical Psychology, 27, 229-239.

48.

Stark, S., Chernyshenko, O. S., & Drasgow, F. (2006). Detecting Differential Item Functioning With Confirmatory Factor Analysis and Item Response Theory: Toward a Unified Strategy. Journal of Applied Psychology, 91(6), 1292-1306.

49.

Steenkamp, J. E. M. & Baumgartner, H. (1998). Assessing Measurement Invariance in Cross-National Consumer Research. Journal of Consumer Research, 25, 78-9.

50.

Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In Wainer, H. & Braun, H. I. (Eds.), Test Validity(pp. 147-172). Hillsdale, New Jersey: Lawrence Erlbaum Associates, Inc.

51.

Vandenberg, R. J. & Lance, C. E. (2000). A Review and Synthesis of the Measurement Invariance Literature: Suggestions, Practices, and Recommendations for Organizational Research. Organizational Research Methods, 3(1), 4-7.

52.

Yoon, M. & Millsap, R. E. (2007). Detecting Violations of Factorial Invariance Using Data-Based Specification Searches: A Monte Carlo Study. Structural Equation Modeling, 14(3), 435-463.

한국심리학회지: 일반