• 认知诊断缺失数据处理方法的比较:零替换、多重插补与极大似然估计法

    Subjects: Psychology >> Social Psychology submitted time 2023-03-27 Cooperative journals: 《心理学报》

    Abstract: The problem of missing data is common in research, and there is no exception for cognitive diagnostic assessment (CDA). Some studies have revealed that both the presence of missing values and the selection of different missing data processing methods would affect the results of CDA. Therefore, it is necessary to attach more attention to the problem in CDA and choose appropriate methods to deal with it. Although the problem in CDA has been explored before, previous studies did not consider multiple imputation (MI) and full information maximum likelihood (FIML), which are widely used in the field of missing data analysis. Moreover, previous studies neglected the comparison using empirical data and saturation models such as GDINA model. In summary, the main purpose of this study are to introduce MI and FIML into CDA, thus making a comprehensive comparison of different missing data handling methods, and further putting forward suggestions for handling missing data in practice. Simulation study considered six factors: (1) Sample size: 200 participants, 400 participants, and 1000 participants; (2) Test length: 15 test items and 30 test items; (3) Quality of items: high quality, medium quality, and low quality; (4) Missing data mechanisms: missing completely at random (MCAR), missing at random (MAR), and missing not at random (MNAR); (5) Missing rate: 10%, 20%, and 30%; (6) Missing data handling methods: zero replacement (ZR), MI-CART, MI-PMM, MI-LOGREG.BOOT, Expectation-Maximization algorithm (EM), and FIML. The GDINA model was used, and the analysis process was realized by the GDINA package in R software. Secondly, the PISA 2015 computer-based mathematics data were applied to compare the practical value of the proposed methods. The results of simulation study revealed that: (1) Missing data results in a decrease in estimation accuracy. The absolute value of Bias and RMSE both increased and PCCR values of all methods decreased as the sample size, test length and the quality of the items decreased and the missing rate increased; (2) When estimating item parameters, EM performed best, followed by MI. Meanwhile, FIML and ZR methods were unstable; (3) When estimating the KS of participants, EM and FIML performed best as the missing data mechanism was MAR or MCAR. When the missing data mechanism was MNAR, EM, FIML and ZR performed best. The empirical study results further supported the simulation research results. It showed that: (1) For all empirical indicators, EM, FIML, and MI-PMM perform best on one or more indicators; (2) The results obtained under the empirical study and simulation study under the MNAR mechanism are very similar; (3) EM performs well on all indicators, and ZR and FIML methods are slightly worse than EM, followed by MI-PMM, LOGREG.BOOT and MI-CART. In addition, based on the research results, the following suggestions were provided: (1) EM and FIML should be the first choice. However, if researchers do not want to get the complete data set, FIML could be used as a priority for missing data handling; (2) When the missing data mechanism was MAR or MCAR and the test length was not enough, researchers should avoid using the ZR method to deal with missing data. Finally, this paper ends with the prospects of future researches: (1) The multilevel scoring situation should also be studied; (2) The effectiveness of these methods should be tested in longitudinal research; (3) The performance of more methods of information matrix can be further compared in calculating the standard error to handle missing data; (4) Future research could focus on the missing mechanisms of data onto the real data.

  • 考虑题目选项信息的非参数认知诊断计算机自适应测验

    Subjects: Psychology >> Social Psychology submitted time 2023-03-27 Cooperative journals: 《心理学报》

    Abstract: Most existing cognitive diagnostic computerized adaptive testing (CD-CAT) item selection methods ignore the diagnostic information that distractors provide for multiple-choice (MC) items. Consequently, some useful information is missed and resources are wasted. To overcome this, researchers proposed the Jensen-Shannon divergence (JSD) strategy to select items with the MC-DINA model. However, the JSD strategy needs large samples to obtain reliable estimates of the item parameters before the formal test, and this could compromise the items in the bank. By contrast, the nonparametric method does not require any parameter calibration before the formal test and can be used in small educational programs. The current study proposes two nonparametric item selection methods (i.e., HDDmc and JDDmc) for CD-CAT with MC items as well as two termination rules (i.e., MR and DR) for variable-length CD-CAT with MC items. Two simulation studies were conducted to examine the performance of these nonparametric item selection methods and termination rules. The first study examined the performance of the HDDmc and JDDmc with fixed-length CD-CAT. In this study, six factors were manipulated: the number of attributes (K = 4 vs. 6), the structure of the Q-matrix (simple vs. complex), the quality of the item bank (high vs. low vs. mixed), the distribution of the attribute profile (multivariate normal threshold model vs. discrete uniform distribution), the test length (two vs. three vs. four times of K), and the item selection methods (HDDmc vs. JDDmc vs. JSD). Of these, item selection method was the within-group variable, and the rest were between-group variables. The results showed that: (1) the HDDmc and JDDmc produced higher attribute pattern matched ratios (PMRs) than the JSD method for most conditions; (2) the HDDmc and JDDmc produced similar PMRs for all conditions; (3) the HDDmc and JDDmc produced more even distributions of item exposure than the JSD method. The second simulation study investigated the performance of the MR and DR with variable-length CD-CAT. Six factors were also manipulated in this study: the settings for the number of attributes, the structure of the Q-matrix, the quality of the item bank, and the distribution of the attribute profile were the same as in the first study; the other two factors were termination rules (MR, DR, D1, and D3) and item selection methods (HDDmc and JDDmc). Again, the first four were between-group variables, while termination rules and item selection methods were within-group variables. The results showed that: (1) the HDDmc and JDDmc yielded higher PMRs for MR and DR rules than for the D1 and D3 rules; (2) the HDDmc and JDDmc yielded longer test lengths for MR and DR rules than for the D1 and D3 rules, especially for the JDD rule. In sum, both nonparametric item selection methods and the two new termination rules proved appropriate for CD-CAT with MC items, which means they can be used to balance the trade-off between measurement accuracy and item exposure rate.

  • 基于选项层面的认知诊断非参数方法

    Subjects: Psychology >> Social Psychology submitted time 2023-03-27 Cooperative journals: 《心理学报》

    Abstract: Cognitive diagnostic assessment (CDA) focuses on evaluating students' advantages and disadvantages in knowledge mastering, providing an opportunity for individualized teaching. Therefore, CDA has attracted attention of many scholars, teachers, and students at domestic and overseas. In CDA and a large number of standardized tests, multiple-choice (MC) are typical item types, which have the advantages of not being affected by subjective errors, improving test reliability, being easy to review, scoring quickly, and meeting the needs of content balance. To fulfil the potential of MC items for CDA, researchers proposed the MC-cognitive diagnosis models (MC-CDMs). However, these MC-CDMs pertain to parameter methods, which need a large sample size to obtain accurate parameter estimation. They are not suitable for small samples at class level, and the MCMC algorithm is very time-consuming. In this study, three nonparametric MC cognitive diagnosis methods based on hamming-distance are proposed, aiming at maximizing the diagnostic efficacy of MC items and being suitable for the diagnosis target of a small sample. Simulation study 1 considered four factors: sample size (30, 50, 100), test length (10, 20, 30), item quality (high and low), and the true model (MC-S-DINA1, MC-S-DINA2). Three nonparametric MC methods and two parametric models were compared. The results showed that in most conditions, the pattern accuracy rates and average attribute accuracy rates of the nonparametric MC method(dh−MC{{d}_{\text{h}-\text{MC}}}) were higher than those of parametric models, especially when the test length was short or item quality was low. In a real test situation, the quality of different items in a test may vary greatly. Based on this, simulation study 2 set the first half of the items at high quality and the remaining items at low quality. The results showed that the pattern accuracy rates and average attribute accuracy rates of the nonparametric MC method (dph−MC{{d}_{\text{ph}-\text{MC}}}) were higher than those of the parametric models in all conditions. In an empirical study, the nonparametric MC methods and the parametric models were used to analyze a set of real data simultaneously. The results showed that nonparametric MC methods and parametric models presented high classification consistency rates. Furthermore, the dph−MC{{d}_{\text{ph}-\text{MC}}} method had satisfactory estimations. In sum, dh−MC{{d}_{\text{h}-\text{MC}}} was suitable in most conditions, especially when the test length was short or the item quality was low When the quality of different items was quite diverse, dph−MC{{d}_{\text{ph}-\text{MC}}} was a better choice compared with parameteric approaches.

  • Nonparametric cognitive diagnostic computerized adaptive testing using distractor information

    Subjects: Psychology >> Psychological Measurement submitted time 2022-04-06

    Abstract:

    Most existing cognitive diagnostic computerized adaptive testing (CD-CAT) item selection methods ignore the diagnostic information that distractors provide for multiple-choice (MC) items. Consequently, some useful information is missed and resources are wasted. To overcome this, Yigit et al. (2019) proposed the Jensen–Shannon divergence (JSD) strategy to select items with the MC-DINA model (de la Torre, 2009). However, the JSD strategy needs large samples to obtain reliable estimates of the item parameters before the formal test, and this could compromise the items in the bank. By contrast, the nonparametric method does not require any parameter calibration before the formal test and can be used in small educational programs.

    The current study proposes two nonparametric item selection methods (i.e., HDDmc and JDDmc) for CD-CAT with MC items as well as two termination rules (i.e., MR and DR,) for variable-length CD-CAT with MC items. Two simulation studies were conducted to examine the performance of these nonparametric item selection methods and termination rules.

    The first study examined the performance of the HDDmc and JDDmc with fixed-length CD-CAT. In this study, six factors were manipulated: the number of attributes (K = 4 vs. 6), the structure of the Q-matrix (simple vs. complex), the quality of the item bank (high vs. low vs. mixed), the distribution of the attribute profile (multivariate normal threshold model vs. discrete uniform distribution), the test length (two vs. three vs. four times of K), and the item selection methods (HDDmc vs. JDDmc vs. JSD). Of these, item selection method was the within-group variable, and the rest were between-group variables. The results showed that: (1) the HDDmc and JDDmc produced higher attribute pattern matched ratios (PMRs) than the JSD method for most conditions; (2) the HDDmc and JDDmc produced similar PMRs for all conditions; (3) the HDDmc and JDDmc produced more even distributions of item exposure than the JSD method.

    The second simulation study investigated the performance of the MR and DR with variable-length CD-CAT. Six factors were also manipulated in this study: the settings for the number of attributes, the structure of the Q-matrix, the quality of the item bank, and the distribution of the attribute profile were the same as in the first study; the other two factors were termination rules (MR, DR, D1, and D3) and item selection methods (HDDmc and JDDmc). Again, the first four were between-group variables, while termination rules and item selection methods were within-group variables. The results showed that: (1) the HDDmc and JDDmc yielded higher PMRs for MR and DR rules than for the D1 and D3 rules; (2) the HDDmc and JDDmc yielded longer test lengths for MR and DR rules than for the D1 and D3 rules, especially for the JDD rule.

    In sum, both nonparametric item selection methods and the two new termination rules proved appropriate for CD-CAT with MC items, which means they can be used to balance the trade-off between measurement accuracy and item exposure rate.