• Development of Online Calibration Method Based on SCAD Penalty and EM Perspective in CD-CAT: a study based on the G-DINA model

    Subjects: Psychology >> Psychological Measurement submitted time 2023-11-22

    Abstract: Cognitive diagnostic computerized adaptive testing (CD-CAT) provides a detailed diagnosis of an examinee’s strengths and weaknesses in the content measured in a timely and accurate manner, which can be used as a reference for further study or remediation planning, thus meeting the practical need for efficient and detailed test results. The successful implementation of CD-CAT is based on an item bank, but its maintenance is a very challenging task. A psychometrically popular choice for maintaining an item bank is online calibration. Currently, the research on online calibration methods in the CD-CAT that can calibrate Q-matrix and item parameters simultaneously is very weak. The existing methods are basically developed based on the deterministic input, noisy and gate (DINA) model. Compared with the DINA model, the generalized DINA (G-DINA) model has been more widely applied because it is less restrictive and can meet the requirements of a large number of test data in psychological and educational assessment. Therefore, if the online calibration method that jointly calibrates the Q-matrix and item parameters can be developed for models with few constraints such as G-DINA, its meaning is understood without explanation.
    In current study, a new online calibration method, SCADOCM, was proposed, which was suitable for the G-DINA model. The construction of SCADOCM was based on the smoothly clipped absolute deviation penalty (SCAD) and marginalized maximum likelihood estimation (MMLE/EM) algorithm. For the new item j, the log-likelihood function with SCAD can be formulated based on the examinees’ responses in this item and the examinees’ attribute marginal mastery probability, and the q-vector of the new item can be estimated by the q-vector estimator based on SCAD. Then, the EM algorithm was used to estimate the item parameter of the new item j based on the posterior distributions of examinees’ attribute patterns, the examinees’ responses to new item j and the estimated q-vector.  
    To examine the performance of the proposed SCADOCM and compare it with the SIE method, two simulation studies (Study 1 and Study 2) are conducted. Study 1 is based on a simulated item bank while Study 2 is based on the real item bank (Internet addiction item bank; Shi, 2017). In these simulation studies, four factors were manipulated: the calibration sample size (nj = 50 vs. 100 vs. 500 vs. 1000 vs. 2000), the distribution of the attribute pattern (uniform distribution vs. high-order distribution vs. normal distribution), the item quality (U (0.05, 0.15) vs. U (0.1, 0.3)), and the online calibration methods (SCADOCM vs. SIE). The results showed that (1) SCADOCM has satisfactory calibration accuracy and calibration efficiency, and is superior to the SIE method. In addition, the traditional SIE method is not applicable for the G-DINA model, and its Q-matrix estimation accuracy rate is low under all experimental conditions. (2) The item calibration accuracy of SCADOCM and SIE increases with the increase of calibration sample and item quality under most conditions, and its item calibration accuracy in the uniform distribution/higher-order distribution is greater than that in the normal distribution. (3) The calibration efficiency of SCADOCM decreases with the increase of calibration samples, but it is less affected by the item quality and the attribute pattern distribution; the calibration efficiency of SIE decreases with the increase of calibration samples, but it is less affected by the item quality. Moreover, the calibration efficiency of the SIE method in the normal distribution is slightly slower than that of uniform distribution/high-order distribution.
    To sum up the results, this study demonstrated that the SCADOCM has higher item calibration accuracy and calibration efficiency, and outperforms the SIE method; meanwhile, the traditional SIE method is not suitable for G-DINA model. All in all, this study provides an efficient and accurate method for item calibration in CD-CAT, and provides important support for further promoting the application of CD-CAT in practice.

  • A New Dual-Objective CD-CAT Item Selection Method Based on the Gini Index

    Subjects: Psychology >> Psychological Measurement submitted time 2020-09-02

    Abstract: " Existing literature has shown that dual-objective CD-CAT testing can facilitate the achievement of measurement objectives for both formative and summative assessments. And the Gini Index can be used as a measurement for the degree of uncertainty of random variables since a smaller Gini value indicates a lower degree of uncertainty. Hence, this paper proposed a Gini-Index-based selection method for dual-objective CD-CAT, and it measured the changes in the posterior probability of knowledge state and confidence interval for latent traits estimation. By adopting the Bayesian Decision Theory, the potential information of participants could be detected based on participants’ responses and changes in posterior probability distribution of two the random variables. Monte Carlo Simulation was used to test the performances of the selection method based on Gini, ASI, IPA and JSD, respectively. The item banks measured 5 attributes consisting of 250 items in total, and each item measured 3 attributes at most. The true knowledge state of each participant was generated by HO-CDM and Multivariate Normal Models (both means were 0 and covariance coefficient was 0.8 and 0.2, respectively). G-DINA, DINA and R-RUM were adopted as the cognitive diagnostic models and the item bank of each of these three models included both CDM and 2PL parameters. Specifically, CDM parameters were generated by a G-DINA package in R software with the slipping and guessing parameters randomly selected from uniform distribution in a range from 0.05 to 0.25. The 2PL parameters were estimated by factoring in the responses elicited from 3,000 participants’ responses to all items in item banks using the mirt package. Four indexes, namely the pattern measurement rates, root mean square error of latent trait, chi-square value and time needed for item selection, were adopted in comparing the efficiency of different item selection methods. The value for each index was the mean of 10 repeated simulations of 1,000 participants’ responses to all item bank. The results showed that (1) The Gini and IPA selection methods had similar performance in terms of pattern measurement rates, root mean square error of latent trait and chi-square value. Both methods were high in precision measurement and low in sensitivity to CDM and the distribution of participants’ cognitive patterns, making both methods applicable to the item banks featuring a mixture of cognitive diagnosis models. By comparison, the Gini method outperformed slightly the IPA method in pattern measurement rates and time needed for item selection in which the Gini method was only one-tenth that of the IPA method; (2) Both the Gini and ASI selection methods were weighted linear combination approaches. The performances of the two methods were very close in the short test. In the long test, however, although time needed for item selection using the ASI method was only one-third that of the Gini method, the latter was superior to the former in terms of measurement accuracy and chi-square value; (3) Although the JSD method outperformed the Gini method in terms of uniformity of item bank usage and time needed for item selection, its measurement accuracy was far less than the latter. To summarize, the Gini, IPA and ASI selection methods all have good measurement accuracy and hence are all recommended for short tests. For medium and long tests with a limited number of attributes and a smaller item bank, the Gini and IPA selection methods are recommended. As the number of attributes and item bank size grow, the Gini method is recommended. When there are high correlations among different attributes, as well as a large number of attributes and big item bank size, the ASI and JSD selection methods are recommended with the ASI method slightly outperforming the JSD method in measurement accuracy.