Your conditions: 李喻骏
  • 不同认知结构被试的测验设计模式

    Subjects: Psychology >> Social Psychology submitted time 2023-03-27 Cooperative journals: 《心理学报》

    Abstract: Doctors have to use different medical technologies to diagnose different kinds of illness effectively. Similarly, teachers have to use well designed tests to provide an accurate evaluation of students with different cognitive structures. To provide such an evaluation, we recommend to adopt the Cognitive Diagnostic Assessment (CDA). CDA could measure specific cognitive structures and processing skills of students so as to provide information about their cognitive strengths and weaknesses. In general, the typical design procedure of a CDA test is as follow: firstly, identify the target attributes and their hierarchical relationships; secondly, design a Q matrix (which characterizes the design of test construct and content); finally, construct test items. Within that designing framework, two forms of test are available: the traditional test and the computerized adaptive test (CAT). The former is a kind of test that has a fixed-structure for all participants with different cognitive structures, the latter is tailored to each participant’s cognitive structure. Researchers have not, however, considered the specific test design for different cognitive structures when using these two test forms. As a result, the traditional test requires more items to gain a precise evaluation of a group of participants with mixed cognitive structures, and a cognitive diagnosis computer adaptive test (CD-CAT) has low efficiency of the item bank usage due to the problems in assembling a particular item bank. The key to overcome these hurdles is to explore the appropriate design tailored for participants with different cognitive structures. As discussed above, a reasonable diagnosis test should be specific for the cognitive structure of target examinees so to perform classification precisely and efficiently. This is in line with CAT. In CAT, an ideal item bank serves as a cornerstone in achieving this purpose. In this regard, Reckase (2003, 2007 & 2010) came up with an approach named p-optimality in designing an optimal item bank. Inspired by the p-optimality and working according to the characteristics of CDA, we proposed a method to design the test for different cognitive structures. We conducted a Monte Carlo simulation study to explore the different test design modes for different cognitive structures under six attribute hierarchical structures (Linear, Convergent, Divergent, Unstructured, Independent and Mixture). The results show that: (1) the optimal test design modes for different cognitive structures are different under the same hierarchical structure in test length, initial exploration stage (Stage 0), accurately estimation stage (Stage 1); (2) the item bank for cognitive diagnosis computer adaptive test (CD-CAT) we built, according to the different cognitive structures’ optimal test design modes, has a superior performance on item pool usage than other commonly used item banks no matter whether the fixed-length test or the variable-length test is used. We provide suggestions for item bank assembling basing on results from these experiments.

  • Research on Person-fit in Cognitive Diagnostic Assessment

    Subjects: Psychology >> Psychological Measurement Subjects: Psychology >> Statistics in Psychology Subjects: Psychology >> Educational Psychology submitted time 2022-05-12

    Abstract:

    Cognitive Diagnostic Assessment (CDA) has been widely used in educational assessment. It can provide guidance for further study and teaching by analyzing whether the test-takers have acquired knowledge points or skills.

    In psychometrics, statistical methods for assessing the fit of an examinee’s item responses to a postulated psychometric model are often called person-fit statistic. The person-fit analysis can help to verify the individual diagnostic results, and is mainly used to distinguish the abnormal examinees from the normal ones. The abnormal response patterns include “sleeping” behavior, fatigue, cheating, creative responding, random guessing responses and cheating with randomness, and all of these abnormal response patterns can affect the deviation of examinee’s ability estimation. The person-fit analysis can help researchers identify the abnormal response patterns more accurately, so as to delete the abnormal responding examinees and improve the validity of the test. In the past, most of the person fit researches were mainly carried out under the Item Response Theory (IRT) framework, while only few papers have been published dealing with person-fit under the CDM framework. This study attempts to fill a gap in the literature by introducing new methods. In this study, a new person fit index (R) was proposed.

    In order to verify the validity of the newly developed person fit index, this study explores the type I error and statistical test power of R index under different item length, item discrimination and different misfit types of respondent, and compares it with existing methods RCI and lz . Type I error rate was defined as the proportion of flagged abnormal response patterns by a person fit statistic out of 1,000 generated normal response patterns from the DINA model. The control variables of this study include: the number of subjects is controlled to 1000, the cognitive diagnosis model is chosen as DINA model, the attributes are 6, and the Q matrix is fixed. Finally, in order to reflect the value of person fit index in practical application, the R index is applied to the empirical data of fractional subtraction.

    The results show that the type I error of R index is reasonable and stable at 0.05. In the aspect of statistical test power, with the improvement of item differentiation, the statistical test power of each index in different abnormal examinees is improved. With the increase in the number of items, most of the statistical power show an upward trend. For different types of abnormal subjects, R index perform best in the cases of random guessing responses and cheating with randomness. In the case of fatigue, sleep, and creative responding, the lz  index perform better. In the empirical data study, the detection rate of abnormal examinees is 4.29%.

    With the increase of the discrimination of items and the increase of the number of items, the power of R index has improved, and the performance of R index is the most robust when the discrimination of item is low. The R index has a high power for the types of abnormal behavior such as creative responding behavior, random guessing responses and cheating with randomness.

    "