首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The assumption of conditional independence between the responses and the response times (RTs) for a given person is common in RT modeling. However, when the speed of a test taker is not constant, this assumption will be violated. In this article we propose a conditional joint model for item responses and RTs, which incorporates a covariance structure to explain the local dependency between speed and accuracy. To obtain information about the population of test takers, the new model was embedded in the hierarchical framework proposed by van der Linden ( 2007 ). A fully Bayesian approach using a straightforward Markov chain Monte Carlo (MCMC) sampler was developed to estimate all parameters in the model. The deviance information criterion (DIC) and the Bayes factor (BF) were employed to compare the goodness of fit between the models with two different parameter structures. The Bayesian residual analysis method was also employed to evaluate the fit of the RT model. Based on the simulations, we conclude that (1) the new model noticeably improves the parameter recovery for both the item parameters and the examinees’ latent traits when the assumptions of conditional independence between the item responses and the RTs are relaxed and (2) the proposed MCMC sampler adequately estimates the model parameters. The applicability of our approach is illustrated with an empirical example, and the model fit indices indicated a preference for the new model.  相似文献   

2.
Structural equation modeling (SEM) is now a generic modeling framework for many multivariate techniques applied in the social and behavioral sciences. Many statistical models can be considered either as special cases of SEM or as part of the latent variable modeling framework. One popular extension is the use of SEM to conduct linear mixed-effects modeling (LMM) such as cross-sectional multilevel modeling and latent growth modeling. It is well known that LMM can be formulated as structural equation models. However, one main difference between the implementations in SEM and LMM is that maximum likelihood (ML) estimation is usually used in SEM, whereas restricted (or residual) maximum likelihood (REML) estimation is the default method in most LMM packages. This article shows how REML estimation can be implemented in SEM. Two empirical examples on latent growth model and meta-analysis are used to illustrate the procedures implemented in OpenMx. Issues related to implementing REML in SEM are discussed.  相似文献   

3.
Psychometric models based on structural equation modeling framework are commonly used in many multiple-choice test settings to assess measurement invariance of test items across examinee subpopulations. The premise of the current article is that they may also be useful in the context of performance assessment tests to test measurement invariance of raters. The modeling approach and how it can be used for performance tests with less than optimal rater designs are illustrated using a data set from a performance test designed to measure medical students’ patient management skills. The results suggest that group-specific rater statistics can help spot differences in rater performance that might be due to rater bias, identify specific weaknesses and strengths of individual raters, and enhance decisions related to future task development, rater training, and test scoring processes.  相似文献   

4.
5.
Reliability can be estimated using structural equation modeling (SEM). Two potential problems with this approach are that estimates may be unstable with small sample sizes and biased with misspecified models. A Monte Carlo study was conducted to investigate the quality of SEM estimates of reliability by themselves and relative to coefficient alpha. The SEM approach showed minimal bias when the model was correctly specified if items were relatively well defined by their underlying factor(s). They tended to demonstrate somewhat greater bias when the model was misspecified, particularly underspecified. Overall, SEM estimates were more stable than anticipated. Researchers are more likely to obtain accurate estimates of reliability using SEM by conducting large-sample studies with well-constructed scales and critically assessing model fit.  相似文献   

6.
Differential equation models have intuitively meaningful parameters that can be mapped to developmental theories emphasizing nonlinear multiple timescale processes. Following this tradition, we map theoretical propositions of infant-parent self-/co-regulation to intrinsic and extrinsic dynamics captured by fishery models wherein fish’s reproduction and farmers’ harvesting contribute to population size. Integrated and estimated within a multilevel growth modeling framework, the model captures distinct components of self-/co-regulation and changes with age. We use simulations to examine viability of the model’s application to real-world data and illustrate the model’s utility using exemplar data drawn from a multiple timescale study of infant distress and mother soothing behaviors (= 144 dyads) as the infant received routine immunizations at ages 2 and 6 months. Results highlight the benefits of articulating theoretical propositions within a differential equations framework, and using multiple timescale study designs with natural or experimentally induced perturbations in the study of self-/co-regulation and its development.  相似文献   

7.
This study compares five cognitive diagnostic models in search of optimal one(s) for English as a Second Language grammar test data. Using a unified modeling framework that can represent specific models with proper constraints, the article first fit the full model (the log-linear cognitive diagnostic model, LCDM) and investigated which model emerged as the dominant model. It then fit the dominant model and the other models to confirm that the model provides the best fit to the data. The model found to represent the most number of items in the test was the Compensatory Reparameterized Unified Model (C-RUM) and other models compared were the Deterministic-Input, Noisy-And (DINA), Deterministic Input, Noisy-Or-gate (DINO), and Noisy Input, Deterministic-Or-gate (NIDO). The absolute (item-association root mean square error values) and relative (information criteria) model fit indices also indicated that the LCDM and the C-RUM were the best fit to the data. More detailed analyses on the functioning of the C-RUM were conducted and the interpretation of the results was included in the discussion section. The article ends with some suggestions for future research based on the limitations of the study.  相似文献   

8.
It is known that the Rasch model is a special two-level hierarchical generalized linear model (HGLM). This article demonstrates that the many-faceted Rasch model (MFRM) is also a special case of the two-level HGLM, with a random intercept representing examinee ability on a test, and fixed effects for the test items, judges, and possibly other facets. This perspective suggests useful modeling extensions of the MFRM. For example, in the HGLM framework it is possible to model random effects for items and judges in order to assess their stability across examinees. The MFRM can also be extended so that item difficulty and judge severity are modeled as functions of examinee characteristics (covariates), for the purposes of detecting differential item functioning and differential rater functioning. Practical illustrations of the HGLM are presented through the analysis of simulated and real judge-mediated data sets involving ordinal responses.  相似文献   

9.
Structural equation models are typically evaluated on the basis of goodness-of-fit indexes. Despite their popularity, agreeing what value these indexes should attain to confidently decide between the acceptance and rejection of a model has been greatly debated. A recently proposed approach by means of equivalence testing has been recommended as a superior way to evaluate the goodness of fit of models. The approach has also been proposed as providing a necessary vehicle that can be used to advance the inferential nature of structural equation modeling as a confirmatory tool. The purpose of this article is to introduce readers to key ideas in equivalence testing and illustrate its use for conducting model–data fit assessments. Two confirmatory factor analysis models in which a priori specified latent variable models with known structure and tested against data are used as examples. It is advocated that whenever the goodness of fit of a model is to be assessed researchers should always examine the resulting values obtained via the equivalence testing approach.  相似文献   

10.
Multidimensional item response theory (MIRT) provides an ideal foundation for modeling performance in complex domains, taking into account multiple basic abilities simultaneously, and representing different mixtures of the abilities required for different test items. This article provides a brief overview of different MIRT models, and the substantive implications of their differences for educational assessment. To illustrate the flexibility and benefits of MIRT, three application scenarios are described: to account for unintended multidimensionality when measuring a unidimensional construct, to model latent covariance structures between ability dimensions, and to model interactions of multiple abilities required for solving specific test items. All of these scenarios are illustrated by empirical examples. Finally, the implications of using MIRT models on educational processes are discussed.  相似文献   

11.
12.
The aim of the present study was to investigate the various dimensions underlying performance on the diagrams, tables, maps subtest of the Swedish Scholastic Aptitude Test. Data from two test versions were analyzed using a structural equation modeling technique on 14,463 and 19,636 examinees, respectively. Two alternative three‐factor models containing (a) a general factor influencing all items, (b) a quantitative factor, and (c) either an end of test or complexity factor were identified. On the basis of the results, it may tentatively be concluded that the gender difference in performance of the test in favor of males is due to a great extent to the quantitative factor. An end of test effect seems indisputable, although this effect may be explained by an increasing difficulty of test items rather than, or in addition to, sheer speed. © 1999 John Wiley & Sons, Inc. J Res Sci Teach 36: 565–582, 1999  相似文献   

13.
Deaf and hearing college students' mean reaction times (RTs) were compared on a mental calculation task in which they had to verify the accuracy of solutions to addition and multiplication problems. The deaf students were divided into higher and lower readers. Higher deaf readers and hearing students had similar RTs and accuracy on addition problems; their RTs were greater in the voicing interference mode than in the manual tapping interference mode. The lower deaf readers showed no RT differences between the two interference modes and had consistently lower RT performance and score accuracy across the verification tasks. On the verification task for multiplication problems, all participants showed a greater RT effect for manual tapping. The lower deaf readers were significantly less accurate on multiplication problems.  相似文献   

14.
When dealing with missing responses, two types of omissions can be discerned: items can be skipped or not reached by the test taker. When the occurrence of these omissions is related to the proficiency process the missingness is nonignorable. The purpose of this article is to present a tree‐based IRT framework for modeling responses and omissions jointly, taking into account that test takers as well as items can contribute to the two types of omissions. The proposed framework covers several existing models for missing responses, and many IRTree models can be estimated using standard statistical software. Further, simulated data is used to show that ignoring missing responses is less robust than often considered. Finally, as an illustration of its applicability, the IRTree approach is applied to data from the 2009 PISA reading assessment.  相似文献   

15.
How is affective change rated with positive adjectives such as good related to change rated with negative adjectives such as bad? Two nested perfect and imperfect forms of dynamic bipolarity are defined using latent change structural equation models based on tetrads of items. Perfect bipolarity means that latent change scores correlate -1. Meaningful structural equation modeling (SEM) analyses of self-rated affect may require analyzing polychoric correlations, if self-ratings are collected using ordered categories. The models were applied to 6 4-wave datasets from Steyer and Riedl (2004). Results suggest that perfect bipolarity is generally compatible with valence self-ratings, whereas imperfect bipolarity is compatible with tension and energy self-ratings. Methodological and substantive limits of the approach are discussed.  相似文献   

16.
Componential IRT models for polytomous items are of particular interest in two contexts: Componential research and test development. We assume that there are basic components, such as processes and knowledge structures, involved in solving cognitive tasks. In Componential research, the subtask paradigm may be used to isolate such components in subtasks. In test development, items may be composed such that their response alternatives correspond with specific combinations of such components. In both cases the data may be modeled as polytomous items. With Bock's (1972) nominal model as a general framework, transformation matrices can be used to constrain the parameters of the response categories so as to reflect the Componential design of the response categories. In this way, both main effects and interaction effects of components can be studied. An application to a spelling task demonstrates this approach  相似文献   

17.
In many intervention and evaluation studies, outcome variables are assessed using a multimethod approach comparing multiple groups over time. In this article, we show how evaluation data obtained from a complex multitrait–multimethod–multioccasion–multigroup design can be analyzed with structural equation models. In particular, we show how the structural equation modeling approach can be used to (a) handle ordinal items as indicators, (b) test measurement invariance, and (c) test the means of the latent variables to examine treatment effects. We present an application to data from an evaluation study of an early childhood prevention program. A total of 659 children in intervention and control groups were rated by their parents and teachers on prosocial behavior and relational aggression before and after the program implementation. No mean change in relational aggression was found in either group, whereas an increase in prosocial behavior was found in both groups. Advantages and limitations of the proposed approach are highlighted.  相似文献   

18.
The determination of initial equilibrium shapes is a common problem in research work and engineering applications related to membrane structures. Using a general structural analysis framework of the finite particle method (FPM), this paper presents the first application of the FPM and a recently-developed membrane model to the shape analysis of light weight mem- branes. The FPM is rooted in vector mechanics and physical viewpoints. It discretizes the analyzed domain into a group of parti- cles linked by elements, and the motion of the free particles is directly described by Newton's second law while the constrained ones follow the prescribed paths. An efficient physical modeling procedure of handling geometric nonlinearity has been developed to evaluate the particle interaction forces. To achieve the equilibrium shape as fast as possible, an integral-form, explicit time integration scheme has been proposed for solving the equation of motion. The equilibrium shape can be obtained naturally without nonlinear iterative correction and global stiffness matrix integration. Two classical curved surfaces of tension membranes pro- duced under the uniform-stress condition are presented to verify the accuracy and efficiency of the proposed method.  相似文献   

19.
Schematic modeling is presented as an epistemologic framework for physics instruction. According to schematic modeling, models comprise the content core of scientific knowledge, and modeling is a major process for constructing and employing this knowledge. A model is defined by its composition and structure and is situated in a theory by its domain and organization. Modeling involves model selection, construction, validation, analysis, and deployment. Two groups of Lebanese high school and college students participated in problem-solving tutorials that followed a schematic modeling approach. Both groups improved significantly in problem-solving performance, and course achievement of students in the college group was significantly better than that of their control peers. © 1996 John Wiley & Sons, Inc.  相似文献   

20.
Large-scale assessments of student competencies address rather broad constructs and use parsimonious, unidimensional measurement models. Differential item functioning (DIF) in certain subpopulations usually has been interpreted as error or bias. Recent work in educational measurement, however, assumes that DIF reflects the multidimensionality that is inherent in broad competency constructs and leads to differential achievement profiles. Thus, DIF parameters can be used to identify the relative strengths and weaknesses of certain student subpopulations. The present paper explores profiles of mathematical competencies in upper secondary students from six countries (Austria, France, Germany, Sweden, Switzerland, the US). DIF analyses are combined with analyses of the cognitive demands of test items based on psychological conceptualisations of mathematical problem solving. Experts judged the cognitive demands of TIMSS test items, and these demand ratings were correlated with DIF parameters. We expected that cultural framings and instructional traditions would lead to specific aspects of mathematical problem solving being fostered in classroom instruction, which should be reflected in differential item functioning in international comparative assessments. Results for the TIMSS mathematics test were in line with expectations about cultural and instructional traditions in mathematics education of the six countries.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号