首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 515 毫秒
1.
Multilevel bifactor item response theory (IRT) models are commonly used to account for features of the data that are related to the sampling and measurement processes used to gather those data. These models conventionally make assumptions about the portions of the data structure that represent these features. Unfortunately, when data violate these models' assumptions but these models are used anyway, incorrect conclusions about the cluster effects could be made and potentially relevant dimensions could go undetected. To address the limitations of these conventional models, a more flexible multilevel bifactor IRT model that does not make these assumptions is presented, and this model is based on the generalized partial credit model. Details of a simulation study demonstrating this model outperforming competing models and showing the consequences of using conventional multilevel bifactor IRT models to analyze data that violate these models' assumptions are reported. Additionally, the model's usefulness is illustrated through the analysis of the Program for International Student Assessment data related to interest in science.  相似文献   

2.
When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA and CC nonparametrically by replacing the role of the parametric IRT model in Lee's classification indices with a modified version of Ramsay's kernel‐smoothed item response functions. The performance of the nonparametric CA and CC indices are tested in simulation studies in various conditions with different generating IRT models, test lengths, and ability distributions. The nonparametric approach to CA often outperforms Lee's method and Livingston and Lewis's method, showing robustness to nonnormality in the simulated ability. The nonparametric CC index performs similarly to Lee's method and outperforms Livingston and Lewis's method when the ability distributions are nonnormal.  相似文献   

3.
The nature of anatomy education has changed substantially in recent decades, though the traditional multiple‐choice written examination remains the cornerstone of assessing students' knowledge. This study sought to measure the quality of a clinical anatomy multiple‐choice final examination using item response theory (IRT) models. One hundred seventy‐six students took a multiple‐choice clinical anatomy examination. One‐ and two‐parameter IRT models (difficulty and discrimination parameters) were used to assess item quality. The two‐parameter IRT model demonstrated a wide range in item difficulty, with a median of ?1.0 and range from ?2.0 to 0.0 (25th to 75th percentile). Similar results were seen for discrimination (median 0.6; range 0.4–0.8). The test information curve achieved maximum discrimination for an ability level one standard deviation below the average. There were 15 items with standardized loading less than 0.3, which was due to several factors: two items had two correct responses, one was not well constructed, two were too easy, and the others revealed a lack of detailed knowledge by students. The test used in this study was more effective in discriminating students of lower ability than those of higher ability. Overall, the quality of the examination in clinical anatomy was confirmed by the IRT models. Anat Sci Educ 3:17–24, 2010. © 2009 American Association of Anatomists.  相似文献   

4.
A polytomous item is one for which the responses are scored according to three or more categories. Given the increasing use of polytomous items in assessment practices, item response theory (IRT) models specialized for polytomous items are becoming increasingly common. The purpose of this ITEMS module is to provide an accessible overview of polytomous IRT models. The module presents commonly encountered polytomous IRT models, describes their properties, and contrasts their defining principles and assumptions. After completing this module, the reader should have a sound understating of what a polytomous IRT model is, the manner in which the equations of the models are generated from the model's underlying step functions, how widely used polytomous IRT models differ with respect to their definitional properties, and how to interpret the parameters of polytomous IRT models.  相似文献   

5.
A new approach to moral education using blended learning has been developed. This approach involves 10 scenarios that are designed as a web-based game and serves as a basis for group moral-consequence-based reasoning, which is developed based on a hypothetical-deductive model. The aim of the study was to examine the changes in students' blended learning interest and reasoning ability in a time series experimental design. After playing the game with the 10 initial scenarios during the first week of the study, participants were subjected to five blended learning sessions that required them to discuss the consequences of one of the 10 scenarios using hypothetical-deductive reasoning. After six weeks, the data from the 110 participants were analyzed using time series statistics. The results indicated that players were highly interested in the game, although their interest had a tendency to decrease slightly over time. Repetitive game play (i.e. practice) was positively associated with the players' moral reasoning performance. The study results may lend support to the design of a game with additional or more highly complex content for players to further develop students' consequential reasoning ability.  相似文献   

6.
Item Response Theory (IRT) models were applied to investigate the psychometric properties of the Arthur and Day's Advanced Progressive Matrices-Short Form (APM-SF; 1994) [Arthur and Day (1994). Development of a short form for the Raven Advanced Progressive Matrices test. Educational and Psychological Measurement, 54, 395–403] in order to test if the scale is a reliable and valid tool to assess general fluid ability in a short time frame. The APM-SF was administered to 2264 high-school and university students. Once attested the one-factor structure of the scale, unidimensional IRT analyses for dichotomous data were applied to investigate the increases in item difficulty levels, Test Information Function, and Differential Item Functioning across age, gender, and country (comparing Italian and British respondents). Additionally, validity measures were reported. Findings attest that the Arthur and Day's APM-SF is a sound instrument for assessing fluid ability within a short time frame.  相似文献   

7.
Background:?Although on-demand testing is being increasingly used in many areas of assessment, it has not been adopted in high stakes examinations like the General Certificate of Secondary Education (GCSE) and General Certificate of Education Advanced level (GCE A level) offered by awarding organisations (AOs) in the UK. One of the major issues with on-demand testing is that some of the methods used for maintaining the comparability of standards over time in conventional testing are no longer available and the development of new methods is required.

Purpose:?This paper proposes an item response theory (IRT) framework for implementing on-demand testing and maintaining the comparability of standards over time for general qualifications, including GCSEs and GCE A levels, in the UK and discusses procedures for its practical implementation.

Sources of evidence:?Sources of evidence include literature from the fields of on-demand testing, the design of computer-based assessment, the development of IRT, and the application of IRT in educational measurement.

Main argument:?On-demand testing presents many advantages over conventional testing. In view of the nature of general qualifications, including the use of multiple components and multiple question types, the advances made in item response modelling over the past 30 years, and the availability of complex IRT analysis software systems, coupled with increasing IRT expertise in awarding organisations, IRT models could be used to implement on-demand testing in high stakes examinations in the UK. The proposed framework represents a coherent and complete approach to maintaining standards in on-demand testing. The procedures for implementing the framework discussed in the paper could be adapted by people to suit their own needs and circumstances.

Conclusions:?The use of IRT to implement on-demand testing could prove to be one of the viable approaches to maintaining standards over time or between test sessions for UK general qualifications.  相似文献   

8.
The use of digital tools like computers and tablets in institutional learning arenas give rise to forms of flexibility where time and space boundaries become diffuse. Online learning sites are understood as being crucial today, especially in large parts of the Global North, where anyone anywhere potentially can become a student and have access to educational opportunities.

This study focuses on the analysis of recorded sessions, part of an ‘Italian for (adult) beginners' online course. Our interests relate to accounting for how students negotiate different language varieties, including modalities, and how communication in virtual learning settings enables both flexible participation trajectories and identity positions in and across the boundaries of time and space.

The sociocultural and dialogical analyses here are framed in terms of fluidity of ‘glocal' positions and (trans)languaging that emerge in and across time and space in Technology Mediated Communication. Our findings suggest that online environments support meaning-making where it is possible to identify alternative ways of (co)constructing and mediating learning. Such hybridity as well as the performative character of learning and identity display have important implications for online glocal communities.  相似文献   

9.
This article proceeds from the assumption that the aging of American society has consequences for the life roles of midlife and older persons. Seven points are developed in support of the assumption. They are as follows: dynamics and demographics of an aging population; education, a critical component of life in the future; a model of education for older adults; new roles for an aging society; literacy for older persons; older persons' activities in pursuit of lifelong education; and a view of the future that includes lifelong education for lifelong needs. The final section offers some speculations about what lifelong education will be like in 2010.  相似文献   

10.
The head and neck region is one of the most complex areas featured in the medical gross anatomy curriculum. The effectiveness of using three‐dimensional (3D) models to teach anatomy is a topic of much discussion in medical education research. However, the use of 3D stereoscopic models of the head and neck circulation in anatomy education has not been previously studied in detail. This study investigated whether 3D stereoscopic models created from computed tomographic angiography (CTA) data were efficacious teaching tools for the head and neck vascular anatomy. The test subjects were first year medical students at the University of Mississippi Medical Center. The assessment tools included: anatomy knowledge tests (prelearning session knowledge test and postlearning session knowledge test), mental rotation tests (spatial ability; presession MRT and postsession MRT), and a satisfaction survey. Results were analyzed using a Wilcoxon rank‐sum test and linear regression analysis. A total of 39 first year medical students participated in the study. The results indicated that all students who were exposed to the stereoscopic 3D vascular models in 3D learning sessions increased their ability to correctly identify the head and neck vascular anatomy. Most importantly, for students with low‐spatial ability, 3D learning sessions improved postsession knowledge scores to a level comparable to that demonstrated by students with high‐spatial ability indicating that the use of 3D stereoscopic models may be particularly valuable to these students with low‐spatial ability. Anat Sci Educ 10: 34–45. © 2016 American Association of Anatomists.  相似文献   

11.
本研究采用“共同题?锚测验”设计,使用R语言ltm程序包中的IRT两参数模型进行各年级小学生数学学力认知诊断测验和被试参数的估计,并使用equateIRT程序包进行跨年级小学生数学学力认知诊断测验各项参数的等值转换。结果表明,等值转换后各年级测验的题目难度和小学生数学学力均随年级增长而逐渐递增,不同学校、民族、性别学生的数学学力发展差异性特征均与理论假设相符。本研究验证了采用IRT垂直等值方法构建跨年级小学生数学学力发展水平垂直量表的可行性,为制定系统性补救教学方案和自适应题库建设提供了必要的实证证据。  相似文献   

12.
The importance of reflection in supporting the continued professional learning of preservice practitioners is well recognised. This study examines one aspect of the outcomes of preservice teachers' reflection: the development of their own self-image as a teacher. In making the transition from student to teacher, preservice teachers create their own professional identity. Their ability to articulate this identity is examined through a new construct, a “teachers' voice”. A teachers' voice, develops when preservice teachers interpret and reinterpret their experiences through the processes of reflection. A teachers' voice is articulated as part of the persons' self-image. The construct, a teachers' voice, was investigated by examining changes in preservice teachers' contributions in an online discussion forum. Two complementary approaches of content analysis were applied. Both methods revealed changes in preservice teachers' levels of engagement and showed that in the first semester of preservice teacher education, the majority of preservice teachers moved towards a more professional stance in their contributions.  相似文献   

13.
Game-based learning environments hold significant promise for facilitating learning experiences that are both effective and engaging. To support individualised learning and support proactive scaffolding when students are struggling, game-based learning environments should be able to accurately predict student knowledge at early points in students' gameplay. Student knowledge is traditionally assessed prior to and after each student interacts with the learning environment with conventional methods, such as multiple choice content knowledge assessments. While previous student modelling approaches have leveraged machine learning to automatically infer students' knowledge, there is limited work that incorporates the fine-grained content from each question in these types of tests into student models that predict student performance at early junctures in gameplay episodes. This work investigates a predictive student modelling approach that leverages the natural language text of the post-gameplay content knowledge questions and the text of the possible answer choices for early prediction of fine-grained individual student performance in game-based learning environments. With data from a study involving 66 undergraduate students from a large public university interacting with a game-based learning environment for microbiology, Crystal Island , we investigate the accuracy and early prediction capacity of student models that use a combination of gameplay features extracted from student log files as well as distributed representations of post-test content assessment questions. The results demonstrate that by incorporating knowledge about assessment questions, early prediction models are able to outperform competing baselines that only use student game trace data with no question-related information. Furthermore, this approach achieves high generalisation, including predicting the performance of students on unseen questions.

Practitioner notes

What is already known about this topic
  • A distinctive characteristic of game-based learning environments is their capacity to enable fine-grained student assessment.
  • Adaptive game-based learning environments offer individualisation based on specific student needs and should be able to assess student competencies using early prediction models of those competencies.
  • Word embedding approaches from the field of natural language processing show great promise in the ability to encode semantic information that can be leveraged by predictive student models.
What this paper adds
  • Investigates word embeddings of assessment question content for reliable early prediction of student performance.
  • Demonstrates the efficacy of distributed word embeddings of assessment questions when used by early prediction models compared to models that use either no assessment information or discrete representations of the questions.
  • Demonstrates the efficacy and generalisability of word embeddings of assessment questions for predicting the performance of both new students on existing questions and existing students on new questions.
Implications for practice and/or policy
  • Word embeddings of assessment questions can enhance early prediction models of student knowledge, which can drive adaptive feedback to students who interact with game-based learning environments.
  • Practitioners should determine if new assessment questions will be developed for their game-based learning environment, and if so, consider using our student modelling framework that incorporates early prediction models pretrained with existing student responses to previous assessment questions and is generalisable to the new assessment questions by leveraging distributed word embedding techniques.
  • Researchers should consider the most appropriate way to encode the assessment questions in ways that early prediction models are able to infer relationships between the questions and gameplay behaviour to make accurate predictions of student competencies.
  相似文献   

14.
15.

Informal learning experiences have risen to the forefront of science education as being beneficial to students' learning. However, it is not clear in what ways such experiences may be beneficial to students; nor how informal learning experiences may interface with classroom science instruction. This study aims to acquire a better understanding of these issues by investigating one aspect of science learning, scientific reasoning ability, with respect to the students' informal learning experiences and classroom science instruction. Specifically, the purpose of this study was to investigate possible differences in students' scientific reasoning abilities relative to their informal learning environments (impoverished, enriched), classroom teaching experiences (non-inquiry, inquiry) and the interaction of these variables. The results of two-way ANOVAs indicated that informal learning environments and classroom science teaching procedures showed significant main effects on students' scientific reasoning abilities. Students with enriched informal learning environments had significantly higher scientific reasoning abilities compared to those with impoverished informal learning environments. Likewise, students in inquirybased science classrooms showed higher scientific reasoning abilities compared to those in non-inquiry science classrooms. There were no significant interaction effects. These results indicate the need for increased emphases on both informal learning opportunities and inquiry-based instruction in science.  相似文献   

16.
Item response theory (IRT) models can be subsumed under the larger class of statistical models with latent variables. IRT models are increasingly used for the scaling of the responses derived from standardized assessments of competencies. The paper summarizes the strengths of IRT in contrast to more traditional techniques as well as in contrast to alternative models with latent variables (e. g. structural equation modeling). Subsequently, specific limitations of IRT and cases where other methods might be preferable are lined out.  相似文献   

17.
In this paper, the use of several measurement models for the analysis of data arising from learning hierarchies is discussed, and the results of an empirical investigation of the application of an Item Response Theory (IRT) model to a learning hierarchy in subtraction are examined. The analysis confronts the test developer’s original intentions with empirical data through the use of the IRT model, and three ways that the model can be useful in explicating the patterns of empirical results from a learning hierarchy are described.  相似文献   

18.
Cognitive diagnosis models (CDMs) continue to generate interest among researchers and practitioners because they can provide diagnostic information relevant to classroom instruction and student learning. However, its modeling component has outpaced its complementary component??test construction. Thus, most applications of cognitive diagnosis modeling involve retrofitting of CDMs to assessments constructed using classical test theory (CTT) or item response theory (IRT). This study explores the relationship between item statistics used in the CTT, IRT, and CDM frameworks using such an assessment, specifically a large-scale mathematics assessment. Furthermore, by highlighting differences between tests with varying levels of diagnosticity using a measure of item discrimination from a CDM approach, this study empirically uncovers some important CTT and IRT item characteristics. These results can be used to formulate practical guidelines in using IRT- or CTT-constructed assessments for cognitive diagnosis purposes.  相似文献   

19.
The posterior predictive model checking method is a flexible Bayesian model‐checking tool and has recently been used to assess fit of dichotomous IRT models. This paper extended previous research to polytomous IRT models. A simulation study was conducted to explore the performance of posterior predictive model checking in evaluating different aspects of fit for unidimensional graded response models. A variety of discrepancy measures (test‐level, item‐level, and pair‐wise measures) that reflected different threats to applications of graded IRT models to performance assessments were considered. Results showed that posterior predictive model checking exhibited adequate power in detecting different aspects of misfit for graded IRT models when appropriate discrepancy measures were used. Pair‐wise measures were found more powerful in detecting violations of the unidimensionality and local independence assumptions.  相似文献   

20.
The results of field research suggest that, contrary to being behaviorally inflexible, some amphibians may have the ability to respond effectively to changing environments. The performance of seven newts (Triturus viridescens) was studied across 20 successive reversals of a spatial discrimination problem in a dry T-maze. Submersion in shaded water served as reinforcement for correct responses. The subjects showed a decrease in mean errors across reversals and across ordinal trials within sessions. These results are discussed in terms of the importance of using biologically relevant methodologies in the study of comparative animal learning.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号