首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
In this digital ITEMS module, Dr. Roy Levy describes Bayesian approaches to psychometric modeling. He discusses how Bayesian inference is a mechanism for reasoning in a probability-modeling framework and is well-suited to core problems in educational measurement: reasoning from student performances on an assessment to make inferences about their capabilities more broadly conceived, as well as fitting models to characterize the psychometric properties of tasks. The approach is first developed in the context of estimating a mean and variance of a normal distribution before turning to the context of unidimensional item response theory (IRT) models for dichotomously scored data. Dr. Levy illustrates the process of fitting Bayesian models using the JAGS software facilitated through the R statistical environment. The module is designed to be relevant for students, researchers, and data scientists in various disciplines such as education, psychology, sociology, political science, business, health, and other social sciences. It contains audio-narrated slides, diagnostic quiz questions, and data-based activities with video solutions as well as curated resources and a glossary.  相似文献   

2.
As the popularity of rich assessment scenarios increases so must the availability of psychometric models capable of handling the resulting data. Dynamic Bayesian networks (DBNs) offer a fast, flexible option for characterizing student ability across time under psychometrically complex conditions. In this article, a brief introduction to DBNs is offered, followed by a review of the existing literature on the use of DBNs in educational and psychological measurement with a focus on methodological investigations and novel applications that may provide guidance for practitioners wishing to deploy these models. The article concludes with a discussion of future directions for research in the field.  相似文献   

3.
4.
New technology enables interactive and adaptive scenario‐based tasks (SBTs) to be adopted in educational measurement. At the same time, it is a challenging problem to build appropriate psychometric models to analyze data collected from these tasks, due to the complexity of the data. This study focuses on process data collected from SBTs. We explore the potential of using concepts and methods from social network analysis to represent and analyze process data. Empirical data were collected from the assessment of Technology and Engineering Literacy, conducted as part of the National Assessment of Educational Progress. For the activity sequences in the process data, we created a transition network using weighted directed networks, with nodes representing actions and directed links connecting two actions only if the first action is followed by the second action in the sequence. This study shows how visualization of the transition networks represents process data and provides insights for item design. This study also explores how network measures are related to existing scoring rubrics and how detailed network measures can be used to make intergroup comparisons.  相似文献   

5.
Drawing valid inferences from modern measurement models is contingent upon a good fit of the data to the model. Violations of model‐data fit have numerous consequences, limiting the usefulness and applicability of the model. As Bayesian estimation is becoming more common, understanding the Bayesian approaches for evaluating model‐data fit models is critical. In this instructional module, Allison Ames and Aaron Myers provide an overview of Posterior Predictive Model Checking (PPMC), the most common Bayesian model‐data fit approach. Specifically, they review the conceptual foundation of Bayesian inference as well as PPMC and walk through the computational steps of PPMC using real‐life data examples from simple linear regression and item response theory analysis. They provide guidance for how to interpret PPMC results and discuss how to implement PPMC for other model(s) and data. The digital module contains sample data, SAS code, diagnostic quiz questions, data‐based activities, curated resources, and a glossary.  相似文献   

6.
针对丁醇生产过程中发酵产物品质参量难以实时测量,现有测量方法精度不高、测量结果受不确定因素影响较大的问题,提出一种基于贝叶斯推断和支持向量回归(Support vector machine regression,SVR)的多层软测量建模方法。首先应用贝叶斯推断计算后验概率、筛选偏置数据,并对偏置数据校准,建立第一层SVR模型;然后利用贝叶斯推断进行二次校准,建立第二层SVR模型,对第一层SVR模型输出进行修正,得到最终预测结果,克服干扰和偏差引起的模型不准确问题。将基于贝叶斯推断的多层支持向量回归(Bi-SVR)预测模型应用于丁醇发酵过程,仿真及实验结果表明,相较于传统SVR预测模型,系统在低干扰的情况下预测精度提高了4.52%,在高干扰时预测精度提高了5.37%。  相似文献   

7.
Even though Bayesian estimation has recently become quite popular in item response theory (IRT), there is a lack of works on model checking from a Bayesian perspective. This paper applies the posterior predictive model checking (PPMC) method ( Guttman, 1967 ; Rubin, 1984 ), a popular Bayesian model checking tool, to a number of real applications of unidimensional IRT models. The applications demonstrate how to exploit the flexibility of the posterior predictive checks to meet the need of the researcher. This paper also examines practical consequences of misfit, an area often ignored in educational measurement literature while assessing model fit.  相似文献   

8.
《Educational Assessment》2013,18(2):125-146
States are implementing statewide assessment programs that classify students into proficiency levels that reflect state-defined performance standards. In an effort to provide support for score interpretations, this study examined the consistency of classifications based on competing item response theory (IRT) models for data from a state assessment program. Classification of students into proficiency levels was compared based on a 1-parameter vs. a 3-parameter IRT model. Despite an overall high level of agreement between classifications based on the 2 models, systematic differences were observed. Under the 1-parameter model, proficiency was underestimated for low proficiency classifications but overestimated for upper proficiency classifications. This resulted in higher "Below Basic" and "Advanced" classifications under 1-parameter vs. 3-parameter IRT applications. Implications of these differences are discussed.  相似文献   

9.
Many language proficiency tests include group oral assessments involving peer interaction. In such an assessment, examinees discuss a common topic with others. Human raters score each examinee's spoken performance on specially designed criteria. However, measurement models for analyzing group assessment data usually assume local person independence and thus fail to consider the impact of peer interaction on the assessment outcomes. This research advances an extended many-facet Rasch model for group assessments (MFRM-GA), accounting for local person dependence. In a series of simulations, we examined the MFRM-GA's parameter recovery and the consequences of ignoring peer interactions under the traditional modeling approach. We also used a real dataset from the English-speaking test of the Language Proficiency Assessment for Teachers (LPAT) routinely administered in Hong Kong to illustrate the efficiency of the new model. The discussion focuses on the model's usefulness for measuring oral language proficiency, practical implications, and future research perspectives.  相似文献   

10.
This article presents relevant research on Bayesian methods and their major applications to modeling in an effort to lay out differences between the frequentist and Bayesian paradigms and to look at the practical implications of these differences. Before research is reviewed, basic tenets and methods of the Bayesian approach to modeling are presented and contrasted with basic estimation results from a frequentist perspective. It is argued that Bayesian methods have become a viable alternative to traditional maximum likelihood-based estimation techniques and may be the only solution for more complex psychometric data structures. Hence, neither the applied nor the theoretical measurement community can afford to neglect the exciting new possibilities that have opened up on the psychometric horizon.  相似文献   

11.
Structural equation modeling is a common multivariate technique for the assessment of the interrelationships among latent variables. Structural equation models have been extensively applied to behavioral, medical, and social sciences. Basic structural equation models consist of a measurement equation for characterizing latent variables through multiple observed variables and a mean regression-type structural equation for investigating how explanatory latent variables influence outcomes of interest. However, the conventional structural equation does not provide a comprehensive analysis of the relationship between latent variables. In this article, we introduce the quantile regression method into structural equation models to assess the conditional quantile of the outcome latent variable given the explanatory latent variables and covariates. The estimation is conducted in a Bayesian framework with Markov Chain Monte Carlo algorithm. The posterior inference is performed with the help of asymmetric Laplace distribution. A simulation shows that the proposed method performs satisfactorily. An application to a study of chronic kidney disease is presented.  相似文献   

12.
网络交互质量是影响网络教学效果的重要因素。已有的网络交互教学成效评价系统缺乏对学习过程的监控和对学习者的良好建模,评价大多凭借专家经验,评估的主观随意性大,评价结果与实际值存在一定的误差。贝叶斯网络有很强大的解决不确定问题的处理水平,是目前基于概率的不确定表达和智能推理方面最有效的理论模型之一。基于贝叶斯网络的网络交互教学成效评价系统,基于领域知识关系构建贝叶斯网络学生模型,并引入模糊数学变换方法对学生认知水平进行评估。能减少不正常因素的干扰,提高对学生认知能力评价的精确度.实现对网络教学交互的质量评估和个性化导学。  相似文献   

13.
在分析国内外研究的基础上,提出从社会及人口统计特征、学校环境、个人特征和学生投入等四个维度构建高校学生学业表现研究框架。以A大学为例,构建整合型教育数据系统,基于不同数据来源,应用决策树、贝叶斯网络、人工神经网络和支持向量机分别建立分类模型,并对模型有效性进行评判。结果表明,所建立的高校学生学业表现分类模型具有一定的有效性和实用价值,可为高校应用教育数据挖掘进行科学管理和完善学业支持体系提供参考。  相似文献   

14.
In many applications of multilevel modeling, group-level (L2) variables for assessing group-level effects are generated by aggregating variables from a lower level (L1). However, the observed group mean might not be a reliable measure of the unobserved true group mean. In this article, we propose a Bayesian approach for estimating a multilevel latent contextual model that corrects for measurement error and sampling error (i.e., sampling only a small number of L1 units from a L2 unit) when estimating group-level effects of aggregated L1 variables. Two simulation studies were conducted to compare the Bayesian approach with the maximum likelihood approach implemented in Mplus. The Bayesian approach showed fewer estimation problems (e.g., inadmissible solutions) and more accurate estimates of the group-level effect than the maximum likelihood approach under problematic conditions (i.e., small number of groups, predictor variable with a small intraclass correlation). An application from educational psychology is used to illustrate the different estimation approaches.  相似文献   

15.
Mokken scale analysis (MSA) is a probabilistic‐nonparametric approach to item response theory (IRT) that can be used to evaluate fundamental measurement properties with less strict assumptions than parametric IRT models. This instructional module provides an introduction to MSA as a probabilistic‐nonparametric framework in which to explore measurement quality, with an emphasis on its application in the context of educational assessment. The module describes both dichotomous and polytomous formulations of the MSA model. Examples of the application of MSA to educational assessment are provided using data from a multiple‐choice physical science assessment and a rater‐mediated writing assessment.  相似文献   

16.
将复杂网络理论和方法引入到教育技术学研究领域中,首先从复杂网络的复杂性、特征及教育技术学的复杂性与网络特征方面系统梳理了复杂网络理论与教育技术的契合;其次分析了教育技术网络与复杂网络具有的共性;最后对复杂网络的方法与理论在教育技术中可应用的方面进行了分析,并指出了具体的应用领域。  相似文献   

17.
总结了复杂网络结构特性及目前的主要研究结果,对网络的静态几何性质做了小结,对规则网络与完全随机网络,Small World网络和Scale Free网络的机制模型做了总结与分析,阐述了网络的结构稳定性研究,从网络机制模型的演化得到在Internet网络中的演化方法启示,提出了一些Internet局域世界拓扑建模需要考虑的问题。  相似文献   

18.
The purpose of this paper is to define and evaluate the categories of cognitive models underlying at least three types of educational tests. We argue that while all educational tests may be based—explicitly or implicitly—on a cognitive model, the categories of cognitive models underlying tests often range in their development and in the psychological evidence gathered to support their value. For researchers and practitioners, awareness of different cognitive models may facilitate the evaluation of educational measures for the purpose of generating diagnostic inferences, especially about examinees' thinking processes, including misconceptions, strengths, and/or abilities. We think a discussion of the types of cognitive models underlying educational measures is useful not only for taxonomic ends, but also for becoming increasingly aware of evidentiary claims in educational assessment and for promoting the explicit identification of cognitive models in test development. We begin our discussion by defining the term cognitive model in educational measurement. Next, we review and evaluate three categories of cognitive models that have been identified for educational testing purposes using examples from the literature. Finally, we highlight the practical implications of "blending" models for the purpose of improving educational measures .  相似文献   

19.
The inference mediation hypothesis (IMH) assumes that individual difference factors that affect reading proficiency have direct and indirect effects on comprehension outcomes, with the indirect effects involving inference processes. The present study tested the IMH in a diverse sample of two and four-year college students in a task that emphasizes comprehension of the passage (traditional assessment) and a task that emphasizes complex problem solving (SBA). Participants were administered assessments of foundational skills that support reading, inference generation, a traditional assessment of comprehension proficiency, and a scenario-based reading assessment. The results support the IMH. However, the strength of the indirect relationships depended on the type of reading performance assessment. Coherence building inferences partially mediated the relationship for the traditional assessment. Elaborative inferences partially mediated the relationship for the scenario-based assessment. The results are discussed in terms of theories of purposeful reading and implications for understanding college readiness.  相似文献   

20.
Computer modeling has been widely promoted as a means to attain higher order learning outcomes. Substantiating these benefits, however, has been problematic due to a lack of proper assessment tools. In this study, we compared computer modeling with expository instruction, using a tailored assessment designed to reveal the benefits of either mode of instruction. The assessment addresses proficiency in declarative knowledge, application, construction, and evaluation. The subscales differentiate between simple and complex structure. The learning task concerns the dynamics of global warming. We found that, for complex tasks, the modeling group outperformed the expository group on declarative knowledge and on evaluating complex models and data. No differences were found with regard to the application of knowledge or the creation of models. These results confirmed that modeling and direct instruction lead to qualitatively different learning outcomes, and that these two modes of instruction cannot be compared on a single “effectiveness measure”.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号