期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The Impact of Inaccurate “Informative” Priors for Growth Parameters in Bayesian Growth Mixture Modeling

Sarah Depaoli 《Structural equation modeling》2013,20(2):239-252

Within Bayesian estimation, prior distributions are placed on model parameters and these distributions can take on many different levels of informativeness. Although much of the research conducted within this estimation framework uses what are called diffuse (or noninformative) priors, there are certain models and modeling circumstances where it is more optimal to use what are referred to as informative priors. This study focuses on the latter situation and examines the effects of inaccurate informative priors on the growth parameters within the context of growth mixture modeling. Overall, results indicated that growth mixture modeling is relatively robust to the use of inaccurate mean hyperparameters for the growth parameters, as long as the variance hyperparameters are somewhat large. 相似文献

2.

Bayesian Versus Maximum Likelihood Estimation of Multitrait–Multimethod Confirmatory Factor Models

Jonathan Lee Helm Laura Castro-Schilo Zita Oravecz 《Structural equation modeling》2017,24(1):17-30

This article compares maximum likelihood and Bayesian estimation of the correlated trait–correlated method (CT–CM) confirmatory factor model for multitrait–multimethod (MTMM) data. In particular, Bayesian estimation with minimally informative prior distributions—that is, prior distributions that prescribe equal probability across the known mathematical range of a parameter—are investigated as a source of information to aid convergence. Results from a simulation study indicate that Bayesian estimation with minimally informative priors produces admissible solutions more often maximum likelihood estimation (100.00% for Bayesian estimation, 49.82% for maximum likelihood). Extra convergence does not come at the cost of parameter accuracy; Bayesian parameter estimates showed comparable bias and better efficiency compared to maximum likelihood estimates. The results are echoed via 2 empirical examples. Hence, Bayesian estimation with minimally informative priors outperforms enables admissible solutions of the CT–CM model for MTMM data. 相似文献

3.

The Use of Incorrect Informative Priors in the Estimation of MIMIC Model Parameters with Small Sample Sizes

W. Holmes Finch J.E. Miller 《Structural equation modeling》2019,26(4):497-508

Recently, advancements in Bayesian structural equation modeling (SEM), particularly software developments, have allowed researchers to more easily employ it in data analysis. With the potential for greater use, come opportunities to apply Bayesian SEM in a wider array of situations, including for small sample size problems. Effective use of Bayseian estimation hinges on selection of appropriate prior distributions for model parameters. Researchers have suggested that informative priors may be useful with small samples, presuming that the mean of the prior is accurate with respect to the population mean. The purpose of this simulation study was to examine model parameter estimation for the Multiple Indicator Multiple Cause model when an informative prior distribution had an incorrect mean. Results demonstrated that the use of incorrect informative priors with somewhat larger variance than is typical, yields more accurate parameter estimates than do naïve priors, or maximum likelihood estimation. Implications for practice are discussed. 相似文献

4.

Self‐report Case‐studies: an experiment in own classroom data collection by teachers

Trevor Kerry 《International Journal of Research & Method in Education》2013,36(1):103-111

The capacity of Bayesian methods in estimating complex statistical models is undeniable. Bayesian data analysis is seen as having a range of advantages, such as an intuitive probabilistic interpretation of the parameters of interest, the efficient incorporation of prior information to empirical data analysis, model averaging and model selection. As a simplified demonstration, we illustrate (1) how Bayesians test and compare two non‐nested growth curve models using Bayesian estimation with non‐informative prior; (2) how Bayesians model and handle missing outcomes in the context of missing values; and (3) how Bayesians incorporate data‐based evidence from a previous data set, construct informative priors and treat them as extra information while conducting an up‐to‐date analogy analysis. 相似文献

5.

Five Methods for Estimating Angoff Cut Scores with IRT

下载免费PDF全文

Adam E. Wyse 《Educational Measurement》2017,36(4):16-27

This article illustrates five different methods for estimating Angoff cut scores using item response theory (IRT) models. These include maximum likelihood (ML), expected a priori (EAP), modal a priori (MAP), and weighted maximum likelihood (WML) estimators, as well as the most commonly used approach based on translating ratings through the test characteristic curve (i.e., the IRT true‐score (TS) estimator). The five methods are compared using a simulation study and a real data example. Results indicated that the application of different methods can sometimes lead to different estimated cut scores, and that there can be some key differences in impact data when using the IRT TS estimator compared to other methods. It is suggested that one should carefully think about their choice of methods to estimate ability and cut scores because different methods have distinct features and properties. An important consideration in the application of Bayesian methods relates to the choice of the prior and the potential bias that priors may introduce into estimates. 相似文献

6.

Comparison of Inverse Wishart and Separation-Strategy Priors for Bayesian Estimation of Covariance Parameter Matrix in Growth Curve Analysis

Haiyan Liu Zhiyong Zhang Kevin J. Grimm 《Structural equation modeling》2016,23(3):354-367

Growth curve modeling provides a general framework for analyzing longitudinal data from social, behavioral, and educational sciences. Bayesian methods have been used to estimate growth curve models, in which priors need to be specified for unknown parameters. For the covariance parameter matrix, the inverse Wishart prior is most commonly used due to its proper and conjugate properties. However, many researchers have pointed out that the inverse Wishart prior might not work as expected. The purpose of this study is to investigate the influence of the inverse Wishart prior and compare it with a class of separation-strategy priors on the parameter estimates of growth curve models. In this article, we illustrate the use of different types of priors with 2 real data analyses, and then conduct simulation studies to evaluate and compare these priors in estimating both linear and nonlinear growth curve models. For the linear model, the simulation study shows that both the inverse Wishart and the separation-strategy priors work well for the fixed effects parameters. For the Level 1 residual variance estimate, the separation-strategy prior performs better than the inverse Wishart prior. For the covariance matrix, the results are mixed. Overall, the inverse Wishart prior is suggested if the population correlation coefficient and at least 1 of the 2 marginal variances are large. Otherwise, the separation-strategy prior is preferred. For the nonlinear growth curve model, the separation-strategy priors work better than the inverse Wishart prior. 相似文献

7.

The Impact of Moderate Priors For Bayesian Estimation and Testing of Item Factor Analysis Models When Maximum Likelihood is Unsuitable

Sierra A. Bainter Daniel E. Forster 《Structural equation modeling》2019,26(1):80-93

In psychological research, available data are often insufficient to estimate item factor analysis (IFA) models using traditional estimation methods, such as maximum likelihood (ML) or limited information estimators. Bayesian estimation with common-sense, moderately informative priors can greatly improve efficiency of parameter estimates and stabilize estimation. There are a variety of methods available to evaluate model fit in a Bayesian framework; however, past work investigating Bayesian model fit assessment for IFA models has assumed flat priors, which have no advantage over ML in limited data settings. In this paper, we evaluated the impact of moderately informative priors on ability to detect model misfit for several candidate indices: posterior predictive checks based on the observed score distribution, leave-one-out cross-validation, and widely available information criterion (WAIC). We found that although Bayesian estimation with moderately informative priors is an excellent aid for estimating challenging IFA models, methods for testing model fit in these circumstances are inadequate. 相似文献

8.

An Investigation of the Influence of Internal Test Bias on Regression Slope

《教育实用测度》2013,26(4):351-368

Through a large-scale simulation study, this article compares item parameter estimates obtained by the marginal maximum likelihood estimation (MMLE) and marginal Bayes modal estimation (MBME) procedures in the 3-parameter logistic model. The impact of different prior specifications on the MBME estimates is also investigated using carefully selected prior distributions. The results indicate that, in general, the MBME provides more accurate item parameter estimates than the MMLE procedure. The impact of different priors on the Bayesian estimates is modest when the examinee sample size is not extremely small. 相似文献

9.

Mediation Effects In 2-1-1 Multilevel Model: Evaluation Of Alternative Estimation Methods

Jie Fang Kit-Tai Hau 《Structural equation modeling》2019,26(4):591-606

We compared six common methods in estimating the 2-1-1 (level-2 independent, level-1 mediator, level-1 dependent) multilevel mediation model with a random slope. They were the Bayesian with informative priors, the Bayesian with non-informative priors, the Monte-Carlo, the distribution of the product, the bias-corrected, and the bias-uncorrected parametric percentile residual bootstrap. The Bayesian method with informative priors was superior in relative mean square error (RMSE), power, interval width, and interval imbalance. The prior variance and prior mean were also varied and examined. Decreasing the prior variance increased the power, reduced RMSE and interval width when the prior mean was the true value, but decreasing the prior variance reduced the power when the prior mean was set incorrectly. The influence of misspecification of prior information of the b coefficient on multilevel mediation analysis was greater than that on coefficient a. An illustrate example with the Bayesian multilevel mediation was provided. 相似文献

10.

A Comparison of IRT Proficiency Estimation Methods Under Adaptive Multistage Testing

下载免费PDF全文

Sooyeon Kim Tim Moses Hanwook Yoo 《Journal of Educational Measurement》2015,52(1):70-79

This inquiry is an investigation of item response theory (IRT) proficiency estimators’ accuracy under multistage testing (MST). We chose a two‐stage MST design that includes four modules (one at Stage 1, three at Stage 2) and three difficulty paths (low, middle, high). We assembled various two‐stage MST panels (i.e., forms) by manipulating two assembly conditions in each module, such as difficulty level and module length. For each panel, we investigated the accuracy of examinees’ proficiency levels derived from seven IRT proficiency estimators. The choice of Bayesian (prior) versus non‐Bayesian (no prior) estimators was of more practical significance than the choice of number‐correct versus item‐pattern scoring estimators. The Bayesian estimators were slightly more efficient than the non‐Bayesian estimators, resulting in smaller overall error. Possible score changes caused by the use of different proficiency estimators would be nonnegligible, particularly for low‐ and high‐performing examinees. 相似文献

11.

Brief Research Report: Bayesian Versus REML Estimations With Noninformative Priors in Multilevel Single-Case Data

Eunkyeng Baek S. Natasha Beretvas Wim Van den Noortgate John M. Ferron 《Journal of Experimental Education》2020,88(4):698-710

Abstract

Recently, researchers have used multilevel models for estimating intervention effects in single-case experiments that include replications across participants (e.g., multiple baseline designs) or for combining results across multiple single-case studies. Researchers estimating these multilevel models have primarily relied on restricted maximum likelihood (REML) techniques, but Bayesian approaches have also been suggested. The purpose of this Monte Carlo simulation study was to examine the impact of estimation method (REML versus Bayesian with noninformative priors) on the estimation of treatment effects (relative bias, root mean square error) and on the inferences about those effects (interval coverage) for autocorrelated multiple-baseline data. Simulated conditions varied with regard to the number of participants, series length, and distribution of the variance within and across participants. REML and Bayesian estimation led to estimates of the fixed effects that showed little to no bias but that differentially impacted the inferences about the fixed effects and the estimates of the variances. Implications for applied researchers and methodologists are discussed. 相似文献

12.

Obtaining Test Blueprint Weights From Job Analysis Surveys

Judith A. Spray Chi-Yu Huang 《Journal of Educational Measurement》2000,37(3):187-201

A method for combining multiple scale responses from job or task surveys based on a hierarchical ranking scheme is presented. A rationale for placing the resulting ordinal information onto an interval scale of measurement using the Rasch Rating Scale Model is also provided. After a simple linear transformation, the item or task parameter estimates can be used to obtain item weights to be used in constructing test blueprints. Prior weights can then be used to modify the item weights after data collection, based either on content balancing requirements or Bayesian prior content weights from SMEs (subject matter experts). Finally a method is suggested to link two or more surveys, again using the Rasch Rating Scale Model and the computer program, Bigsteps, when it is desirable to shorten the length of the typical job or task survey. 相似文献

13.

小样本情况下试题Logistic IRT参数估计研究综述

汪存友《考试研究》2014,(6):48-60

回顾国内外有关小样本情况下估计试题的Logistic IRT参数的研究,可以总结出六种参数估计方法,分别是:修改IRT模型法、提供先验信息法、人工神经网络法、非参数估计法、经典测验理论标准化法以及使用数据增强技术。后续研究应加强对已有参数估计方法的改进,使用包括标准误在内的多种误差指标,在250人以内的样本水平上,采用模拟数据与真实数据相结合的模拟实验法开展更加严谨的模拟研究。相似文献

14.

A Comparison of Estimation Techniques for IRT Models With Small Samples

Holmes Finch Brian F. French 《教育实用测度》2019,32(2):77-96

The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact traditional marginal maximum likelihood (ML) estimation of IRT model parameters, including sample size, with smaller samples generally being associated with lower parameter estimation accuracy, and inflated standard errors for the estimates. Given this deleterious impact of small samples on IRT model performance, use of these techniques with low-incidence populations, where it might prove to be particularly useful, estimation becomes difficult, especially with more complex models. Recently, a Pairwise estimation method for Rasch model parameters has been suggested for use with missing data, and may also hold promise for parameter estimation with small samples. This simulation study compared item difficulty parameter estimation accuracy of ML with the Pairwise approach to ascertain the benefits of this latter method. The results support the use of the Pairwise method with small samples, particularly for obtaining item location estimates. 相似文献

15.

A comparison of the approaches of generalizability theory and item response theory in estimating the reliability of test scores for testlet-composed tests

Guemin Lee In-Yong Park 《Asia Pacific Education Review》2012,13(1):47-54

Previous assessments of the reliability of test scores for testlet-composed tests have indicated that item-based estimation methods overestimate reliability. This study was designed to address issues related to the extent to which item-based estimation methods overestimate the reliability of test scores composed of testlets and to compare several estimation methods for different measurement models using simulation techniques. Three types of estimation approach were conceptualized for generalizability theory (GT) and item response theory (IRT): item score approach (ISA), testlet score approach (TSA), and item-nested-testlet approach (INTA). The magnitudes of overestimation when applying item-based methods ranged from 0.02 to 0.06 and were related to the degrees of dependence among within-testlet items. Reliability estimates from TSA were lower than those from INTA due to the loss of information with IRT approaches. However, this could not be applied in GT. Specified methods in IRT produced higher reliability estimates than those in GT using the same approach. Relatively smaller magnitudes of error in reliability estimates were observed for ISA and for methods in IRT. Thus, it seems reasonable to use TSA as well as INTA for both GT and IRT. However, if there is a relatively large dependence among within-testlet items, INTA should be considered for IRT due to nonnegligible loss of information. 相似文献

16.

Covariate Balance in Bayesian Propensity Score Approaches for Observational Studies

《Journal of research on educational effectiveness》2013,6(2):280-302

Abstract

Bayesian alternatives to frequentist propensity score approaches have recently been proposed. However, few studies have investigated their covariate balancing properties. This article compares a recently developed two-step Bayesian propensity score approach to the frequentist approach with respect to covariate balance. The effects of different priors on covariate balance are evaluated and the differences between frequentist and Bayesian covariate balance are discussed. Results of the case study reveal that both the Bayesian and frequentist propensity score approaches achieve good covariate balance. The frequentist propensity score approach performs slightly better on covariate balance for stratification and weighting methods, whereas the two-step Bayesian approach offers slightly better covariate balance in the optimal full matching method. Results of a comprehensive simulation study reveal that accuracy and precision of prior information on propensity score model parameters do not greatly influence balance performance. Results of the simulation study also show that overall, the optimal full matching method provides the best covariate balance and treatment effect estimates compared to the stratification and weighting methods. A unique feature of covariate balance within Bayesian propensity score analysis is that we can obtain a distribution of balance indices in addition to the point estimates so that the variation in balance indices can be naturally captured to assist in covariate balance checking. 相似文献

17.

Power in Bayesian Mediation Analysis for Small Sample Research

Milica Miočević David P. MacKinnon Roy Levy 《Structural equation modeling》2017,24(5):666-683

Bayesian methods have the potential for increasing power in mediation analysis (Koopman, Howe, Hollenbeck, & Sin, 2015; Yuan & MacKinnon, 2009). This article compares the power of Bayesian credibility intervals for the mediated effect to the power of normal theory, distribution of the product, percentile, and bias-corrected bootstrap confidence intervals at N ≤ 200. Bayesian methods with diffuse priors have power comparable to the distribution of the product and bootstrap methods, and Bayesian methods with informative priors had the most power. Varying degrees of precision of prior distributions were also examined. Increased precision led to greater power only when N ≥ 100 and the effects were small, N < 60 and the effects were large, and N < 200 and the effects were medium. An empirical example from psychology illustrated a Bayesian analysis of the single mediator model from prior selection to interpreting results. 相似文献

18.

An NCME Instructional Module on Item‐Fit Statistics for Item Response Theory Models

下载免费PDF全文

Allison J. Ames Randall D. Penfield 《Educational Measurement》2015,34(3):39-48

Drawing valid inferences from item response theory (IRT) models is contingent upon a good fit of the data to the model. Violations of model‐data fit have numerous consequences, limiting the usefulness and applicability of the model. This instructional module provides an overview of methods used for evaluating the fit of IRT models. Upon completing this module, the reader will have an understanding of traditional and Bayesian approaches for evaluating model‐data fit of IRT models, the relative advantages of each approach, and the software available to implement each method. 相似文献

19.

A Comparative Study of the Effects of Recency of Instruction on the Stability of IRT and Conventional Item Parameter Estimates

Linda L. Cook Daniel R. Eignor Hessy L. Taft 《Journal of Educational Measurement》1988,25(1):31-45

A potential concern for individuals interested in using item response theory (IRT) with achievement test data is that such tests have been specifically designed to measure content areas related to course curriculum and students taking the tests at different points in their coursework may not constitute samples from the same population. In this study, data were obtained from three administrations of two forms of a Biology achievement test. Data from the newer of the two forms were collected at a spring administration, made up of high school sophomores just completing the Biology course, and at a fall administration, made up mostly of seniors who completed their instruction in the course from 6–18 months prior to the test administration. Data from the older form, already on scale, were collected at only a fall administration, where the sample was comparable to the newer form fall sample. IRT and conventional item difficulty parameter estimates for the common items across the two forms were compared for each of the two form/sample combinations. In addition, conventional and IRT score equatings were performed between the new and old forms for each o f the form sample combinations. Widely disparate results were obtained between the equatings based on the two form/sample combinations. Conclusions are drawn about the use o f both classical test theory and IRT in situations such as that studied, and implications o f the results for achievement test validity are also discussed 相似文献

20.

A new approach to test score equating using item response theory with fixed C-parameters

Guemin Lee Anne R. Fitzpatrick 《Asia Pacific Education Review》2008,9(3):248-261

Because parameter estimates from different calibration runs under the IRT model are linearly related, a linear equation can convert IRT parameter estimates onto another scale metric without changing the probability of a correct response (Kolen & Brennan, 1995, 2004). This study was designed to explore a new approach to finding a linear equation by fixing C-parameters for anchor items in IRT equating. A rationale for fixing C-parameters for anchor items in IRT equating can be established from the fact that the C-parameters are not affected by any linear transformation. This new approach can avoid the difficulty in getting accurate C-parameters for anchor items embedded in the application of the IRT model. Based upon our findings in this study, we would recommend using the new approach to fix C-parameters for anchor items in IRT equating. This work was supported by a Korea Research Foundation Grant funded by the Korean Government (MOEHRD, Basic Research 相似文献