首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 353 毫秒
1.
McDonald goodness‐of‐fit indices based on maximum likelihood, asymptotic distribution free, and the Satorra‐Bentler scale correction estimation methods are investigated. Sampling experiments are conducted to assess the magnitude of error for each index under variations in distributional misspecification, structural misspecification, and sample size. The Satorra‐Bentler correction‐based index is shown to have the least error under each distributional misspecification level when the model has correct structural specification. The scaled index also performs adequately when there is minor structural misspecification and distributional misspecification. However, when a model has major structural misspecification with distributional misspecification, none of the estimation methods perform adequately.  相似文献   

2.
Proper model specification is an issue for researchers, regardless of the estimation framework being utilized. Typically, indexes are used to compare the fit of one model to the fit of an alternate model. These indexes only provide an indication of relative fit and do not necessarily point toward proper model specification. There is a procedure in the Bayesian framework called posterior predictive checking that is designed theoretically to detect model misspecification for observed data. However, the performance of the posterior predictive check procedure has thus far not been directly examined under different conditions of mixture model misspecification. This article addresses this task and aims to provide additional insight into whether or not posterior predictive checks can detect model misspecification within the context of Bayesian growth mixture modeling. Results indicate that this procedure can only identify mixture model misspecification under very extreme cases of misspecification.  相似文献   

3.
A Monte Carlo simulation study was conducted to investigate the effects on structural equation modeling (SEM) fit indexes of sample size, estimation method, and model specification. Based on a balanced experimental design, samples were generated from a prespecified population covariance matrix and fitted to structural equation models with different degrees of model misspecification. Ten SEM fit indexes were studied. Two primary conclusions were suggested: (a) some fit indexes appear to be noncomparable in terms of the information they provide about model fit for misspecified models and (b) estimation method strongly influenced almost all the fit indexes examined, especially for misspecified models. These 2 issues do not seem to have drawn enough attention from SEM practitioners. Future research should study not only different models vis‐à‐vis model complexity, but a wider range of model specification conditions, including correctly specified models and models specified incorrectly to varying degrees.  相似文献   

4.
In observed‐score equipercentile equating, the goal is to make scores on two scales or tests measuring the same construct comparable by matching the percentiles of the respective score distributions. If the tests consist of different items with multiple categories for each item, a suitable model for the responses is a polytomous item response theory (IRT) model. The parameters from such a model can be utilized to derive the score probabilities for the tests and these score probabilities may then be used in observed‐score equating. In this study, the asymptotic standard errors of observed‐score equating using score probability vectors from polytomous IRT models are derived using the delta method. The results are applied to the equivalent groups design and the nonequivalent groups design with either chain equating or poststratification equating within the framework of kernel equating. The derivations are presented in a general form and specific formulas for the graded response model and the generalized partial credit model are provided. The asymptotic standard errors are accurate under several simulation conditions relating to sample size, distributional misspecification and, for the nonequivalent groups design, anchor test length.  相似文献   

5.
Using a complex simulation study we investigated parameter recovery, classification accuracy, and performance of two item‐fit statistics for correct and misspecified diagnostic classification models within a log‐linear modeling framework. The basic manipulated test design factors included the number of respondents (1,000 vs. 10,000), attributes (3 vs. 5), and items (25 vs. 50) as well as different attribute correlations (.50 vs. .80) and marginal attribute difficulties (equal vs. different). We investigated misspecifications of interaction effect parameters under correct Q‐matrix specification and two types of Q‐matrix misspecification. While the misspecification of interaction effects had little impact on classification accuracy, invalid Q‐matrix specifications led to notably decreased classification accuracy. Two proposed item‐fit indexes were more strongly sensitive to overspecification of Q‐matrix entries for items than to underspecification. Information‐based fit indexes AIC and BIC were sensitive to both over‐ and underspecification.  相似文献   

6.
This study used Monte Carlo methods to investigate the accuracy and utility of estimators of overall error and error due to approximation in structural equation models. The effects of sample size, indicator reliabilities, and degree of misspecification were examined. The rescaled noncentrality parameter (McDonald & Marsh, 1990) was examined as a measure of approximation error, whereas the one‐ and two‐sample cross‐validation indices and a sample estimator of overall error (EFo) proposed by Browne and Cudeck (1989, 1993) were presented as measures of overall error. The rescaled noncentrality parameter and EFo provided extremely accurate estimates of the amounts of approximation and overall error, respectively. However, although models with errors of omission produced larger estimates of approximation and overall error, the presence of errors of inclusion had little or no effect on estimates of either type of error. The cross‐validation indices and sample estimator of overall error reached minimum values for the same model as an empirically derived measure of overall error only for models with large amounts of specification error. Implications for the use of these estimators in choosing among competing models were discussed.  相似文献   

7.
This simulation study demonstrates how the choice of estimation method affects indexes of fit and parameter bias for different sample sizes when nested models vary in terms of specification error and the data demonstrate different levels of kurtosis. Using a fully crossed design, data were generated for 11 conditions of peakedness, 3 conditions of misspecification, and 5 different sample sizes. Three estimation methods (maximum likelihood [ML], generalized least squares [GLS], and weighted least squares [WLS]) were compared in terms of overall fit and the discrepancy between estimated parameter values and the true parameter values used to generate the data. Consistent with earlier findings, the results show that ML compared to GLS under conditions of misspecification provides more realistic indexes of overall fit and less biased parameter values for paths that overlap with the true model. However, despite recommendations found in the literature that WLS should be used when data are not normally distributed, we find that WLS under no conditions was preferable to the 2 other estimation procedures in terms of parameter bias and fit. In fact, only for large sample sizes (N = 1,000 and 2,000) and mildly misspecified models did WLS provide estimates and fit indexes close to the ones obtained for ML and GLS. For wrongly specified models WLS tended to give unreliable estimates and over-optimistic values of fit.  相似文献   

8.
Fit indexes are an important tool in the evaluation of model fit in structural equation modeling (SEM). Currently, the newest confidence interval (CI) for fit indexes proposed by Zhang and Savalei (2016) is based on the quantiles of a bootstrap sampling distribution at a single level of misspecification. This method, despite a great improvement over naive and model-based bootstrap methods, still suffers from unsatisfactory coverage. In this work, we propose a new method of constructing bootstrap CIs for various fit indexes. This method directly inverts a bootstrap test and produces a CI that involves levels of misspecification that would not be rejected in a bootstrap test. Similar in rationale to a parametric CI of root mean square error of approximation (RMSEA) based on a noncentral χ2 distribution and a profile-likelihood CI of model parameters, this approach is shown to have better performance than the approach of Zhang and Savalei (2016), with more accurate coverage and more efficient widths.  相似文献   

9.
Assessing the correctness of a structural equation model is essential to avoid drawing incorrect conclusions from empirical research. In the past, the chi-square test was recommended for assessing the correctness of the model but this test has been criticized because of its sensitivity to sample size. As a reaction, an abundance of fit indexes have been developed. The result of these developments is that structural equation modeling packages are now producing a large list of fit measures. One would think that this progression has led to a clear understanding of evaluating models with respect to model misspecifications. In this article we question the validity of approaches for model evaluation based on overall goodness-of-fit indexes. The argument against such usage is that they do not provide an adequate indication of the “size” of the model's misspecification. That is, they vary dramatically with the values of incidental parameters that are unrelated with the misspecification in the model. This is illustrated using simple but fundamental models. As an alternative method of model evaluation, we suggest using the expected parameter change in combination with the modification index (MI) and the power of the MI test.  相似文献   

10.
A problem central to structural equation modeling is measurement model specification error and its propagation into the structural part of nonrecursive latent variable models. Full-information estimation techniques such as maximum likelihood are consistent when the model is correctly specified and the sample size large enough; however, any misspecification within the model can affect parameter estimates in other parts of the model. The goals of this study included comparing the bias, efficiency, and accuracy of hypothesis tests in nonrecursive latent variable models with indirect and direct feedback loops. We compare the performance of maximum likelihood, two-stage least-squares and Bayesian estimators in nonrecursive latent variable models with indirect and direct feedback loops under various degrees of misspecification in small to moderate sample size conditions.  相似文献   

11.
Latent Markov (LM) models are increasingly used in a wide range of research areas including psychological, sociological, educational, and medical sciences. Methods to perform power computations are lacking, however. This article presents methods for preforming power analysis in LM models. Two cases of tests of hypotheses on the transition parameters of LM models are considered. The first case concerns the situation where the likelihood ratio test statistic follows a chi-square distribution, implying that the power computation can also be based on this theoretical distribution. In the second case, power needs to be computed based on empirical distributions constructed via Monte Carlo methods. Numerical studies are conducted to illustrate the proposed power computation methods and to investigate design factors affecting the power of this test.  相似文献   

12.
Approximations to the distributions of goodness-of-fit indexes in structural equation modeling are derived with the assumption of multivariate normality and slight misspecification of models. The fit indexes considered in this article are Joreskog and Sorbom's goodness-of-fit index (GFI) and the adjusted GFI, McDonald's absolute GFI, Steiger and Lind's root mean squared error of approximation, Steiger's Γ1 and Γ2, Bentler and Bonett's normed fit index, Bollen's incremental fit index and ρ1, Tucker and Lewis's index ρ2, and Bentler's fit index (McDonald and Marsh's relative noncentrality index). An approximation to the asymptotic covariance matrix for the fit indexes is derived by using the delta method. Furthermore, approximations to the densities of the fit indexes are obtained from the transformations of the asymptotically noncentral chi-square distributed variable. A simulation is carried out to confirm the accuracy of the approximations.  相似文献   

13.
In previous research (Hu & Bentler, 1998, 1999), 2 conclusions were drawn: standardized root mean squared residual (SRMR) was the most sensitive to misspecified factor covariances, and a group of other fit indexes were most sensitive to misspecified factor loadings. Based on these findings, a 2-index strategy-that is, SRMR coupled with another index-was proposed in model fit assessment to detect potential misspecification in both the structural and measurement model parameters. Based on our reasoning and empirical work presented in this article, we conclude that SRMR is not necessarily most sensitive to misspecified factor covariances (structural model misspecification), the group of indexes (TLI, BL89, RNI, CFI, Gamma hat, Mc, or RMSEA) are not necessarily more sensitive to misspecified factor loadings (measurement model misspecification), and the rationale for the 2-index presentation strategy appears to have questionable validity.  相似文献   

14.
As with any psychometric models, the validity of inferences from cognitive diagnosis models (CDMs) determines the extent to which these models can be useful. For inferences from CDMs to be valid, it is crucial that the fit of the model to the data is ascertained. Based on a simulation study, this study investigated the sensitivity of various fit statistics for absolute or relative fit under different CDM settings. The investigation covered various types of model–data misfit that can occur with the misspecifications of the Q‐matrix, the CDM, or both. Six fit statistics were considered: –2 log likelihood (–2LL), Akaike's information criterion (AIC), Bayesian information criterion (BIC), and residuals based on the proportion correct of individual items (p), the correlations (r), and the log‐odds ratio of item pairs (l). An empirical example involving real data was used to illustrate how the different fit statistics can be employed in conjunction with each other to identify different types of misspecifications. With these statistics and the saturated model serving as the basis, relative and absolute fit evaluation can be integrated to detect misspecification efficiently.  相似文献   

15.
Model fit indices are being increasingly recommended and used to select the number of factors in an exploratory factor analysis. Growing evidence suggests that the recommended cutoff values for common model fit indices are not appropriate for use in an exploratory factor analysis context. A particularly prominent problem in scale evaluation is the ubiquity of correlated residuals and imperfect model specification. Our research focuses on a scale evaluation context and the performance of four standard model fit indices: root mean square error of approximate (RMSEA), standardized root mean square residual (SRMR), comparative fit index (CFI), and Tucker–Lewis index (TLI), and two equivalence test-based model fit indices: RMSEAt and CFIt. We use Monte Carlo simulation to generate and analyze data based on a substantive example using the positive and negative affective schedule (N = 1,000). We systematically vary the number and magnitude of correlated residuals as well as nonspecific misspecification, to evaluate the impact on model fit indices in fitting a two-factor exploratory factor analysis. Our results show that all fit indices, except SRMR, are overly sensitive to correlated residuals and nonspecific error, resulting in solutions that are overfactored. SRMR performed well, consistently selecting the correct number of factors; however, previous research suggests it does not perform well with categorical data. In general, we do not recommend using model fit indices to select number of factors in a scale evaluation framework.  相似文献   

16.
We evaluate the performance of the most common estimators of latent Markov (LM) models with covariates in the presence of direct effects of the covariates on the indicators of the LM model. In LM modeling it is common practice not to model such direct effects, ignoring the consequences that might have on the overall model fit and the parameters of interest. However, in the general literature about latent variable modeling it is well known that unmodeled direct effects can severely bias the parameter estimates of the model at hand. We evaluate how the presence of direct effects in?uences the bias and efficiency of the 3 most common estimators of LM models, the 1-step, 2-step, and 3-step approaches. Furthermore, we propose amendments (that were thus far not used in the context of LM modeling) to the 2- and 3-step approaches that make it possible to account for direct effects and eliminate bias as a consequence. This is done by modeling the (possible) direct effects in the first step of the stepwise estimation procedures. We evaluate the proposed estimators through an extensive simulation study, and illustrate them via a real data application. Our results show, first, that the augmented 2-step and 3-step approaches are unbiased and efficient estimators of LM models with direct effects. Second, ignoring the direct effects leads to biased estimates with all existing estimators, the 1-step approach being the most sensitive.  相似文献   

17.
In educational and psychological measurement, a person-fit statistic (PFS) is designed to identify aberrant response patterns. For parametric PFSs, valid inference depends on several assumptions, one of which is that the item response theory (IRT) model is correctly specified. Previous studies have used empirical data sets to explore the effects of model misspecification on PFSs. We further this line of research by using a simulation study, which allows us to explore issues that may be of interest to practitioners. Results show that, depending on the generating and analysis item models, Type I error rates at fixed values of the latent variable may be greatly inflated, even when the aggregate rates are relatively accurate. Results also show that misspecification is most likely to affect PFSs for examinees with extreme latent variable scores. Two empirical data analyses are used to illustrate the importance of model specification.  相似文献   

18.
Appropriate model specification is fundamental to unbiased parameter estimates and accurate model interpretations in structural equation modeling. Thus detecting potential model misspecification has drawn the attention of many researchers. This simulation study evaluates the efficacy of the Bayesian approach (the posterior predictive checking, or PPC procedure) under multilevel bifactor model misspecification (i.e., ignoring a specific factor at the within level). The impact of model misspecification on structural coefficients was also examined in terms of bias and power. Results showed that the PPC procedure performed better in detecting multilevel bifactor model misspecification, when the misspecification became more severe and sample size was larger. Structural coefficients were increasingly negatively biased at the within level, as model misspecification became more severe. Model misspecification at the within level affected the between-level structural coefficient estimates more when data dependency was lower and the number of clusters was smaller. Implications for researchers are discussed.  相似文献   

19.
Allowance for multiple chances to answer constructed response questions is a prevalent feature in computer‐based homework and exams. We consider the use of item response theory in the estimation of item characteristics and student ability when multiple attempts are allowed but no explicit penalty is deducted for extra tries. This is common practice in online formative assessments, where the number of attempts is often unlimited. In these environments, some students may not always answer‐until‐correct, but may rather terminate a response process after one or more incorrect tries. We contrast the cases of graded and sequential item response models, both unidimensional models which do not explicitly account for factors other than ability. These approaches differ not only in terms of log‐odds assumptions but, importantly, in terms of handling incomplete data. We explore the consequences of model misspecification through a simulation study and with four online homework data sets. Our results suggest that model selection is insensitive for complete data, but quite sensitive to whether missing responses are regarded as informative (of inability) or not (e.g., missing at random). Under realistic conditions, a sequential model with similar parametric degrees of freedom to a graded model can account for more response patterns and outperforms the latter in terms of model fit.  相似文献   

20.
When the multivariate normality assumption is violated in structural equation modeling, a leading remedy involves estimation via normal theory maximum likelihood with robust corrections to standard errors. We propose that this approach might not be best for forming confidence intervals for quantities with sampling distributions that are slow to approach normality, or for functions of model parameters. We implement and study a robust analog to likelihood-based confidence intervals based on inverting the robust chi-square difference test of Satorra (2000). We compare robust standard errors and the robust likelihood-based approach versus resampling methods in confirmatory factor analysis (Studies 1 & 2) and mediation analysis models (Study 3) for both single parameters and functions of model parameters, and under a variety of nonnormal data generation conditions. The percentile bootstrap emerged as the method with the best calibrated coverage rates and should be preferred if resampling is possible, followed by the robust likelihood-based approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号