首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and 2 well-known robust test statistics. A modification to the Satorra–Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the 4 test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies 7 sample sizes and 3 distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ2 test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra–Bentler scaled test statistic performed best overall, whereas the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions.  相似文献   

2.
We highlight critical conceptual and statistical issues and how to resolve them in conducting Satorra–Bentler (SB) scaled difference chi-square tests. Concerning the original (Satorra & Bentler, 2001 Satorra, A. and Bentler, P. M. 2001. A scaled difference chi-square test statistic for moment structure analysis. Psychometrika, 66: 507514. [Crossref], [Web of Science ®] [Google Scholar]) and new (Satorra & Bentler, 2010 Satorra, A. and Bentler, P. M. 2010. Ensuring positiveness of the scaled chi-square test statistic. Psychometrika, 75: 243248. [Crossref], [Web of Science ®] [Google Scholar]) scaled difference tests, a fundamental difference exists in how to compute properly a model's scaling correction factor (c), depending on the particular structural equation modeling software used. Because of how LISREL 8 defines the SB scaled chi-square, LISREL users should compute c for each model by dividing the model's normal theory weighted least-squares (NTWLS) chi-square by its SB chi-square, to recover c accurately with both tests. EQS and Mplus users, in contrast, should divide the model's maximum likelihood (ML) chi-square by its SB chi-square to recover c. Because ML estimation does not minimize the NTWLS chi-square, however, it can produce a negative difference in nested NTWLS chi-square values. Thus, we recommend the standard practice of testing the scaled difference in ML chi-square values for models M 1 and M 0 (after properly recovering c for each model), to avoid an inadmissible test numerator. We illustrate the difference in computations across software programs for the original and new scaled tests and provide LISREL, EQS, and Mplus syntax in both single- and multiple-group form for specifying the model M 10 that is involved in the new test.  相似文献   

3.
McDonald goodness‐of‐fit indices based on maximum likelihood, asymptotic distribution free, and the Satorra‐Bentler scale correction estimation methods are investigated. Sampling experiments are conducted to assess the magnitude of error for each index under variations in distributional misspecification, structural misspecification, and sample size. The Satorra‐Bentler correction‐based index is shown to have the least error under each distributional misspecification level when the model has correct structural specification. The scaled index also performs adequately when there is minor structural misspecification and distributional misspecification. However, when a model has major structural misspecification with distributional misspecification, none of the estimation methods perform adequately.  相似文献   

4.
When the assumption of multivariate normality is violated and the sample sizes are relatively small, existing test statistics such as the likelihood ratio statistic and Satorra–Bentler’s rescaled and adjusted statistics often fail to provide reliable assessment of overall model fit. This article proposes four new corrected statistics, aiming for better model evaluation with nonnormally distributed data at small sample sizes. A Monte Carlo study is conducted to compare the performances of the four corrected statistics against those of existing statistics regarding Type I error rate. Results show that the performances of the four new statistics are relatively stable compared with those of existing statistics. In particular, Type I error rates of a new statistic are close to the nominal level across all sample sizes under a condition of asymptotic robustness. Other new statistics also exhibit improved Type I error control, especially with nonnormally distributed data at small sample sizes.  相似文献   

5.
Classical accounts of maximum likelihood (ML) estimation of structural equation models for continuous outcomes involve normality assumptions: standard errors (SEs) are obtained using the expected information matrix and the goodness of fit of the model is tested using the likelihood ratio (LR) statistic. Satorra and Bentler (1994) introduced SEs and mean adjustments or mean and variance adjustments to the LR statistic (involving also the expected information matrix) that are robust to nonnormality. However, in recent years, SEs obtained using the observed information matrix and alternative test statistics have become available. We investigate what choice of SE and test statistic yields better results using an extensive simulation study. We found that robust SEs computed using the expected information matrix coupled with a mean- and variance-adjusted LR test statistic (i.e., MLMV) is the optimal choice, even with normally distributed data, as it yielded the best combination of accurate SEs and Type I errors.  相似文献   

6.
This article investigates the effect of the number of item response categories on chi‐square statistics for confirmatory factor analysis to assess whether a greater number of categories increases the likelihood of identifying spurious factors, as previous research had concluded. Four types of continuous single‐factor data were simulated for a 20‐item test: (a) uniform for all items, (b) symmetric unimodal for all items, (c) negatively skewed for all items, or (d) negatively skewed for 10 items and positively skewed for 10 items. For each of the 4 types of distributions, item responses were divided to yield item scores with 2,4, or 6 categories. The results indicated that the chi‐square statistic for evaluating a single‐factor model was most inflated (suggesting spurious factors) for 2‐category responses and became less inflated as the number of categories increased. However, the Satorra‐Bentler scaled chi‐square tended not to be inflated even for 2‐category responses, except if the continuous item data had both negatively and positively skewed distributions.  相似文献   

7.
In the application of the Satorra–Bentler scaling correction, the choices of normal-theory weight matrices (i.e., the model-predicted vs. the sample covariance matrix) in the calculation of the correction remains unclear. Different software programs use different matrices by default. This simulation study investigates the discrepancies due to the weight matrices in the robust chi-square statistics, standard errors, and chi-square-based model fit indexes. This study varies the sample sizes at 100, 200, 500, and 1,000; kurtoses at 0, 7, and 21; and degrees of model misspecification, measured by the population root mean square error of approximation (RMSEA), at 0, .03, .05, .08, .10, and .15. The results favor the use of the model-predicted covariance matrix because it results in less false rejection rates under the correctly specified model, as well as more accurate standard errors across all conditions. For the sample-corrected robust RMSEA, comparative fit index (CFI) and Tucker–Lewis index (TLI), 2 matrices result in negligible differences.  相似文献   

8.
When the assumption of multivariate normality is violated or when a discrepancy function other than (normal theory) maximum likelihood is used in structural equation models, the null distribution of the test statistic may not be χ2 distributed. Most existing methods to approximate this distribution only match up to 2 moments. In this article, we propose 2 additional approximation methods: a scaled F distribution that matches 3 moments simultaneously and a direct Monte Carlo–based weighted sum of i.i.d. χ2 variates. We also conduct comprehensive simulation studies to compare the new and existing methods for both maximum likelihood and nonmaximum likelihood discrepancy functions and to separately evaluate the effect of sampling uncertainty in the estimated weights of the weighted sum on the performance of the approximation methods.  相似文献   

9.
This study examined and compared various statistical methods for detecting individual differences in change. Considering 3 issues including test forms (specific vs. generalized), estimation procedures (constrained vs. unconstrained), and nonnormality, we evaluated 4 variance tests including the specific Wald variance test, the generalized Wald variance test, the specific likelihood ratio (LR) variance test, and the generalized LR variance test under both constrained and unconstrained estimation for both normal and nonnormal data. For the constrained estimation procedure, both the mixture distribution approach and the alpha correction approach were evaluated for their performance in dealing with the boundary problem. To deal with the nonnormality issue, we used the sandwich standard error (SE) estimator for the Wald tests and the Satorra–Bentler scaling correction for the LR tests. Simulation results revealed that testing a variance parameter and the associated covariances (generalized) had higher power than testing the variance solely (specific), unless the true covariances were zero. In addition, the variance tests under constrained estimation outperformed those under unconstrained estimation in terms of higher empirical power and better control of Type I error rates. Among all the studied tests, for both normal and nonnormal data, the robust generalized LR and Wald variance tests with the constrained estimation procedure were generally more powerful and had better Type I error rates for testing variance components than the other tests. Results from the comparisons between specific and generalized variance tests and between constrained and unconstrained estimation were discussed.  相似文献   

10.
We describe and evaluate a random permutation test of measurement invariance with ordered-categorical data. To calculate a p-value for the observed (?)χ2, an empirical reference distribution is built by repeatedly shuffling the grouping variable, then saving the χ2 from a configural model, or the ?χ2 between configural and scalar-invariance models, fitted to each permuted dataset. The current gold standard in this context is a robust mean- and variance-adjusted ?χ2 test proposed by Satorra (2000), which yields inflated Type I errors, particularly when thresholds are asymmetric, unless samples sizes are quite large (Bandalos, 2014; Sass et al., 2014). In a Monte Carlo simulation, we compare permutation to three implementations of Satorra’s robust χ2 across a variety of conditions evaluating configural and scalar invariance. Results suggest permutation can better control Type I error rates while providing comparable power under conditions that the standard robust test yields inflated errors.  相似文献   

11.
Robust corrections to standard errors and test statistics have wide applications in structural equation modeling (SEM). The original SEM development, due to Satorra and Bentler (1988 Satorra, A. and Bentler, P. M. 1988. “Scaling corrections for chi-square statistics in covariance structure analysis”. In ASA 1988 Proceedings of the Business and Economic Statistics Section, 308313. Alexandria, VA: American Statistical Association.  [Google Scholar], 1994 Satorra, A. and Bentler, P. M. 1994. “Corrections to test statistics and standard errors in covariance structure analysis”. In Latent variables analysis: Applications for developmental research, Edited by: von Eye, A. and Clogg, C. C. 399419. Thousand Oaks, CA: Sage.  [Google Scholar]), was to account for the effect of nonnormality. Muthén (1993) Muthén, B. O. 1993. “Goodness of fit with categorical and other nonnormal variables”. In Testing structural equation models, Edited by: Bollen, K. A. and Long, J. S. 205234. Newbury Park, CA: Sage.  [Google Scholar] proposed corrections to accompany certain categorical data estimators, such as cat-LS or cat-DWLS. Other applications of robust corrections exist. Despite the diversity of applications, all robust corrections are constructed using the same underlying rationale: They correct for inefficiency of the chosen estimator. The goal of this article is to make the formulas behind all types of robust corrections more intuitive. This is accomplished by building an analogy with similar equations in linear regression and then by reformulating the SEM model as a nonlinear regression model.  相似文献   

12.
At times, the same set of test questions is administered under different measurement conditions that might affect the psychometric properties of the test scores enough to warrant different score conversions for the different conditions. We propose a procedure for assessing the practical equivalence of conversions developed for the same set of test questions but administered under different measurement conditions. This procedure assesses whether the use of separate conversions for each condition has a desirable or undesirable effect. We distinguish effects due to differences in difficulty from effects due to rounding conventions. The proposed procedure provides objective empirical information that assists in deciding to report a common conversion for a set of items or a different conversion for the set of items when the set is administered under different measurement conditions. To illustrate the use of the procedure, we consider the case where a scrambled test form is used along with a base test form. If section order effects are detected between the scrambled and base forms, a decision needs to be made whether to report a single common conversion for both forms or to report separate conversions.  相似文献   

13.
14.
The purpose of the study was to determine whether a subject weighted (SW) multiple-choice test taking procedure would result in higher and more reliable scores than the conventional (C) multiple-choice test taking procedure in general and at different levels of risk taking. A 25-item statistics test was administered twice to 122 graduate students. The differences between means and the differences between reliabilities for the SW and C scores in general and at different levels of risk taking were not statistically significant, p < .05. Some additional information on the SW procedure was also reported. The implications for the practitioner are that the SW multiple-choice test taking procedure has not provided sufficient evidence to warrant its use at this time.  相似文献   

15.
The study examined two approaches for equating subscores. They are (1) equating subscores using internal common items as the anchor to conduct the equating, and (2) equating subscores using equated and scaled total scores as the anchor to conduct the equating. Since equated total scores are comparable across the new and old forms, they can be used as an anchor to equate the subscores. Both chained linear and chained equipercentile methods were used. Data from two tests were used to conduct the study and results showed that when more internal common items were available (i.e., 10–12 items), then using common items to equate the subscores is preferable. However, when the number of common items is very small (i.e., five to six items), then using total scaled scores to equate the subscores is preferable. For both tests, not equating (i.e., using raw subscores) is not reasonable as it resulted in a considerable amount of bias.  相似文献   

16.
17.
The asymptotic performance of structural equation modeling tests and standard errors are influenced by two factors: the model and the asymptotic covariance matrix Γ of the sample covariances. Although most simulation studies clearly specify model conditions, specification of Γ is usually limited to values of univariate skewness and kurtosis. We illustrate that marginal skewness and kurtosis are not sufficient to adequately specify a nonnormal simulation condition by showing that asymptotic standard errors and test statistics vary substantially among distributions with skewness and kurtosis that are identical. We argue therefore that Γ should be reported when presenting the design of simulation studies. We show how Γ can be exactly calculated under the widely used Vale–Maurelli transform. We suggest plotting the elements of Γ and reporting the eigenvalues associated with the test statistic. R code is provided.  相似文献   

18.
In structural equation models, outliers could result in inaccurate parameter estimates and misleading fit statistics when using traditional methods. To robustly estimate structural equation models, iteratively reweighted least squares (IRLS; Yuan & Bentler, 2000) has been proposed, but not thoroughly examined. We explore the large-sample properties of IRLS and its effect on parameter recovery, model fit, and aberrant data identification. A parametric bootstrap technique is proposed to determine the tuning parameters of IRLS, which results in improved Type I error rates in aberrant data identification, for data sets generated from homogenous populations. Scenarios concerning (a) simulated data, (b) contaminated data, and (c) a real data set are studied. Results indicate good parameter recovery, model fit, and aberrant data identification when noisy observations are drawn from a real data set, but lackluster parameter recovery and identification of aberrant data when the noise is parametrically structured. Practical implications and further research are discussed.  相似文献   

19.
考虑膜振动一致椭圆型问题的加权特征值上界估计,利用试验函数、分部积分法、Rayleigh定理以及不等式估计等方法,建立了用前n个特征值来估计第n+1个特征值的上界估计,其估计系数与区域度量无关.这个结果在力学和物理学中有着广泛的应用.  相似文献   

20.
This study addressed the sampling error and linking bias that occur with small samples in a nonequivalent groups anchor test design. We proposed a linking method called the synthetic function, which is a weighted average of the identity function and a traditional equating function (in this case, the chained linear equating function). Specifically, we compared the synthetic, identity, and chained linear functions for various‐sized samples from two types of national assessments. One design used a highly reliable test and an external anchor, and the other used a relatively low‐reliability test and an internal anchor. The results from each of these methods were compared to the criterion equating function derived from the total samples with respect to linking bias and error. The study indicated that the synthetic functions might be a better choice than the chained linear equating method when samples are not large and, as a result, unrepresentative.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号