首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
When the assumption of multivariate normality is violated and the sample sizes are relatively small, existing test statistics such as the likelihood ratio statistic and Satorra–Bentler’s rescaled and adjusted statistics often fail to provide reliable assessment of overall model fit. This article proposes four new corrected statistics, aiming for better model evaluation with nonnormally distributed data at small sample sizes. A Monte Carlo study is conducted to compare the performances of the four corrected statistics against those of existing statistics regarding Type I error rate. Results show that the performances of the four new statistics are relatively stable compared with those of existing statistics. In particular, Type I error rates of a new statistic are close to the nominal level across all sample sizes under a condition of asymptotic robustness. Other new statistics also exhibit improved Type I error control, especially with nonnormally distributed data at small sample sizes.  相似文献   

2.
McDonald goodness‐of‐fit indices based on maximum likelihood, asymptotic distribution free, and the Satorra‐Bentler scale correction estimation methods are investigated. Sampling experiments are conducted to assess the magnitude of error for each index under variations in distributional misspecification, structural misspecification, and sample size. The Satorra‐Bentler correction‐based index is shown to have the least error under each distributional misspecification level when the model has correct structural specification. The scaled index also performs adequately when there is minor structural misspecification and distributional misspecification. However, when a model has major structural misspecification with distributional misspecification, none of the estimation methods perform adequately.  相似文献   

3.
We introduce and evaluate a new class of approximations to common test statistics in structural equation modeling. Such test statistics asymptotically follow the distribution of a weighted sum of i.i.d. chi-square variates, where the weights are eigenvalues of a certain matrix. The proposed eigenvalue block averaging (EBA) method involves creating blocks of these eigenvalues and replacing them within each block with the block average. The Satorra–Bentler scaling procedure is a special case of this framework, using one single block. The proposed procedure applies also to difference testing among nested models. We investigate the EBA procedure both theoretically in the asymptotic case, and with simulation studies for the finite-sample case, under both maximum likelihood and diagonally weighted least squares estimation. Comparison is made with 3 established approximations: Satorra–Bentler, the scaled and shifted, and the scaled F tests.  相似文献   

4.
In the application of the Satorra–Bentler scaling correction, the choices of normal-theory weight matrices (i.e., the model-predicted vs. the sample covariance matrix) in the calculation of the correction remains unclear. Different software programs use different matrices by default. This simulation study investigates the discrepancies due to the weight matrices in the robust chi-square statistics, standard errors, and chi-square-based model fit indexes. This study varies the sample sizes at 100, 200, 500, and 1,000; kurtoses at 0, 7, and 21; and degrees of model misspecification, measured by the population root mean square error of approximation (RMSEA), at 0, .03, .05, .08, .10, and .15. The results favor the use of the model-predicted covariance matrix because it results in less false rejection rates under the correctly specified model, as well as more accurate standard errors across all conditions. For the sample-corrected robust RMSEA, comparative fit index (CFI) and Tucker–Lewis index (TLI), 2 matrices result in negligible differences.  相似文献   

5.
Classical accounts of maximum likelihood (ML) estimation of structural equation models for continuous outcomes involve normality assumptions: standard errors (SEs) are obtained using the expected information matrix and the goodness of fit of the model is tested using the likelihood ratio (LR) statistic. Satorra and Bentler (1994) introduced SEs and mean adjustments or mean and variance adjustments to the LR statistic (involving also the expected information matrix) that are robust to nonnormality. However, in recent years, SEs obtained using the observed information matrix and alternative test statistics have become available. We investigate what choice of SE and test statistic yields better results using an extensive simulation study. We found that robust SEs computed using the expected information matrix coupled with a mean- and variance-adjusted LR test statistic (i.e., MLMV) is the optimal choice, even with normally distributed data, as it yielded the best combination of accurate SEs and Type I errors.  相似文献   

6.
The asymptotically distribution-free (ADF) test statistic depends on very mild distributional assumptions and is theoretically superior to many other so-called robust tests available in structural equation modeling. The ADF test, however, often leads to model overrejection even at modest sample sizes. To overcome its poor small-sample performance, a family of robust test statistics obtained by modifying the ADF statistics was recently proposed. This study investigates by simulation the performance of the new modified test statistics. The results revealed that although a few of the test statistics adequately controlled Type I error rates in each of the examined conditions, most performed quite poorly. This result underscores the importance of choosing a modified test statistic that performs well for specific examined conditions. A parametric bootstrap method is proposed for identifying such a best-performing modified test statistic. Through further simulation it is shown that the proposed bootstrap approach performs well.  相似文献   

7.
The accuracy of structural model parameter estimates in latent variable mixture modeling was explored with a 3 (sample size) × 3 (exogenous latent mean difference) × 3 (endogenous latent mean difference) × 3 (correlation between factors) × 3 (mixture proportions) factorial design. In addition, the efficacy of several likelihood-based statistics (Akaike's Information Criterion [AIC], Bayesian Information Ctriterion [BIC], the sample-size adjusted BIC [ssBIC], the consistent AIC [CAIC], the Vuong-Lo-Mendell-Rubin adjusted likelihood ratio test [aVLMR]), classification-based statistics (CLC [classification likelihood information criterion], ICL-BIC [integrated classification likelihood], normalized entropy criterion [NEC], entropy), and distributional statistics (multivariate skew and kurtosis test) were examined to determine which statistics best recover the correct number of components. Results indicate that the structural parameters were recovered, but the model fit statistics were not exceedingly accurate. The ssBIC statistic was the most accurate statistic, and the CLC, ICL-BIC, and aVLMR showed limited utility. However, none of these statistics were accurate for small samples (n = 500).  相似文献   

8.
We highlight critical conceptual and statistical issues and how to resolve them in conducting Satorra–Bentler (SB) scaled difference chi-square tests. Concerning the original (Satorra & Bentler, 2001 Satorra, A. and Bentler, P. M. 2001. A scaled difference chi-square test statistic for moment structure analysis. Psychometrika, 66: 507514. [Crossref], [Web of Science ®] [Google Scholar]) and new (Satorra & Bentler, 2010 Satorra, A. and Bentler, P. M. 2010. Ensuring positiveness of the scaled chi-square test statistic. Psychometrika, 75: 243248. [Crossref], [Web of Science ®] [Google Scholar]) scaled difference tests, a fundamental difference exists in how to compute properly a model's scaling correction factor (c), depending on the particular structural equation modeling software used. Because of how LISREL 8 defines the SB scaled chi-square, LISREL users should compute c for each model by dividing the model's normal theory weighted least-squares (NTWLS) chi-square by its SB chi-square, to recover c accurately with both tests. EQS and Mplus users, in contrast, should divide the model's maximum likelihood (ML) chi-square by its SB chi-square to recover c. Because ML estimation does not minimize the NTWLS chi-square, however, it can produce a negative difference in nested NTWLS chi-square values. Thus, we recommend the standard practice of testing the scaled difference in ML chi-square values for models M 1 and M 0 (after properly recovering c for each model), to avoid an inadmissible test numerator. We illustrate the difference in computations across software programs for the original and new scaled tests and provide LISREL, EQS, and Mplus syntax in both single- and multiple-group form for specifying the model M 10 that is involved in the new test.  相似文献   

9.
This article investigates the effect of the number of item response categories on chi‐square statistics for confirmatory factor analysis to assess whether a greater number of categories increases the likelihood of identifying spurious factors, as previous research had concluded. Four types of continuous single‐factor data were simulated for a 20‐item test: (a) uniform for all items, (b) symmetric unimodal for all items, (c) negatively skewed for all items, or (d) negatively skewed for 10 items and positively skewed for 10 items. For each of the 4 types of distributions, item responses were divided to yield item scores with 2,4, or 6 categories. The results indicated that the chi‐square statistic for evaluating a single‐factor model was most inflated (suggesting spurious factors) for 2‐category responses and became less inflated as the number of categories increased. However, the Satorra‐Bentler scaled chi‐square tended not to be inflated even for 2‐category responses, except if the continuous item data had both negatively and positively skewed distributions.  相似文献   

10.
The asymptotically distribution free (ADF) method is often used to estimate parameters or test models without a normal distribution assumption on variables, both in covariance structure analysis and in correlation structure analysis. However, little has been done to study the differences in behaviors of the ADF method in covariance versus correlation structure analysis. The behaviors of 3 test statistics frequently used to evaluate structural equation models with nonnormally distributed variables, χ2 test TAGLS and its small-sample variants TYB and TF(AGLS) were compared. Results showed that the ADF method in correlation structure analysis with test statistic TAGLS performs much better at small sample sizes than the corresponding test for covariance structures. In contrast, test statistics TYB and TF(AGLS) under the same conditions generally perform better with covariance structures than with correlation structures. It is proposed that excessively large and variable condition numbers of weight matrices are a cause of poor behavior of ADF test statistics in small samples, and results showed that these condition numbers are systematically increased with substantial increase in variance as sample size decreases. Implications for research and practice are discussed.  相似文献   

11.
We describe and evaluate a random permutation test of measurement invariance with ordered-categorical data. To calculate a p-value for the observed (?)χ2, an empirical reference distribution is built by repeatedly shuffling the grouping variable, then saving the χ2 from a configural model, or the ?χ2 between configural and scalar-invariance models, fitted to each permuted dataset. The current gold standard in this context is a robust mean- and variance-adjusted ?χ2 test proposed by Satorra (2000), which yields inflated Type I errors, particularly when thresholds are asymmetric, unless samples sizes are quite large (Bandalos, 2014; Sass et al., 2014). In a Monte Carlo simulation, we compare permutation to three implementations of Satorra’s robust χ2 across a variety of conditions evaluating configural and scalar invariance. Results suggest permutation can better control Type I error rates while providing comparable power under conditions that the standard robust test yields inflated errors.  相似文献   

12.
This simulation study assesses the statistical performance of two mathematically equivalent parameterizations for multitrait–multimethod data with interchangeable raters—a multilevel confirmatory factor analysis (CFA) and a classical CFA parameterization. The sample sizes of targets and raters, the factorial structure of the trait factors, and rater missingness are varied. The classical CFA approach yields a high proportion of improper solutions under conditions with small sample sizes and indicator-specific trait factors. In general, trait factor related parameters are more sensitive to bias than other types of parameters. For multilevel CFAs, there is a drastic bias in fit statistics under conditions with unidimensional trait factors on the between level, where root mean square error of approximation (RMSEA) and χ2 distributions reveal a downward bias, whereas the between standardized root mean square residual is biased upwards. In contrast, RMSEA and χ2 for classical CFA models are severely upwardly biased in conditions with a high number of raters and a small number of targets.  相似文献   

13.
A 2-stage robust procedure as well as an R package, rsem, were recently developed for structural equation modeling with nonnormal missing data by Yuan and Zhang (2012). Several test statistics that have been used for complete data analysis are employed to evaluate model fit in the 2-stage robust method. However, properties of these statistics under robust procedures for incomplete nonnormal data analysis have never been studied. This study aims to systematically evaluate and compare 5 test statistics, including a test statistic derived from normal-distribution-based maximum likelihood, a rescaled chi-square statistic, an adjusted chi-square statistic, a corrected residual-based asymptotical distribution-free chi-square statistic, and a residual-based F statistic. These statistics are evaluated under a linear growth curve model by varying 8 factors: population distribution, missing data mechanism, missing data rate, sample size, number of measurement occasions, covariance between the latent intercept and slope, variance of measurement errors, and downweighting rate of the 2-stage robust method. The performance of the test statistics varies and the one derived from the 2-stage normal-distribution-based maximum likelihood performs much worse than the other four. Application of the 2-stage robust method and of the test statistics is illustrated through growth curve analysis of mathematical ability development, using data on the Peabody Individual Achievement Test mathematics assessment from the National Longitudinal Survey of Youth 1997 Cohort.  相似文献   

14.
Statistical theories of goodness-of-fit tests in structural equation modeling are based on asymptotic distributions of test statistics. When the model includes a large number of variables or the population is not from a multivariate normal distribution, the asymptotic distributions do not approximate the distribution of the test statistics very well at small sample sizes. A variety of methods have been developed to improve the accuracy of hypothesis testing at small sample sizes. However, all these methods have their limitations, specially for nonnormal distributed data. We propose a Monte Carlo test that is able to control Type I error with more accuracy compared to existing approaches in both normal and nonnormally distributed data at small sample sizes. Extensive simulation studies show that the suggested Monte Carlo test has a more accurate observed significance level as compared to other tests with a reasonable power to reject misspecified models.  相似文献   

15.
We examine the accuracy of p values obtained using the asymptotic mean and variance (MV) correction to the distribution of the sample standardized root mean squared residual (SRMR) proposed by Maydeu-Olivares to assess the exact fit of SEM models. In a simulation study, we found that under normality, the MV-corrected SRMR statistic provides reasonably accurate Type I errors even in small samples and for large models, clearly outperforming the current standard, that is, the likelihood ratio (LR) test. When data shows excess kurtosis, MV-corrected SRMR p values are only accurate in small models (p = 10), or in medium-sized models (p = 30) if no skewness is present and sample sizes are at least 500. Overall, when data are not normal, the MV-corrected LR test seems to outperform the MV-corrected SRMR. We elaborate on these findings by showing that the asymptotic approximation to the mean of the SRMR sampling distribution is quite accurate, while the asymptotic approximation to the standard deviation is not.  相似文献   

16.
Mean and mean-and-variance corrections are the 2 major principles to develop test statistics with violation of conditions. In structural equation modeling (SEM), mean-rescaled and mean-and-variance-adjusted test statistics have been recommended under different contexts. However, recent studies indicated that their Type I error rates vary from 0% to 100% as the number of variables p increases. Can we still trust the 2 principles and what alternative rules can be used to develop test statistics for SEM with “big data”? This article addresses the issues by a large-scale Monte Carlo study. Results indicate that empirical means and standard deviations of each statistic can differ from their expected values many times in standardized units when p is large. Thus, the problems in Type I error control with the 2 statistics are because they do not possess the properties to which they are entitled, not because of the wrongdoing of the mean and mean-and-variance corrections. However, the 2 principles need to be implemented using small sample methodology instead of asymptotics. Results also indicate that distributions other than chi-square might better describe the behavior of test statistics in SEM with big data.  相似文献   

17.
Smoothing is designed to yield smoother equating results that can reduce random equating error without introducing very much systematic error. The main objective of this study is to propose a new statistic and to compare its performance to the performance of the Akaike information criterion and likelihood ratio chi-square difference statistics in selecting the smoothing parameter for polynomial loglinear equating under the random groups design. These model selection statistics were compared for four sample sizes (500, 1,000, 2,000, and 3,000) and eight simulated equating conditions, including both conditions where equating is not needed and conditions where equating is needed. The results suggest that all model selection statistics tend to improve the equating accuracy by reducing the total equating error. The new statistic tended to have less overall error than the other two methods.  相似文献   

18.
This study examined and compared various statistical methods for detecting individual differences in change. Considering 3 issues including test forms (specific vs. generalized), estimation procedures (constrained vs. unconstrained), and nonnormality, we evaluated 4 variance tests including the specific Wald variance test, the generalized Wald variance test, the specific likelihood ratio (LR) variance test, and the generalized LR variance test under both constrained and unconstrained estimation for both normal and nonnormal data. For the constrained estimation procedure, both the mixture distribution approach and the alpha correction approach were evaluated for their performance in dealing with the boundary problem. To deal with the nonnormality issue, we used the sandwich standard error (SE) estimator for the Wald tests and the Satorra–Bentler scaling correction for the LR tests. Simulation results revealed that testing a variance parameter and the associated covariances (generalized) had higher power than testing the variance solely (specific), unless the true covariances were zero. In addition, the variance tests under constrained estimation outperformed those under unconstrained estimation in terms of higher empirical power and better control of Type I error rates. Among all the studied tests, for both normal and nonnormal data, the robust generalized LR and Wald variance tests with the constrained estimation procedure were generally more powerful and had better Type I error rates for testing variance components than the other tests. Results from the comparisons between specific and generalized variance tests and between constrained and unconstrained estimation were discussed.  相似文献   

19.
The asymptotic performance of structural equation modeling tests and standard errors are influenced by two factors: the model and the asymptotic covariance matrix Γ of the sample covariances. Although most simulation studies clearly specify model conditions, specification of Γ is usually limited to values of univariate skewness and kurtosis. We illustrate that marginal skewness and kurtosis are not sufficient to adequately specify a nonnormal simulation condition by showing that asymptotic standard errors and test statistics vary substantially among distributions with skewness and kurtosis that are identical. We argue therefore that Γ should be reported when presenting the design of simulation studies. We show how Γ can be exactly calculated under the widely used Vale–Maurelli transform. We suggest plotting the elements of Γ and reporting the eigenvalues associated with the test statistic. R code is provided.  相似文献   

20.
Ill conditioning of covariance and weight matrices used in structural equation modeling (SEM) is a possible source of inadequate performance of SEM statistics in nonasymptotic samples. A maximum a posteriori (MAP) covariance matrix is proposed for weight matrix regularization in normal theory generalized least squares (GLS) estimation. Maximum likelihood (ML), GLS, and regularized GLS test statistics (RGLS and rGLS) are studied by simulation in a 15-variable, 3-factor model with 15 levels of sample size varying from 60 to 100,000. A key result showed that in terms of nominal rejection rates, RGLS outperformed ML at all sample sizes below 500, and GLS at most sample sizes below 500. In larger samples, their performance was equivalent. The second regularization methodology (rGLS) performed well asymptotically, but poorly in small samples. Regularization in SEM deserves further study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号