期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Evaluating Model Fit With Ordered Categorical Data Within a Measurement Invariance Framework: A Comparison of Estimators

Daniel A. Sass Thomas A. Schmitt Herbert W. Marsh 《Structural equation modeling》2013,20(2):167-180

A paucity of research has compared estimation methods within a measurement invariance (MI) framework and determined if research conclusions using normal-theory maximum likelihood (ML) generalizes to the robust ML (MLR) and weighted least squares means and variance adjusted (WLSMV) estimators. Using ordered categorical data, this simulation study aimed to address these queries by investigating 342 conditions. When testing for metric and scalar invariance, Δχ² results revealed that Type I error rates varied across estimators (ML, MLR, and WLSMV) with symmetric and asymmetric data. The Δχ² power varied substantially based on the estimator selected, type of noninvariant indicator, number of noninvariant indicators, and sample size. Although some the changes in approximate fit indexes (ΔAFI) are relatively sample size independent, researchers who use the ΔAFI with WLSMV should use caution, as these statistics do not perform well with misspecified models. As a supplemental analysis, our results evaluate and suggest cutoff values based on previous research. 相似文献

2.

Examining Chi-Square Test Statistics Under Conditions of Large Model Size and Ordinal Data

Dexin Shi Christine DiStefano Heather L. McDaniel Zhehan Jiang 《Structural equation modeling》2018,25(6):924-945

This study examined the effect of model size on the chi-square test statistics obtained from ordinal factor analysis models. The performance of six robust chi-square test statistics were compared across various conditions, including number of observed variables (p), number of factors, sample size, model (mis)specification, number of categories, and threshold distribution. Results showed that the unweighted least squares (ULS) robust chi-square statistics generally outperform the diagonally weighted least squares (DWLS) robust chi-square statistics. The ULSM estimator performed the best overall. However, when fitting ordinal factor analysis models with a large number of observed variables and small sample size, the ULSM-based chi-square tests may yield empirical variances that are noticeably larger than the theoretical values and inflated Type I error rates. On the other hand, when the number of observed variables is very large, the mean- and variance-corrected chi-square test statistics (e.g., based on ULSMV and WLSMV) could produce empirical variances conspicuously smaller than the theoretical values and Type I error rates lower than the nominal level, and demonstrate lower power rates to reject misspecified models. Recommendations for applied researchers and future empirical studies involving large models are provided. 相似文献

3.

Is Parceling Really Necessary? A Comparison of Results From Item Parceling and Categorical Variable Methodology

Deborah L. Bandalos 《Structural equation modeling》2013,20(2):211-240

This study examined the efficacy of 4 different parceling methods for modeling categorical data with 2, 3, and 4 categories and with normal, moderately nonnormal, and severely nonnormal distributions. The parceling methods investigated were isolated parceling in which items were parceled with other items sharing the same source of variance, and distributed parceling in which items were parceled with items influenced by different factors. These parceling strategies were crossed with strategies in which items were either parceled with similarly distributed or differently distributed items, to create 4 different parceling methods. Overall, parceling together items influenced by different factors and with different distributions resulted in better model fit, but high levels of parameter estimate bias. Across all parceling methods, parameter estimate bias ranged from 20% to over 130%. Parceling strategies were contrasted with use of the WLSMV estimator for categorical, unparceled data. Results based on this estimator are encouraging, although some bias was found when high levels of nonnormality were present. Values of the chi-square and root mean squared error of approximation based on WLSMV also resulted in Type II error rates for misspecified models when data were severely nonnormally distributed. 相似文献

4.

A Comparison of Diagonal Weighted Least Squares Robust Estimation Techniques for Ordinal Data

Christine DiStefano Grant B. Morgan 《Structural equation modeling》2013,20(3):425-438

This study compared diagonal weighted least squares robust estimation techniques available in 2 popular statistical programs: diagonal weighted least squares (DWLS; LISREL version 8.80) and weighted least squares–mean (WLSM) and weighted least squares—mean and variance adjusted (WLSMV; Mplus version 6.11). A 20-item confirmatory factor analysis was estimated using item-level ordered categorical data. Three different nonnormality conditions were applied to 2- to 7-category data with sample sizes of 200, 400, and 800. Convergence problems were seen with nonnormal data when DWLS was used with few categories. Both DWLS and WLSMV produced accurate parameter estimates; however, bias in standard errors of parameter estimates was extreme for select conditions when nonnormal data were present. The robust estimators generally reported acceptable model–data fit, unless few categories were used with nonnormal data at smaller sample sizes; WLSMV yielded better fit than WLSM for most indices. 相似文献

5.

A Regularized GLS for Structural Equation Modeling

Erin H. Arruda Peter M. Bentler 《Structural equation modeling》2017,24(5):657-665

Ill conditioning of covariance and weight matrices used in structural equation modeling (SEM) is a possible source of inadequate performance of SEM statistics in nonasymptotic samples. A maximum a posteriori (MAP) covariance matrix is proposed for weight matrix regularization in normal theory generalized least squares (GLS) estimation. Maximum likelihood (ML), GLS, and regularized GLS test statistics (RGLS and rGLS) are studied by simulation in a 15-variable, 3-factor model with 15 levels of sample size varying from 60 to 100,000. A key result showed that in terms of nominal rejection rates, RGLS outperformed ML at all sample sizes below 500, and GLS at most sample sizes below 500. In larger samples, their performance was equivalent. The second regularization methodology (rGLS) performed well asymptotically, but poorly in small samples. Regularization in SEM deserves further study. 相似文献

6.

Causal Mediation Analysis With a Binary Outcome and Multiple Continuous or Ordinal Mediators: Simulations and Application to an Alcohol Intervention

Trang Quynh Nguyen Yenny Webb-Vargas Ina M. Koning Elizabeth A. Stuart 《Structural equation modeling》2016,23(3):368-383

We investigate a method to estimate the combined effect of multiple continuous/ordinal mediators on a binary outcome: (a) fit a structural equation model with probit link for the outcome and identity/probit link for continuous/ordinal mediators, (b) predict potential outcome probabilities, and (c) compute natural direct and indirect effects. Step 2 involves rescaling the latent continuous variable underlying the outcome to address residual mediator variance and covariance. We evaluate the estimation of risk-difference- and risk-ratio-based effects (RDs, RRs) using the maximum likelihood (ML), mean-and-variance-adjusted weighted least squares (WLSMV) and Bayes estimators in Mplus. Across most variations in path-coefficient and mediator-residual-correlation signs and strengths, and confounding situations investigated, the method performs well with all estimators, but favors ML/WLSMV for RDs with continuous mediators, and Bayes for RRs with ordinal mediators. Bayes outperforms ML/WLSMV regardless of mediator type when estimating RRs with small potential outcome probabilities and in two other special cases. An adolescent alcohol prevention study is used for illustration. 相似文献

7.

Effects of Missing Data Methods in Structural Equation Modeling With Nonnormal Longitudinal Data

Tacksoo Shin Mark L. Davison Jeffrey D. Long 《Structural equation modeling》2013,20(1):70-98

The purpose of this study is to investigate the effects of missing data techniques in longitudinal studies under diverse conditions. A Monte Carlo simulation examined the performance of 3 missing data methods in latent growth modeling: listwise deletion (LD), maximum likelihood estimation using the expectation and maximization algorithm with a nonnormality correction (robust ML), and the pairwise asymptotically distribution-free method (pairwise ADF). The effects of 3 independent variables (sample size, missing data mechanism, and distribution shape) were investigated on convergence rate, parameter and standard error estimation, and model fit. The results favored robust ML over LD and pairwise ADF in almost all respects. The exceptions included convergence rates under the most severe nonnormality in the missing not at random (MNAR) condition and recovery of standard error estimates across sample sizes. The results also indicate that nonnormality, small sample size, MNAR, and multicollinearity might adversely affect convergence rate and the validity of statistical inferences concerning parameter estimates and model fit statistics. 相似文献

8.

A Two-Stage Approach to Synthesizing Covariance Matrices in Meta-Analytic Structural Equation Modeling

Mike W.L. Cheung Wai Chan 《Structural equation modeling》2013,20(1):28-53

A great obstacle for wider use of structural equation modeling (SEM) has been the difficulty in handling categorical variables. Two data sets with known structure between 2 related binary outcomes and 4 independent binary variables were generated. Four SEM strategies and resulting apparent validity were tested: robust maximum likelihood (ML), tetrachoric correlation matrix input followed by SEM ML analysis, SEM ML estimation for the sum of squares and cross-products (SSCP) matrix input obtained by the log-linear model that treated all variables as dependent, and asymptotic distribution-free (ADF) SEM estimation. SEM based on the SSCP matrix obtained by the log-linear model and SEM using robust ML estimation correctly identified the structural relation between the variables. SEM using ADF added an extra parameter. SEM based on tetrachoric correlation input did not specify the data generating process correctly. Apparent validity was similar for all models presented. Data transformation used in log-linear modeling can serve as an input for SEM. 相似文献

9.

Logistic Regression Procedure Using Penalized Maximum Likelihood Estimation for Differential Item Functioning

Sunbok Lee 《Journal of Educational Measurement》2020,57(3):443-457

In the logistic regression (LR) procedure for differential item functioning (DIF), the parameters of LR have often been estimated using maximum likelihood (ML) estimation. However, ML estimation suffers from the finite-sample bias. Furthermore, ML estimation for LR can be substantially biased in the presence of rare event data. The bias of ML estimation due to small samples and rare event data can degrade the performance of the LR procedure, especially when testing the DIF of difficult items in small samples. Penalized ML (PML) estimation was originally developed to reduce the finite-sample bias of conventional ML estimation and also was known to reduce the bias in the estimation of LR for the rare events data. The goal of this study is to compare the performances of the LR procedures based on the ML and PML estimation in terms of the statistical power and Type I error. In a simulation study, Swaminathan and Rogers's Wald test based on PML estimation (PSR) showed the highest statistical power in most of the simulation conditions, and LRT based on conventional PML estimation (PLRT) showed the most robust and stable Type I error. The discussion about the trade-off between bias and variance is presented in the discussion section. 相似文献

10.

Relative Performance of Categorical Diagonally Weighted Least Squares and Robust Maximum Likelihood Estimation

Deborah L. Bandalos 《Structural equation modeling》2013,20(1):102-116

Robust maximum likelihood (ML) and categorical diagonally weighted least squares (cat-DWLS) estimation have both been proposed for use with categorized and nonnormally distributed data. This study compares results from the 2 methods in terms of parameter estimate and standard error bias, power, and Type I error control, with unadjusted ML and WLS estimation methods included for purposes of comparison. Conditions manipulated include model misspecification, level of asymmetry, level and categorization, sample size, and type and size of the model. Results indicate that cat-DWLS estimation method results in the least parameter estimate and standard error bias under the majority of conditions studied. Cat-DWLS parameter estimates and standard errors were generally the least affected by model misspecification of the estimation methods studied. Robust ML also performed well, yielding relatively unbiased parameter estimates and standard errors. However, both cat-DWLS and robust ML resulted in low power under conditions of high data asymmetry, small sample sizes, and mild model misspecification. For more optimal conditions, power for these estimators was adequate. 相似文献

11.

Effect of Unequal Variances in Proficiency Distributions on Type-I Error of the Mantel-Haenszel Chi-square Test for Differential Item Functioning

Patrick O. Monahan Robert D. Ankenmann 《Journal of Educational Measurement》2005,42(2):101-131

Empirical studies demonstrated Type-I error (TIE) inflation (especially for highly discriminating easy items) of the Mantel-Haenszel chi-square test for differential item functioning (DIF), when data conformed to item response theory (IRT) models more complex than Rasch, and when IRT proficiency distributions differed only in means. However, no published study manipulated proficiency variance ratio (VR). Data were generated with the three-parameter logistic (3PL) IRT model. Proficiency VRs were 1, 2, 3, and 4. The present study suggests inflation may be greater, and may affect all highly discriminating items (low, moderate, and high difficulty), when IRT proficiency distributions of reference and focal groups differ also in variances. Inflation was greatest on the 21-item test (vs. 41) and 2,000 total sample size (vs. 1,000). Previous studies had not systematically examined sample size ratio. Sample size ratio of 1:1 produced greater TIE inflation than 3:1, but primarily for total sample size of 2,000. 相似文献

12.

Loglinear analysis of cross-classified ordinal data: applications in developmental research

J A Green 《Child development》1988,59(1):1-25

This article provides an introduction to loglinear analysis of cross-classification tables, including tables with nominal and ordinal variables. Loglinear models offer several advantages over the more commonly used chi-square test of independence, including the ability to analyze 3-, 4-, and higher-way interactions, the ability to determine whether the association between variables is linear or nonlinear, and the ability to interpret scale scores assigned to categories of an ordinal variable. After a review of the advantages of loglinear modeling, the chi-square test of independence is compared with the loglinear model of independence. This comparison serves to introduce the notation and terminology of loglinear modeling. The overall strategy of loglinear modeling is introduced next; then special loglinear models for ordinal data are reviewed. Each model discussed in the article is applied to data from the developmental literature. 相似文献

13.

Confirmatory Factor Analysis of Ordinal Variables With Misspecified Models

Fan Yang-Wallentin Karl G. Jöreskog Hao Luo 《Structural equation modeling》2013,20(3):392-423

Ordinal variables are common in many empirical investigations in the social and behavioral sciences. Researchers often apply the maximum likelihood method to fit structural equation models to ordinal data. This assumes that the observed measures have normal distributions, which is not the case when the variables are ordinal. A better approach is to use polychoric correlations and fit the models using methods such as unweighted least squares (ULS), maximum likelihood (ML), weighted least squares (WLS), or diagonally weighted least squares (DWLS). In this simulation evaluation we study the behavior of these methods in combination with polychoric correlations when the models are misspecified. We also study the effect of model size and number of categories on the parameter estimates, their standard errors, and the common chi-square measures of fit when the models are both correct and misspecified. When used routinely, these methods give consistent parameter estimates but ULS, ML, and DWLS give incorrect standard errors. Correct standard errors can be obtained for these methods by robustification using an estimate of the asymptotic covariance matrix W of the polychoric correlations. When used in this way the methods are here called RULS, RML, and RDWLS. 相似文献

14.

Using Instrumental Variables to Estimate the Parameters in Unconditional and Conditional Second-Order Latent Growth Models

Steffen Nestler 《Structural equation modeling》2013,20(3):461-473

This article applies Bollen’s (1996) 2-stage least squares/instrumental variables (2SLS/IV) approach for estimating the parameters in an unconditional and a conditional second-order latent growth model (LGM). First, the 2SLS/IV approach for the estimation of the means and the path coefficients in a second-order LGM is derived. An empirical example is then used to show that 2SLS/IV yields estimates that are similar to maximum likelihood (ML) in the estimation of a conditional second-order LGM. Three subsequent simulation studies are then presented to show that the new approach is as accurate as ML and that it is more robust against misspecifications of the growth trajectory than ML. Together, these results suggest that 2SLS/IV should be considered as an alternative to the commonly applied ML estimator. 相似文献

15.

Performance of Bootstrapping Approaches to Model Test Statistics and Parameter Standard Error Estimation in Structural Equation Modeling

《Structural equation modeling》2013,20(3):353-377

Though the common default maximum likelihood estimator used in structural equation modeling is predicated on the assumption of multivariate normality, applied researchers often find themselves with data clearly violating this assumption and without sufficient sample size to utilize distribution-free estimation methods. Fortunately, promising alternatives are being integrated into popular software packages. Bootstrap resampling, which is offered in AMOS (Arbuckle, 1997), is one potential solution for estimating model test statistic p values and parameter standard errors under nonnormal data conditions. This study is an evaluation of the bootstrap method under varied conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Accuracy of the test statistic p values is evaluated in terms of model rejection rates, whereas accuracy of bootstrap standard error estimates takes the form of bias and variability of the standard error estimates themselves. 相似文献

16.

The Performance of ML,GLS, and WLS Estimation in Structural Equation Modeling Under Conditions of Misspecification and Nonnormality

《Structural equation modeling》2013,20(4):557-595

This simulation study demonstrates how the choice of estimation method affects indexes of fit and parameter bias for different sample sizes when nested models vary in terms of specification error and the data demonstrate different levels of kurtosis. Using a fully crossed design, data were generated for 11 conditions of peakedness, 3 conditions of misspecification, and 5 different sample sizes. Three estimation methods (maximum likelihood [ML], generalized least squares [GLS], and weighted least squares [WLS]) were compared in terms of overall fit and the discrepancy between estimated parameter values and the true parameter values used to generate the data. Consistent with earlier findings, the results show that ML compared to GLS under conditions of misspecification provides more realistic indexes of overall fit and less biased parameter values for paths that overlap with the true model. However, despite recommendations found in the literature that WLS should be used when data are not normally distributed, we find that WLS under no conditions was preferable to the 2 other estimation procedures in terms of parameter bias and fit. In fact, only for large sample sizes (N = 1,000 and 2,000) and mildly misspecified models did WLS provide estimates and fit indexes close to the ones obtained for ML and GLS. For wrongly specified models WLS tended to give unreliable estimates and over-optimistic values of fit. 相似文献

17.

A Comparison of Estimation Techniques for IRT Models With Small Samples

Holmes Finch Brian F. French 《教育实用测度》2019,32(2):77-96

The usefulness of item response theory (IRT) models depends, in large part, on the accuracy of item and person parameter estimates. For the standard 3 parameter logistic model, for example, these parameters include the item parameters of difficulty, discrimination, and pseudo-chance, as well as the person ability parameter. Several factors impact traditional marginal maximum likelihood (ML) estimation of IRT model parameters, including sample size, with smaller samples generally being associated with lower parameter estimation accuracy, and inflated standard errors for the estimates. Given this deleterious impact of small samples on IRT model performance, use of these techniques with low-incidence populations, where it might prove to be particularly useful, estimation becomes difficult, especially with more complex models. Recently, a Pairwise estimation method for Rasch model parameters has been suggested for use with missing data, and may also hold promise for parameter estimation with small samples. This simulation study compared item difficulty parameter estimation accuracy of ML with the Pairwise approach to ascertain the benefits of this latter method. The results support the use of the Pairwise method with small samples, particularly for obtaining item location estimates. 相似文献

18.

EMPIRICAL COMPARISON OF SELECTED ITEM BIAS DETECTION PROCEDURES WITH BIAS MANIPULATION

MICHAEL J. SUBKOVIAK JOANNE S. MACK GAIL H. IRONSON ROBERT D. CRAIG 《Journal of Educational Measurement》1984,21(1):49-58

Biased test items were intentionally imbedded within a set of test items, and the resulting instrument was administered to large samples of blacks and whites. Three popular item bias detection procedures were then applied to the data: (1) the three-parameter item characteristic curve procedure, (2) the chi-square method, and (3) the transformed item difficulty approach. The three-parameter item characteristic curve procedure proved most effective at detecting the intentionally biased test items; and the chi-square method was viewed as the best alternative. The transformed item difficulty approach has certain limitations yet represents a practical alternative if sample size, lack of computer facilities, or the like preclude the use of the other two procedures. 相似文献

19.

Evaluation of a New Mean Scaled and Moment Adjusted Test Statistic for SEM

Xiaoxiao Tong Peter M. Bentler 《Structural equation modeling》2013,20(1):148-156

Recently a new mean scaled and skewness adjusted test statistic was developed for evaluating structural equation models in small samples and with potentially nonnormal data, but this statistic has received only limited evaluation. The performance of this statistic is compared to normal theory maximum likelihood and 2 well-known robust test statistics. A modification to the Satorra–Bentler scaled statistic is developed for the condition that sample size is smaller than degrees of freedom. The behavior of the 4 test statistics is evaluated with a Monte Carlo confirmatory factor analysis study that varies 7 sample sizes and 3 distributional conditions obtained using Headrick's fifth-order transformation to nonnormality. The new statistic performs badly in most conditions except under the normal distribution. The goodness-of-fit χ² test based on maximum-likelihood estimation performed well under normal distributions as well as under a condition of asymptotic robustness. The Satorra–Bentler scaled test statistic performed best overall, whereas the mean scaled and variance adjusted test statistic outperformed the others at small and moderate sample sizes under certain distributional conditions. 相似文献

20.

Effect of Sample Size Ratio and Model Misfit When Using the Difficulty Parameter Differences Procedure to Detect DIF

Ángela I. Berrío Juana Gómez-Benito 《Journal of Experimental Education》2019,87(3):367-383

This study examined the effect of sample size ratio and model misfit on the Type I error rates and power of the Difficulty Parameter Differences procedure using Winsteps. A unidimensional 30-item test with responses from 130,000 examinees was simulated and four independent variables were manipulated: sample size ratio (20/100/250/500/1000); model fit/misfit (1 PL and 3PLc =. 15 models); impact (no difference/mean differences/variance differences/mean and variance differences); and percentage of items with uniform and nonuniform DIF (0%/10%/20%). In general, the results indicate the importance of ensuring model fit to achieve greater control of Type I error and adequate statistical power. The manipulated variables produced inflated Type I error rates, which were well controlled when a measure of DIF magnitude was applied. Sample size ratio also had an effect on the power of the procedure. The paper discusses the practical implications of these results. 相似文献