首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 675 毫秒
1.
Nonparametric and robust statistics (those using trimmed means and Winsorized variances) were compared for their ability to detect treatment effects in the 2-sample case. In particular, 2 specialized tests, tests designed to be sensitive to treatment effects when the distributions of the data are skewed to the right, were compared with 2 nonspecialized nonparametric (Wilcoxon-Mann-Whitney; Mann &; Whitney, 1947; Wilcoxon, 1949) and trimmed (Yuen, 1974) tests for 6 nonnormal distributions that varied according to their measures of. skewness and kurtosis. As expected, the specialized tests provided more power to detect treatment effects, particularly for the nonparametric comparison. However, when distributions were symmetric, the nonspecialized tests were more powerful; therefore, for all the distributions investigated, power differences did not favor the specialized tests. Consequently, the specialized tests are not recommended; researchers would have to know the shapes of the distributions that they work with in order to benefit from specialized tests. In addition, the nonparametric approach resulted in more power than the trimmed-means approach did.  相似文献   

2.
This paper presents the results of a simulation study to compare the performance of the Mann-Whitney U test, Student?s t test, and the alternate (separate variance) t test for two mutually independent random samples from normal distributions, with both one-tailed and two-tailed alternatives. The estimated probability of a Type I error was controlled (in the sense of being reasonably close to the attainable level) by all three tests when the variances were equal, regardless of the sample sizes. However, it was controlled only by the alternate t test for unequal variances with unequal sample sizes. With equal sample sizes, the probability was controlled by all three tests regardless of the variances. When it was controlled, we also compared the power of these tests and found very little difference. This means that very little power will be lost if the Mann-Whitney U test is used instead of tests that require the assumption of normal distributions.  相似文献   

3.
This Monte Carlo simulation study investigated the impact of nonnormality on estimating and testing mediated effects with the parallel process latent growth model and 3 popular methods for testing the mediated effect (i.e., Sobel’s test, the asymmetric confidence limits, and the bias-corrected bootstrap). It was found that nonnormality had little effect on the estimates of the mediated effect, standard errors, empirical Type I error, and power rates in most conditions. In terms of empirical Type I error and power rates, the bias-corrected bootstrap performed best. Sobel’s test produced very conservative Type I error rates when the estimated mediated effect and standard error had a relationship, but when the relationship was weak or did not exist, the Type I error was closer to the nominal .05 value.  相似文献   

4.
The authors used Johnson's transformation with approximate test statistics to test the homogeneity of simple linear regression slopes when both xij and xij may have nonnormal distributions and there is Type I heteroscedasticity, Type II heteroscedasticity, or complete heteroscedasticity. The test statistic t was first transformed by Johnson's method for each group to correct the nonnormality and to correct the heteroscedasticity; also an approximate test, such as the Welch test or the DeShon-Alexander test, was applied to test the homogeneity of the regression slopes. Computer simulations showed that the proposed technique can control Type I error rate under various circumstances. Finally, the authors provide an example to demonstrate the calculation.  相似文献   

5.
Type I error rate and power for the t test, Wilcoxon-Mann-Whitney (U) test, van der Waerden Normal Scores (NS) test, and Welch-Aspin-Satterthwaite (W) test were compared for two independent random samples drawn from nonnormal distributions. Data with varying degrees of skewness (S) and kurtosis (K) were generated using Fleishman's (1978) power function. Five sample size combinations were used with both equal and unequal variances. For nonnormal data with equal variances, the power of the U test exceeded the power of the t test regardless of sample size. When the sample sizes were equal but the variances were unequal, the t test proved to be the most powerful test. When variances and sample sizes were unequal, the W test became the test of choice because it was the only test that maintained its nominal Type I error rate.  相似文献   

6.
A computer program generated power functions of the Student t test and Mann-Whitney U test under violation of the parametric assumption of homogeneity of variance for equal and unequal sample sizes. In addition to depression and elevation of nominal significance levels of the t test observed by Hsu and by Scheffé, the entire power functions of both the t test and the U test were depressed or elevated. When the smaller sample was associated with a smaller variance, the U test was more powerful in detecting differences over the entire range of possible differences between population means. When sample sizes were equal, or when the smaller sample had the larger variance, the t test was more powerful over this entire range. These results show that replacement of the t test by a nonparametric alternative under violation of homogeneity of variance does not necessarily maximize correct decisions.  相似文献   

7.
Increasing the correlation between the independent variable and the mediator (a coefficient) increases the effect size (ab) for mediation analysis; however, increasing a by definition increases collinearity in mediation models. As a result, the standard error of product tests increase. The variance inflation caused by increases in a at some point outweighs the increase of the effect size (ab) and results in a loss of statistical power. This phenomenon also occurs with nonparametric bootstrapping approaches because the variance of the bootstrap distribution of ab approximates the variance expected from normal theory. Both variances increase dramatically when a exceeds the b coefficient, thus explaining the power decline with increases in a. Implications for statistical analysis and applied researchers are discussed.  相似文献   

8.
One challenge in mediation analysis is to generate a confidence interval (CI) with high coverage and power that maintains a nominal significance level for any well-defined function of indirect and direct effects in the general context of structural equation modeling (SEM). This study discusses a proposed Monte Carlo extension that finds the CIs for any well-defined function of the coefficients of SEM such as the product of k coefficients and the ratio of the contrasts of indirect effects, using the Monte Carlo method. Finally, we conduct a small-scale simulation study to compare CIs produced by the Monte Carlo, nonparametric bootstrap, and asymptotic-delta methods. Based on our simulation study, we recommend researchers use the Monte Carlo method to test a complex function of indirect effects.  相似文献   

9.
When both model misspecifications and nonnormal data are present, it is unknown how trustworthy various point estimates, standard errors (SEs), and confidence intervals (CIs) are for standardized structural equation modeling parameters. We conducted simulations to evaluate maximum likelihood (ML), conventional robust SE estimator (MLM), Huber–White robust SE estimator (MLR), and the bootstrap (BS). We found (a) ML point estimates can sometimes be quite biased at finite sample sizes if misfit and nonnormality are serious; (b) ML and MLM generally give egregiously biased SEs and CIs regardless of the degree of misfit and nonnormality; (c) MLR and BS provide trustworthy SEs and CIs given medium misfit and nonnormality, but BS is better; and (d) given severe misfit and nonnormality, MLR tends to break down and BS begins to struggle.  相似文献   

10.
When cut scores for classifications occur on the total score scale, popular methods for estimating classification accuracy (CA) and classification consistency (CC) require assumptions about a parametric form of the test scores or about a parametric response model, such as item response theory (IRT). This article develops an approach to estimate CA and CC nonparametrically by replacing the role of the parametric IRT model in Lee's classification indices with a modified version of Ramsay's kernel‐smoothed item response functions. The performance of the nonparametric CA and CC indices are tested in simulation studies in various conditions with different generating IRT models, test lengths, and ability distributions. The nonparametric approach to CA often outperforms Lee's method and Livingston and Lewis's method, showing robustness to nonnormality in the simulated ability. The nonparametric CC index performs similarly to Lee's method and outperforms Livingston and Lewis's method when the ability distributions are nonnormal.  相似文献   

11.
Gibbons and Chakraborti's (1991) interpretation of recent simulation results and their recommendations to researchers are misleading in some respects. The present note emphasizes that the Mann-Whitney test is not a suitable replacement of the Student t test when variances and sample sizes are unequal, irrespective of whether the assumption of normality is satisfied or violated. When both normality and homogeneity of variance are violated together, an effective procedure, not widely known to researchers in education and psychology, is the Fligner-Policello test or, alternatively, the Welch t' test in conjunction with transformation of the original scores to ranks.  相似文献   

12.
This article investigates likelihood-based difference statistics for testing nonlinear effects in structural equation modeling using the latent moderated structural equations (LMS) approach. In addition to the standard difference statistic TD, 2 robust statistics have been developed in the literature to ensure valid results under the conditions of nonnormality or small sample sizes: the robust TDR and the “strictly positive” TDRP. These robust statistics have not been examined in combination with LMS yet. In 2 Monte Carlo studies we investigate the performance of these methods for testing quadratic or interaction effects subject to different sources of nonnormality, nonnormality due to the nonlinear terms, and nonnormality due to the distribution of the predictor variables. The results indicate that TD is preferable to both TDR and TDRP. Under the condition of strong nonlinear effects and nonnormal predictors, TDR often produced negative differences and TDRP showed no desirable power.  相似文献   

13.
The authors investigated 2 issues concerning the power of latent growth modeling (LGM) in detecting linear growth: the effect of the number of repeated measurements on LGM's power in detecting linear growth and the comparison between LGM and some other approaches in terms of power for detecting linear growth. A Monte Carlo simulation design was used, with 3 crossed factors (growth magnitude, number of repeated measurements, and sample size) and 1,000 replications within each cell condition. The major findings were as follows: For 3 repeated measurements, a substantial proportion of samples failed to converge in structural equation modeling; the number of repeated measurements did not show any effect on the statistical power of LGM in detecting linear growth; and the LGM approach outperformed both the dependent t test and repeated-measures analysis of variance (ANOVA) in terms of statistical power for detecting growth under the conditions of small growth magnitude and small to moderate sample size conditions. The multivariate repeated-measures ANOVA approach consistently underperformed the other tests.  相似文献   

14.
Conventionally, moderated mediation analysis is conducted through adding relevant interaction terms into a mediation model of interest. In this study, we illustrate how to conduct moderated mediation analysis by directly modeling the relation between the indirect effect components including a and b and the moderators, to permit easier specification and interpretation of moderated mediation. With this idea, we introduce a general moderated mediation model that can be used to model many different moderated mediation scenarios including the scenarios described in Preacher, Rucker, and Hayes (2007). Then we discuss how to estimate and test the conditional indirect effects and to test whether a mediation effect is moderated using Bayesian approaches. How to implement the estimation in both BUGS and Mplus is also discussed. Performance of Bayesian methods is evaluated and compared to that of frequentist methods including maximum likelihood (ML) with 1st-order and 2nd-order delta method standard errors and mL with bootstrap (percentile or bias-corrected confidence intervals) via a simulation study. The results show that Bayesian methods with diffuse (vague) priors implemented in both BUGS and Mplus yielded unbiased estimates, higher power than the ML methods with delta method standard errors, and the ML method with bootstrap percentile confidence intervals, and comparable power to the ML method with bootstrap bias-corrected confidence intervals. We also illustrate the application of these methods with the real data example used in Preacher et al. (2007). Advantages and limitations of applying Bayesian methods to moderated mediation analysis are also discussed.  相似文献   

15.
The authors compared the Type I error rate and the power to detect differences in slopes and additive treatment effects of analysis of covariance (ANCOVA) and randomized block (RB) designs with a Monte Carlo simulation. For testing differences in slopes, 3 methods were compared: the test of slopes from ANCOVA, the omnibus Block × Treatment interaction, and the linear component of the Block × Treatment interaction of RB. In the test for adjusted means, 2 variations of both ANCOVA and RB were used. The power of the omnibus test of the interaction decreased dramatically as the number of blocks used increased and was always considerably smaller than the specific test of differences in slopes found in ANCOVA. Tests for means when there were concomitant differences in slopes showed that only ANCOVA uniformly controlled Type I error under all configurations of design variables. The most powerful option in almost all simulations for tests of both slopes and means was ANCOVA.  相似文献   

16.
Though the common default maximum likelihood estimator used in structural equation modeling is predicated on the assumption of multivariate normality, applied researchers often find themselves with data clearly violating this assumption and without sufficient sample size to utilize distribution-free estimation methods. Fortunately, promising alternatives are being integrated into popular software packages. Bootstrap resampling, which is offered in AMOS (Arbuckle, 1997), is one potential solution for estimating model test statistic p values and parameter standard errors under nonnormal data conditions. This study is an evaluation of the bootstrap method under varied conditions of nonnormality, sample size, model specification, and number of bootstrap samples drawn from the resampling space. Accuracy of the test statistic p values is evaluated in terms of model rejection rates, whereas accuracy of bootstrap standard error estimates takes the form of bias and variability of the standard error estimates themselves.  相似文献   

17.
Recent advances in testing mediation have found that certain resampling methods and tests based on the mathematical distribution of 2 normal random variables substantially outperform the traditional z test. However, these studies have primarily focused only on models with a single mediator and 2 component paths. To address this limitation, a simulation was conducted to evaluate these alternative methods in a more complex path model with multiple mediators and indirect paths with 2 and 3 paths. Methods for testing contrasts of 2 effects were evaluated also. The simulation included 1 exogenous independent variable, 3 mediators and 2 outcomes and varied sample size, number of paths in the mediated effects, test used to evaluate effects, effect sizes for each path, and the value of the contrast. Confidence intervals were used to evaluate the power and Type I error rate of each method, and were examined for coverage and bias. The bias-corrected bootstrap had the least biased confidence intervals, greatest power to detect nonzero effects and contrasts, and the most accurate overall Type I error. All tests had less power to detect 3-path effects and more inaccurate Type I error compared to 2-path effects. Confidence intervals were biased for mediated effects, as found in previous studies. Results for contrasts did not vary greatly by test, although resampling approaches had somewhat greater power and might be preferable because of ease of use and flexibility.  相似文献   

18.
The relation between test reliability and statistical power has been a controversial issue, perhaps due in part to a 1975 publication in the Psychological Bulletin by Overall and Woodward, “Unreliability of Difference Scores: A Paradox for the Measurement of Change”, in which they demonstrated that a Student t test based on pretest-posttest differences can attain its greatest power when the difference score reliability is zero. In the present article, the authors attempt to explain this paradox by demonstrating in several ways that power is not a mathematical function of reliability unless either true score variance or error score variance is constant.  相似文献   

19.
Little research has examined factors influencing statistical power to detect the correct number of latent classes using latent profile analysis (LPA). This simulation study examined power related to interclass distance between latent classes given true number of classes, sample size, and number of indicators. Seven model selection methods were evaluated. None had adequate power to select the correct number of classes with a small (Cohen's d = .2) or medium (d = .5) degree of separation. With a very large degree of separation (d = 1.5), the Lo–Mendell–Rubin test (LMR), adjusted LMR, bootstrap likelihood ratio test, Bayesian Information Criterion (BIC), and sample-size-adjusted BIC were good at selecting the correct number of classes. However, with a large degree of separation (d = .8), power depended on number of indicators and sample size. Akaike's Information Criterion and entropy poorly selected the correct number of classes, regardless of degree of separation, number of indicators, or sample size.  相似文献   

20.
This article introduces a bootstrap generalization to the Modified Parallel Analysis (MPA) method of test dimensionality assessment using factor analysis. This methodology, based on the use of Marginal Maximum Likelihood nonlinear factor analysis, provides for the calculation of a test statistic based on a parametric bootstrap using the MPA methodology for generation of synthetic datasets. Performance of the bootstrap test was compared with the likelihood ratio difference test and the DIMTEST procedure using a Monte Carlo simulation. The bootstrap test was found to exhibit much better control of the Type I error rate than the likelihood ratio difference test, and comparable power to DIMTEST under most conditions. A major conclusion to be taken from this research is that under many real-world conditions, the bootstrap MPA test presents a useful alternative for practitioners using Marginal Maximum Likelihood factor analysis to test for multidimensional testing data.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号