期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Maximal Criterion Validity and Scale Criterion Validity: A Latent Variable Modeling Approach for Examining Their Difference

Tenko Raykov Siegfried Gabler Dimiter M. Dimitrov 《Structural equation modeling》2016,23(4):544-554

This article studies the difference between the criterion validity coefficient of the widely used overall scale score for a unidimensional multicomponent measuring instrument and the maximal criterion validity coefficient that is achievable with a linear combination of its components. A necessary and sufficient condition of their identity is presented in the case of measurement errors being uncorrelated among themselves and with a used criterion. An upper bound of the difference in these validity coefficients is provided, indicating that it cannot exceed the discrepancy between the maximal reliability and composite reliability indexes. A readily applicable latent variable modeling procedure is discussed that can be used for point and interval estimation of the difference between the maximal and scale criterion validity coefficients. The outlined method is illustrated with a numerical example. 相似文献

2.

Higher Validity in the Face of Lower Reliability: Another Look

《教育实用测度》2013,26(3):249-253

A test segment that lacks content validity with respect to a criterion may be deleted for that reason. At issue is the effect on reliability and validity as measured by the coefficients arising from classical test theory. Assuming that the predictor test has some reasonable degree of internal consistency, deleting a segment of meaningful size is certain to reduce reliability. However, Feldt (1997) showed that a concomitant rise in the validity coefficient may occur under certain limited conditions. The present research further characterizes the circumstances under which validity changes may occur as a result of deletion of a predictor test segment. Specifically, for a positive outcome, one seeks a relatively large correlation between the scores from the deleted segment and the remaining items coupled with a relatively low correlation between scores from the deleted segment and the criterion. 相似文献

3.

Testing Criterion Correlations With Scale Component Measurement Errors Using Latent Variable Modeling

Tenko Raykov George A. Marcoulides Siegfried Gabler Youngjun Lee 《Structural equation modeling》2017,24(3):468-474

A latent variable modeling method for testing criterion correlations with measurement error terms in multicomponent measuring instruments is outlined. The approach is based on an application of the Benjamini–Hochberg multiple testing procedure and can be used when assumptions of validity estimation related procedures need to be examined. The method also allows studying the extent to which criterion validity coefficients might be due to the relationship between a presumed underlying latent construct evaluated by a psychometric scale and a criterion variable, or could be a consequence of the relation between measurement error in the overall scale score and the criterion. The discussed procedure is widely applicable with popular latent variable modeling software, and is illustrated using a numerical example. 相似文献

4.

Steps for the Application of the Johnson-Neyman Technique-A Sample Analysis

Robert H. Koenker Carl W. Hansen 《Journal of Experimental Education》2013,81(3):164-173

A number of mental-test theorists have called attention to the fact that increasing test reliability beyond an optimal point can actually lead to a decrement in the validity of that test with respect to a criterion. This non-monotonic relation between reliability and validity has been referred to by Loevinger as the “attentuation paradox,” because Spearman’s correction for attenuation leads one to expect that increasing reliability will always increase validity. In this paper a mathematical link between test reliability and test validity is derived which takes into account the correlation between error scores on a test and error scores on a criterion measure the test is designed to predict. It is proved that when the correlation between these two sets of error scores is positive, the non-monotonic relation between test reliability and test validity which has been viewed as a paradox occurs universally. 相似文献

5.

Curriculum-Based Measurement in the Content Areas: Validity of Vocabulary-Matching as an Indicator of Performance in Social Studies

Christine A. Espin Todd W. Busch Jongho Shin Ron Kruschwitz 《Learning disabilities research & practice》2001,16(3):142-151

In this study, we examined the reliability and validity of two curriculum-based measures as indicators of performance in a content-area classroom. Participants were 58 students in a 7th-grade social studies class. CBM measures were student- and administrator-read vocabulary-matching probes. Criterion measures were knowledge pre- and post-tests, the social studies subtest of the Iowa Test of Basic Skills, and student grades. Results revealed moderate alternate-form reliability for both vocabulary-matching measures. Reliability of the measures was increased by combining scores across two testing sessions. Correlations between the predictor and criterion variables were moderate to moderately strong, with the exception of those between vocabulary-matching and student grades. Observed scores for students with LD were lower than for students without LD on both student- and administrator-read vocabulary-matching measures. Few differences in reliability and validity coefficients were found between the student- and administrator-read measures. Results are discussed in terms of the use of CBM as a system for monitoring performance and designing interventions for students with learning disabilities in content-area classrooms. 相似文献

6.

中学生班级集体效能感量表的初步编制

刘盛敏陈永胜《集美大学学报(教育科学版》2010,11(3):61-64

根据研究和访谈结果,编制中学生班级集体效能感初测问卷,对194名中学生进行测试,结果用于探索性分析。对1773名中学生的测试结果用于验证性因素分析。159人完成重测,359人同时完成校标检验。探索性因素分析获得获得四个因素：合作、能力、预期和努力,解释了总变异的52．62％;验证性因素分析显示,四因素模型的各项参数达到可接受的水平;量表的内在一致性α系数和分半信度分别为0．878和0．859,重测信度为0．703,采用高峰强等人的集体效能信念量表作为班级集体效能感测量的效标,结果相关系数为0．541。因此,该量表具有较好的信效度,可应用于中学生群体。相似文献

7.

Interval Estimation of Optimal Scores from Multiple-Component Measuring Instruments via SEM

《Structural equation modeling》2013,20(2):252-263

A structural equation modeling based method is outlined that accomplishes interval estimation of individual optimal scores resulting from multiple-component measuring instruments evaluating single underlying latent dimensions. The procedure capitalizes on the linear combination of a prespecified set of measures that is associated with maximal reliability and validity. The approach is useful when one is interested in evaluating plausible ranges for subject scores on the composite exhibiting highest measurement consistency and strongest linear relation with a given criterion. The method is illustrated with a numerical example. 相似文献

8.

The Reliability and Validity of Weighted Composite Scores

《教育实用测度》2013,26(3):221-240

The scores on 2 distinct tests (e.g., essay and objective) are often combined to create a composite score, which is used to make decisions. The validity of the observed composite can sometimes be evaluated relative to an external criterion. However, in cases where no criterion is available, the observed composite has generally been evaluated in terms of its reliability. The analyses in this article are based on a simple, content-based model for the validity of the observed composite as an estimate of a target composite, based on a priori weights for the 2 tests. The results suggest that giving extra weight to the more reliable of the 2 observed scores tends to improve the reliability of the composite, and up to a point tends to improve its validity. Giving too much weight to the more reliable score can decrease the validity of the observed composite as a measure of the target composite. 相似文献

9.

下载免费PDF全文

Mark L. Davison Ernest C. Davenport Jr. Yu‐Feng Chang Kory Vue Shiyang Su 《Journal of Educational Measurement》2015,52(3):263-279

Criterion‐related profile analysis (CPA) can be used to assess whether subscores of a test or test battery account for more criterion variance than does a single total score. Application of CPA to subscore evaluation is described, compared to alternative procedures, and illustrated using SAT data. Considerations other than validity and reliability are discussed, including broad societal goals (e.g., affirmative action), fairness, and ties in expected criterion predictions. In simulation data, CPA results were sensitive to subscore correlations, sample size, and the proportion of criterion‐related variance accounted for by the subscores. CPA can be a useful component in a thorough subscore evaluation encompassing subscore reliability, validity, distinctiveness, fairness, and broader societal goals. 相似文献

10.

大学生归因方式问卷的编制及信度、效度研究

张学军《新乡教育学院学报》2003,16(4):8-10

问卷由正性学业成就、正性人际关系、负性学业成绩、负性人际关系四个分问卷组成。每个分问卷包括若干个条目,共16个条目(正性事件、负性事件各8条)。各个分问卷的信度系数(Cronbach,s a)在0.533-0.744之间,全问卷的分半信度为0.549。运用主成分分析法和效标效度法判断问卷的效度。研究结果表明,问卷具有较为理想的信度和效度,可作为测量、评估大学生归因方式的工具。相似文献

11.

中学教师工作积极性的测评研究

余敦旺《赣南师范学院学报》2000,(2):56-59

本文运用教师积极性测评表对 4 2名中学教师进行了测量 ,并对测评表的信度和效度进行了检验。结果表明 :运用量表对教师的工作积极性进行测评是可行的和有意义的 ,该量表具有一定的信度和效标关联效度 ,但还应在现有基础上 ,用因素分析法进行筛选和归类。本文还提出 ,应建立一个统一的评分体系 ,使分数更具有可比性相似文献

12.

大学生心理亚健康自评量表的编制

韦波梁永锋李毅昂《高教论坛》2012,(8):3-5

介绍了高校大学生心理亚健康自评量表编制及对量表进行信效度分析。研究结果显示,量表含心理调适力、躯体状况、学习适应、人际关系、社会适应及睡眠障碍6个维度,6个维度与总量表间的相关系数在0.695-0.824间,有较好信度和结构效度,量表总体Cronbach’α系数为0.916,分量表α系数分别为0.810、0.809、0.785、0.807、0.796及0.661。量表可用于大学生心理亚健康症状筛查。相似文献

13.

学校生活质量量表在中国内地大学生中应用的初步测试分析

胡平建《安康学院学报》2010,22(3):17-19

本文以中国内地某高校大学生为被试,对学校生活质量量表（quality of school life scale QSLS）的信度和效度进行了检验。结果表明,QSLS在中国内地大学生中测试具有良好的信度和效度,可以作为在中国内地大学生中测量学校生活质量的工具,其总分信度Cronbach＇sa系数为0.896,重测信度为0.843,各分量表之间重测信度为0.782-0.859;总分与分量表之间的相关系数为0.653-0.815,各因子之间相关系数为0.269-0.773;运用验证性因子分析,各项指标均达到统计学要求。相似文献

14.

中学生学习动机问卷的初步编制

刘志军白学军李炳煌《中学教育》2010,(6):56-61

中学生学习动机问卷的编制遵循标准化的程序,采用探索性因素分析和验证性因素分析方法,问卷的内在一致性信度系数和重测信度系数达到0．7,学习动机的结构拟合良好,因素载荷在0．4—0．9之间,效标效度基本符合测量学的要求,中学生学习动机问卷是一个信、效度良好,具有五因素结构的测量工具。相似文献

15.

Heidegger's Theory of Truth and its Importance for the Quality of Qualitative Research

RAUNO HUTTUNEN LEENA KAKKORI 《Journal of Philosophy of Education》2020,54(3):600-616

When reliability and validity were introduced as validation criteria for empirical research in the human sciences, quantitative research methods prevailed, and theory of science relied on neopositivism (Vienna Circle) or postpositivism (scientific realism). Within this worldview, notions of reliability and validity as criteria of scientific goodness were introduced. Reliability and validity were associated with the correspondence theory of truth, which is mostly ill-suited to the needs of qualitative research. For that reason, qualitative research must look for other kinds of validation criteria. The article elaborates the problems arising when the correspondence theory of truth is used as an ultimate criterion in evaluating qualitative research and proposes Heidegger's hermeneutical or alethetical idea of truth as a more suitable approach. 相似文献

16.

初中生数学学习非智力因素调查问卷的编制

王光明李爽《数学教育学报》2020,(1):29-39

通过查阅文献资料与已有成熟问卷,征询专家意见,根据初中生数学学习特点,编制了"初中生数学学习非智力因素调查问卷".经过项目分析、探索性因素分析、验证性因素分析,对问卷进行相应修改,正式问卷包括动机、态度、意志、性格、情绪情感5个维度,以及对应的13个因子.所编问卷具有较好的信度(一致性系数,重测信度,分半信度)与效度(内容效度、结构效度、效标效度),可作为测量初中生数学学习非智力因素的有效工具. 相似文献

17.

TOWARD AN INTEGRATION OF THEORY AND METHOD FOR CRITERION-REFERENCED TESTS1,2

RONALD K. HAMBLETON MELVIN R. NOVICK 《Journal of Educational Measurement》1973,10(3):159-170

In this paper, an attempt has been made to synthesize some of the current thinking in the area of criterion-referenced testing as well as to provide the beginning of an integration of theory and method for such testing. Since criterion-referenced testing is viewed from a decision-theoretic point of view, approaches to reliability and validity estimation consistent with this philosophy are suggested. Also, to improve the decision-making accuracy of criterion-referenced tests, a Bayesian procedure for estimating true mastery scores has been proposed. This Bayesian procedure uses information about other members of a student's group (collateral information), but the resulting estimation is still criterion referenced rather than norm referenced in that the student is compared to a standard rather than to other students. In theory, the Bayesian procedure increases the “effective length” of the test by improving the reliability, the validity, and more importantly, the decision-making accuracy of the criterion-referenced test scores. 相似文献

18.

Validity of alternative approaches for the identification of learning disabilities: operationalizing unexpected underachievement

Fletcher JM Denton C Francis DJ 《Journal of learning disabilities》2005,38(6):545-552

This article reviews the validity of models based on (a) aptitude-achievement discrepancies, (b) low achievement, (c) intraindividual differences, and (d) response to instruction for the classification and identification of learning disabilities (LD). Models based on aptitude-achievement discrepancies and intraindividual differences showed little evidence of discriminant validity. Low achievement models had stronger discriminant validity but do not adequately assess the most significant component of the LD construct, unexpected underachievement. All three of these status models have limited reliability because of their reliance on a measurement at a single time point. Models that incorporate response to instruction have stronger reliability and validity but cannot represent the sole criterion for LD identification. Hybrid models combining low achievement and response to instruction most clearly capture the LD construct and have the most direct relation to instruction. The assessment of students for LD must reflect a stronger underlying classification that takes into account relations with other developmental disorders as well as the reliability and validity of the underlying classification and resultant identification system. 相似文献

19.

The Orientation to Life Questionnaire: Validation of a Measure to Assess Older Adults’ Sense of Coherence

Sofia von Humboldt Isabel Leal 《Educational gerontology》2015,41(6):451-465

Increasingly, the literature suggests that the sense of coherence (SOC) positively influences well-being in later life. This study reports the assessment of the following psychometric properties: distributional properties, construct, criterion and external-related validities, and reliability of the Orientation to Life Questionnaire (OtLQ) in an cross-national population of older adults. We recruited 1291 community-dwelling older adults aged between 75–102 years (M = 83.9; SD = 6.68). Convenience sampling was used to gather questionnaire data. The construct validity was asserted by confirmatory factor analysis and convergent and discriminant validity. Moreover, criterion and external-related validities, as well as distributional properties and reliability, were also tested. Data gathered with the 29-items OtLQ scale showed overall good psychometric properties in terms of distributional properties, construct, criterion, and external-related validities, as well as reliability. Three factors were validated for the OtLQ scale: (a) comprehensibility; (b) manageability; and (c) meaningfulness. We validated the three-factor OtLQ scale, which produced valid and reliable data for a cross-national sample with older adults. Hence, it is an adequate instrument for assessing sense of coherence among older people in health care practice and program development contexts. 相似文献

20.

Proximal Versus Distal Validity Coefficients for Teacher Observational Instruments

Robert J. Marzano 《The Teacher Educator》2014,49(2):89-96

This study examined the use of measures of student learning computed using end-of-year assessments (distal measures) versus measures of student learning associated with a single lesson (proximal measures) as criterion scores for the validity of observations of teachers' pedagogical skills. The validity coefficients computed using distal measures were significantly lower than the validity coefficient computed using proximal measures. Assumptions underlying the current emphasis on distal measures were challenged. Possible ways to generate more proximal measures were explored. 相似文献