首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   17篇
  免费   0篇
教育   15篇
各国文化   2篇
  2017年   1篇
  2015年   4篇
  2014年   3篇
  2013年   3篇
  2010年   1篇
  2009年   2篇
  2002年   1篇
  1998年   1篇
  1989年   1篇
排序方式: 共有17条查询结果,搜索用时 15 毫秒
1.
This paper demonstrates and discusses the use of think aloud protocols (TAPs) as an approach for examining and confirming sources of differential item functioning (DIF). The TAPs are used to investigate to what extent surface characteristics of the items that are identified by expert reviews as sources of DIF are supported by empirical evidence from examinee thinking processes in the English and French versions of a Canadian national assessment. In this research, the TAPs confirmed sources of DIF identified by expert reviews for 10 out of 20 DIF items. The moderate agreement between TAPs and expert reviews indicates that evidence from expert reviews cannot be considered sufficient in deciding whether DIF items are biased and such judgments need to include evidence from examinee thinking processes.  相似文献   
2.
Analysis of Differential Item Functioning in the NAEP History Assessment   总被引:1,自引:0,他引:1  
The Mantel-Haenszel approach for investigating differential item functioning was applied to U.S. history items that were administered as part o f the National Assessment o f Educational Progress, On some items, blacks, Hispanics, and females performed more poorly than other students, conditional on number-right score. It was hypothesized that this resulted, in part, from the fact that ethnic and gender groups differed in their exposure to the material included in the assessment. Supplementary Mantel-Haenszel analyses were undertaken in which the number o f historical periods studied, as well as score. was used as a conditioning variable. Contrary to expectation, the additional conditioning did not lead to a reduction in the number o f DIF items. Both methodological and substantive explanations for this unexpected result were explored.  相似文献   
3.
4.
In this study, we examine the degree of construct comparability and possible sources of incomparability of the English and French versions of the Programme for International Student Assessment (PISA) 2003 problem-solving measure administered in Canada. Several approaches were used to examine construct comparability at the test- (examination of test data structure, reliability comparisons and test characteristic curves) and item-levels (differential item functioning, item parameter correlations, and linguistic comparisons). Results from the test-level analyses indicate that the two language versions of PISA are highly similar as shown by similarity of internal consistency coefficients, test data structure (same number of factors and item factor loadings) and test characteristic curves for the two language versions of the tests. However, results of item-level analyses reveal several differences between the two language versions as shown by large proportions of items displaying differential item functioning, differences in item parameter correlations (discrimination parameters) and number of items found to contain linguistic differences.  相似文献   
5.
This Issue     
  相似文献   
6.
Similar to educators in mathematics, science, and reading, history educators around the world have mobilized curricular reform movements toward including complex thinking in history education, advancing historical thinking, developing historical consciousness, and teaching competence in historical sense making. These reform movements, including the Common Core Standards, are beginning to include historical thinking. Despite these developments, inclusion of historical thinking in assessments has been slow: The great majority of history assessments, both large-scale and classroom-based, still focus on fragmented pieces of information. In this article, we discuss the challenges in assessment of historical thinking, describe how these issues were dealt with in a 1-hr test of students ability to reason about “enemy aliens” in Canada during World War I, and make recommendations for future assessments.  相似文献   
7.
Heterogeneity within English language learners (ELLs) groups has been documented. Previous research on differential item functioning (DIF) analyses suggests that accurate DIF detection rates are reduced greatly when groups are heterogeneous. In this simulation study, we investigated the effects of heterogeneity within linguistic (ELL) groups on the accuracy of DIF detection. Heterogeneity within such groups may occur for a myriad of reasons including differential lengths of time residing in English-speaking countries, degrees of exposure to English-speaking environments, and amounts of English instruction. Our findings revealed that at high levels of within-group heterogeneity, DIF detection is at the level of chance, implying that a large proportion of DIF items might remain undetected when assessing heterogeneous populations potentially leading to developing biased tests. Based on our findings, we urge test development organizations to consider heterogeneity within ELL and other heterogeneous focus groups in their routine DIF analyses.  相似文献   
8.
ABSTRACT

Differential item functioning (DIF) analyses have been used as the primary method in large-scale assessments to examine fairness for subgroups. Currently, DIF analyses are conducted utilizing manifest methods using observed characteristics (gender and race/ethnicity) for grouping examinees. Homogeneity of item responses is assumed denoting that all examinees respond to test items using a similar approach. This assumption may not hold with all groups. In this study, we demonstrate the first application of the latent class (LC) approach to investigate DIF and its sources with heterogeneous (linguistic minority groups). We found at least three LCs within each linguistic group, suggesting the need to empirically evaluate this assumption in DIF analysis. We obtained larger proportions of DIF items with larger effect sizes when LCs within language groups versus the overall (majority/minority) language groups were examined. The illustrated approach could be used to improve the ways in which DIF analyses are typically conducted to enhance DIF detection accuracy and score-based inferences when analyzing DIF with heterogeneous populations.  相似文献   
9.
10.
Multiple scoring is widely used in large-scale assessments. The use of a single response for making multiple inferences as is done in multiple scoring has implications on the validity of these inferences and interpretations based on assessment results. The purpose of this article is to review two types of multiple scoring practices and discuss how multiple scoring affects inferences.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号