期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

How Item Writers Understand Depth of Knowledge

Adam E. Wyse Steven G. Viger 《Educational Assessment》2013,18(4):185-206

An important part of test development is ensuring alignment between test forms and content standards. One common way of measuring alignment is the Webb (1997, 2007) alignment procedure. This article investigates (a) how well item writers understand components of the definition of Depth of Knowledge (DOK) from the Webb alignment procedure and (b) how consistent their DOK ratings are with ratings provided by other committees of educators across grade levels, content areas, and alternate assessment levels in a Midwestern state alternate assessment system. Results indicate that many item writers understand key features of DOK. However, some item writers struggled to articulate what DOK means and had some misconceptions. Additional analyses suggested some lack of consistency between the item writer DOK ratings and the committee DOK ratings. Some notable differences were found across alternate assessment levels and content areas. Implications for future item writing training and alignment studies are provided. 相似文献

2.

The Rating and Matching Item-Objective Alignment Methods

Jerome V. D'Agostino Megan E. Welsh Adriana D. Cimetta Lia D. Falco Shannon Smith Waverely Hester VanWinkle 《教育实用测度》2013,26(1):1-21

Central to the standards-based assessment validation process is an examination of the alignment between state standards and test items. Several alignment analysis systems have emerged recently, but most rely on either traditional rating or matching techniques. Little, if any, analyses have been reported on the degree of consistency between the two methods and on the item and objective characteristics that influence judges' decisions. We randomly assigned judges to either rate item-objective links or match items to objectives while reviewing the 2004 Arizona high school mathematics standards and assessment. Across items we found moderate convergence between methods, and we detected apparent reasons for divergently scored items. We also found that judges relied on item and objective content and intellectual skill features to render decisions. Based on our evidence, we contend that a thorough alignment analysis would involve judges using both rating and matching, while focusing on both content and intellectual skill. The findings have important implications for states when examining the alignment between their standards and assessments. 相似文献

3.

Gauging Item Alignment Through Online Systems While Controlling for Rater Effects

下载免费PDF全文

Daniel Anderson Shawn Irvin Julie Alonzo Gerald A. Tindal 《Educational Measurement》2015,34(1):22-33

The alignment of test items to content standards is critical to the validity of decisions made from standards‐based tests. Generally, alignment is determined based on judgments made by a panel of content experts with either ratings averaged or via a consensus reached through discussion. When the pool of items to be reviewed is large, or the content‐matter experts are broadly distributed geographically, panel methods present significant challenges. This article illustrates the use of an online methodology for gauging item alignment that does not require that raters convene in person, reduces the overall cost of the study, increases time flexibility, and offers an efficient means for reviewing large item banks. Latent trait methods are applied to the data to control for between‐rater severity, evaluate intrarater consistency, and provide item‐level diagnostic statistics. Use of this methodology is illustrated with a large pool (1,345) of interim‐formative mathematics test items. Implications for the field and limitations of this approach are discussed. 相似文献

4.

Alignment of Content and Effectiveness of Mathematics Assessment Items

《Educational Assessment》2013,18(4):333-356

Alignment has taken on increased importance given the current high-stakes nature of assessment. To make well-informed decisions about student learning on the basis of test results, assessment items need to be well aligned with standards. Project 2061 of the American Association for the Advancement of Science (AAAS) has developed a procedure for analyzing the content and quality of assessment items. The authors of this study used this alignment procedure to closely examine 2 mathematics assessment items. Student work on these 2 items was analyzed to determine whether the conclusions reached through the use of the alignment procedure could be validated. It was found that the Project 2061 alignment procedure was effective in providing a tool for in-depth analysis of the mathematical content of the item and a set of standards and in identifying 1 particular content standard that was most closely aligned with the standard. Through analyzing student work samples and student interviews, it was also found that students' thinking may not correspond to the standard identified as best aligned with the learning goals of the item. This finding highlights the potential usefulness of analyzing student work to clarify any additional deficiencies of an assessment item not revealed by an alignment procedure. 相似文献

5.

基于课程标准的学业成就评价:韦伯模式之研究

岳喜腾张雨强《全球教育展望》2011,(10):79-85

学业成就评价必须基于课程标准,这是国家对基础教育整体质量卓越的诉求。韦伯博士是美国"基于标准的评价"运动的重要代表人物,他提出了学业成就评价与课程标准保持一致性的分析维度、分析程序、分析方法等一整套一致性框架,并开发了基于网络的评价与标准一致性工具。全面深入研究韦伯模式,对我国基础教育阶段学生学业成就评价改革具有重要意义。相似文献

6.

Rater Agreement in Test‐to‐Curriculum Alignment Reviews

下载免费PDF全文

A. Traynor H. E. Merzdorf 《Educational Measurement》2018,37(3):55-64

During the development of large‐scale curricular achievement tests, recruited panels of independent subject‐matter experts use systematic judgmental methods—often collectively labeled “alignment” methods—to rate the correspondence between a given test's items and the objective statements in a particular curricular standards document. High disagreement among the expert panelists may indicate problems with training, feedback, or other steps of the alignment procedure. Existing procedural recommendations for alignment reviews have been derived largely from single‐panel research studies; support for their use during operational large‐scale test development may be limited. Synthesizing data from more than 1,000 alignment reviews of state achievement tests, this study identifies features of test–standards alignment review procedures that impact agreement about test item content. The researchers then use their meta‐regression results to propose some practical suggestions for alignment review implementation. 相似文献

7.

On the alignment of teachers’ mathematical content knowledge assessments with the common core state standards

Copur-Gencturk Yasemin Jacobson Erik Rasiej Richard 《Journal of Mathematics Teacher Education》2022,25(3):267-291

Instruments designed to measure teachers’ knowledge for teaching mathematics have been widely used to evaluate the impact of professional development and to investigate the role of teachers’ knowledge in teaching and student learning. These instruments assess a mixture of content knowledge and pedagogical content knowledge. However, little attention has been given to the content alignment between such instruments and curricular standards, particularly in regard to how content knowledge and pedagogical content knowledge items are distributed across mathematical topics. This article provides content maps for two widely used teacher assessment instruments in the USA relative to the widely adopted Common Core State Standards. This common reference enables comparisons of content alignment both between the instruments and between parallel forms within each instrument. The findings indicate that only a small number of items on both instruments are designed to capture teachers’ pedagogical content knowledge and that the majority of these items are focused on curricular topics in the later grades rather than in the early grades. Furthermore, some forms designed for use as pre- and post-assessment of professional development or teacher education are not parallel in terms of curricular topics, so estimates of teachers’ knowledge growth based on these forms may not mean what users assume. The implications of these findings for teacher educators and researchers who use teacher knowledge instruments are discussed.

相似文献

8.

Using Response Time to Detect Item Preknowledge in Computer‐Based Licensure Examinations

Hong Qian Dorota Staniewska Mark Reckase Ada Woo 《Educational Measurement》2016,35(1):38-47

This article addresses the issue of how to detect item preknowledge using item response time data in two computer‐based large‐scale licensure examinations. Item preknowledge is indicated by an unexpected short response time and a correct response. Two samples were used for detecting item preknowledge for each examination. The first sample was from the early stage of the operational test and was used for item calibration. The second sample was from the late stage of the operational test, which may feature item preknowledge. The purpose of this research was to explore whether there was evidence of item preknowledge and compromised items in the second sample using the parameters estimated from the first sample. The results showed that for one nonadaptive operational examination, two items (of 111) were potentially exposed, and two candidates (of 1,172) showed some indications of preknowledge on multiple items. For another licensure examination that featured computerized adaptive testing, there was no indication of item preknowledge or compromised items. Implications for detected aberrant examinees and compromised items are discussed in the article. 相似文献

9.

Assessing quality characteristics of center-based early childhood environments in Germany and Portugal: A cross-national study

Wolfgang Tietze Joaquim Bairrão Teresa Barreiros Leal Hans-Guenther Rossbach 《European Journal of Psychology of Education - EJPE》1998,13(2):283-298

Based on the knowledge that direct assessment of the quality of early childhood educational environments is becoming more and more important, this article deals with a coordinated adaptation of the Early Childhood Environment Rating Scale (ECERS) in Germany and Portugal. The ECERS, developed by Harms and Clifford (1980), is a complex rating scale, comprising 37 items, which is implemented to assess major quality characteristics of early childhood classrooms and has been used quite extensively in the US as well as in other countries. In this article we describe how the scale was adapted and we report selected results of a jointly coordinated comparative study on a sample of 103 early childhood classrooms in Germany and 88 in Portugal. The classrooms were classified according to three different types in each country. In particular, we report the item characteristics of the adapted versions, reliabilities and intercorrelations of the total scale and the apriori-subscales, and results of factor analyses. A one-way ANOVA indicates group differences between the six types of classrooms which can be meaningfully interpreted. 相似文献

10.

Integrated, Comprehensive Alignment as a Foundation for Measuring Student Progress

Joseph Martineau Pamela Paek John Keene Thomas Hirsch 《Educational Measurement》2007,26(1):28-35

相似文献

11.

Assessing the Life Science Knowledge of Students and Teachers Represented by the K–8 National Science Standards

Philip M. Sadler Harold Coyle Nancy Cook Smith Jaimie Miller Joel Mintzes Kimberly Tanner John Murray 《CBE life sciences education》2013,12(3):553-575

We report on the development of an item test bank and associated instruments based on the National Research Council (NRC) K–8 life sciences content standards. Utilizing hundreds of studies in the science education research literature on student misconceptions, we constructed 476 unique multiple-choice items that measure the degree to which test takers hold either a misconception or an accepted scientific view. Tested nationally with 30,594 students, following their study of life science, and their 353 teachers, these items reveal a range of interesting results, particularly student difficulties in mastering the NRC standards. Teachers also answered test items and demonstrated a high level of subject matter knowledge reflecting the standards of the grade level at which they teach, but exhibiting few misconceptions of their own. In addition, teachers predicted the difficulty of each item for their students and which of the wrong answers would be the most popular. Teachers were found to generally overestimate their own students’ performance and to have a high level of awareness of the particular misconceptions that their students hold on the K–4 standards, but a low level of awareness of misconceptions related to the 5–8 standards. 相似文献

12.

美国幼儿教师教育的普通知识标准

朱宗顺《学前教育研究》2006,(9):54-56

美国幼儿教育协会制订早期儿童教育专业的准备标准，提出了各个层次从业者的培养要求。就早期儿童教育专业候选者所需要的普通知识的准备而言，该组织提出了语言和读写能力、艺术、数学、身体锻炼和体育、科学、社会研究等领域的培养要求，可以为我国幼儿教师教育的课程改革提供借鉴。相似文献

13.

Validating Measurement of Knowledge Integration in Science Using Multiple-Choice and Explanation Items

Hee-Sun Lee Ou Lydia Liu Marcia C. Linn 《教育实用测度》2013,26(2):115-136

This study explores measurement of a construct called knowledge integration in science using multiple-choice and explanation items. We use construct and instructional validity evidence to examine the role multiple-choice and explanation items plays in measuring students' knowledge integration ability. For construct validity, we analyze item properties such as alignment, discrimination, and target range on the knowledge integration scale using a Rasch Partial Credit Model analysis. For instructional validity, we test the sensitivity of multiple-choice and explanation items to knowledge integration instruction using a cohort comparison design. Results show that (1) one third of correct multiple-choice responses are aligned with higher levels of knowledge integration while three quarters of incorrect multiple-choice responses are aligned with lower levels of knowledge integration, (2) explanation items discriminate between high and low knowledge integration ability students much more effectively than multiple-choice items, (3) explanation items measure a wider range of knowledge integration levels than multiple-choice items, and (4) explanation items are more sensitive to knowledge integration instruction than multiple-choice items. 相似文献

14.

早期教育的文化:来自香港幼稚园的观察研究

刘丽薇《幼儿教育》2006,(1)

随着对“幼儿是如何学习的”这一问题认识的不断深入,世界范围内的幼儿教育从知识传授向知识建构教育范式转变。本文对当前香港主流的幼稚园教育实践以及影响香港幼儿教育的社会文化因素作了探讨。系统观察研究的结果显示,香港幼稚园普遍采用直接教学和探究式教学相结合的教学方式。在语言教学中,逐渐接受早期读写萌发教学的观念;在数学教育中,从传统的以讲授和练习为主的教学转向以活动为基础的教学。关于幼儿社会化的文化观念反映在教学实践以及与幼儿的互动过程中。虽然关于幼儿怎样才能更好地进行学习的理念在当前幼儿教育实践中得到了较好的体现,但是中国传统文化中的相关观念仍然在课堂情境中占据主导地位,二者彼此交织和融合在教学实践中。相似文献

15.

The Impact of an Ongoing Professional Development Program on Prekindergarten Teachers' Mathematics Practices

Jenifer S. Thornton Courtney L. Crim Jacqueline Hawkins 《Journal of Early Childhood Teacher Education》2013,34(2):150-161

Mathematics is a natural part of daily life for young children as they explore and investigate the world around them. To build on these experiences, and to begin establishing a mathematical foundation, early childhood educators must not only be knowledgeable about mathematical concepts, they must also be aware of the most developmentally appropriate ways in which to teach these concepts to young children. After participation in an ongoing professional development program, specifically targeting teachers of prekindergarten children in public school, Preschool Programs for Children with Disabilities (PPCD), Head Start, and child care settings, teachers reported positive changes in math practices. Specifically, teachers reported a stronger alignment to national mathematics standards and increased awareness pertaining to developmentally appropriate mathematics practices as they apply to early childhood classrooms. Teachers reported a shift towards more hands-on activities and a shift away from the use of worksheets in their prekindergarten classrooms. Implications from this study suggest that ongoing professional development that is designed to meet the specific needs of early childhood educators can have a positive impact on reported mathematics content knowledge and instructional practices. 相似文献

16.

IMPLEMENTING COLLEGE AND CAREER STANDARDS IN MATH METHODS COURSE FOR EARLY CHILDHOOD AND ELEMENTARY EDUCATION TEACHER CANDIDATES

Joohi Lee 《International Journal of Science and Mathematics Education》2016,14(1):177-192

This study is purposed to measure the efficacy of implementing College and Career Readiness Standards (CCRS) math standards into math methods courses for early childhood and elementary education teacher candidates at an urban university located in the Dallas and Fort Worth metroplex area. A total of 161 college seniors (teacher candidates) enrolled in the mathematics methods courses for early childhood and elementary education participated in this study. The results showed that incorporating CCRS into the math methods courses was successful as shown by the significant findings between pre- and post-assessments in both skills- and knowledge-associated items. Although only two items were found to be significant between pre- and post-assessments on disposition, teacher candidates showed higher scores in pre-assessments on disposition compared to skills- and knowledge-associated items. 相似文献

17.

Methodological Choices in the Content Analysis of Textbooks for Measuring Alignment With Standards

下载免费PDF全文

Morgan S. Polikoff Nan Zhou Shauna E. Campbell 《Educational Measurement》2015,34(3):10-17

With the recent adoption of the Common Core standards in many states, there is a need for quality information about textbook alignment to standards. While there are many existing content analysis procedures, these generally have little, if any, validity or reliability evidence. One exception is the Surveys of Enacted Curriculum (SEC), which has been widely used to analyze the alignment among standards, assessments, and teachers’ instruction. However, the SEC can be time‐consuming and expensive when used for this purpose. This study extends the SEC to the analysis of entire mathematics textbooks and investigates whether the results of SEC alignment analyses are affected if the content analysis procedure is simplified. The results indicate that analyzing only every fifth item produces nearly identical alignment results with no effect on the reliability of content analyses. 相似文献

18.

SAT Differential Item Performance for Nine Handicapped Groups

Randy Elliot Bennett Donald A. Rock Bruce A. Kaplan 《Journal of Educational Measurement》1987,24(1):41-55

The purpose of this study was to identify broad classes of items that behave differentially for handicapped examinees taking special, extended-time administrations of the Scholastic Aptitude Test (SA T). To identify these item classes, the performance of nine handicapped groups and one nonhandicapped group on each of two forms of the SAT was investigated through a two-stage procedure. The first stage centered on the performance of item clusters. Individual items composing clusters showing questionable performance were then examined. This two-stage procedure revealed little indication of differentially functioning item classes. However, some notable instances of differential performance at the item level were detected, the most serious of which affected visually impaired students taking the braille edition of the test. 相似文献

19.

“幼儿园（Kindergarten）”一词的诞生及隐义

罗瑶《学科教育》2014,(1):88-94

“幼儿园（Kindergarten）”是一个几乎人人熟悉的名字,然而在该词中却隐藏着创造者福禄培尔鲜为人知的深意.在对统一体的追寻中,福禄培尔发现了自然发展和人类发展中的统一的力,以及力在各发展阶段的不同表现形式和内在连续性.他进而发现人之天性在于人的精神力,而儿童期的精神力的表现形式与自然发展阶段中晶体、植物、动物的力的表现形式是一致的.具体到儿童早期阶段,其力的表现形式与植物生命力展现的方式一样都是“内在生活的外在化”,因此儿童早期教育必须向照料植物学习,让幼儿在游戏中将自己的内在生活自由展现.这就是福禄培尔选择用“儿童的花园（Kindergarten）”来命名幼儿教育机构的深层原因. 相似文献

20.

Answer Changing on Objective Tests

《The Journal of educational research》2012,105(6):313-315

Abstract

In an attempt to identify some of the causes of answer changing behavior, the effects of four tests and item specific variables were evaluated. Three samples of New Zealand school children of different ages were administered tests of study skills. The number of answer changes per item was compared with the position of each item in a group of items, the position of each item in the test, the discrimination index and the difficulty index of each item. It is shown that answer changes were more likely to be made on items occurring early in a group of items and toward the end of a test. There was also a tendency for difficult items and items with poor discriminations to be changed more frequently. Some implications of answer changing in the design of tests are discussed. 相似文献