首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 377 毫秒
1.
Although many studies have examined the alignment of state standards with large-scale assessment and instruction, fewer have attended to alignment concerning alternate assessments for students with significant disabilities. This study was designed to (1) compare expectations in one state's alternate assessment (AA) with curricular priorities reflected in students' Individualized Education Programs (IEPs), and (2) consider the effect of this relationship on AA scores. The study was conducted in a state whose AA consisted of standardized performance tasks measuring reading comprehension (RC) and number systems (NUM). Archival data, including AA scores and IEPs for 292 students, were analyzed. The average IEP emphasized speaking, writing, and measurement, and objectives primarily required simple recall skills. Half of IEPs contained no objectives aligned with RC. More than one third of IEPs did not align with NUM. Assessment–IEP alignment had a moderate effect on Reading test score, but not Math test score. Recommendations are made for future investigations of the taught curriculum for this population, and professional development to improve alignment of instruction with assessments.  相似文献   

2.
This article examines three typical approaches to alternate assessment for students with significant cognitive disabilities—portfolios, performance assessments, and rating scales. A detailed analysis of common and unique design features of these approaches is provided, including features of each approach that influence the psychometric quality of their results. Validity imperatives for alternate assessments are reviewed, and approaches for addressing the need for validity evidence are outlined. The article concludes with an examination of three technical challenges—alignment, scores and scoring, and standard setting—common to all alternate assessments. In light of these challenges, existing methods and professional testing standards are endorsed as necessary guidance for understanding and advancing alternate assessment practices.  相似文献   

3.
Webb一致性程序是判断评价与课程标准一致性的重工具,它有4个标准:类别一致性、知识深度一致性、知识范围一致性和知识分布平衡性,分别对应着多级可接受水平。一致性研究的挑战在于判断"什么样的一致性才是最好的",这一判断过程虽然可以基于众多实证研究,但也包含某种程度的主观性。  相似文献   

4.
We interviewed special educators (a) whose students with disabilities (SWDs) were proficient on the 2008 general education assessment but were assigned to the 2009 alternate assessment based on modified achievement standards (AA‐MAS), and (b) whose students with mild disabilities took the 2008 alternate assessment based on alternate achievement standards (AA‐AAS) and then the 2009 AA‐MAS. We explored teachers’ rationales for test‐type assignment, student characteristics, and quality of instruction to determine the test‐type decisions’ appropriateness. All teachers based their decisions on combinations of factors in the guidelines plus subjective and noninstructional factors. Findings raised concerns about the subjectivity of the assessment assignment system and the inappropriate grade‐level instruction for SWDs. Future research, implications of these findings, and limitations are discussed.  相似文献   

5.
The relationships between ratings on the Idaho Alternate Assessment (IAA) for 116 students with significant disabilities and corresponding ratings for the same students on two norm-referenced teacher rating scales were examined to gain evidence about the validity of resulting IAA scores. To contextualize these findings, another group of 54 students who had disabilities, but were not officially eligible for the alternate assessment also was assessed. Evidence to support the validity of the inferences about IAA scores was mixed, yet promising. Specifically, the relationship among the reading, language arts, and mathematics achievement level ratings on the IAA and the concurrent scores on the ACES-Academic Skills scales for the eligible students varied across grade clusters, but in general were moderate. These findings provided evidence that IAA scales measure skills indicative of the state's content standards. This point was further reinforced by moderate to high correlations between the IAA and Idaho State Achievement Test (ISAT) for the not eligible students. Additional evidence concerning the valid use of the IAA was provided by logistic regression results that the scores do an excellent job of differentiating students who were eligible from those not eligible to participate in an alternate assessment. The collective evidence for the validity of the IAA scores suggests it is a promising assessment for NCLB accountability of students with significant disabilities. The methods of establishing this evidence have the potential to advance validation efforts of other states' alternate assessments.  相似文献   

6.
The development of alternate assessments for students with disabilities plays a pivotal role in state and national accountability systems. An important assumption in the use of alternate assessments in these accountability systems is that scores are comparable on different test forms across diverse groups of students over time. The use of test equating is a common way that states attempt to establish score comparability on different test forms. However, equating presents many unique, practical, and technical challenges for alternate assessments. This article provides case studies of equating for two alternate assessments in Michigan and an approach to determine whether or not equating would be preferred to not equating on these assessments. This approach is based on examining equated score and performance-level differences and investigating population invariance across subgroups of students with disabilities. Results suggest that using an equating method with these data appeared to have a minimal impact on proficiency classifications. The population invariance assumption was suspect for some subgroups and equating methods with some large potential differences observed.  相似文献   

7.
This article presents findings from two projects designed to improve evaluations of technical quality of alternate assessments for students with the most significant cognitive disabilities. We argue that assessment technical documents should allow for the evaluation of the construct validity of the alternate assessments following the traditions of Cronbach (1971) , Messick (1989, 1995) , Linn, Baker, and Dunbar (1991) , and Shepard (1993) . The projects used the work of Knowing What Students Know ( Pellegrino, Chudowsky, & Glaser, 2001 ) to structure and focus the collection and evaluation of assessment information. The heuristic of the assessment triangle ( Pellegrino et al., 2001 ) was particularly useful in emphasizing that the validity evaluation needs to consider the logical connections among the characteristics of the students tested and how they develop domain proficiency (the cognition vertex), the nature of the assessment (the observation vertex), and the ways in which the assessment results are interpreted (the interpretation vertex). This project has shown that in addition to designing more valid assessments, the growing body of knowledge about the psychology of achievement testing can be useful for structuring evaluations of technical quality.  相似文献   

8.
This introduction to the special issue titled Alternate Assessments Based on Modified Academic Achievement Standards: New Policy, New Practices, and Persistent Challenges addresses the federal policy introducing the new alternate assessment for students with persistent academic difficulties, as well as related implementation issues that will be more thoroughly considered throughout the journal. Three guidelines are identified within the policy for alternate assessments based on modified academic achievement standards (AA-MASs), including that (a) a state's grade-level academic content standards cannot be modified for an AA-MAS, (b) a state's general test can be modified for an AA-MAS, and (c) a state's achievement standards can be modified for an AA-MAS so long as they remain on grade level. This article introduces key issues including identification of students eligible for an AA-MAS, the degree of modification that can be applied to develop an AA-MAS, and the current state of AA-MAS development across the nation. The article concludes with overviews of each contribution in the journal.  相似文献   

9.
Because of the unique nature of the students eligible for alternate assessments based on modified academic achievement standards, their varied access to the general education curriculum, and their unique learning needs, innovative psychometric thinking and practice is needed to assure high technical quality of alternate assessments. Indeed, we at least must marshal state-of-the-art procedures to secure strong psychometric evidence to support appropriate and meaningful design and use of these important assessments. The authors contributing work to this special issue, Alternate Assessments Based on Modified Academic Achievement Standards, address important issues and provide guidance to policymakers, test developers, and educators. They also each raise important technical quality issues. This article offers a brief review of such psychometric considerations, in light of the work and comments of the special issue authors.  相似文献   

10.
美国严重认知障碍儿童的选择性评价   总被引:1,自引:1,他引:0  
在美国,不能参加统一绩效测验的严重认知障碍儿童需要接受选择性评价。本文主要介绍了美国严重认知障碍儿童选择性评价的法律依据、评价标准、评价形式等基本情况。文章还讨论了选择性评价的积极影响及面临的问题,展望了今后的研究趋势。  相似文献   

11.
Evaluating the multiple characteristics of alignment has taken a prominent role in educational assessment and accountability systems given its attention in the No Child Left Behind legislation (NCLB). Leading to this rise in popularity, alignment methodologies that examined relationships among curriculum, academic content standards, instruction, and assessments were proposed as strategies to evaluate evidence of the intended uses and interpretations of test scores. In this article, we propose a framework for evaluating alignment studies based on similar concepts that have been recommended for standard setting (Kane). This framework provides guidance to practitioners about how to identify sources of validity evidence for an alignment study and make judgments about the strength of the evidence that may impact the interpretation of the results.  相似文献   

12.
Nebraska districts use different strategies for measuring student performance on the state's content standards. District assessments differ in type and technical quality. Six quality criteria were endorsed by the state. These criteria cover content and curricular validity, fairness, and appropriateness of score interpretations. District assessment portfolios document how well assessments meet these criteria. Districts receive ratings on how well their assessments meet each of the quality criteria and are given a rating from Unacceptable to Exemplary. This article presents these technical quality criteria and explains how they are (a) individually rated and (b) combined for the district's overall quality rating.  相似文献   

13.
Alignment has been defined as the extent to which curricular expectations and assessments are in agreement and work together to provide guidance for educators' efforts to facilitate students' progress toward desire academic outcomes. The Council of Chief State School Officers has identified three preferred models as frameworks for evaluating alignment: Webb's alignment model, the Surveys of Enacted Curriculum model, and the Achieve model. Each model consists of a series of indices that summarize or describe the general match or coherence between state standards, large‐scale assessments, and, in some cases, classroom instruction. This article provides an overview of these frameworks for evaluating alignment and their applications in educational practice and the research literature. After providing an introduction to the use of alignment to evaluate large‐scale accountability systems, the article presents potential extensions of alignment for use with vulnerable populations (e.g., students with disabilities, preschoolers), individual students, and classroom teachers. These proposed applications can provide information for facilitating efforts to improve teachers' classroom instruction and students' educational achievement. © 2008 Wiley Periodicals, Inc.  相似文献   

14.
Abstract

A process for judging the alignment between curriculum standards and assessments developed by the author is presented. This process produces information on the relationship of standards and assessments on four alignment criteria: Categorical Concurrence, Depth of Knowledge Consistency, Range of Knowledge Correspondence, and Balance of Representation. Five issues are identified—but not resolved—that have arisen from conducting alignment studies. All of these issues relate to making a decision about what alignment is good enough. Pragmatic decisions have been made to specify acceptable levels for each of the alignment criteria. The assumptions are described. The issues discussed arise from a change in the underlying assumptions and from considering variations in the purpose for an assessment. The existence of such issues reinforces that alignment judgments have an element of subjectivity.  相似文献   

15.
This article reports on the collaboration of six states to study how simulation‐based science assessments can become transformative components of multi‐level, balanced state science assessment systems. The project studied the psychometric quality, feasibility, and utility of simulation‐based science assessments designed to serve formative purposes during a unit and to provide summative evidence of end‐of‐unit proficiencies. The frameworks of evidence‐centered assessment design and model‐based learning shaped the specifications for the assessments. The simulations provided the three most common forms of accommodations in state testing programs: audio recording of text, screen magnification, and support for extended time. The SimScientists program at WestEd developed simulation‐based, curriculum‐embedded, and unit benchmark assessments for two middle school topics, Ecosystems and Force & Motion. These were field‐tested in three states. Data included student characteristics, responses to the assessments, cognitive labs, classroom observations, and teacher surveys and interviews. UCLA CRESST conducted an evaluation of the implementation. Feasibility and utility were examined in classroom observations, teacher surveys and interviews, and by the six‐state Design Panel. Technical quality data included AAAS reviews of the items' alignment with standards and quality of the science, cognitive labs, and assessment data. Student data were analyzed using multidimensional Item Response Theory (IRT) methods. IRT analyses demonstrated the high psychometric quality (reliability and validity) of the assessments and their discrimination between content knowledge and inquiry practices. Students performed better on the interactive, simulation‐based assessments than on the static, conventional items in the posttest. Importantly, gaps between performance of the general population and English language learners and students with disabilities were considerably smaller on the simulation‐based assessments than on the posttests. The Design Panel participated in development of two models for integrating science simulations into a balanced state science assessment system. © 2012 Wiley Periodicals, Inc. J Res Sci Teach 49: 363–393, 2012  相似文献   

16.
This article describes an alignment study conducted to evaluate the alignment between Indiana's Kindergarten content standards and items on the Indiana Standards Tool for Alternate Reporting. Alignment is the extent to which standards and assessments are in agreement, working together to guide educators' efforts to support children's learning and development. The alignment process in this study represented a modification of Webb's nationally recognized method of alignment analysis to early childhood assessments and standards. The alignment panel (N = 13) in this study consisted of early childhood educators and educational leaders from all geographic regions of the state. Panel members were asked to rate the depth of knowledge (DOK) stage of each objective in Kindergarten standards; rate the DOK stage for each item on the ISTAR rating scale; and identify the one or two objectives from the standards to which each ISTAR item corresponded. Analysis of the panel's responses suggested the ISTAR inconsistently conformed to Webb's DOK consistency and ROK correspondence criteria for alignment. A promising finding was the strong alignment of the ISTAR Level F1 and F2 scales to the Kindergarten standards. This result provided evidence of the developmental continuum of skills and knowledge that are assessed by the ISTAR items .  相似文献   

17.
Criteria and standards‐based assessment models are increasingly being adopted by universities as effective practice. However the promise of these models of assessment may not be realised unless teachers can find ways of making criteria and standards understandable to students. Exemplars or examples of previous students’ work of high and low quality can make criteria and standards concrete. Recent research has focussed on the use of exemplars to help students understand criteria and standards, and less emphasis has been given to exemplars simply as guides for students. This mixed methods study explores students’ perceptions of the usefulness of exemplars and different types of feedback for guiding them in completing assessments. A combination of engaging in marking and discussing exemplars, and receiving individualised and standards‐based feedback provides the most helpful guidance for students’ effective learning.  相似文献   

18.
Since Federal regulations have given states the option to implement alternate assessments based on modified academic achievement standards (AA-MAS) as part of their accountability systems for a small group of students with disabilities, a number of states have made decisions about whether or not to develop and implement such an assessment. State-level directors of assessment and directors of special education were surveyed about their state's decisions on implementing AA-MAS. Improvements in accessibility and appropriateness were reasons given for choosing to implement an AA-MAS, whereas lack of resources and guidance were identified as barriers. This article presents the findings from a survey on 22 states' decisions concerning implementation of AA-MAS.  相似文献   

19.
20.
Students with the most significant cognitive disabilities (SCD) are the 1% of the total student population who have a disability or multiple disabilities that significantly impact intellectual functioning and adaptive behaviors and who require individualized instruction and substantial supports. Historically, these students have received little instruction in science and the science assessments they have participated in have not included age‐appropriate science content. Guided by a theory of action for a new assessment system, an eight‐state consortium developed multidimensional alternate content standards and alternate assessments in science for students in three grade bands (3–5, 6–8, 9–12) that are linked to the Next Generation Science Standards (NGSS Lead States, 2013 ) and A Framework for K‐12 Science Education (Framework; National Research Council, 2012 ). The great variability within the population of students with SCD necessitates variability in the assessment content, which creates inherent challenges in establishing technical quality. To address this issue, a primary feature of this assessment system is the use of hypothetical cognitive models to provide a structure for variability in assessed content. System features and subsequent validity studies were guided by a theory of action that explains how the proposed claims about score interpretation and use depend on specific assumptions about the assessment, as well as precursors to the assessment. This paper describes evidence for the main claim that test scores represent what students know and can do. We present validity evidence for the assumptions about the assessment and its precursors, related to this main claim. The assessment was administered to over 21,000 students in eight states in 2015–2016. We present selected evidence from system components, procedural evidence, and validity studies. We evaluate the validity argument and demonstrate how it supports the claim about score interpretation and use.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号