首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The mainstream research on scoring rubrics has emphasized the summative aspect of assessment. In recent years, the use of rubrics for formative purposes has gained more attention. This research has, however, not been conclusive. The aim of this study is therefore to review the research on formative use of rubrics, in order to investigate if, and how, rubrics have an impact on student learning. In total, 21 studies about rubrics were analyzed through content analysis. Sample, subject/task, design, procedure, and findings, were compared among the different studies in relation to effects on student performance and selfregulation. Findings indicate that rubrics may have the potential to influence students learning positively, but also that there are several different ways for the use of rubrics to mediate improved performance and self-regulation. There are a number of factors identified that may moderate the effects of using rubrics formatively, as well as factors that need further investigation.  相似文献   

2.
A review of rubric use in higher education   总被引:5,自引:3,他引:2  
This paper critically reviews the empirical research on the use of rubrics at the post‐secondary level, identifies gaps in the literature and proposes areas in need of research. Studies of rubrics in higher education have been undertaken in a wide range of disciplines and for multiple purposes, including increasing student achievement, improving instruction and evaluating programmes. While, student perceptions of rubrics are generally positive and some authors report positive responses to rubric use by instructors, others noted a tendency for instructors to resist using them. Two studies suggested that rubric use was associated with improved academic performance, while one did not. The potential of rubrics to identify the need for improvements in courses and programmes has been demonstrated. Studies of the validity of rubrics have shown that clarity and appropriateness of language is a central concern. Studies of rater reliability tend to show that rubrics can lead to a relatively common interpretation of student performance. Suggestions for future research include the use of more rigorous research methods, more attention to validity and reliability, a closer focus on learning and research on rubric use in diverse educational contexts.  相似文献   

3.
教育应该成为基于标准的教育,在教育教学过程中应该有相应的表现性准则、评价规则贯穿始终,这样教育才能真正达到其应有的目标。表现性准则和评价规则在基于标准的教育中具有重要的作用,应利用表现性准则和评价规则促进教学与学习。  相似文献   

4.
In the United Kingdom, the majority of national assessments involve human raters. The processes by which raters determine the scores to award are central to the assessment process and affect the extent to which valid inferences can be made from assessment outcomes. Thus, understanding rater cognition has become a growing area of research in the United Kingdom. This study investigated rater cognition in the context of the assessment of school‐based project work for high‐stakes purposes. Thirteen teachers across three subjects were asked to “think aloud” whilst scoring example projects. Teachers also completed an internal standardization exercise. Nine professional raters across the same three subjects standardized a set of project scores whilst thinking aloud. The behaviors and features attended to were coded. The data provided insights into aspects of rater cognition such as reading strategies, emotional and social influences, evaluations of features of student work (which aligned with scoring criteria), and how overall judgments are reached. The findings can be related to existing theories of judgment. Based on the evidence collected, the cognition of teacher raters did not appear to be substantially different from that of professional raters.  相似文献   

5.
6.
In response to the demand for sound science assessments, this article presents the development of a latent construct called knowledge integration as an effective measure of science inquiry. Knowledge integration assessments ask students to link, distinguish, evaluate, and organize their ideas about complex scientific topics. The article focuses on assessment topics commonly taught in 6th- through 12th-grade classes. Items from both published standardized tests and previous knowledge integration research were examined in 6 subject-area tests. Results from Rasch partial credit analyses revealed that the tests exhibited satisfactory psychometric properties with respect to internal consistency, item fit, weighted likelihood estimates, discrimination, and differential item functioning. Compared with items coded using dichotomous scoring rubrics, those coded with the knowledge integration rubrics yielded significantly higher discrimination indexes. The knowledge integration assessment tasks, analyzed using knowledge integration scoring rubrics, demonstrate strong promise as effective measures of complex science reasoning in varied science domains.  相似文献   

7.
A tool for self assessment in secondary art education was developed and tested. The tool includes rubrics for assessing production and reception activities in art education and consists of visual and text rubrics. The criteria in the rubrics are based on the Common European Framework of Reference for Visual Literacy which was developed by The European Network of Visual Literacy (ENViL). The way teachers and students use the rubrics, whether they consider them helpful and to what extent students’ self‐assessments are in line with teacher assessments was studied. It was concluded that teachers work with the rubrics intensively and both students and teachers appreciate its visual form. However, it was found that the agreement between teachers and students about the students’ scores was moderate and needed to improve. The results show that it is untrue that students, or boys in particular, overestimate their own performance in art education. The current study contributes to the development of feasible and valid assessment criteria and instruments in secondary art education.  相似文献   

8.
Content‐based automated scoring has been applied in a variety of science domains. However, many prior applications involved simplified scoring rubrics without considering rubrics representing multiple levels of understanding. This study tested a concept‐based scoring tool for content‐based scoring, c‐rater?, for four science items with rubrics aiming to differentiate among multiple levels of understanding. The items showed moderate to good agreement with human scores. The findings suggest that automated scoring has the potential to score constructed‐response items with complex scoring rubrics, but in its current design cannot replace human raters. This article discusses sources of disagreement and factors that could potentially improve the accuracy of concept‐based automated scoring.  相似文献   

9.
Interpreting and creating graphs plays a critical role in scientific practice. The K-12 Next Generation Science Standards call for students to use graphs for scientific modeling, reasoning, and communication. To measure progress on this dimension, we need valid and reliable measures of graph understanding in science. In this research, we designed items to measure graph comprehension, critique, and construction and developed scoring rubrics based on the knowledge integration (KI) framework. We administered the items to over 460 middle school students. We found that the items formed a coherent scale and had good reliability using both item response theory and classical test theory. The KI scoring rubric showed that most students had difficulty linking graphs features to science concepts, especially when asked to critique or construct graphs. In addition, students with limited access to computers as well as those who speak a language other than English at home have less integrated understanding than others. These findings point to the need to increase the integration of graphing into science instruction. The results suggest directions for further research leading to comprehensive assessments of graph understanding.  相似文献   

10.
Rubrics have attained considerable importance in the authentic and sustainable assessment paradigm; nevertheless, few studies have examined their contribution to validity, especially outside the domain of educational studies. This empirical study used a quantitative approach to analyse the validity of a rubrics-based performance assessment. Raters evaluated the performance of 84 first-year university students producing service-learning projects for the Conservation–Restoration and Design degrees. The study data comprised the 9240 scores given by two teachers and three student tutors, who assessed the students’ projects on three occasions during the semester. Factor analyses confirmed that the students attained the expected learning outcomes and made significant learning progress. This learning progress was also corroborated by analyses of variance. The attainment of the learning goals and the evidence of learning progress demonstrated the validity of the inferences drawn from the assessment system. In addition, the results highlighted the need to consider rubrics as a first-order teaching resource and not only as a scoring tool.  相似文献   

11.
Typical assessment systems often measure isolated ideas rather than the coherent understanding valued in current science classrooms. Such assessments may motivate students to memorize, rather than to use new ideas to solve complex problems. To meet the requirements of the Next Generation Science Standards, instruction needs to emphasize sustained investigations, and assessments need to create a detailed picture of students’ conceptual understanding and reasoning processes.

This article describes the design process and potential for automated scoring of 2 forms of inquiry assessment: Energy Stories and MySystem. To design these assessments, we formed a partnership of teachers, discipline experts, researchers, technologists, and psychometricians to align curriculum, assessments, and rubrics. We illustrate how these items document middle school students’ reasoning about energy flow in life science. We used evidence from review by science teachers and experts in the discipline; classroom experiments; and psychometric analysis to validate the assessments, rubrics, and automated scoring.  相似文献   

12.
Argumentation is fundamental to science education, both as a prominent feature of scientific reasoning and as an effective mode of learning—a perspective reflected in contemporary frameworks and standards. The successful implementation of argumentation in school science, however, requires a paradigm shift in science assessment from the measurement of knowledge and understanding to the measurement of performance and knowledge in use. Performance tasks requiring argumentation must capture the many ways students can construct and evaluate arguments in science, yet such tasks are both expensive and resource-intensive to score. In this study we explore how machine learning text classification techniques can be applied to develop efficient, valid, and accurate constructed-response measures of students' competency with written scientific argumentation that are aligned with a validated argumentation learning progression. Data come from 933 middle school students in the San Francisco Bay Area and are based on three sets of argumentation items in three different science contexts. The findings demonstrate that we have been able to develop computer scoring models that can achieve substantial to almost perfect agreement between human-assigned and computer-predicted scores. Model performance was slightly weaker for harder items targeting higher levels of the learning progression, largely due to the linguistic complexity of these responses and the sparsity of higher-level responses in the training data set. Comparing the efficacy of different scoring approaches revealed that breaking down students' arguments into multiple components (e.g., the presence of an accurate claim or providing sufficient evidence), developing computer models for each component, and combining scores from these analytic components into a holistic score produced better results than holistic scoring approaches. However, this analytical approach was found to be differentially biased when scoring responses from English learners (EL) students as compared to responses from non-EL students on some items. Differences in the severity between human and computer scores for EL between these approaches are explored, and potential sources of bias in automated scoring are discussed.  相似文献   

13.
14.
《Educational Assessment》2013,18(3):201-224
This article discusses an approach to analyzing performance assessments that identifies potential reasons for misfitting items and uses this information to improve on items and rubrics for these assessments. Specifically, the approach involves identifying psychometric features and qualitative features of items and rubrics that may possibly influence misfit; examining relations between these features and the fit statistic; conducting an analysis of student responses to a sample of misfitting items; and finally, based on the results of the previous analyses, modifying characteristics of the items or rubrics and reexamining fit. A mathematics performance assessment containing 53 constructed-response items scored on a holistic scale from 0 to 4 is used to illustrate the approach. The 2-parameter graded response model (Samejima, 1969) is used to calibrate the data. Implications of this method of data analysis for improving performance assessment items and rubrics are discussed as well as issues and limitations related to the use of the approach.  相似文献   

15.
An assessment‐oriented design‐based research model was applied to existing inquiry‐oriented multimedia programs in astronomy, biology, and ecology. Building on emerging situative theories of assessment, the model extends prevailing views of formative assessment for learning by embedding “discursive” formative assessment more directly into the curriculum. Three twenty‐hour curricula were designed and aligned to content standards, and three levels of assessments were developed and used to assess and enhance learning for each curriculum. These assessments included three or four informal “activity‐oriented” quizzes and discursive formative feedback rubrics supporting collective discourse, a “curriculum‐oriented” examination of individual conceptual understanding, and a “standards‐oriented” test measuring aggregated achievement of targeted standards. After two design‐research cycles, worthwhile scientific argumentation and statistically significant gains were attained for two of the three packages on the exam and test. Achievement gains were comparable to or larger than those of students in comparison classrooms. Many existing innovations could be enhanced and evaluated in this fashion; designing these strategies directly into innovations could have an even greater impact on discourse, understanding, and achievement. © 2012 Wiley Periodicals, Inc. J Res Sci Teach 49: 1240–1270, 2012  相似文献   

16.
This study considered middle school mathematics teachers’ use of rubrics to score non‐traditional tasks. A group of eighth‐grade teachers attended a two‐day workshop where they evaluated assessment tasks and discussed the use an associated scoring rubric. Scored samples of student work submitted by the teachers indicated that they had difficulty using the rubrics for scoring. When compared to expert ratings, all except one teacher had discrepancies in scoring and some discrepancies indicated major problems. These discrepancies appear to be related to whether the task contained familiar or unfamiliar content and the mix of procedure and explanation the task required. Several other factors related to discrepancies, such as leniency errors, teacher knowledge, and the halo effect are also discussed. With the expanded use of rubrics in many arenas, these results show the need for more professional development related to rubric use.  相似文献   

17.
Researchers have documented the impact of rater effects, or raters’ tendencies to give different ratings than would be expected given examinee achievement levels, in performance assessments. However, the degree to which rater effects influence person fit, or the reasonableness of test-takers’ achievement estimates given their response patterns, has not been investigated. In rater-mediated assessments, person fit reflects the reasonableness of rater judgments of individual test-takers’ achievement over components of the assessment. This study illustrates an approach to visualizing and evaluating person fit in assessments that involve rater judgment using rater-mediated person response functions (rm-PRFs). The rm-PRF approach allows analysts to consider the impact of rater effects on person fit in order to identify individual test-takers for whom the assessment results may not have a straightforward interpretation. A simulation study is used to evaluate the impact of rater effects on person fit. Results indicate that rater effects can compromise the interpretation and use of performance assessment results for individual test-takers. Recommendations are presented that call researchers and practitioners to supplement routine psychometric analyses for performance assessments (e.g., rater reliability checks) with rm-PRFs to identify students whose ratings may have compromised interpretations as a result of rater effects, person misfit, or both.  相似文献   

18.
This paper describes an online early childhood assessment course that was developed through a multi-university collaboration with support from a state improvement grant. Collaborators from three universities developed the course to address a new early childhood unified license (birth to age 8, regular and special education) in the state of Kansas. After reviewing the new state content standards, we identified targeted understandings, performance assessments, and online activities for 15 modules using a backward design process. Emphasis was placed on active learning through synchronous and asynchronous interactions facilitated by the use of a course management system. Positive evidence of learning was indicated by anonymous student feedback, pretest/posttest gain scores, and performance assessments evaluated with rubrics. We viewed the implementation of the course as a success and anticipate that it may lead to more sharing of online coursework in Kansas teacher education programs in the future.  相似文献   

19.
This paper reports on a study where rubrics have been used to convey assessment expectations to students (n?=?176) in three different assessment situations in professional education. These situations are: (1) the development of a survey instrument, which was part of a course in statistics and epidemiology; (2) an inspection of a house, which was part of a course about the functions of buildings for real estate brokers and (3) a workshop in communication with patients, which was part of a course in the evaluation of diagnostic procedures and treatments of oral infections in dental education. In all situations, students’ perceptions and uses of the rubrics were investigated. Findings suggest that it is indeed possible to convey expectations to students through the use of rubrics, in the sense that students not only appreciate the efforts to make assessment criteria transparent, but may also use the criteria in order to support and self-assess their performance. Important features of the rubrics, which were found to facilitate students’ understanding and use of the criteria in these situations, are presented and discussed.  相似文献   

20.
《Africa Education Review》2013,10(3):415-428
ABSTRACT

The assessment rubric is increasingly gaining recognition as a valuable tool in teaching and learning in higher education. While many studies have examined the value of rubrics for students, research into the lecturers’ usage of rubrics is limited. This article explores the lecturers’ perceptions of rubrics, in particular, its use and design, the role it can play in informing one's teaching practice and in curriculum review and development. The data shows that many lecturers use the rubric in a very mechanical and unconscious manner and view it mostly as a grading tool with limited instructional value. While acknowledging the rubric as a reflective tool for students, lecturers do not perceive it as having the same benefits for them. The findings, therefore suggest more conversations around the role that rubrics can play in informing one's teaching practice and course design. It also suggests further research into this area.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号