期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

The use of scoring rubrics for formative assessment purposes revisited: A review

《Educational Research Review》2013

The mainstream research on scoring rubrics has emphasized the summative aspect of assessment. In recent years, the use of rubrics for formative purposes has gained more attention. This research has, however, not been conclusive. The aim of this study is therefore to review the research on formative use of rubrics, in order to investigate if, and how, rubrics have an impact on student learning. In total, 21 studies about rubrics were analyzed through content analysis. Sample, subject/task, design, procedure, and findings, were compared among the different studies in relation to effects on student performance and selfregulation. Findings indicate that rubrics may have the potential to influence students learning positively, but also that there are several different ways for the use of rubrics to mediate improved performance and self-regulation. There are a number of factors identified that may moderate the effects of using rubrics formatively, as well as factors that need further investigation. 相似文献

2.

A review of rubric use in higher education 总被引：5，自引：3，他引：2

Y. Malini Reddy Heidi Andrade 《Assessment & Evaluation in Higher Education》2010,35(4):435-448

This paper critically reviews the empirical research on the use of rubrics at the post‐secondary level, identifies gaps in the literature and proposes areas in need of research. Studies of rubrics in higher education have been undertaken in a wide range of disciplines and for multiple purposes, including increasing student achievement, improving instruction and evaluating programmes. While, student perceptions of rubrics are generally positive and some authors report positive responses to rubric use by instructors, others noted a tendency for instructors to resist using them. Two studies suggested that rubric use was associated with improved academic performance, while one did not. The potential of rubrics to identify the need for improvements in courses and programmes has been demonstrated. Studies of the validity of rubrics have shown that clarity and appropriateness of language is a central concern. Studies of rater reliability tend to show that rubrics can lead to a relatively common interpretation of student performance. Suggestions for future research include the use of more rigorous research methods, more attention to validity and reliability, a closer focus on learning and research on rubric use in diverse educational contexts. 相似文献

3.

基于标准的教育:利用评分规则促进学生学习

边玉芳单怀俊《教育理论与实践》2006,(13)

教育应该成为基于标准的教育,在教育教学过程中应该有相应的表现性准则、评价规则贯穿始终,这样教育才能真正达到其应有的目标。表现性准则和评价规则在基于标准的教育中具有重要的作用,应利用表现性准则和评价规则促进教学与学习。相似文献

4.

An Investigation of Rater Cognition in the Assessment of Projects

Victoria Crisp 《Educational Measurement》2012,31(3):10-20

In the United Kingdom, the majority of national assessments involve human raters. The processes by which raters determine the scores to award are central to the assessment process and affect the extent to which valid inferences can be made from assessment outcomes. Thus, understanding rater cognition has become a growing area of research in the United Kingdom. This study investigated rater cognition in the context of the assessment of school‐based project work for high‐stakes purposes. Thirteen teachers across three subjects were asked to “think aloud” whilst scoring example projects. Teachers also completed an internal standardization exercise. Nine professional raters across the same three subjects standardized a set of project scores whilst thinking aloud. The behaviors and features attended to were coded. The data provided insights into aspects of rater cognition such as reading strategies, emotional and social influences, evaluations of features of student work (which aligned with scoring criteria), and how overall judgments are reached. The findings can be related to existing theories of judgment. Based on the evidence collected, the cognition of teacher raters did not appear to be substantially different from that of professional raters. 相似文献

5.

Rubrics vs. self-assessment scripts effect on self-regulation,performance and self-efficacy in pre-service teachers

Ernesto Panadero Jesús Alonso-Tapia Eloísa Reche 《Studies in Educational Evaluation》2013

相似文献

6.

Assessing Knowledge Integration in Science: Construct,Measures, and Evidence

Ou Lydia Liu Hee-Sun Lee Carolyn Hofstetter Marcia C. Linn 《Educational Assessment》2013,18(1):33-55

In response to the demand for sound science assessments, this article presents the development of a latent construct called knowledge integration as an effective measure of science inquiry. Knowledge integration assessments ask students to link, distinguish, evaluate, and organize their ideas about complex scientific topics. The article focuses on assessment topics commonly taught in 6th- through 12th-grade classes. Items from both published standardized tests and previous knowledge integration research were examined in 6 subject-area tests. Results from Rasch partial credit analyses revealed that the tests exhibited satisfactory psychometric properties with respect to internal consistency, item fit, weighted likelihood estimates, discrimination, and differential item functioning. Compared with items coded using dichotomous scoring rubrics, those coded with the knowledge integration rubrics yielded significantly higher discrimination indexes. The knowledge integration assessment tasks, analyzed using knowledge integration scoring rubrics, demonstrate strong promise as effective measures of complex science reasoning in varied science domains. 相似文献

7.

Self‐Assessment in Art Education through a Visual Rubric

Talita Groenendijk Andrea Krpti Folkert Haanstra 《The International Journal of Art & Design Education》2020,39(1):153-175

A tool for self assessment in secondary art education was developed and tested. The tool includes rubrics for assessing production and reception activities in art education and consists of visual and text rubrics. The criteria in the rubrics are based on the Common European Framework of Reference for Visual Literacy which was developed by The European Network of Visual Literacy (ENViL). The way teachers and students use the rubrics, whether they consider them helpful and to what extent students’ self‐assessments are in line with teacher assessments was studied. It was concluded that teachers work with the rubrics intensively and both students and teachers appreciate its visual form. However, it was found that the agreement between teachers and students about the students’ scores was moderate and needed to improve. The results show that it is untrue that students, or boys in particular, overestimate their own performance in art education. The current study contributes to the development of feasible and valid assessment criteria and instruments in secondary art education. 相似文献

8.

Automated Scoring of Constructed‐Response Science Items: Prospects and Obstacles

Ou Lydia Liu Chris Brew John Blackmore Libby Gerard Jacquie Madhok Marcia C. Linn 《Educational Measurement》2014,33(2):19-28

Content‐based automated scoring has been applied in a variety of science domains. However, many prior applications involved simplified scoring rubrics without considering rubrics representing multiple levels of understanding. This study tested a concept‐based scoring tool for content‐based scoring, c‐rater?, for four science items with rubrics aiming to differentiate among multiple levels of understanding. The items showed moderate to good agreement with human scores. The findings suggest that automated scoring has the potential to score constructed‐response items with complex scoring rubrics, but in its current design cannot replace human raters. This article discusses sources of disagreement and factors that could potentially improve the accuracy of concept‐based automated scoring. 相似文献

9.

Measuring Graph Comprehension,Critique, and Construction in Science

Kevin Lai Julio Cabrera Jonathan M. Vitale Jacquie Madhok Robert Tinker Marcia C. Linn 《Journal of Science Education and Technology》2016,25(4):665-681

Interpreting and creating graphs plays a critical role in scientific practice. The K-12 Next Generation Science Standards call for students to use graphs for scientific modeling, reasoning, and communication. To measure progress on this dimension, we need valid and reliable measures of graph understanding in science. In this research, we designed items to measure graph comprehension, critique, and construction and developed scoring rubrics based on the knowledge integration (KI) framework. We administered the items to over 460 middle school students. We found that the items formed a coherent scale and had good reliability using both item response theory and classical test theory. The KI scoring rubric showed that most students had difficulty linking graphs features to science concepts, especially when asked to critique or construct graphs. In addition, students with limited access to computers as well as those who speak a language other than English at home have less integrated understanding than others. These findings point to the need to increase the integration of graphing into science instruction. The results suggest directions for further research leading to comprehensive assessments of graph understanding. 相似文献

10.

The contribution of rubrics to the validity of performance assessment: a study of the conservation–restoration and design undergraduate degrees

José-Luis Menéndez-Varela 《Assessment & Evaluation in Higher Education》2016,41(2):228-244

Rubrics have attained considerable importance in the authentic and sustainable assessment paradigm; nevertheless, few studies have examined their contribution to validity, especially outside the domain of educational studies. This empirical study used a quantitative approach to analyse the validity of a rubrics-based performance assessment. Raters evaluated the performance of 84 first-year university students producing service-learning projects for the Conservation–Restoration and Design degrees. The study data comprised the 9240 scores given by two teachers and three student tutors, who assessed the students’ projects on three occasions during the semester. Factor analyses confirmed that the students attained the expected learning outcomes and made significant learning progress. This learning progress was also corroborated by analyses of variance. The attainment of the learning goals and the evidence of learning progress demonstrated the validity of the inferences drawn from the assessment system. In addition, the results highlighted the need to consider rubrics as a first-order teaching resource and not only as a scoring tool. 相似文献

11.

Designing and Validating Assessments of Complex Thinking in Science

Kihyun Ryoo Marcia C. Linn 《理论付诸实践》2015,54(3):238-254

Typical assessment systems often measure isolated ideas rather than the coherent understanding valued in current science classrooms. Such assessments may motivate students to memorize, rather than to use new ideas to solve complex problems. To meet the requirements of the Next Generation Science Standards, instruction needs to emphasize sustained investigations, and assessments need to create a detailed picture of students’ conceptual understanding and reasoning processes.

This article describes the design process and potential for automated scoring of 2 forms of inquiry assessment: Energy Stories and MySystem. To design these assessments, we formed a partnership of teachers, discipline experts, researchers, technologists, and psychometricians to align curriculum, assessments, and rubrics. We illustrate how these items document middle school students’ reasoning about energy flow in life science. We used evidence from review by science teachers and experts in the discipline; classroom experiments; and psychometric analysis to validate the assessments, rubrics, and automated scoring. 相似文献

12.

Using automated analysis to assess middle school students' competence with scientific argumentation

Christopher D. Wilson Kevin C. Haudek Jonathan F. Osborne Zoë E. Buck Bracey Tina Cheuk Brian M. Donovan Molly A. M. Stuhlsatz Marisol M. Santiago Xiaoming Zhai 《科学教学研究杂志》2024,61(1):38-69

Argumentation is fundamental to science education, both as a prominent feature of scientific reasoning and as an effective mode of learning—a perspective reflected in contemporary frameworks and standards. The successful implementation of argumentation in school science, however, requires a paradigm shift in science assessment from the measurement of knowledge and understanding to the measurement of performance and knowledge in use. Performance tasks requiring argumentation must capture the many ways students can construct and evaluate arguments in science, yet such tasks are both expensive and resource-intensive to score. In this study we explore how machine learning text classification techniques can be applied to develop efficient, valid, and accurate constructed-response measures of students' competency with written scientific argumentation that are aligned with a validated argumentation learning progression. Data come from 933 middle school students in the San Francisco Bay Area and are based on three sets of argumentation items in three different science contexts. The findings demonstrate that we have been able to develop computer scoring models that can achieve substantial to almost perfect agreement between human-assigned and computer-predicted scores. Model performance was slightly weaker for harder items targeting higher levels of the learning progression, largely due to the linguistic complexity of these responses and the sparsity of higher-level responses in the training data set. Comparing the efficacy of different scoring approaches revealed that breaking down students' arguments into multiple components (e.g., the presence of an accurate claim or providing sufficient evidence), developing computer models for each component, and combining scores from these analytic components into a holistic score produced better results than holistic scoring approaches. However, this analytical approach was found to be differentially biased when scoring responses from English learners (EL) students as compared to responses from non-EL students on some items. Differences in the severity between human and computer scores for EL between these approaches are explored, and potential sources of bias in automated scoring are discussed. 相似文献

13.

Rubrics vs. self-assessment scripts: effects on first year university students’ self-regulation and performance / Rúbricas y guiones de autoevaluación: efectos sobre la autorregulación y el rendimiento de estudiantes universitarios de primer año

Ernesto Panadero Jesús Alonso-Tapia Juan-Antonio Huertas 《Infancia y Aprendizaje》2013,36(1):149-183

相似文献

14.

Assessing Writing Portfolios: issues in the Validity and Meaning of Scores

《Educational Assessment》2013,18(3):201-224

This article discusses an approach to analyzing performance assessments that identifies potential reasons for misfitting items and uses this information to improve on items and rubrics for these assessments. Specifically, the approach involves identifying psychometric features and qualitative features of items and rubrics that may possibly influence misfit; examining relations between these features and the fit statistic; conducting an analysis of student responses to a sample of misfitting items; and finally, based on the results of the previous analyses, modifying characteristics of the items or rubrics and reexamining fit. A mathematics performance assessment containing 53 constructed-response items scored on a holistic scale from 0 to 4 is used to illustrate the approach. The 2-parameter graded response model (Samejima, 1969) is used to calibrate the data. Implications of this method of data analysis for improving performance assessment items and rubrics are discussed as well as issues and limitations related to the use of the approach. 相似文献

15.

Assessment as learning: Enhancing discourse,understanding, and achievement in innovative science curricula

Daniel T. Hickey Gita Taasoobshirazi Dionne Cross 《科学教学研究杂志》2012,49(10):1240-1270

An assessment‐oriented design‐based research model was applied to existing inquiry‐oriented multimedia programs in astronomy, biology, and ecology. Building on emerging situative theories of assessment, the model extends prevailing views of formative assessment for learning by embedding “discursive” formative assessment more directly into the curriculum. Three twenty‐hour curricula were designed and aligned to content standards, and three levels of assessments were developed and used to assess and enhance learning for each curriculum. These assessments included three or four informal “activity‐oriented” quizzes and discursive formative feedback rubrics supporting collective discourse, a “curriculum‐oriented” examination of individual conceptual understanding, and a “standards‐oriented” test measuring aggregated achievement of targeted standards. After two design‐research cycles, worthwhile scientific argumentation and statistically significant gains were attained for two of the three packages on the exam and test. Achievement gains were comparable to or larger than those of students in comparison classrooms. Many existing innovations could be enhanced and evaluated in this fashion; designing these strategies directly into innovations could have an even greater impact on discourse, understanding, and achievement. © 2012 Wiley Periodicals, Inc. J Res Sci Teach 49: 1240–1270, 2012 相似文献

16.

Teachers' use of rubrics to score non‐traditional tasks: factors related to discrepancies in scoring

Sherry L. Meier Beverly S. Rich JoAnn Cady 《Assessment in Education: Principles, Policy & Practice》2006,13(1):69-95

This study considered middle school mathematics teachers’ use of rubrics to score non‐traditional tasks. A group of eighth‐grade teachers attended a two‐day workshop where they evaluated assessment tasks and discussed the use an associated scoring rubric. Scored samples of student work submitted by the teachers indicated that they had difficulty using the rubrics for scoring. When compared to expert ratings, all except one teacher had discrepancies in scoring and some discrepancies indicated major problems. These discrepancies appear to be related to whether the task contained familiar or unfamiliar content and the mix of procedure and explanation the task required. Several other factors related to discrepancies, such as leniency errors, teacher knowledge, and the halo effect are also discussed. With the expanded use of rubrics in many arenas, these results show the need for more professional development related to rubric use. 相似文献

17.

Exploring the Impact of Rater Effects on Person Fit in Rater-Mediated Assessments

Stefanie A. Wind 《Educational Measurement》2020,39(4):76-94

Researchers have documented the impact of rater effects, or raters’ tendencies to give different ratings than would be expected given examinee achievement levels, in performance assessments. However, the degree to which rater effects influence person fit, or the reasonableness of test-takers’ achievement estimates given their response patterns, has not been investigated. In rater-mediated assessments, person fit reflects the reasonableness of rater judgments of individual test-takers’ achievement over components of the assessment. This study illustrates an approach to visualizing and evaluating person fit in assessments that involve rater judgment using rater-mediated person response functions (rm-PRFs). The rm-PRF approach allows analysts to consider the impact of rater effects on person fit in order to identify individual test-takers for whom the assessment results may not have a straightforward interpretation. A simulation study is used to evaluate the impact of rater effects on person fit. Results indicate that rater effects can compromise the interpretation and use of performance assessment results for individual test-takers. Recommendations are presented that call researchers and practitioners to supplement routine psychometric analyses for performance assessments (e.g., rater reliability checks) with rm-PRFs to identify students whose ratings may have compromised interpretations as a result of rater effects, person misfit, or both. 相似文献

18.

RECONSTRUCTING CHILDREN'S EXPERIENCE WHEN TEACHING AND ASSESSING THEM: LESSONS FROM DEWEY

Francisco Sousa 《Journal of Early Childhood Teacher Education》2013,34(3):313-320

This paper describes an online early childhood assessment course that was developed through a multi-university collaboration with support from a state improvement grant. Collaborators from three universities developed the course to address a new early childhood unified license (birth to age 8, regular and special education) in the state of Kansas. After reviewing the new state content standards, we identified targeted understandings, performance assessments, and online activities for 15 modules using a backward design process. Emphasis was placed on active learning through synchronous and asynchronous interactions facilitated by the use of a course management system. Positive evidence of learning was indicated by anonymous student feedback, pretest/posttest gain scores, and performance assessments evaluated with rubrics. We viewed the implementation of the course as a success and anticipate that it may lead to more sharing of online coursework in Kansas teacher education programs in the future. 相似文献

19.

Rubrics as a way of providing transparency in assessment

Anders Jonsson 《Assessment & Evaluation in Higher Education》2014,39(7):840-852

This paper reports on a study where rubrics have been used to convey assessment expectations to students (n?=?176) in three different assessment situations in professional education. These situations are: (1) the development of a survey instrument, which was part of a course in statistics and epidemiology; (2) an inspection of a house, which was part of a course about the functions of buildings for real estate brokers and (3) a workshop in communication with patients, which was part of a course in the evaluation of diagnostic procedures and treatments of oral infections in dental education. In all situations, students’ perceptions and uses of the rubrics were investigated. Findings suggest that it is indeed possible to convey expectations to students through the use of rubrics, in the sense that students not only appreciate the efforts to make assessment criteria transparent, but may also use the criteria in order to support and self-assess their performance. Important features of the rubrics, which were found to facilitate students’ understanding and use of the criteria in these situations, are presented and discussed. 相似文献

20.

Lecturers’ perceptions: The value of assessment rubrics for informing teaching practice and curriculum review and development

《Africa Education Review》2013,10(3):415-428

ABSTRACT

The assessment rubric is increasingly gaining recognition as a valuable tool in teaching and learning in higher education. While many studies have examined the value of rubrics for students, research into the lecturers’ usage of rubrics is limited. This article explores the lecturers’ perceptions of rubrics, in particular, its use and design, the role it can play in informing one's teaching practice and in curriculum review and development. The data shows that many lecturers use the rubric in a very mechanical and unconscious manner and view it mostly as a grading tool with limited instructional value. While acknowledging the rubric as a reflective tool for students, lecturers do not perceive it as having the same benefits for them. The findings, therefore suggest more conversations around the role that rubrics can play in informing one's teaching practice and course design. It also suggests further research into this area. 相似文献