首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The effects of rating scale format (behaviorally anchored vs. Likert) and rater training on leniency and halo in student ratings of instruction were investigated. The subjects (N=269) were students enrolled in required courses at a graduate theological seminary in the Southwest United States. A repeated measures design controlling for teacher and course was used. Findings indicated: (a) training was effective in reducing leniency and halo in ratings from both instruments; (b) trained raters exhibited less leniency on two rating dimensions when using behaviorally anchored rating scales (BARS's) than when using the Likert scale; and (c) trained raters exhibited less halo when using the Likert than when using the BARS. The findings demonstrate the importance of focusing efforts to improve quality of ratings on the students rather than on the format of the instrument.Presented at the Twenty-Eighth Annual Forum of the Association for Institutional Research, Phoenix, Ariz., May 1988.  相似文献   

2.
Teacher evaluation systems commonly rely on observation of teaching practice (OTP) by school principals. However, the value of OTP as evidence of teacher effectiveness depends on its psychometric quality. In this study, we address a key aspect of the psychometric quality of principals’ OTP ratings. Specifically, we investigate the degree to which rating scale categories have a consistent interpretation across teaching episodes and practices. Results suggest that the 1,324 principals’ use of the rating scale categories functioned as intended overall. However, we also found that the midpoint category is underutilized and that rating categories do not always reflect similar levels of teaching effectiveness across teaching episodes and practices. When such discrepancies occur, we cannot assume principals’ ratings reflect a consistent level of teacher effectiveness within and across classrooms. This is a critical component of validity evidence that can inform the interpretation of OTP ratings and point to areas for improvement in both the rubrics and in principals’ training for classroom observations.  相似文献   

3.
Historically, research focusing on rater characteristics and rating contexts that enable the assignment of accurate ratings and research focusing on statistical indicators of accurate ratings has been conducted by separate communities of researchers. This study demonstrates how existing latent trait modeling procedures can identify groups of raters who may be of substantive interest to those studying the experiential, cognitive, and contextual aspects of ratings. We employ two data sources in our demonstration—simulated data and data from a large‐scale state‐wide writing assessment. We apply latent trait models to these data to identify examples of rater leniency, centrality, inaccuracy, and differential dimensionality; and we investigate the association between rater training procedures and the manifestation of rater effects in the real data.  相似文献   

4.
5.
6.
Two hundred and thirty-six teachers were independently rated by their principals and supervisors on twenty-three scales of teacher competence. Each teacher received forty-six ratings (23 from a principal and 23 from a supervisor). The rating scales were intercorrelated and the resulting matrix factor analyzed. Two correlated factors emerged, one corresponding to principals’ ratings and the other to supervisors’ ratings. The results were interpreted to mean that the rating scales generated data that were more a reflection of the rater’s point of view than of a teacher’s actual classroom behavior.  相似文献   

7.
8.
Definitions of formative assessment include assessment, feedback and differentiated instruction as key components. We investigated the effects of prepared teaching materials designed to support teachers using learning progress assessment (LPA) to give feedback and adapt differentiated instruction. We also examined to what extend this modular approach can be implemented in regular reading lessons in third grade. In a three-group design all teachers (N?=?44, N?=?945 students) employed a computer-based LPA tool, while teachers in two conditions additionally received prepared feedback material (FB) or feedback and reading instruction material (FB+FM), to support the implementation of the different components of formative assessment. Over the course of one schoolyear, we assessed the implementation outcomes using questionnaires as well as students’ reading achievement and further student outcomes. While acceptability is high, teacher ratings of feasibility are low. In comparison to the LPA group, the additional support in form of prepared materials had no effects on student outcomes. Results are discussed regarding the question of how teachers can optimally be supported in using formative assessment.  相似文献   

9.
In science education, reform frequently is conceived and implemented in a top-down fashion, whether teachers are required to engage in change by their principals or superintendents (through high-stakes testing and accountability measures) or by researchers, who inform teachers about alternatives they ought to implement. In this position paper on science education policy, I draw on first philosophy to argue for a different approach to reform, one that involves all stakeholders—teachers, interns, school and university supervisors, and, above all, students—who participate in efforts to understand and change their everyday praxis of teaching and learning. Once all stakeholders experience control over the shaping and changing of classroom learning (i.e., experience agency), they may recognize that they really are in it together, that is, they experience a sense of solidarity. Drawing on ethnographic vignettes, science teaching examples, and philosophical concepts, I outline how more democratic approaches to reform can be enabled.  相似文献   

10.
In this article, different inspection models are compared in terms of their impact on school improvement and the mechanisms each of these models generates to have such an impact. Our theoretical framework was drawn from the programme theories of six countries’ school inspection systems (i.e. the Netherlands, England, Sweden, Ireland, the province of Styria in Austria and the Czech Republic). We describe how inspection models differ in the scheduling and frequency of visits (using a differentiated or cyclical approach), the evaluation of process and/or output standards, and the consequences of visits, and how these models lead to school improvement through the setting of expectations, the use of performance feedback and actions of the school's stakeholders. These assumptions were tested by means of a survey of principals in primary and secondary schools in these countries (n?=?2239). The data analysis followed a three-step approach: (1) confirmatory factor analyses, (2) path modelling and (3) fitting of multiple-indicator multiple-cause models. The results indicate that Inspectorates of Education that use a differentiated model (in addition to regular visits), in which they evaluate both educational practices and outcomes of schools and publicly report inspection findings of individual schools, are the most effective. These changes seem to be mediated by improvements in the schools’ self-evaluations and the schools' stakeholders’ awareness of the findings in the public inspection reports. However, differentiated inspections also lead to unintended consequences as principals report on narrowing the curriculum and on discouraging teachers from experimenting with new teaching methods.  相似文献   

11.
Teacher evaluation commonly includes classroom observations conducted by principals. Despite widespread use, little is known about the quality of principal ratings. We investigated 1,324 principals’ rating accuracy of six teaching practices at the conclusion of training within an authentic teacher evaluation system. Data are from a video-based exam of four 10-minute classroom observations. Many-Facet Rasch modeling revealed that (1) overall principals had high accuracy, but individuals varied substantially, and (2) some teaching episodes and practices were easier to rate accurately. For example, promotes critical thinking was rated more accurately than uses formative assessment. Because Many-Facet Rasch modeling estimates individuals’ accuracy patterns across teaching episodes and practices, it is a useful tool for identifying areas that individual principals, or groups, may need additional training (e.g., evaluating formative assessment). Implications for improving training of principals to conduct classroom observations for teacher evaluation are discussed.  相似文献   

12.
Researchers have documented the impact of rater effects, or raters’ tendencies to give different ratings than would be expected given examinee achievement levels, in performance assessments. However, the degree to which rater effects influence person fit, or the reasonableness of test-takers’ achievement estimates given their response patterns, has not been investigated. In rater-mediated assessments, person fit reflects the reasonableness of rater judgments of individual test-takers’ achievement over components of the assessment. This study illustrates an approach to visualizing and evaluating person fit in assessments that involve rater judgment using rater-mediated person response functions (rm-PRFs). The rm-PRF approach allows analysts to consider the impact of rater effects on person fit in order to identify individual test-takers for whom the assessment results may not have a straightforward interpretation. A simulation study is used to evaluate the impact of rater effects on person fit. Results indicate that rater effects can compromise the interpretation and use of performance assessment results for individual test-takers. Recommendations are presented that call researchers and practitioners to supplement routine psychometric analyses for performance assessments (e.g., rater reliability checks) with rm-PRFs to identify students whose ratings may have compromised interpretations as a result of rater effects, person misfit, or both.  相似文献   

13.
The Classroom Assessment Scoring System (CLASS; Pianta et al., 2008) is a popular measure of teacher–child interactions. Despite its prominence, CLASS scores have fairly weak relations with various child outcomes (e.g., Zaslow et al., 2010). One potential reason for these findings could be systematic differences in observer severity. As such, the purpose of this study was to explore the scope and impact of rater effects on CLASS scores with a sample of 77 teachers who were rated by 13 observers. Results indicated significant rater effects across all three CLASS domains. Adjusting for these effects, however, did not improve relations between CLASS scores and child outcomes. Implications for the CLASS and related assessments are discussed.  相似文献   

14.
Teachers’ interpersonal behavior in class is important for teacher and student emotions. Often the same rater (either teacher or students) is used to assess both perceptions of teacher behavior and emotions, which makes it vulnerable to common-method bias. Including other perspectives on teacher behavior has been proposed as a solution, but it is unclear to what extent different perspectives are correlated and how to separate their shared and unique variance in explaining emotions. Behavior of 80 teachers was rated from three perspectives (observers, students, and teachers) in terms of Agency (i.e., social influence) and Communion (i.e., friendliness). The three perspectives overlapped more strongly for teacher agency than for communion. Especially for students, teacher communion was a stronger predictor of emotions than agency. Our innovative statistical approach showed that the strong association between ratings of teacher behavior and emotions of the same rater are unlikely to result from common-method bias only.  相似文献   

15.
Despite considerable interest in the topic of instructional quality in research as well as practice, little is known about the quality of its assessment. Using generalizability analysis as well as content analysis, the present study investigates how reliably and validly instructional quality is measured by observer ratings. Twelve trained raters judged 57 videotaped lesson sequences with regard to aspects of domain-independent instructional quality. Additionally, 3 of these sequences were judged by 390 untrained raters (i.e., student teachers and teachers). Depending on scale level and dimension, 16–44% of the variance in ratings could be attributed to instructional quality, whereas rater bias accounted for 12–40% of the variance. Although the trained raters referred more often to aspects considered essential for instructional quality, this was not reflected in the reliability of their ratings. The results indicate that observer ratings should be treated in a more differentiated manner in the future.  相似文献   

16.
《师资教育杂志》2012,38(1):61-75
Teachers' thinking about four conceptions of teaching (i.e., apprenticeship‐developmental, nurturing, social reform, and transmission) were captured using the Teaching Perspectives Inventory (TPI). New Zealand and Queensland have very similar teaching‐related policies and practices but differences around assessment policies and practices are expected to influence teachers' conceptions of teaching. Results from two surveys (New Zealand primary (n = 241) and Queensland primary (n = 784) and secondary (n = 614) teachers) found acceptably fitting models. TPI models were not invariant between primary and secondary teachers in Queensland while the models for primary teachers in Queensland and New Zealand were partially invariant. There were only small differences in mean perspectives scores, except for transmission, which elicited large differences.  相似文献   

17.
This study examined similarities and differences in the perceptions of principals and teachers about the use of differentiated strategies for gifted learners and studied principals’ perceptions about schoolwide differentiation. Comparisons of these perceptions have been undoc-umented to date. Participants included 867 teachers and 120 principals from government schools in Sydney, Australia. A mixed methods approach was used, including online questionnaires and case studies of principals. Results revealed significant differences between the perceptions of principals and teachers about differentiated practices. The case studies demonstrate that exemplary principals continually enhance their understanding of differentiated learning and build their teachers’ collective capacity for educating gifted learners. The findings indicate the need for stronger pedagogical congruence between principals and teachers in educating the gifted, ongoing professional education of principals and teachers in gifted education, and effective leadership actions for schoolwide differentiated learning.  相似文献   

18.
This study examined the relationship of teachers’ ratings of students’ 21st century skills (i.e., persistence, curiosity, externalizing and internalizing affect, and cognition) via the Human Behavior Rating Scale: Brief (HBRS: Brief; Eaves & Woods‐Groves, 2011) with student performance. Midwestern K‐11 teachers (n = 96) rated students (n = 1,689) via the HBRS: Brief and the Strengths and Difficulties Questionnaire (SDQ; Goodman, 1997). Students’ academic (i.e., standardized tests) and behavioral (i.e., office discipline referrals [ODRs] and absences) performance was compared with HBRS: Brief ratings. Hierarchical linear modeling revealed that teachers’ ratings of students’ 21st century skills were related to the following: (a) Persistence with SDQ conduct problems, academic performance, and absences; (b) curiosity with SDQ emotional symptoms; (c) externalizing affect with SDQ conduct problems, academic performance, and ODRs; (d) internalizing affect with SDQ emotional symptoms and academic performance, and (e) cognition with academics.  相似文献   

19.
Research Findings: This study investigated the relationship between features of the classroom environment and misalignment between teacher and observer ratings of preschoolers’ classroom engagement and the extent to which years of teaching experience moderated this relationship. In a sample of 116 preschoolers and 21 teachers in 29 classrooms, classroom engagement was assessed using teacher report and independent direct observation. Classroom-level predictors included severity of challenging behaviors, child:adult ratio, and the percentage of students from economically disadvantaged backgrounds. Lower misalignment was noted on ratings of negative engagement than positive engagement. Multilevel regression analyses revealed that misalignment was largely independent of variation in classroom factors, except for a consistent interaction between years of teaching experience and child:adult ratio. Specifically, observers’ ratings were less misaligned with novice teachers’ ratings than veterans’ in classrooms with a high child:adult ratio, whereas the opposite trend was found in classrooms with low child:adult ratios. Practice or Policy: Findings are discussed in light of current best practice calling for a multidisciplinary approach when conducting assessments on young children, despite little guidance on how to handle misalignment between raters when determining service eligibility and developing effective interventions.  相似文献   

20.
The past decade has witnessed growing interest in the study of the perceptual differences between principals and teachers, and a number of inconsistent results have been documented. This study examined differences between principals’ and teachers’ perceptions of principal instructional leadership and tested the hypothesis that power distance (PD) moderates the differences between the two parties. Based on survey data collected from 132 Chinese principals and 1708 teachers, the results revealed no significant differences in the total and dimensional levels of instructional leadership; however, PD moderated the perceptual differences. Specifically, when the principals reported a low PD, their self-ratings of their instructional leadership were higher than the teachers’ ratings, and conversely, when the principals reported a high PD, their self-ratings were lower than the teachers’ evaluations. However, the result was contrary to the hypothesis when PD was reported by teachers. The theoretical and practical implications are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号