期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Uncovering Multivariate Structure in Classroom Observations in the Presence of Rater Errors

Daniel F. McCaffrey Kun Yuan Terrance D. Savitsky J. R. Lockwood Maria O. Edelen 《Educational Measurement》2015,34(2):34-46

We examine the factor structure of scores from the CLASS‐S protocol obtained from observations of middle school classroom teaching. Factor analysis has been used to support both interpretations of scores from classroom observation protocols, like CLASS‐S, and the theories about teaching that underlie them. However, classroom observations contain multiple sources of error, most predominantly rater errors. We demonstrate that errors in scores made by two raters on the same lesson have a factor structure that is distinct from the factor structure at the teacher level. Consequently, the “standard” approach of analyzing on teacher‐level average dimension scores can yield incorrect inferences about the factor structure at the teacher level and possibly misleading evidence about the validity of scores and theories of teaching. We consider alternative hierarchical estimation approaches designed to prevent the contamination of estimated teacher‐level factors. These alternative approaches find a teacher‐level factor structure for CLASS‐S that consists of strongly correlated support and classroom management factors. Our results have implications for future studies using factor analysis on classroom observation data to develop validity evidence and test theories of teaching and for practitioners who rely on the results of such studies to support their use and interpretation of the classroom observation scores. 相似文献

2.

An experimental evaluation of three teacher quality measures: Value-added,classroom observations,and student surveys

《Economics of Education Review》2019

Nearly every state evaluates teacher performance using multiple measures, but evidence has largely shown that only one such measure—teachers’ effects on student achievement (i.e., value-added)—captures teachers’ causal effects. We conducted a random assignment experiment in 66 fourth- and fifth-grade mathematics classrooms to evaluate the predictive validity of three measures of teacher performance: value-added, classroom observations, and student surveys. Combining our results with those from two previous random assignment experiments, we provide additional experimental evidence that value-added measures are unbiased predictors of teacher performance. Though results for the other two measures are less precise, we find that classroom observation scores are predictive of teachers’ performance after random assignment while student surveys are not. These results thus lend support to teacher evaluation systems that use value-added and classroom observations, but suggest practitioners should proceed with caution when considering student survey measures for teacher evaluation. 相似文献

3.

A hierarchical approach to Students’ Assessments of Instruction

Alan Socha 《Assessment & Evaluation in Higher Education》2013,38(1):94-113

A teacher evaluation system can be threatening to faculty, especially if used for summative decisions. Therefore, it is important to obtain valid and pertinent information. Since students are extensively exposed to course elements, students’ evaluation of instruction should be one of several components in the teacher evaluation system. Since traditional methods, such as Cronbach’s alpha and ordinary least squares regression, do not address the hierarchical data of the classroom, the current study used the statistical techniques of confirmatory factor analysis and hierarchical linear modelling in order to properly investigate the reliability and validity of the Students’ Assessment of Instruction (SAI) instrument. Use of hierarchical linear modelling to analyse teacher evaluation instruments could not be found in the literature, although it has been used in educational settings. This study will illustrate its usefulness in determining what measures are related, either as evidence of validity or as a bias, to instructional effectiveness. Student responses were also compared with faculty self-evaluations, one indicator of effective teaching, in order to determine if the SAI does measure instructional effectiveness. Overall, the SAI was found to have good reliability and validity with relatively few biases and could be used to extract five distinguishable traits of instructional effectiveness. 相似文献

4.

The Influence of Technology on the Classroom Climate of Social Studies Classrooms: A Multidimensional Approach

Mucherah Wilfridah M. 《Learning Environments Research》2003,6(1):37-57

This study examined the classroom climate in social studies classrooms using technology as measured by the Classroom Climate Questionnaire (CCQ), classroom observations, and teacher interviews. The questionnaire was administered to 306 students in 14 classrooms across three public urban middle schools. Exploratory and confirmatory factor analyses revealed six factors: Teacher Support and Structure, Rule Clarity and Teacher Control, Involvement in Teacher Structured Activities, Involvement with Computers, Competition with Computers, and Innovation. Significant differences among schools emerged for Involvement in Teacher Structured Activities, Innovation, and Involvement with Computers. Gender effects were found for Involvement with Computers and Competition with Computers, with males reporting higher scores than females on both aspects of classroom climate. Observational data and interviews provide interpretive information. Implications and future research are discussed. 相似文献

5.

Teacher self-efficacy: a classroom-organization conceptualization

《Teaching and Teacher Education》2002,18(6):675-686

In current conceptualizations teacher sense of self-efficacy relates mainly to teaching tasks, in particular within the classroom context. This article offers a new conceptualization of teacher self-efficacy based on a broader work spectrum, comprising classroom and school-organizational contexts, with empirical evidence to support its validity. Participants were 555 teachers who served as respondents, filling out a self-report questionnaire. In a factor analysis of the scores, a two-factor structure emerged which consisted of teacher self-efficacy in the classroom and in the school-organizational domain. Each factor possessed professional tasks and inter-relation elements. The study suggests a new definition of teacher self-efficacy. 相似文献

6.

A Validity Argument in Support of the Use of College Admissions Test Scores for Federal Accountability

Wayne J. Camara Krista Mattern Michelle Croft Sara Vispoel Paul Nichols 《Educational Measurement》2019,38(4):12-26

In 2018, 26 states administered a college admissions test to all public school juniors. Nearly half of those states proposed to use those scores as their academic achievement indicators for federal accountability under the Every Student Succeeds Act (ESSA); many others are planning to use those scores for other accountability purposes. Accountability encompasses a number of different uses and subsumes a variety of claims. For states proposing to use summative tests for accountability, a validity argument needs to be developed, which entails delineating each specific use of test scores associated with accountability, identifying appropriate evidence, and offering a rebuttal to counterclaims. The aim of this article is to support states in developing a validity argument for use of college admission test scores for accountability by identifying claims that are applicable across states, along with summarizing existing evidence as it relates to each of these claims. As outlined by The Standards for Educational and Psychological Testing, multiple sources of evidence are used to address each claim. A series of threats to the validity argument, including weaker alignment with content standards and potential influences in narrowing teaching, are reviewed. Finally, the article contrasts validity evidence, primarily from research on the ACT, with regulatory requirements from ESSA. The Standards and guidance addressing the use of a “nationally recognized high school academic assessment” (Elementary and Secondary Education Act (ESEA), Negotiated Rulemaking Committee; Department of Education) are the primary sources for the organization of validity evidence. 相似文献

7.

A comparison of actual and preferred classroom environments as perceived by science teachers and students

Darrell L. Fisher Barry J. Fraser 《科学教学研究杂志》1983,20(1):55-61

This study of perceptions of classroom environment is distinctive in that, first, it made use of two instruments (the Individualized Classroom Environment Questionnaire and Classroom Environment Scale) which have had very little use in prior science education research and, second, it involved assessment not only of student perceptions of actual environment, but also of student perceptions of preferred environments and teacher perceptions of actual environment. Administration of these instruments to a sample of 2175 junior high school students in 116 classes revealed that the environment scales exhibited satisfactory internal consistency reliability and discriminant validity in each of the three forms (student actual, student preferred, and teacher actual), and that there were some fascinating systematic differences between the profiles of environment scale scores obtained for the different forms. In particular, it was generally found that students preferred a more favorable classroom environment then was perceived as being actually present and that teachers perceived the environment of their classes more favorably than did students in the same classrooms. 相似文献

8.

Concurrent Validity of the Classroom Strategies Scale–Teacher Form: A Preliminary Investigation

Linda A. Reddy Christopher M. Dudek Angelique J. Rualo Gregory A. Fabiano 《Educational Assessment》2016,21(4):267-277

The present study investigated the concurrent validity of the Classroom Strategies Scale–Teacher Form (CSS-T), a multidimensional teacher formative assessment of instructional and behavioral management practices. The CSS-T is compared with the Classroom Assessment Scoring System (CLASS), a well-known teacher assessment of overall classroom quality. A sample of 126 kindergarten through 5th-grade general education teachers self-reported on their usage of empirically supported instructional and behavioral management strategies as measured by the CSS-T while a certified independent observer completed the CLASS. Correlational analyses were used to compare CSS-T frequency and discrepancy scores and the CLASS scores. As hypothesized, results demonstrated significant positive (CSS-T frequency scale scores) and negative (CSS-T discrepancy scale scores) correlations between specific CLASS domains and dimensions, thus providing initial evidence for the concurrent and discriminant validity of the CSS-T. Implications of findings are discussed. 相似文献

9.

The effect of A teacher questioning strategy training program on teaching behavior,student achievement,and retention

Paul B. Otto Robert F. Schuck 《科学教学研究杂志》1983,20(6):521-528

The use of questions in the classroom has been employed throughout the recorded history of teaching. One still hears the term “Socratic method” during discussions of questioning procedures. The use of teacher questions is presently viewed as a viable procedure for effective instruction. This study was conducted to investigate the feasibility of training teachers in the use of a questioning technique and the resultant effect upon student learning. The Post-Test Only Control Group Design was used in randomly assigning teachers and students to experimental and control groups. A group of teachers was trained in the use of a specific questioning technique. Follow-up periodic observations were made of questioning technique behavior while teaching science units to groups of students. Post-unit achievement tests were administered to the student groups to obtain evidence of a relationship between the implementation of specific types of teacher questions and student achievement and retention. Analysis of observation data indicated a higher use of managerial and rhetorical questions by the control group than the experimental group. The experimental group employed a greater number of recall and data gathering questions as well as higher order data processing and data verification type questions. The student posttest achievement scores for both units of instruction were greater for the experimental groups than for the control groups. The retention scores for both units were Beater for the experimental groups than for the control groups. 相似文献

10.

Use of multitrait-multimethod modelling to validate actual and preferred forms of the What Is Happening In this Class? (WIHIC) questionnaire

Jeffrey P. Dorman 《Learning Environments Research》2008,11(3):179-193

This article describes the validation of scores on actual and preferred forms of the What Is Happening In this Class? (WIHIC). The WIHIC is a 56-item instrument that assesses seven classroom environment dimensions: Student Cohesiveness, Teacher Support, Involvement, Investigation, Task Orientation, Cooperation and Equity. A sample of 978 secondary school students from Australia responded to actual and preferred forms of the WIHIC. Separate confirmatory factor analyses for the actual and preferred forms supported the seven-scale a priori structure of the instrument. Fit statistics indicated a good fit of the models to the data. The use of multitrait-multimethod modelling with the seven scales as traits and the two forms of the instrument as methods supported the WIHIC’s construct validity. This research has provided strong evidence of the sound psychometric properties of the WIHIC. 相似文献

11.

Validation of a National Teacher Assessment and Improvement System

Sandy Taut María Verónica Santelices Brian Stecher 《Educational Assessment》2013,18(4):163-199

The task of validating a teacher assessment and improvement system is similar whether the system operates in the United States or in another country. Chile has a national teacher evaluation system (NTES) that is standards based, uses multiple instruments, and is intended to serve both formative and summative purposes. For the past 6 years the authors have performed validation research on NTES using a variety of methods and data sources. This article describes our validation research agenda, the results of major validation studies, and an integration of the existing evidence, and it offers the authors' preliminary judgment about NTES's validity. The article also offers a critical reflection regarding the decisions taken while driving the long and winding validation road, and the lessons we learned during this politically and methodologically complex journey. 相似文献

12.

Comparing teaching practices,teacher content knowledge and pay in Punjab

《International Journal of Educational Development》2020

This study utilises a unique school based survey of public, private and Public Partnership Programme (PPP) schools in Punjab, Pakistan to identify the correlates of: teacher behaviour in the classroom, teacher content knowledge mastery, and teacher pay. The study finds that private school teachers are associated with lower classroom observation scores than public school teachers, and both private and PPP school teachers are less likely to exhibit content knowledge mastery in Urdu than public school teachers (however, this is not the case for math and English). Public school teachers recruited after the introduction of test-based teacher recruitment are 30 percentage points more likely to demonstrate math content knowledge mastery than teachers recruited prior to test-based recruitment. The study also finds evidence of a pay gap between male and female teachers in both public and PPP schools after controlling for other characteristics. Teachers’ math content knowledge scores were found to be statistically significant correlates of teacher pay for private school teachers, but not for PPP or public school teachers. For female PPP school teachers, having higher academic qualifications is associated with higher wages however, this is not the case for male teachers in these schools. 相似文献

13.

Teachers' Instructional Use of Summative Student Assessment Data

Nancy R. Hoover 《教育实用测度》2013,26(3):219-231

This study examines the extent to which classroom teachers self-report using summative assessment data in formative ways to shape instruction. A Web-based survey was administered to elementary, middle, and high school teachers in a large, suburban school district in central Virginia. Teachers reported administering a variety of summative assessments with varying frequency, and analyzing data at the aggregate level, most often using central tendency statistics. Useful methods for disaggregating data by content standards or student subgroups were not as frequently reported. Regardless of the methods of data analysis, a majority of teachers reported using assessment results to evaluate their instructional practice and make adjustments to support student learning. The results suggest, however, that teachers engaged in a cursory analysis of student performance fairly regularly but conduct more in-depth analyses less often. The study raises questions about how teachers can effectively use summative data for instructional purposes. 相似文献

14.

Teacher Candidates' Perceptions About Grading and Constructivist Teaching

Sarah M. Bonner Peggy P. Chen 《Educational Assessment》2013,18(2):57-77

The Survey of Assessment Beliefs (SAB) was developed to measure teacher candidates' perceptions about grading practices. After piloting, the SAB was administered to 222 teacher candidates at a large northeastern urban university, along with a measure of their beliefs about teaching. Candidates were found to support many grading practices not recommended by professional standards. Support for grading practices that deviate from professional recommendations was positively associated with support for constructivist approaches. Significant differences were found in grading and teaching attitudes between elementary and secondary education teacher candidates. Teacher candidates became more moderate in endorsing nonstandard grading practices following coursework in classroom assessment but on average maintained a tendency to approve academically enabling grading practices. This study provides empirical evidence about possible areas of tension between constructivist learning theory and principles of educational measurement, and it helps classroom assessment teachers understand the needs of their target audiences. 相似文献

15.

Building and Supporting a Validity Argument for a Standards‐Based Classroom Assessment of English Proficiency Based on Teacher Judgments

Lorena Llosa 《Educational Measurement》2008,27(3):32-42

Using an argument‐based approach to validation, this study examines the quality of teacher judgments in the context of a standards‐based classroom assessment of English proficiency. Using Bachman's (2005) assessment use argument (AUA) as a framework for the investigation, this paper first articulates the claims, warrants, rebuttals, and backing needed to justify the link between teachers' scores on the English Language Development (ELD) Classroom Assessment and the interpretations made about students' language ability. Then the paper summarizes the findings of two studies—one quantitative and one qualitative—conducted to gather the necessary backing to support the warrants and, in particular, address the rebuttals about teacher judgments in the argument. The quantitative study examined the assessment in relation to another measure of the same ability—the California English Language Development Test—using confirmatory factor analysis of multitrait‐multimethod data and provided evidence in support of the warrant that states that the ELD Classroom Assessment measures English proficiency as defined by the California ELD Standards. The qualitative study examined the processes teachers engaged in while scoring the classroom assessment using verbal protocol analysis. The findings of this study serve to support the rebuttals in the validity argument that state that there are inconsistencies in teachers' scoring. The paper concludes by providing an explanation for these seemingly contradictory findings using the AUA as a framework and discusses the implications of the findings for the use of standards‐based classroom assessments based on teacher judgments. 相似文献

16.

Implementing summative assessment with a formative flavour: a case study in a large class

Jaclyn Broadbent Ernesto Panadero David Boud 《Assessment & Evaluation in Higher Education》2018,43(2):307-322

Teaching a large class can present real challenges in design, management and standardisation of assessment practices. One of the main dilemmas for university teachers is how to implement effective formative assessment practices with accompanying high-quality feedback consistently over time with large classroom groups. This article reports on how elements of formative practices can be implemented as part of summative assessment in very large undergraduate cohorts (n = 1500 in one semester), studying in different modes (on- and off-campus), with multiple markers, and under common cost and time constraints. Design features implemented include the use of exemplars, rubrics and audio feedback. The article draws on the reflections of the leading teacher, and argues that, for summative assessment to benefit learners, it should contain formative assessment elements. The teaching practices utilised in the case study provide some means to resolve the tensions between formative assessment and summative assessment that may be more generally applicable. 相似文献

17.

Inside the pre-kindergarten door: Classroom climate and instructional time allocation in Tulsa's pre-K programs 总被引：1，自引：0，他引：1

Deborah A. Phillips William T. GormleyAmy E. Lowenstein 《Early childhood research quarterly》2009

相似文献

18.

A comparative judgement approach to teacher assessment

Suzanne McMahon 《Assessment in Education: Principles, Policy & Practice》2015,22(3):368-389

We report one teacher’s response to a top-down shift from external examinations to internal teacher assessment for summative purposes in the Republic of Ireland. The teacher adopted a comparative judgement approach to the assessment of secondary students’ understanding of a chemistry experiment. The aims of the research were to investigate whether comparative judgement can produce assessment outcomes that are valid and reliable without producing undue workload for the teachers involved. Comparative judgement outcomes correlated as expected both with test marks and with existing student achievement data, supporting the validity of the approach. Further analysis suggested that teacher judgement privileged scientific understanding, whereas marking privileged factual recall. The estimated reliability of the outcome was acceptably high, but comparative judgement was notably more time-consuming than marking. We consider how validity and efficiency might be improved and the contributions that comparative judgement might offer to summative assessment, moderation of teacher assessment and peer assessment. 相似文献

19.

Measuring elementary teacher stress and coping in the classroom: Validity evidence for the Classroom Appraisal of Resources and Demands

Richard G. Lambert Christopher McCarthy Megan O'Donnell Chuang Wang 《Psychology in the schools》2009,46(10):973-988

The Classroom Appraisal of Resources and Demands (CARD, elementary version) was used to investigate teacher stress among a sample of elementary teachers (n = 521). The CARD measures teacher stress by examining the subjective experience of both classroom demands and resources provided by the school, and thereby attempts to capture the situationally specific nature of teacher stress. This study attempted to examine whether the CARD can provide reliable and valid information that addresses the call by experts in the field of teacher stress research for measures that consider each teacher's specific occupational circumstances. Specifically, the factor structure of the CARD was supported empirically. Further evidence was offered for the construct and concurrent validity by correlations between CARD scales scores and other measures theoretically relevant to teacher well‐being: general health, teacher efficacy, self‐critical attitudes, and burnout symptoms. © 2009 Wiley Periodicals, Inc. 相似文献

20.

Teachers Speak Out on Assessment Practices 总被引：1，自引：0，他引：1

Shannan McNair Ambika Bhargava Leah Adams Sally Edgerton Bess Kypros 《Early Childhood Education Journal》2003,31(1):23-31

A 1997 statewide survey of Michigan teachers, administrators, and parents about assessment practices revealed that all 3 groups held similar views about what constitutes appropriate assessment in the early years, and they put little faith in test scores. This study reports on follow-up interviews aimed at determining the types, frequency, and utility of assessment techniques used by classroom teachers. Specifically, this study focused on the types of assessment techniques used by a sample of elementary teachers, including how often they use paper-and-pencil tests, how often they write observation notes and what they do with the notes, whether they use children's portfolios as assessment, and whether their teaching is influenced by mandated tests. Study findings revealed that paper and pencil tests were regularly used by teachers in grades 3 and 4 (92%), and rarely or occasionally used by the teachers below that level (16% rarely and 20% occasionally). Seventy-three percent of the early level teachers and 76% of the teachers in grade 3 and 4 used observation for summative rather than formative analysis. Teachers in both groups used checklists frequently, primarily for summative purposes. Portfolios, like other assessment tools, are used primarily for summative rather than formative purposes. 相似文献