首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This study illustrates how generalizability theory can be used to evaluate the dependability of school-level scores in situations where test forms have been matrix sampled within schools, and to estimate the minimum number of forms required to achieve acceptable levels of score reliability. Data from a statewide performance assessment in reading, writing, and language usage were analyzed in a series of generalizability studies using a person: (school x form) design that provided variance component estimates for four sources: school, form, school x form, and person: (school x form). Six separate scores were examined. The results of the generalizability studies were then used in decision studies to determine the impact on score reliability when the number of forms administered within schools was varied. Results from the decision studies indicated that score generalizability could be improved when the number of forms administered within schools was increased from one to three forms, but that gains in generalizability were small when the number of forms was increased beyond three. The implications of these results for planning large-scale performance assessments are discussed.  相似文献   

2.
Increasingly, assessment practitioners use generalizability coefficients to estimate the reliability of scores from performance tasks. Little research, however, examines the relation between the estimation of generalizability coefficients and the number of rubric scale points and score distributions. The purpose of the present research is to inform assessment practitioners of (a) the optimum number of scale points necessary to achieve the best estimates of generalizability coefficients and (b) the possible biases of generalizability coefficients when the distribution of scores is non-normal. Results from this study indicate that the number of scale points substantially affects the generalizability estimates. Generalizability estimates increase as scale points increase, with little bias after scales reach 12 points. Score distributions had little effect on generalizability estimates.  相似文献   

3.
This study evaluates four growth prediction models—projection, student growth percentile, trajectory, and transition table—commonly used to forecast (and give schools credit for) middle school students' future proficiency. Analyses focused on vertically scaled summative mathematics assessments, and two performance standards conditions (high rigor and low rigor) were examined. Results suggest that, when “status plus growth” is the accountability metric a state uses to reward or sanction schools, growth prediction models offer value above and beyond status‐only accountability systems in most, but not all, circumstances. Predictive growth models offer little value beyond status‐only systems if the future target proficiency cut score is rigorous. Conversely, certain models (e.g., projection) provide substantial additional value when the future target cut score is relatively low. In general, growth prediction models' predictive value is limited by a lack of power to detect students who are truly on‐track. Limitations and policy implications are discussed, including the utility of growth projection models in assessment and accountability systems organized around ambitious college‐readiness goals.  相似文献   

4.
Student–teacher interactions are dynamic relationships that change and evolve over the course of a school year. Measuring classroom quality through observations that focus on these interactions presents challenges when observations are conducted throughout the school year. Variability in observed scores could reflect true changes in the quality of student–teacher interaction or simply reflect measurement error. Classroom observation protocols should be designed to minimize measurement error while allowing measureable changes in the construct of interest. Treating occasions as fixed multivariate outcomes allows true changes to be separated from random measurement error. These outcomes may also be summarized through trend score composites to reflect different types of growth over the school year. We demonstrate the use of multivariate generalizability theory to estimate reliability for trend score composites, and we compare the results to traditional methods of analysis. Reliability estimates computed for average, linear, quadratic, and cubic trend scores from 118 classrooms participating in the MyTeachingPartner study indicate that universe scores account for between 57% and 88% of observed score variance.  相似文献   

5.
Using student ratings to assess instructional quality of schools should fulfill three requirements: (1) an appropriate level of inter-rater agreement within schools, (2) systematic variance of student ratings between schools, (3) an adequate reliability level of aggregated student ratings. Using international PISA-data (2000–2012; 81 countries, over 55,300 schools, over 1.3 million 15-year olds) this study investigated how these requirements were met regarding indicators of instructional quality (classroom management, cognitive activation, individual learning support). We computed the interrater agreement index rWG(J), as well as the intraclass correlations ICC(1) and ICC(2). Our results showed that (1) student ratings demonstrated a moderate or strong level of agreement for most indicators of instructional quality and (2) instructional quality assessed by students varied systematically between schools. Yet, (3) reliability of aggregated student ratings was not sufficient in many countries. We discuss these results regarding conventions to evaluate agreement, variability, and reliability of student ratings at the school level.  相似文献   

6.
In the UK, USA and elsewhere, school accountability systems increasingly compare schools using value-added measures of school performance derived from pupil scores in high-stakes standardised tests. Rather than naïvely comparing school average scores, which largely reflect school intake differences in prior attainment, these measures attempt to compare the average progress or improvement pupils make during a year or phase of schooling. Schools, however, also differ in terms of their pupil demographic and socioeconomic characteristics and these factors also predict why some schools subsequently score higher than others. Many therefore argue that value-added measures unadjusted for pupil background are biased in favour of schools with more ‘educationally advantaged’ intakes. But others worry that adjusting for pupil background entrenches socioeconomic inequities and excuses low-performing schools. In this article we explore these theoretical arguments and their practical importance in the context of the ‘Progress 8’ secondary school accountability system in England, which has chosen to ignore pupil background. We reveal how the reported low or high performance of many schools changes dramatically once adjustments are made for pupil background, and these changes also affect the reported differential performances of regions and of different school types. We conclude that accountability systems which choose to ignore pupil background are likely to reward and punish the wrong schools and this will likely have detrimental effects on pupil learning. These findings, especially when coupled with more general concerns surrounding high-stakes testing and school value-added models, raise serious doubts about their use in school accountability systems.  相似文献   

7.
The purpose of this study was to investigate the methods of estimating the reliability of school-level scores using generalizability theory and multilevel models. Two approaches, ‘student within schools’ and ‘students within schools and subject areas,’ were conceptualized and implemented in this study. Four methods resulting from the combination of these two approaches with generalizability theory and multilevel models were compared for both balanced and unbalanced data. The generalizability theory and multilevel models for the ‘students within schools’ approach produced the same variance components and reliability estimates for the balanced data, while failing to do so for the unbalanced data. The different results from the two models can be explained by the fact that they administer different procedures in estimating the variance components used, in turn, to estimate reliability. Among the estimation methods investigated in this study, the generalizability theory model with the ‘students nested within schools crossed with subject areas’ design produced the lowest reliability estimates. Fully nested designs such as (students:schools) or (subject areas:students:schools) would not have any significant impact on reliability estimates of school-level scores. Both methods provide very similar reliability estimates of school-level scores.  相似文献   

8.
This study assesses several policy implications of within‐school, between‐classroom variability in pupil achievement. It diverges from current school effect studies by directly modelling pupil achievement in the Jerusalem public primary school system. This three‐level study includes pupils, classrooms and schools, thus allowing an appropriate estimate of the variations between these three levels. The findings show that between‐classroom variability is consistently greater than the estimated variation between schools. These findings contrast with traditional school‐level analyses that usually ignore within‐school variability. In the light of these findings we address three educational and policy issues. First, we probe into the moral consequences of between‐classroom, within‐school variability, specifically focusing on issues of choice and commitment. Second, we scrutinize the administrative policy of ‘social integration’ and reflect on some educational consequences that result from our findings. Third, we assess the Israeli version of ‘school league tables’ and discuss their usefulness as a means of resource allocation  相似文献   

9.
School education in Australia is a complex interplay between federal and state governments, and between government and non-government schools. This article explores the supervision of schools in Australia through school accountability systems. Utilising publicly available documents a systematic analysis of the state and territory systems for government schools is provided. It is a paper that attempts to document rather than critique school accountability, although a conceptual framework utilising contractual, moral and professional accountability is used to analyse the different accountability processes reported upon. Contractual and moral accountability is supported by most systems, whilst there is potential to foster professional accountability in only two systems. Fostering professional accountability is important because this is where the internal motivation of teachers helps to drive school improvement. When compared to leading-edge systems, Australian accountability systems are lacking in judgements on teaching practice in individual classrooms, and the use of sophisticated measures of learning and value-added analysis.  相似文献   

10.
Despite considerable interest in research and practice in the effect of classroom disciplinary climate of schools on academic achievement, little is known about the generalizability of this effect over countries. Using hierarchical linear analyses, the present study reveals that a better classroom disciplinary climate in a school is significantly associated with better school reading performance in 53 of the 65 Programme for International Student Assessment (PISA) 2009 participant countries. The classroom disciplinary climate of schools can explain 11% of the between-school differences in reading achievement over countries. Controlling for economic, social, and cultural variables and student gender-related variables at student and school levels, the between-country differences in the effect of classroom disciplinary climate of schools shrank by three quarters. These findings can inform countries that face educational inequality issues (e.g., Argentina) and gender gap issues (e.g., Trinidad and Tobago), suggesting the possibility of tackling these issues via intervening on classroom disciplinary climate of schools.  相似文献   

11.
Previous studies in higher education have shown that the reliability of student ratings of teaching skill increases if multiple ratings by different students are aggregated. This study examines the generalizability of these findings to the context of secondary education. Also, it seeks to validate these findings by comparing reliability levels estimated by the routinely used nested design with those estimated using a more complex design. The sample consisted of 410 students from 17 classes rating 63 teachers working at eight schools across the Netherlands. Using the nested design, the study replicates findings of previous studies in higher education. The findings illustrate how the reliability level of secondary school students’ ratings increases with an increasing number of students. However, these replicated reliability levels were not validated by the more complex design which provided lower estimates. This indicates that the nested design may not provide accurate estimations of rating reliability.  相似文献   

12.
ABSTRACT

A more theoretical approach to effective schools research is needed, and a political systems model is an appropriate starting‐point since it directs attention to power issues, which are critical to school improvement. The model suggests that both internal and external influences on schools are important. There are four main classes of external influence: administrative; professional; societal; and familial. Each has the potential for strengthening or weakening school effectiveness. Studies of family influence on student learning and attitudes emphasize the potential of collaborative arrangements in which families and schools work together.

We argue that classroom and school improvement cannot be attained without changing the relationships between the three central figures ‐ teacher, student, and parent; this triad model is an ‘inside out’ version of school improvement, in which classroom and school improvement occurs as fundamental relationships between the triad members become more collaborative.

Our web metaphor suggests that those interested in research on effective schools should be sensitive to the impact of external influences; and that effective schools link participants together into a collaborative and responsive mutual influence system, the integrated school environment, the school level version of a political systems model, in which all gain.  相似文献   

13.
Abstract

Prior to the 2012–13 school year, New York and many other states underwent changes to their accountability systems as a result of applying for and being granted waivers from the requirements of the No Child Left Behind Act of 2001. A key component of these new accountability systems, under what is known as ESEA Flexibility or NCLB Waivers, was the designation of the lowest performing 5% of Title I schools as priority schools with the goal of improved performance within three years of receiving their designation. The priority school policy included elements of both accountability and school turnaround to try to improve student outcomes in low performing schools. This study examines the extent to which elementary and middle priority schools in New York State improved in the three years since being designated priority schools. By the end of the 2014–15 school year—the third year of three to show improvement—I find elementary and middle priority schools did not show improvement and, in fact, performed worse than schools just above the cutoff for determining priority school eligibility.  相似文献   

14.
Previous studies indicate that school climate is important for student health and academic achievement. This study concerns the validity and reliability of the student edition a Swedish instrument for measuring pedagogical and social school climate (PESOC). Data were collected from 5,745 students at 97 Swedish secondary schools. Multilevel confirmatory factor analyses were conducted, and multilevel composite reliability estimates, as well as correlations with school-level achievement indicators, were calculated. The results supported an 8-factor structure at the student level and 1 general factor at the school level. Factor loadings and composite reliability estimates were acceptable at both levels. The school-level factor was moderately and positively correlated with school-level academic achievement. The student PESOC is a promising instrument for studying school climate.  相似文献   

15.
This study examines teachers’ conceptions of assessment and related contextual factors at the classroom, school and national levels. A representative survey of Singaporean secondary school teachers resulted in a final sample consisting of 229 teachers from 9 secondary schools. Findings on that, teachers endorse views of assessment for school accountability, student accountability and student improvement, but little endorsement of assessment as irrelevance. Teachers report feeling capable and qualified to use assessments, but concerned about how much they are trusted as assessors at school and national levels. Follow-up latent class analysis identified groups of teachers based on their responses to the irrelevance of assessment; teachers who found assessment irrelevant were present across all schools and subjects, but showed lower sense of preparation for assessment, school-level support and importance of academic success in society.  相似文献   

16.
This paper estimates the long-run effects of school accountability on educational attainment by exploiting two sources of variation: staggered implementation of accountability across states and individuals’ exposure to accountability. I find 12 years of exposure to school accountability leads to an increase in the likelihood of graduating high school by 2.3 percentage points but has no statistically significant effect on college attendance or the likelihood of receiving a Bachelor's degree. However, racial heterogeneity shows Hispanic students experience a significant increase in the likelihood of attending college. I rule out changes in school expenditures and teacher characteristics as potential mechanisms and present suggestive evidence that schools are classifying more students as learning disabled. Lastly, accountability is more effective in conjunction with promotion gates.  相似文献   

17.
This article describes the inter‐relationship between school organization and classroom instructional style. Two distinct models of school organization, the bureaucratic and open‐systems models, are characterized in terms of three major dimensions of school life; a. the behavior of administrators, teachers and students, b. work design and tasks, and c. space‐time allocations. It is shown that the bureaucratic model of school organization parallels, and sustains, the traditional whole‐class method of teaching in all of the three dimensions. An open‐systems model of staff organization at the school level is required to sustain an alternative form of classroom instruction such as cooperative learning. The approach presented here emphasizes the inter‐relatedness of all three dimensions of schooling at the organizational and classroom levels. It also claims that the implementation of genuine instructional change, that entails new patterns of interpersonal relations in the classroom, is contingent upon similar changes being made at the level of the school as an organization. Lack of attention to school organizational change may explain why efforts at changing instruction at the classroom level frequently fail to yield results.  相似文献   

18.
The Ford score     
Abstract

We combined data from the Office for Standards in Education with those from a large national survey of child and adolescent mental health and developed a simple score that schools or LEAs could use to predict the level of emotional and behavioural difficulties that they are likely to encounter. The final Ford score is based on the rates of free school meals, exclusions, unauthorized absence and children with special educational needs. These data are collected routinely, so the Ford score could easily be calculated to provide estimates of the level of emotional and behavioural problems in mainstream schools without the use of additional resources. It needs further reliability and validity testing but could provide a means of allocating resources.  相似文献   

19.
In the external evaluation of schools the technique of classroom observation belongs to the methodological standard repertoire. Nevertheless the measurement of quality of classroom teaching based upon selected lesson sequences, which are as a rule inspected only briefly, is fraught with a lot of methodological problems. Therefore it is relevant for a substantiated quality assurance to reveal problems in the measurement of quality of classroom teaching due to an implementation of adequate empirical methods. This is made possible by using the generalizability theory and the many-facet Rasch model. Analyses based upon data of the Hamburg school inspection point out that by using an appropriate data collection procedure rater effects in classroom observations turn out comparatively low at about nine percent of total variance. Furthermore analyses prove that it is insufficient to simply quantify the agreement among raters by using global reliability measures, but that it is necessary to check up on intra rater consistency for getting valid and in this way reliable results from classroom observations for the practice.  相似文献   

20.
A key intent of the NCLB growth pilot is to reward low‐status schools who are closing the gap to proficiency. In this article, we demonstrate that the capability of proposed models to identify those schools depends on how the growth model is incorporated into accountability decisions. Six pilot‐approved growth models were applied to vertically scaled mathematics assessment data from a single state collected over 2 years. Student and school classifications were compared across models. Accountability classifications using status and growth to proficiency as defined by each model were considered from two perspectives. The first involved adding the number of students moving toward proficiency to the count of proficient students, while the second involved a multitier accountability system where each school was first held accountable for status and then held accountable for the growth of their nonproficient students. Our findings emphasize the importance of evaluating status and growth independently when attempting to identify low‐status schools with insufficient growth among nonproficient students.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号