首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
This study evaluates four growth prediction models—projection, student growth percentile, trajectory, and transition table—commonly used to forecast (and give schools credit for) middle school students' future proficiency. Analyses focused on vertically scaled summative mathematics assessments, and two performance standards conditions (high rigor and low rigor) were examined. Results suggest that, when “status plus growth” is the accountability metric a state uses to reward or sanction schools, growth prediction models offer value above and beyond status‐only accountability systems in most, but not all, circumstances. Predictive growth models offer little value beyond status‐only systems if the future target proficiency cut score is rigorous. Conversely, certain models (e.g., projection) provide substantial additional value when the future target cut score is relatively low. In general, growth prediction models' predictive value is limited by a lack of power to detect students who are truly on‐track. Limitations and policy implications are discussed, including the utility of growth projection models in assessment and accountability systems organized around ambitious college‐readiness goals.  相似文献   

3.
States participating in the Growth Model Pilot Program reference individual student growth against “proficiency” cut scores that conform with the original No Child Left Behind Act (NCLB). Although achievement results from conventional NCLB models are also cut‐score dependent, the functional relationships between cut‐score location and growth results are more complex and are not currently well described. We apply cut‐score scenarios to longitudinal data to demonstrate the dependence of state‐ and school‐level growth results on cut‐score choice. This dependence is examined along three dimensions: 1) rigor, as states set cut scores largely at their discretion, 2) across‐grade articulation, as the rigor of proficiency standards may vary across grades, and 3) the time horizon chosen for growth to proficiency. Results show that the selection of plausible alternative cut scores within a growth model can change the percentage of students “on track to proficiency” by more than 20 percentage points and reverse accountability decisions for more than 40% of schools. We contribute a framework for predicting these dependencies, and we argue that the cut‐score dependence of large‐scale growth statistics must be made transparent, particularly for comparisons of growth results across states.  相似文献   

4.
Educational policy‐making in the Commonwealth of Independent States (CIS) is still building upon the ambivalences and uncertainties of post‐communist transformation. The international support, expertise and discourses – coupled with communist legacies, stalled democratic developments and national discourses – produce unique effects on education in each of these countries. This paper is an attempt to conceptualise educational policy‐making (with its disparities between ‘democratised’ discourses and ‘Sovietised’ practices) as a form of emerging governmentality or governmentality‐in‐the‐making on the level of the state, using Ukraine as a case study. Analysing policy‐making through the perspective of emerging governmentality brings into focus the genealogy of post‐independent reforms, which is (as a part of the technologies of government) threaded into a broader governmental project of restructuring the state and legitimising its rationality. The final empirical part of the paper presents a discourse analysis of selected curriculum choice and assessment policy documents (1999–2003) and embedded in them the complex interplay of internal and external discourses, which work together to construct and justify the emerging governmental rationality of post‐communist Ukraine.  相似文献   

5.
Abstract

The academic efficiency and social justice of entry procedures at Oxford and Cambridge Universities are examined over the past quarter of a century. For each major subject the mean A‐level scores of males and females entering from state and independent schools are compared with mean final examination scores in the major subjects. In any comparison of state and independent cohorts of the same gender, within the bounds of normal statistical fluctuation, the difference in A‐level score is a good predictor of the difference in finals score. For example, when between state men and independent men the difference in A‐level score is zero, the difference between mean finals score is zero also. The origin of female under‐achievement is examined. In most subjects there is pronounced gender inequality due to the following chain of circumstances: (1) to break‐even in finals women require at entry better grades at Advanced Level than men; (2) women used to have much the better A‐levels and so, in finals a quarter of a century ago, they matched and even — in some subjects — surpassed the men; (3) the A‐levels of women entering Oxford and Cambridge Universities fell off during the 1970s; (4) today female A‐level scores are slightly worse than male A‐level scores, and so female finals scores are much worse, in most subjects, than male finals scores. The concept of an ideal subject is defined; this is a subject in which zero difference in A‐level score between male and female yields zero difference in finals score. Law at Cambridge and chemistry at Oxford are ideal subjects. Ideal subjects are rare at Oxbridge: most subjects exhibit a significant male lead in finals when male and female have equal A‐level scores. The most non‐ideal subject at Oxford is mathematics, in which zero difference in A‐level score between males and females yields a male lead in finals score of 13%: at Oxford the other non‐ideal subjects are physics (male lead at equal A‐levels 11%), philosophy, politics and economics (9%), history (8%), modem languages (8%) and English (5%). An ideal subject is a paradigm which requires even‐handedness between male and female cohorts in the following parameters: (1) efficiency of course selection from school; (2) efficiency of teaching; (3) efficiency of finals assessment; (4) latent ability. A pronounced relative decline in the A‐level scores of girls educated in state maintained schools entering English and Welsh universities occurred in the 1970s; it is attributed to the reform of the state school system, particularly the growth in mixed‐sex comprehensive schools and the decline in the number of female single‐sex grammar schools. A peculiar aspect of the admissions filters at both Oxford and Cambridge ensures that state‐school educated men gaining entry do so with A‐level scores markedly superior to those of the other three cohorts.  相似文献   

6.
Many U.S. students must pass a standards-based exit exam to earn a high school diploma. The degree to which exit exams and state standards properly signal to students their preparedness for postsecondary schooling has been questioned. The alignment of test scores with college grades for students at the University of Arizona (n = 2,667) who took the Arizona high school exams was ascertained in this study. The pass/fail signal accuracy of test scores varied depending on subject: The writing cut score was well aligned with collegiate performance, the reading cut score was below expectations, and the mathematics cut score was set quite rigorously. High school content and performance standards might not be as diluted as prior research has suggested.  相似文献   

7.
Large‐scale assessment results for schools, school boards/districts, and entire provinces or states are commonly reported as the percentage of students achieving a standard—‐that is, the percentage of students scoring above the cut score that defines the standard on the assessment scale. Recent research has shown that this method of reporting is sensitive to small changes in the cut score, especially when comparing results across years or between groups. This study builds on that work, investigating the effects of reporting group size on the stability of results. In Part 1 of this study, Grade 6 students’ results on Ontario's 2008 and 2009 Junior Assessments of Reading, Writing and Mathematics were compared, by school, for different sizes of schools. In Part 2, samples of students’ results on the 2009 assessment were randomly drawn and compared, for 10 group sizes, to estimate the variability in results due to sampling error. The results showed that the percentage of students above a cut score (PAC) was unstable for small schools and small randomly drawn groups.  相似文献   

8.
This study evaluated the classification accuracy of a second grade oral reading fluency curriculum‐based measure (R‐CBM) in predicting third grade state test performance. It also compared the long‐term classification accuracy of local and publisher‐recommended R‐CBM cut scores. Participants were 266 students who were divided into a calibration sample (n = 170) and two cross‐validation samples (n = 46; n = 50), respectively. Using calibration sample data, local fall, winter, and spring R‐CBM cut scores for predicting students’ state test performance were developed using three methods: discriminant analysis (DA), logistic regression (LR), and receiver operating characteristic curve analysis (ROC). The classification accuracy of local and publisher‐recommended cut scores was evaluated across subsamples. Only DA and ROC produced cut scores that maintained adequate sensitivity (≥.70) across cohorts; however, LR and publisher‐recommended scores had higher levels of specificity and overall correct classification. Implications for developing local cut scores are discussed.  相似文献   

9.
In simple terms, competency‐based assessment is the assessment of a person's competence against prescribed standards of performance. Thus, if an occupation has established a set of, say, entry‐level competency standards, then these prescribe the standards of performance required of all new entrants to that occupation. Competency‐based assessment is the process determining whether a candidate meets the prescribed standards of performance, i.e. whether they demonstrate competence.

It is probably a truism that there is no such thing as a process of assessment that is without its critics. Whatever efforts are made to improve an instance of assessment, someone is bound to be unhappy with the process. Competency‐based assessment is therefore at a particular disadvantage since it is both new and unfamiliar to many people. This has meant that competency‐based assessment has aroused numerous and varied worries and objections from many quarters. In the process of researching assessment methods of professions in Australia, as the prelude to writing a guide on competency‐based assessment (Gonczi et al., 1993) the authors identified a number of worries and objections. Competency‐based assessment: (a) only assesses what is trivial or superficial; (b) is inherently unreliable in that it involves inference; (c) is inherently invalid, (d) represents a departure from traditional proven methods of assessment; (e) neglects the importance of knowledge; (f) focuses on outcomes to the neglect of processes; (g) relies on professional judgement, and hence is too subjective; (h) vainly tries to assess attitudes. This paper discusses each of these worries and objections and shows that none of them is decisive. While each of them points to an important issue about competency‐based assessment, the discussion will show that in each case a well‐designed competency‐based assessment system can overcome the worry or objection.  相似文献   


10.
The educational psychology literature is replete with references to higher-order cognitive constructs, such as critical thinking and creativity. Presumably, these constructs represent the primary processes and outcomes that educators should promote in students. For these constructs to be maximally useful, they must be transformed into specific operational definitions that lead to reliable and valid assessment strategies. Minimizing overlap in the definitions and assessment of different concepts would contribute to an orderly accumulation of knowledge about the constructs in question. The ideal would be for each construct to have a definition that is distinct from the definitions of other cognitive constructs. Although higher-order cognitive constructs have much surface appeal, their utility is tied to the clarity and fidelity of their definitions and assessment procedures.  相似文献   

11.
The Bases of Competence model provides a general framework for learner‐centred skill development and programme‐focused outcomes assessment. Based on previous research, the Bases of Competence model describes 17 skills and four base competencies important to graduates to achieve high performance in the workplace. Taking this work from research to relevant educational application as a tool for student self‐assessment and institutional outcomes assessment is the focus of this paper. Results from a multi‐year, multi‐course assessment initiative indicate that students rate themselves stronger in the foundation base competencies of Communicating and Managing Self, and weaker in more complex competencies of Managing People and Tasks and Mobilising Innovation and Change. Comparisons of skill confidence within each base competence as well as between year, student level, gender and beginning versus end of semester are presented as well. These results are discussed and suggestions made for programme design.  相似文献   

12.
Test scores matter these days. Test‐takers want to understand how they performed, and test score reports, particularly those for individual examinees, are the vehicles by which most people get the bulk of this information. Historically, score reports have not always met the examinees’ information or usability needs, but this is clearly changing for the better due to recent, much‐needed additions to the psychometric literature as well as improved efforts in reporting practices. This paper provides an overview of score reports from a development perspective, focusing on current practices and emerging efforts in content of reports as well as the process by which reports are designed, evaluated, and ultimately used to communicate with the public.  相似文献   

13.
Interdisciplinarity is rapidly becoming a norm within both the professional and academic worlds, and the ability to collaborate is becoming an essential skill for all graduates. Chemistry Is in the News (CIITN) is a curriculum that aims to teach students this skill by engaging student collaborative groups in a project that ties real world events and topics to the content taught in the classroom. While the collaborative activity has been successful in many ways, the challenge of maintaining individual accountability within the collaborative activity has persisted. The need to balance the tension between promoting collaboration and maintaining individual performance standards drove the development of an intra‐group peer review system. In developing this peer review system, four goals guide the design: the desire to promote collaboration, to produce a differentiated score among group members reflecting the contribution each person made, to improve student perception of fairness and accuracy in the assessment process of CIITN and to avoid artificially inflating students’ grades. The system was assessed in the winter semester of 2004 in a large lecture course at a major Midwestern university via student questionnaires and the CIITN scores. Evidence is provided to suggest that the intra‐group peer review system has met its core goals.  相似文献   

14.
This study examines the effectiveness of three approaches for maintaining equivalent performance standards across test forms with small samples: (1) common‐item equating, (2) resetting the standard, and (3) rescaling the standard. Rescaling the standard (i.e., applying common‐item equating methodology to standard setting ratings to account for systematic differences between standard setting panels) has received almost no attention in the literature. Identity equating was also examined to provide context. Data from a standard setting form of a large national certification test (N examinees = 4,397; N panelists = 13) were split into content‐equivalent subforms with common items, and resampling methodology was used to investigate the error introduced by each approach. Common‐item equating (circle‐arc and nominal weights mean) was evaluated at samples of size 10, 25, 50, and 100. The standard setting approaches (resetting and rescaling the standard) were evaluated by resampling (N = 8) and by simulating panelists (N = 8, 13, and 20). Results were inconclusive regarding the relative effectiveness of resetting and rescaling the standard. Small‐sample equating, however, consistently produced new form cut scores that were less biased and less prone to random error than new form cut scores based on resetting or rescaling the standard.  相似文献   

15.
A discussion of the need for a consumer‐centred, that is pupil‐ and student‐centred, education service. At present, the state (central and local) is the client who buys the education service it wants, without much regard for the consumer. But what is the place and role of teachers in this situation? Are they as much the victims of the centralist client‐state as is the consumer, or do they more resemble the organised labour force of a nationalised industry, whose ‘muscle’ enables them to make inroads into the decision‐making process, especially as it concerns the level at which the service is received by the ‘consumer’. The writer inclines to the latter view.  相似文献   

16.
Careful analysis of an organization's needs is the first step in developing programs for improving performance that are relevant to the organization and to the individuals who use the programs. The purpose of this article is to examine the perspectives of needs assessment that are described in the performance technology and human resource development literature. The first section of the article focuses on the terms used to describe the process of analyzing needs. The perspectives embodied in the definitions and conceptualizations of the terms need, needs assessment, needs analysis, front end analysis, and performance analysis are examined. The second section of the article focuses on various models of needs assessment. It presents different views about where the needs assessment process starts, where it ends, and what results are produced.  相似文献   

17.
Contemporary educational accountability systems, including state‐level systems prescribed under No Child Left Behind as well as those envisioned under the “Race to the Top” comprehensive assessment competition, rely on school‐level summaries of student test scores. The precision of these score summaries is almost always evaluated using models that ignore the classroom‐level clustering of students within schools. This paper reports balanced and unbalanced generalizability analyses investigating the consequences of ignoring variation at the level of classrooms within schools when analyzing the reliability of such school‐level accountability measures. Results show that the reliability of school means cannot be determined accurately when classroom‐level effects are ignored. Failure to take between‐classroom variance into account biases generalizability (G) coefficient estimates downward and standard errors (SEs) upward if classroom‐level effects are regarded as fixed, and biases G‐coefficient estimates upward and SEs downward if they are regarded as random. These biases become more severe as the difference between the school‐level intraclass correlation (ICC) and the class‐level ICC increases. School‐accountability systems should be designed so that classroom (or teacher) level variation can be taken into consideration when quantifying the precision of school rankings, and statistical models for school mean score reliability should incorporate this information.  相似文献   

18.
Changes to federal guidelines for the identification of children with disabilities have supported the use of multi‐tiered models of service delivery. This study investigated the impact of measurement methodology as used across numerous tiers in determining special education eligibility. Four studies were completed using a sample of inner‐city children (N = 150) who were administered a reading screener twice and a reading measure adapted from the state high‐stakes reading test. A sub‐sample of children identified as At‐Risk were administered a comprehensive reading assessment and compared with a randomly selected control group, who were also administered a comprehensive reading assessment (n = 14). A model was developed to estimate the likelihood of special education eligibility based on both theoretical and empirical measurement parameters. Depending on the measurement assumptions of the multi‐tiered model, special education eligibility outcomes varied from a low of 0.2% to as high as 11%, depending on the type of measure used, decision‐making criteria used at each tier, and the number of tiers in the model. This study highlights the importance of measurement specification, explicit decision‐making criteria, and empirical investigation to fully understand outcomes associated with the implementation of multi‐tiered models. Implications for special education eligibility policy and practical implications for implementing comprehensive measurement practice in multi‐tiered systems at the school level are discussed. © 2012 Wiley Periodicals, Inc.  相似文献   

19.
Primary school reception baseline assessment was designed to produce a single ‘baseline’ data figure on the basis of which young children's progress across primary school could be measured and accounted for. This paper suggests that within the context of punitive performativity, head teachers might be considered ‘irresponsible’ if not engaging with the new accountability measure in its voluntary year. Using DfE‐accredited baseline assessment providers blurred the distinctions between not‐for‐profit social enterprises, digital policy innovation labs, edu‐business, and the state. It is argued that through a process of networked governance, these cross‐sectoral organisations successfully enticed some primary schools with the ‘moral economy’ of using baseline assessment. It is argued that baseline's simplistic reductionism allowed for the economisation of early years education assessment and for its commercialisation of comparison. This paper reports on a sample of five head teachers, taken from a much larger study that used a mixed‐methods approach involving a nationwide survey (n=1131) and in‐depth interviews with reception staff and head teachers in five geographically disparate primary schools. Baseline assessment was ‘withdrawn’ by the DfE in April 2016, quite possibly because of campaigns by early years organisations, the government's own report showing that the three separate baseline datasets were incompatible, and national research funded by the teachers’ unions, a small part of which is reported upon here.  相似文献   

20.
The purpose of this study was to develop a standard‐setting method appropriate for use with a diagnostic assessment that produces profiles of student mastery rather than a single raw or scale score value. The condensed mastery profile method draws from established holistic standard‐setting methods to use rounds of range finding and pinpointing to specify cut points between performance levels. Panelists are convened to review profiles of mastery and specify cut points between performance levels based on the total number of skills mastered. Following panelist specification of cut points, a statistical method is implemented to smooth cut points over grades to decrease between‐grade variability. Procedural evidence, including convergence plots, standard errors of pinpointing ratings, and panelist feedback, suggest the condensed mastery profile method is a useful and technically sound approach for setting performance standards for diagnostic assessment systems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号