首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The Bookmark Standard-Setting Method: A Literature Review   总被引:1,自引:0,他引:1  
The Bookmark method for setting standards on educational tests is currently one of the most popular standard-setting methods. However, research to support the method is scarce. In this report, we review the published and unpublished literature on this method as well as some seminal work in the area of evaluating standard-setting studies. Our review highlights both strengths and limitations of the method. Strengths include its wide acceptance and panelist confidence in the method. Limitations include a potential bias to produce lower-than-intended standards and problems in selecting the most appropriate response probability value for ordering the items presented to panelists. It is clear that more research on this method is needed to support its wide use. Several areas for future research to better understand the validity of the Bookmark method for setting standards on educational tests are presented.  相似文献   

2.
This module describes some common standard-setting procedures used to derive performance levels for achievement tests in education, licensure, and certification. Upon completing the module, readers will be able to: describe what standard setting is; understand why standard setting is necessary; recognize some of the purposes of standard setting; calculate cut scores using various methods; and identify elements to be considered when evaluating standard-setting procedures. A self-test and annotated bibliography are provided at the end of the module. Teaching aids to accompany the module are available through NCME.  相似文献   

3.
Different standard-setting procedures usually produce different cut points even if each has a rational basis. In 2000, three standard-setting procedures were implemented to set cut scores in each of the 18 grade/content areas comprising Kentucky's state assessment system: the Contrasting Groups, Bookmark, and Jaeger-Mills procedures. Subsequently, participants from each of the three procedures worked together in each grade/content area to synthesize the results. These synthesis participants considered the results of, and examined the materials and information provided by, each of the three separate procedures. In this article the synthesis processes are described and discussed.  相似文献   

4.
《欧洲教育》2013,45(3):45-55
"French educational planning is planning based on manpower considerations; reforms in the school system are based on calculations of the needs of the labor market. Social components are subordinated to economic considerations. Changes in the school system place the accent primarily on creating new organizational units. Curricular changes are undertaken only if the findings of manpower studies indicate that they are useful, From a pedagogical and purely educational standpoint, French educational planning has initiated no reform tendencies. Since such a position remains beyond the scope of educational planning, reforms of this order have not taken place." (39)  相似文献   

5.
One commonly used compromise standard-setting method is the Beuk (1984) method. A key assumption of the Beuk method is that the emphasis given to the pass rate and the percent correct ratings should be proportional to the extent that the panelists agree on their ratings. However, whether the slope of Beuk line reflects the emphasis that panelists believe should be assigned to the pass rate and the percentage correct ratings has not be fully tested. In this article, I evaluate this critical assumption of the Beuk method by asking panelists to assign importance weights to their percentage correct and pass rate judgments. I show that in several cases that the emphasis suggested by the Beuk slope is noticeably different from what one would expect and is inconsistent with importance weight ratings. I also suggest two ways that the importance weights can be used to calculate alternate cut scores, and I show that one of the ways of calculating cut scores using the importance weights leads to larger potential differences in cut score estimates. I suggest that practitioners should consider collecting importance weights when the Beuk method is used for determining cut scores.  相似文献   

6.
Pass rates are key assessment statistics which are calculated for nearly all high-stakes examinations. In this article, we define the terminal, first attempt, total attempts, and repeat attempts pass rates, and discuss the uses of each statistic. We also explain why in many situations one should expect the terminal pass rate to be the highest, first attempt pass rate to be the second highest, total attempts pass rate to be the third highest, and repeat attempts pass rate to be the lowest when repeat attempts are allowed. Analyses of data from 14 credentialing programs showed that the expected relationship held for 13 out of 14 of the programs. Additional analyses of pass rates for educational programs in radiography in one state showed that the general relationship held at the state level, but only held for 6 out of 34 educational programs. It is suggested that credentialing programs need to clearly state their pass rate definitions and carefully consider how repeat examinees may influence pass rate statistics. It is also suggested that credentialing programs need to think carefully about the meaning and uses of different pass rate statistics when choosing which pass rates to report to stakeholders.  相似文献   

7.
The relationship between quality and demand is analysed using data from various countries, with special emphasis on Burkina Faso, Mali, and Tanzania. Four types of educational quality are postulated: value, output, process and input quality. The relative importance of quality compared to external efficiency and costs is assessed. The paper is a reanalysis of existing studies. Qualitative data are complemented by simple analysis of educational statistics. The studies had different though over-lapping foci: one study explored reasons for non-enrolment, drop-out and exclusion from school under the umbrella theme of the quality of education. Another one emphasised social demand in rural areas, with quality one of a number of topics. A third study looked at attitudes towards education and educational strategies, restricting itself to parents. A primary level, the quality of education influences the demand for education. The relative importance of quality varies from one context to another. Quality influences the decision to enrol less than the decision to carry on. However, it affects enrolment to such an extent that moderate correlations have been observed between pass rates and repeater rates on the one hand, and enrolment rates on the other. Value quality is mainly related to enrolment. Output quality is the criterion for selecting a school or a school system. Output, process and input quality affect dropping out and irregular attendance. Repetition, justified on unsatisfactory output quality, is related to input quality. The decision to participate in education combines considerations of educational quality with an evaluation of costs, both direct costs and opportunity costs.  相似文献   

8.
Some writers in the measurement literature have been skeptical of the meaningfulness of achievement standards and described the standard-setting process as blatantly arbitrary. We argue that standard setting is more appropriately conceived of as a measurement process similar to student assessment. The construct being measured is the panelists' representation of student performance at the threshold of an achievement level. In the first section of this paper, we argue that standard setting is an example of stimulus-centered measurement. In the second section, we elaborate on this idea by comparing some popular standard-setting methods to the stimulus-centered scaling methods known as psychophysical scaling. In the third section, we use the lens of standard setting as a measurement process to take a fresh look at the two criticisms of standard setting: the role of judgment and the variability of results. In the fourth section, we offer a vision of standard-setting research and practice as grounded in the theory and practice of educational measurement .  相似文献   

9.
In the modern era, the prevailing model of public education has been that of “one size fits all”, with private schooling being a small but notable exception. Language (of instruction) was generally viewed as a minor variable readily overcome by standard classroom instruction. As researchers have sharpened their focus on the reasons for educational failure, language has begun to emerge as a significant variable in producing gains in educational efficiency. This paper reports the intermediate result of a controlled study in a very rural area of a developing country designed to examine the effect of language of instruction on educational outcomes. In the experimental schools, children are taught to read first in the local language (via the local language) and are taught other key subjects via the local language as well. English is taught as a subject. Teachers in the control or standard schools continue the standard national practice of teaching all subjects in either English or Filipino, neither of which is spoken by children when they begin school. Year-end standardised testing was done in all subjects throughout grades one to three as a means of comparing the two programme methodologies.  相似文献   

10.
Knowing that grades can have long-term consequences for students, teachers voice concern about being fair in the grading process. However, their interpretations of fairness are varied and sometimes contradictory. This study looked at how teachers in one standards-based educational system determined secondary students’ grades, focusing specifically on the extent to which they followed a specific set of principles for grading. The results support previous research, and suggest that a better understanding of essential principles is needed for grades to accurately reflect students’ achievement.  相似文献   

11.
Many education policies require estimating whether students in different grades are on track for achieving certain educational standards. One approach for constructing these cut scores is to estimate the values on tests that predict reaching targets on subsequent tests. Whether a student is deemed on target can affect the student’s course counselling and aggregate statistics can affect school closures and funding and teacher employment. Seven different regression procedures for estimating cut scores are compared with 15 different data scenarios. In some situations, all the methods provided fairly accurate estimates, but in other situations, some estimates were poor. The choice of which regression procedure to use can make a difference. Overall, a method based on a loess regression performed well.  相似文献   

12.
This paper examines how one particular class of educational leader – international school Heads – relate to managerialism. Representing a novel site of new theorisation, the independence enjoyed by these leaders allows a ‘purer’ view of managerialism as experienced ‘in here’ (inside the subject), not just as a reaction to what is ‘out there’ (i.e. to policy). Through analysis of twenty-five face-to-face interviews, they were found to have relationships to managerialism that are not compliant or transgressive, educational or managerial, but hybridic. Some Heads relate to managerialism pragmatically; they reluctantly ‘do’ managerialism but avoid, segment and/or moderate managerial influences on their identities. Other Heads proactively use managerialism to discipline their staff and organisations; they draw power from managerial discourses; and they claim its values as their own. Seen through the lens of hybridity, educational identifications remain important, indeed they remain paramount, but for some subjects, they have been conjoined with complimentary managerial ones.  相似文献   

13.
新式教学法是中国近代小学教育改革的重要组成部分。在新式教学法被引介入中国的过程中,历经了对新式教学法的移植和模仿、对新式教学法西化还是本土化的理性思考以及通过对中国传统教学方法与西方新式教学方法之间的调适,在实践中初步摸索出适合中国社会环境教学方法的过程。从整体上看,在近代新式教学法从理论和实践中被引入的过程中,既经历了从简单模仿到内化汲取的过程,又经历了从中国沿海近代教育比较发达地区逐步向内地农村地区传播的过程。  相似文献   

14.
The apprenticeship system in Germany is carried out both by companies and vocational schools (the Dual System). The question of whether the German Dual System is transferable is currently being asked in vocational education and training research. The analysis of current transfer discourses alludes to a research desideratum: the actual approaches consider either the input or the output of an educational transfer, but the transfer process in relation to its input and output has not been investigated to date. We focus on this desideratum. In the present case study, the processes emergence and implementation of dual apprenticeship structures is analysed in relation to its input and output in a German automotive transplant in the United States. Transplant organisations provide an ideal case to explore the transfer phenomenon because they have been transferred from a familiar context to a foreign context. The research questions are: firstly, why and how did the need emerge to implement dual apprenticeship structures in the German transplant in the United States (input); secondly, how and in which way have these structures been implemented (process); and thirdly, how can the implemented structures be characterised: as an imitation, adaptation or transformation of the original model (output)? The central findings of the case study are: firstly, that growing contradictions in the production system triggered the implementation process; secondly, that the original Dual System was transformed within the implementation process; and thirdly, that this transformation led to innovative solutions. These findings may not be valid for every transfer at any time, but they reflect that educational transfer of dual apprenticeship structures can be more than just a more or less successful imitation or adaptation.  相似文献   

15.
In educational systems, concern has been expressed about the accuracy of classification when marks are aligned to grades or levels. In particular, it has been claimed that a school assessment‐based grading would have much greater levels of accuracy than one based on examination scores. This paper investigates classification consistency by analysing five years of examination and assessment data in the subject areas of English and mathematics, and creating simulated parallel‐test observed scores at varying reliabilities (based on classical test theory assumptions). While grades created from moderated school assessments did show greater agreement than those from examination scores, the improvement was only modest.  相似文献   

16.
Standard-setting procedures are a key component within many large-scale educational assessment systems. They are consensual approaches in which committees of experts set cut-scores on continuous proficiency scales, which facilitate communication of proficiency distributions of students to a wide variety of stakeholders. This communicative function makes standard-setting studies a key gateway for validity concerns at the intersection of evidentiary and consequential aspects of score interpretations. This short review paper describes the conceptual and empirical basis of validity arguments for standard-setting procedures in light of recent research on validity theory. It specifically demonstrates how procedural and internal evidence for the validity of standard-setting procedures can be collected to form part of the consequential basis of validity evidence for test use.  相似文献   

17.
Although researchers have discovered a great deal about who uses Twitter for educational purposes, what they post about, when they post and why they participate, there has so far been little work to explore where participants in educational Twitter contexts are located. In this paper, we establish a methodological foundation that can support the exploration of geographical issues in educational Twitter research. We surveyed 46 participants in one educational Twitter hashtag, #michED, to determine where they lived; we then compared these responses to results from three digital methods for geolocating Twitter users (human coding, machine coding and GPS coding) to explore these methods’ affordances and constraints. Human coding of Twitter profiles allowed us to analyze more participants with higher levels of accuracy but also has disadvantages compared to other digital—and traditional—methods. We discuss the additional insights obtained through geolocating #michED participants as well as considerations for using geolocation and other digital methods in educational research.  相似文献   

18.
The present article addresses the question of how social origin affects access to higher education. The role of class-specific differences in school performance (primary effect) and cost-benefit considerations (secondary effect) are considered as well as the way in which changes in the institutional setup of the educational system may interact with social origin. The review shows that educational decisions at this later transition are mainly influenced by secondary effects. In particular, differences between social classes are explained by group-specific investment costs and expectations of the social context associated with continuing higher education. Although expansion of institutional pathways in upper secondary education has reduced social inequality in acquiring higher education entrance qualifications, secondary effects at the transition to university have been found to increase over time. Thus, within younger cohorts, high-school graduates with lower social status are more often diverted from higher education at universities by attractive vocational or nonuniversity pathways. The article discusses approaches to reduce secondary effects of social origin.  相似文献   

19.
Setting motor performance standards has long been a process of interest to physical educators. Theoretical advances in the measurement technology appropriate for standard-setting, however, have occurred only in the last decade. The first portion of this paper is devoted to a discussion of issues in setting standards and a brief review of procedures for standard-setting. In the latter section, gender differences in motor performance are examined and the impact of these differences on standard-setting is considered.  相似文献   

20.
ABSTRACT

This article draws on three assessment paradigms – psychometrics, outcomes-based and curriculum-based assessment – to discuss paradigmatic changes in senior school assessment and achievement standard-setting in Queensland, Australia, over the last 50 years. These include radical reforms in 1970 from university-controlled examinations to school-based assessments applying normative standard-setting, to subsequent reforms in 1978 introducing competence(curriculum)-based assessment and standards. From 2019, a new reform introduces a combination of school-based and external assessment with procedures for establishing standards still in progress.

Changes to Queensland assessment and standard-setting are discussed in terms of three preconditions for paradigm change – dissatisfaction, an alternative acceptable paradigm, and majority acceptance of change. Influence of paradigmatic origins of reformers is discussed. The amalgam of curriculum-based assessment and psychometric paradigms in the new Queensland system is considered in terms of theoretical compatibility and potential impact on the new standards.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号