期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On Scoring Multiple Choice Exams Allowing for Partial Knowledge

J. C. Arnold P. L. Arnold 《Journal of Experimental Education》2013,81(1):8-13

A scoring procedure for multiple choice exams which allows for partial knowledge and also allows the examiner to control the expected gain due to guessing is considered in this paper. The procedure is considered from an elementary game theory approach. Comparisons are given with other scoring methods. 相似文献

2.

Can We Learn From Student Mistakes in a Formative,Reading Comprehension Assessment?

Bowen Liu Patrick C. Kennedy Ben Seipel Sarah E. Carlson Gina Biancarosa Mark L. Davison 《Journal of Educational Measurement》2019,56(4):815-835

This article describes an ongoing project to develop a formative, inferential reading comprehension assessment of causal story comprehension. It has three features to enhance classroom use: equated scale scores for progress monitoring within and across grades, a scale score to distinguish among low‐scoring students based on patterns of mistakes, and a reading efficiency index. Instead of two response types for each multiple‐choice item, correct and incorrect, each item has three response types: correct and two incorrect response types. Prior results on reliability, convergent and discriminant validity, and predictive utility of mistake subscores are briefly described. The three‐response‐type structure of items required rethinking the item response theory (IRT) modeling. IRT‐modeling results are presented, and implications for formative assessments and instructional use are discussed. 相似文献

3.

Elicitation of personal probabilities and their assessment

Emir Shuford Thomas A. Brown 《Instructional Science》1975,4(2):137-188

A student's choice of an answer to a test question is a coarse measure of his knowledge about the subject matter of the question. Much finer measurement might be achieved if the student were asked to estimate, for each possible answer, the probability that it is the correct one. Such a procedure could yield two classes of benefits: (a) students could learn the language of numerical probability and use it to communicate uncertainty, and (b) the learning of other subjects could be facilitated. This report describes the rationale underlying a procedure for eliciting personal estimates of probabilities utilizing a proper scoring rule, and illustrates some new techniques for calibrating those probabilities and providing better feedback to students learning to assess uncertainty. In addition, new results are presented compring the incentive for study, rehearsal, and practice provided by the proper scoring rule with that provided by the simple choice procedure, and concerning the potential effect of cutoff scores and prizes upon student behavior. 相似文献

4.

Modeling Partial Knowledge on Multiple‐Choice Items Using Elimination Testing

Qian Wu Tinne De Laet Rianne Janssen 《Journal of Educational Measurement》2019,56(2):391-414

Single‐best answers to multiple‐choice items are commonly dichotomized into correct and incorrect responses, and modeled using either a dichotomous item response theory (IRT) model or a polytomous one if differences among all response options are to be retained. The current study presents an alternative IRT‐based modeling approach to multiple‐choice items administered with the procedure of elimination testing, which asks test‐takers to eliminate all the response options they consider to be incorrect. The partial credit model is derived for the obtained responses. By extracting more information pertaining to test‐takers’ partial knowledge on the items, the proposed approach has the advantage of providing more accurate estimation of the latent ability. In addition, it may shed some light on the possible answering processes of test‐takers on the items. As an illustration, the proposed approach is applied to a classroom examination of an undergraduate course in engineering science. 相似文献

5.

The Effects of Using Different Procedures to Score Maze Measures

Rebecca L. Pierce Kristen L. McMaster Stanley L. Deno 《Learning disabilities research & practice》2010,25(3):151-160

The purpose of this study was to examine how different scoring procedures affect interpretation of maze curriculum‐based measurements. Fall and spring data were collected from 199 students receiving supplemental reading instruction. Maze probes were scored first by counting all correct maze choices, followed by four scoring variations designed to reduce the effect of random guessing. Pearson's r correlation coefficients were calculated among scoring procedures and between maze scores and a standardized measure of reading. In addition, t tests were conducted to compare fall to spring growth for each scoring procedure. Results indicated that scores derived from the different procedures are highly correlated, demonstrate criterion‐related validity, and show fall‐to‐spring growth. Educators working with struggling readers may use any of the five scoring procedures to obtain technically sound scores. 相似文献

6.

Comparison of student performance using web and paper‐based homework in college‐level physics

Scott W. Bonham Duane L. Deardorff Robert J. Beichner 《科学教学研究杂志》2003,40(10):1050-1071

Homework gives students an opportunity to practice important college‐level physics skills. A switch to Web‐based homework alters the nature of feedback received, potentially changing the pedagogical benefit. Calculus‐ and algebra‐based introductory physics students enrolled in large paired lecture sections at a public university completed homework of standard end‐of‐the‐chapter exercises using either the Web or paper. Comparison of their performances on regular exams, conceptual exams, quizzes, laboratory, and homework showed no significant differences between groups; other measures were found to be strong predictors of performance. This indicates that the change in medium itself has limited effect on student learning. Ways in which Web‐based homework could enable exercises with greater pedagogical value are discussed. © 2003 Wiley Periodicals, Inc. J Res Sci Teach 40: 1050–1071, 2003 相似文献

7.

Multiple True-False Items: A Study of Interitem Correlations, Scoring Alternatives, and Reliability Estimation

Mark A. Albanese Darrell L. Sabers 《Journal of Educational Measurement》1988,25(2):111-123

Intercorrelations among multiple true-false items were examined to determine to what extent each true-false option can be treated as independent. Results from 157 health science students and 170 medical students showed that correlations between true-false options associated with the same stem were from 2.6 to 7.0 times larger than those from different stems. This suggests that results from previous research indicating that each true-false option could be treated as an independent item cannot be generalized to other tests and examinee populations without supporting evidence. Four scoring methods were explored which varied chance success levels and scoring for partial knowledge. The results showed that scoring methods incorporating partial knowledge were more reliable and possessed greater concurrent and predictive validity than those minimizing chance success. Methods for computing reliability estimates were compared and suggestions were offered regarding practical use 相似文献

8.

Item Response Models for Multiple Attempts With Incomplete Data

Yoav Bergner Ikkyu Choi Katherine E. Castellano 《Journal of Educational Measurement》2019,56(2):415-436

Allowance for multiple chances to answer constructed response questions is a prevalent feature in computer‐based homework and exams. We consider the use of item response theory in the estimation of item characteristics and student ability when multiple attempts are allowed but no explicit penalty is deducted for extra tries. This is common practice in online formative assessments, where the number of attempts is often unlimited. In these environments, some students may not always answer‐until‐correct, but may rather terminate a response process after one or more incorrect tries. We contrast the cases of graded and sequential item response models, both unidimensional models which do not explicitly account for factors other than ability. These approaches differ not only in terms of log‐odds assumptions but, importantly, in terms of handling incomplete data. We explore the consequences of model misspecification through a simulation study and with four online homework data sets. Our results suggest that model selection is insensitive for complete data, but quite sensitive to whether missing responses are regarded as informative (of inability) or not (e.g., missing at random). Under realistic conditions, a sequential model with similar parametric degrees of freedom to a graded model can account for more response patterns and outperforms the latter in terms of model fit. 相似文献

9.

When Teachers Misspell

Carlton E. Beck 《College Teaching》2013,61(2):114-115

Multiple-choice exams are often the standard in large, introductory college courses. Although students sometimes report that multiple-choice exams are easier than essay exams, the multiple-choice format often proves to be more difficult. This may be true because multiple-choice exams in college are often composed predominantly of application questions. They ask students to grapple with scenarios and recognize concepts in context, which proves to be difficult for many students. The author details the changes she has made in her introductory sociology curriculum and discusses some of the indicators of success. 相似文献

10.

Partial Credit in Answer-Until-Correct Multiple-Choice Tests Deployed in a Classroom Setting

Aaron D. Slepkov Alan T. K. Godfrey 《教育实用测度》2019,32(2):138-150

The answer-until-correct (AUC) method of multiple-choice (MC) testing involves test respondents making selections until the keyed answer is identified. Despite attendant benefits that include improved learning, broad student adoption, and facile administration of partial credit, the use of AUC methods for classroom testing has been extremely limited. This study presents scoring properties and item analysis for 26 AUC university course examinations, administered using a commercial scratch-card response system. Here, we show that beyond the traditional pedagogical advantages of AUC, the availability of partial credit adds psychometric advantages by boosting both the mean item discrimination and overall test-score reliability, when compared to tests scored dichotomously upon initial response. Furthermore we also find a strong correlation between students’ initial-response successes and the likelihood that they would obtain partial credit when they make incorrect initial responses. Thus, partial credit is being granted based on partial knowledge that remains latent in traditional MC tests. The fact that these advantages are realized in real-life classroom tests may motivate further expansion of the use of AUC MC tests in higher education. 相似文献

11.

Using argumentation as a learning strategy to improve student performance in engineering Statics

Timothy L. Foutz 《European Journal of Engineering Education》2019,44(3):312-329

ABSTRACT

Research suggests that a significant reason that a large number of students earn low grades in the fundamental engineering science course Statics is that they may be entering the course with incorrect conceptual knowledge of mathematics and physics. The self-explanation learning approach called collective argumentation helps k-12 students to understand their misconceptions of mathematical principles that often appear abstract to them. This study investigated collective argumentation as an instructional approach that helps engineering students identify and correct their misconceptions of topics taught in Statics. Results suggest that argumentation improves student performance as measured by grades earned on semester exams. Survey and focus group results suggest that students did not understand the argumentation process. Therefore, the students did not like using it as a learning approach. 相似文献

12.

Using clickers to facilitate development of problem-solving skills

Levesque AA 《CBE life sciences education》2011,10(4):406-417

Classroom response systems, or clickers, have become pedagogical staples of the undergraduate science curriculum at many universities. In this study, the effectiveness of clickers in promoting problem-solving skills in a genetics class was investigated. Students were presented with problems requiring application of concepts covered in lecture and were polled for the correct answer. A histogram of class responses was displayed, and students were encouraged to discuss the problem, which enabled them to better understand the correct answer. Students were then presented with a similar problem and were again polled. My results indicate that those students who were initially unable to solve the problem were then able to figure out how to solve similar types of problems through a combination of trial and error and class discussion. This was reflected in student performance on exams, where there was a statistically significant positive correlation between grades and the percentage of clicker questions answered. Interestingly, there was no clear correlation between exam grades and the percentage of clicker questions answered correctly. These results suggest that students who attempt to solve problems in class are better equipped to solve problems on exams. 相似文献

13.

新高考下大学无机化学与高中化学教学衔接——以温州大学为例

下载免费PDF全文

赵肖为胡玫周茂洪李军《宁波大学学报(教育科学版)》2017,(6):107-112

现在的大学无机化学教学以相当于新高考的选考内容为基础。改革后大多数选考科目包含化学的专业（类）在一定时期内相当多录取考生没有选考化学,他们掌握了学考内容,对选考内容生疏。新形势需要新的衔接措施:着眼于学生终身可持续发展,以树立严谨学习态度、养成良好学习方法和提高自身学习能力为目标,激发学习兴趣,与学考内容相衔接,因材施教,确保学生切实掌握教学内容;实行隐性分层教学方法,应对学生化学基础的巨大差别。相似文献

14.

Measuring knowledge integration: Validation of four‐year assessments

Ou Lydia Liu Hee‐Sun Lee Marcia C. Linn 《科学教学研究杂志》2011,48(9):1079-1107

Science education needs valid, authentic, and efficient assessments. Many typical science assessments primarily measure recall of isolated information. This paper reports on the validation of assessments that measure knowledge integration ability among middle school and high school students. The assessments were administered to 18,729 students in five states. Rasch analyses of the assessments demonstrated satisfactory item fit, item difficulty, test reliability, and person reliability. The study showed that, when appropriately designed, knowledge integration assessments can be balanced between validity and reliability, authenticity and generalizability, and instructional sensitivity and technical quality. Results also showed that, when paired with multiple‐choice items and scored with an effective scoring rubric, constructed‐response items can achieve high reliabilities. Analyses showed that English language learner status and computer use significantly impacted students' science knowledge integration abilities. Students who took the assessment online, which matched the format of content delivery, performed significantly better than students who took the paper‐and‐pencil version. Implications and future directions of research are noted, including refining curriculum materials to meet the needs of diverse students and expanding the range of topics measured by knowledge integration assessments. © 2011 Wiley Periodicals, Inc. J Res Sci Teach 48: 1079–1107, 2011 相似文献

15.

Studying the Relationships Between the Number of APs,AP Performance,and College Outcomes

Jonathan J. Beard Julian Hsu Maureen Ewing Kelly E. Godfrey 《Educational Measurement》2019,38(4):42-54

相似文献

16.

Ability,demography, learning style,and personality trait correlates of student preference for assessment method 总被引：1，自引：1，他引：0

Adrian Furnham Andrew Christopher Jeanette Garwood Neil G. Martin 《教育心理学》2008,28(1):15-27

More than 400 students from four universities in America and Britain completed measures of learning style preference, general knowledge (as a proxy for intelligence), and preference for examination method. Learning style was consistently associated with preferences: surface learners preferred multiple choice and group work options, and viewed essay‐type and dissertation options less favourably. Deep learners, on the other hand, favoured essay‐type and oral exams as well as final dissertations. Males favoured oral (viva voce) exams and females coursework assessment. Extraverts preferred multiple choice, oral, and group work assessment, while openness was positively associated with essays and oral exams but negatively associated with multiple choice and group work. Regression analysis showed that personality, learning style, general knowledge, and demographic factors accounted for 5–10% of the variance in preferred examination technique. Results in part replicate earlier studies and are discussed in terms of changes in examination methods. 相似文献

17.

Modified use of team‐based learning for effective delivery of medical gross anatomy and embryology

Nagaswami S. Vasan David O. DeFouw Bart K. Holland 《Anatomical sciences education》2008,1(1):3-9

Team‐based learning (TBL) is an instructional strategy that combines independent out‐of‐class preparation for in‐class discussion in small groups. This approach has been successfully adopted by a number of medical educators. This strategy allowed us to eliminate anatomy lectures and incorporate small‐group active learning. Although our strategy is a modified use of classical TBL, in the text, we use the standard terminology of TBL for simplicity. We have modified classical TBL to fit our curricular needs and approach. Anatomy lectures were replaced with TBL activities that required pre‐class reading of assigned materials, an individual self‐assessment quiz, discussion of learning issues derived from the reading assignments, and then the group retaking the same quiz for discussion and deeper learning. Students' performances and their educational experiences in the TBL format were compared with the traditional lecture approach. We offer several in‐house unit exams and a final comprehensive subject exam provided by the National Board of Medical Examiners. The students performed better in all exams following the TBL approach compared to traditional lecture‐based teaching. Students acknowledged that TBL encouraged them to study regularly, allowed them to actively teach and learn from peers, and this served to improve their own exam performances. We found that a TBL approach in teaching anatomy allowed us to create an active learning environment that helped to improve students' performances. Based on our experience, other preclinical courses are now piloting TBL. Anat Sci Ed 1:3–9, 2008. © 2007 American Association of Anatomists. 相似文献

18.

The consequences of central examinations on educational quality standards and labour market outcomes

Uschi Backes‐Gellner Stephan Veen 《牛津教育评论》2013,39(5):569-588

Central examinations—that is, centrally set and marked exams—have often been discussed as an instrument for improving educational outcomes. The aim of our study was to determine whether central exams have an impact not only on educational but also on labour market outcomes. We explain school quality choice through the incentives created by central exams vs. non‐central exams and model the resulting students’ schooling decisions and employers’ wage decisions. We use the German Abitur and the variation among the German federal states with respect to central exams as a quasi‐experimental design for alternative educational quality regimes. As hypothesised from our theoretical analysis, the percentage of Abitur holders increases more quickly in quality regimes without central exams than with central exams. However, as theoretically expected in the case of a pooled labour market, the wage premium decreases not only for Abitur‐holders without central exams but also for all Abitur‐holders. This is due to the quality deterioration in the states without central exams which spills over into a pooled labour market. Thus, graduates from states with central exams and higher educational standards ‘pay’ for the quality deterioration of educational standards in states without central exams. 相似文献

19.

Assessing Knowledge Integration in Science: Construct,Measures, and Evidence

Ou Lydia Liu Hee-Sun Lee Carolyn Hofstetter Marcia C. Linn 《Educational Assessment》2013,18(1):33-55

In response to the demand for sound science assessments, this article presents the development of a latent construct called knowledge integration as an effective measure of science inquiry. Knowledge integration assessments ask students to link, distinguish, evaluate, and organize their ideas about complex scientific topics. The article focuses on assessment topics commonly taught in 6th- through 12th-grade classes. Items from both published standardized tests and previous knowledge integration research were examined in 6 subject-area tests. Results from Rasch partial credit analyses revealed that the tests exhibited satisfactory psychometric properties with respect to internal consistency, item fit, weighted likelihood estimates, discrimination, and differential item functioning. Compared with items coded using dichotomous scoring rubrics, those coded with the knowledge integration rubrics yielded significantly higher discrimination indexes. The knowledge integration assessment tasks, analyzed using knowledge integration scoring rubrics, demonstrate strong promise as effective measures of complex science reasoning in varied science domains. 相似文献

20.

Teaching as Story‐telling: A Non‐mechanistic Approach to Planning Teaching

Kieran Egan 《课程研究杂志》2013,45(4):397-406

相似文献