共查询到20条相似文献,搜索用时 234 毫秒
1.
香港兼具高考功能的中学文凭考试综合运用公开考试及校本评核,力求评价方式的客观化与标准化,强调考试实践层面的标准制定,力图以教育测量理论与技术的科学性保证考试的公平性,对内地的高考改革有着较强的借鉴意义。 相似文献
2.
考试评价是教育评价的重要组成部分,对教学实践有显著的导向作用。高考综合改革背景下,考试评价工作要坚持目标导向和问题导向,既要为教育政策制定及改革决策提供建议,也要服务命题部门提高命题水平,并服务一线教学提高教学质量,促进学生全面发展。在评价过程中,要做到“四个结合”,即将教育测量理论与大数据技术结合,教育评价理论研究与教育教学实践探索相结合,将过程评价、结果评价、增值评价和综合评价相结合,以及将分类设计和多元供给相结合。 相似文献
3.
4.
5.
6.
《中学语文(读写新空间)》2020,(3)
<正>中国高考评价体系是深化新时代高考内容改革的基础工程、理论支撑和实践指南,对发展素质教育、推进教育公平、实现教育现代化、建设教育强国、办好人民满意的教育具有重要意义。全面把握高考评价体系的总体特 相似文献
7.
熊永祥 《教育测量与评价(理论版)》2011,(8):45-51
《教育测量与评价》杂志是我国唯一一本专门研究教育测评理论、指导教育测评实践的期刊。其中,考试与招生栏目是体现刊物主办者意图、具有理论指导实践之特色的一个栏目。本文采用文献计量法,对2008年9月。2011年6月《教育测量与评价》所刊载的关于高考与招生的文章进行分主题研究及述论,以期与广大教育研究者共同探讨高考的使命和方向。 相似文献
8.
考试招生制度是基本教育制度,也是国家的重大公共政策。通过文献计量分析对现有研究进行梳理,可为未来完善新高考改革提供借鉴,为相关研究者提供新思路。十年时间新高考研究主要围绕四个方面:对新高考改革的政策措施、理论实践和问题改进的研究;对新高考改革下高中教学模式、育人方式转变以及评价体系的研究;对新高考改革下高校招生和录取模式的研究;对新考试技术应用的研究。未来新高考研究关注的热点是运用跨学科视角丰富新高考研究的成果、利用大数据提高新高考改革的实证研究、加强新高考改革新旧模式和横向比较的研究。 相似文献
9.
10.
爱因斯坦的教育思想包含了教育目的观、教育内容观、教育与文化的关系观、教育与环境观、教师观、教育方法观、教育评价观等内容。在今天高考改革过程中,他的教育思想对我有重大启示价值,启迪着从高考改革的价值观、高考改革目标、高考改革的内容、高考改革的信息化、高考改革的理论研究等方面去研究如何进行改革。 相似文献
11.
When a computerized adaptive testing (CAT) version of a test co-exists with its paper-and-pencil (P&P) version, it is important for scores from the CAT version to be comparable to scores from its P&P version. The CAT version may require multiple item pools for test security reasons, and CAT scores based on alternate pools also need to be comparable to each other. In this paper, we review research literature on CAT comparability issues and synthesize issues specific to these two settings. A framework of criteria for evaluating comparability was developed that contains the following three categories of criteria: validity criterion, psychometric property/reliability criterion, and statistical assumption/test administration condition criterion. Methods for evaluating comparability under these criteria as well as various algorithms for improving comparability are described and discussed. Focusing on the psychometric property/reliability criterion, an example using an item pool of ACT Assessment Mathematics items is provided to demonstrate a process for developing comparable CAT versions and for evaluating comparability. This example illustrates how simulations can be used to improve comparability at the early stages of the development of a CAT. The effects of different specifications of practical constraints, such as content balancing and item exposure rate control, and the effects of using alternate item pools are examined. One interesting finding from this study is that a large part of incomparability may be due to the change from number-correct score-based scoring to IRT ability estimation-based scoring. In addition, changes in components of a CAT, such as exposure rate control, content balancing, test length, and item pool size were found to result in different levels of comparability in test scores. 相似文献
12.
《校园英语(教研版)》2015,(2)
College English teaching reform has been one of the hot issues of higher education.The reform of college English teaching in recent years has made some achievements,but still there are many problems.This paper mainly focus on the following questions: What are the problems in college English teaching.The corresponding measures are proposed in this paper to address these issues. 相似文献
13.
针对新一轮高考改革中的选科组合与等级分数,主要探讨了选科组合与招生专业限科之间的复杂性映射关系,等级分数转换及其应用问题,选科组合与多元录取的协调发展等问题,旨在推动高考改革研究的不断深化和良性发展。 相似文献
14.
《教育实用测度》2013,26(4):297-312
Certain potential benefits of using item response theory in test construction are discussed and evaluated using the experience and evidence accumulated during 9 years of using a three-parameter model in the construction of major achievement batteries. We also discuss several cautions and limitations in realizing these benefits as well as issues in need of further research. The potential benefits considered are those of getting "sample-free" item calibrations and "item-free" person measurement, automatically equating various tests, decreasing the standard errors of scores without increasing the number of items used by using item pattern scoring, assessing item bias (or differential item functioning) independently of difficulty in a manner consistent with item selection, being able to determine just how adequate a tryout pool of items may be, setting up computer-generated "ideal" tests drawn from pools as targets for test developers, and controlling the standard error of a selected test at any desired set of score levels. 相似文献
15.
物理学是一门实验科学,物理实验又是物理学的基础,在培养学生科学素质、动手能力和创新能力方面有着重要的作用。本文结合物理实验在高校开设过程中存在的一些问题,提出了大学物理实验教学改革的几点思路。 相似文献
16.
《Educational Assessment》2013,18(2):71-93
As the number of students with disabilities applying for admission and enrolling in educational institutions continues to increase, educators and measurement experts face the challenge of determining whether and how to offer accommodations in admissions tests and how to report and utilize the results of modified tests. This article discusses the provision of accommodations in admissions testing and in educational programs, the test score flagging practices that impact admissions testing, validity concerns, and issues surrounding fairness and compliance with the federal disability laws for such practices. It offers some conclusions about the legality of the use of flagged test scores, as well as a call for further research concerning testing and evaluating students with disabilities. 相似文献
17.
Adam E. Wyse 《Educational Measurement》2015,34(2):47-54
This article uses data from a large‐scale assessment program to illustrate the potential issue of range restriction with the Bookmark method in the context of trying to set cut scores to closely align with a set of college and career readiness benchmarks. Analyses indicated that range restriction issues existed across different response probability (RP) values and item response theory (IRT) models if one were to apply the Bookmark procedure using intact test forms. Results also suggested that range restriction may still be present if one had access to additional data from an item bank. This demonstration critically highlights challenges that may exist in some practical applications of the Bookmark method due items not being designed to cover the full range of examinee abilities. 相似文献
18.
Evaluating the Comparability of Paper‐ and Computer‐Based Science Tests Across Sex and SES Subgroups
As access and reliance on technology continue to increase, so does the use of computerized testing for admissions, licensure/certification, and accountability exams. Nonetheless, full computer‐based test (CBT) implementation can be difficult due to limited resources. As a result, some testing programs offer both CBT and paper‐based test (PBT) administration formats. In such situations, evidence that scores obtained from different formats are comparable must be gathered. In this study, we illustrate how contemporary statistical methods can be used to provide evidence regarding the comparability of CBT and PBT scores at the total test score and item levels. Specifically, we looked at the invariance of test structure and item functioning across test administration mode across subgroups of students defined by SES and sex. Multiple replications of both confirmatory factor analysis and Rasch differential item functioning analyses were used to assess invariance at the factorial and item levels. Results revealed a unidimensional construct with moderate statistical support for strong factorial‐level invariance across SES subgroups, and moderate support of invariance across sex. Issues involved in applying these analyses to future evaluations of the comparability of scores from different versions of a test are discussed. 相似文献
19.
In recent years, students’ test scores have been used to evaluate teachers’ performance. The assumption underlying this practice is that students’ test performance reflects teachers’ instruction. However, this assumption is generally not empirically tested. In this study, we examine the effect of teachers’ instruction on test performance at the item level using a hierarchical differential item functioning approach. The items are from the U.S. TIMSS 2011 4th-grade math test. Specifically, we tested whether students who had received instruction on a given item performed significantly better on that item compared with students who had not received such instruction when their overall math ability was controlled for, whether with or without controlling for student-level and class-level covariates. This study provides preliminary findings regarding why some items show instructional sensitivity and sheds light on how to develop instructionally sensitive items. Implications and directions for further research are also discussed. 相似文献
20.
John E. Lothes II Sara Matney Zayne Naseer Riley Pfyffer 《Mind, Brain, and Education》2023,17(1):61-69
Research shows that mindfulness interventions for test anxiety in a college student population are beneficial (Lothes, Matney, & Naseer, 2022). This study assessed the effects of online mindfulness practices over a 5-week period on anxiety and test anxiety in college students. Participants included 20 students that were randomly assigned to either a sitting meditation or a wait list control (WLC). A weekly schedule of mindfulness practices was given to participants to complete on their own for 5 weeks. The WLC did not do any mindfulness for the first 5 weeks. Participants in the both conditions showed significant within-group reductions in test anxiety, overall anxiety, and DASS scores during their mindfulness interventions. Both groups also showed significant increases in FFMQ scores. Mindfulness may play a role in the reduction of anxiety and test anxiety. Further research is needed to assess how mindfulness may affect anxiety and test anxiety in college students. 相似文献