首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
How does a testing component function in an integrated learning system? How can you customize computerized tests to meet local specifications? How are computerized tests implemented and evaluated? Is the pay-off of computerized testing justified?  相似文献   

2.
How does the use of computerized adaptive testing affect the performance of students from different groups? How consistent were the results of computerized adaptive and “conventional” tests? What did the students think about the test experience? What advice do the authors have for test developers and users?  相似文献   

3.
Successful administration of computerized adaptive testing (CAT) programs in educational settings requires that test security and item exposure control issues be taken seriously. Developing an item selection algorithm that strikes the right balance between test precision and level of item pool utilization is the key to successful implementation and long‐term quality control of CAT. This study proposed a new item selection method using the “efficiency balanced information” criterion to address issues with the maximum Fisher information method and stratification methods. According to the simulation results, the new efficiency balanced information method had desirable advantages over the other studied item selection methods in terms of improving the optimality of CAT assembly and utilizing items with low a‐values while eliminating the need for item pool stratification.  相似文献   

4.
Pupil monitoring systems support the teacher in tailoring teaching to the individual level of a student and in comparing the progress and results of teaching with national standards. The systems are based on the availability of an item bank calibrated using item response theory. The assessment of the students’ progress and results can be further supported by using computerized adaptive testing where the items selected from the item bank are targeted at the specific ability level of the student. The present article discusses psychometric issues of pupil monitoring systems, such as ability estimation, the optimal construction of tests from the item bank and monitoring of progress.  相似文献   

5.
What conditions stimulated the Learning Outcome Testing Program and what approaches were adopted? What issues need to be addressed in the selection of computer software? What steps should be taken to train teachers to develop item banks and use testing software?  相似文献   

6.
《教育实用测度》2013,26(4):359-375
Many procedures have been developed for selecting the "best" items for a computerized adaptive test. There is a trend toward the use of adaptive testing in applied settings such as licensure tests, program entrance tests, and educational tests. It is useful to consider procedures for item selection and the special needs of applied testing settings to facilitate test design. The current study reviews several classical approaches and alternative approaches to item selection and discusses their relative merit. This study also describes procedures for constrained computerized adaptive testing (C-CAT) that may be added to classical item selection approaches to allow them to be used for applied testing, while maintaining the high measurement precision and short test length that made adaptive testing attractive to practitioners initially.  相似文献   

7.
An item bank is defined as a relatively large collection of easily accessible test questions. A wide variety of item bank schemes that meet this relatively unrestricted definition is illustrated. Advantages and disadvantages of item banking and the conditions under which item banks have the most potential value are identified. An extensive list of questions to be asked in designing item banking systems is provided. The following five questions were singled out for further discussion: How many items should be in the bank? Should users develop their own item collections or use the collections of others? How should the items be classified? Should items be calibrated? Will each test have different items or will the same test be administered to all?  相似文献   

8.
The Role of Consequences in validity Theory   总被引:1,自引:0,他引:1  
How do individuals make sense of and use the products and practices of testing in their everyday lives? What is the responsibility of the educational measurement community to take these issues into consideration in assessing what it is that we do?  相似文献   

9.
本研究利用建构图设计一套含有六大部分的30道试题。题型包括拼写题、选择题和简答题。共有175名6到14岁儿童参加了此项考试。Rasch分析结果发现题组内局部题目依赖并不严重。信度为0.85。考题的难度和考生能力的配合度相当良好。我们根据建构图来编写考题,因此有一定程度的内容效度。但有9道题的难度稍微与原先预期略有出入。有5道题不大吻合Rasch模式的预期,没有发现在性别上有明显的项目功能差异。考生能力与学习英语的时间有正相关。最后探讨了基于信息通讯技术的远程计算机自适应测验的技术问题。  相似文献   

10.
Many computerized testing algorithms require the fitting of some item response theory (IRT) model to examinees' responses to facilitate item selection, the determination of test stopping rules, and classification decisions. Some IRT models are thought to be particularly useful for small volume certification programs that wish to make the transition to computerized adaptive testing (CAT). The one-parameter logistic model (1-PLM) is usually assumed to require a smaller sample size than the three-parameter logistic model (3-PLM) for item parameter calibrations. This study examined the effects of model misspecification on the precision of the decisions made using the sequential probability ratio test (SPRT). For this comparison, the 1-PLM was used to estimate item parameters, even though the items' characteristics were represented by a 3-PLM. Results demonstrated that the 1-PLM produced considerably more decision errors under simulation conditions similar to a real testing environment, compared to the true model and to a fixed-form standard reference set of items.  相似文献   

11.
In computerized adaptive testing (CAT), ensuring the security of test items is a crucial practical consideration. A common approach to reducing item theft is to define maximum item exposure rates, i.e., to limit the proportion of examinees to whom a given item can be administered. Numerous methods for controlling exposure rates have been proposed for tests employing the unidimensional 3-PL model. The present article explores the issues associated with controlling exposure rates when a multidimensional item response theory (MIRT) model is utilized and exposure rates must be controlled conditional upon ability. This situation is complicated by the exponentially increasing number of possible ability values in multiple dimensions. The article introduces a new procedure, called the generalized Stocking-Lewis method, that controls the exposure rate for students of comparable ability as well as with respect to the overall population. A realistic simulation set compares the new method with three other approaches: Kullback-Leibler information with no exposure control, Kullback-Leibler information with unconditional Sympson-Hetter exposure control, and random item selection.  相似文献   

12.
Preventing items in adaptive testing from being over- or underexposed is one of the main problems in computerized adaptive testing. Though the problem of overexposed items can be solved using a probabilistic item-exposure control method, such methods are unable to deal with the problem of underexposed items. Using a system of rotating item pools, on the other hand, is a method that potentially solves both problems. In this method, a master pool is divided into (possibly overlapping) smaller item pools, which are required to have similar distributions of content and statistical attributes. These pools are rotated among the testing sites to realize desirable exposure rates for the items. A test assembly model, motivated by Gulliksen's matched random subtests method, was explored to help solve the problem of dividing a master pool into a set of smaller pools. Different methods to solve the model are proposed. An item pool from the Law School Admission Test was used to evaluate the performances of computerized adaptive tests from systems of rotating item pools constructed using these methods.  相似文献   

13.
How should we think about the concept of the testlet? How can testlets be better incorporated into test score analysis? Can there be a one‐item testlet?  相似文献   

14.
What are the practical implications of small decreases in reliability coefficients? How does increased item local dependence decrease reliability? How does the new format of more “authentic” reading tests affect reliability?  相似文献   

15.
The use of computerized adaptive testing algorithms for ranking items (e.g., college preferences, career choices) involves two major challenges: unacceptably high computation times (selecting from a large item pool with many dimensions) and biased results (enhanced preferences or intensified examinee responses because of repeated statements across items). To address these issues, we introduce subpool partition strategies for item selection and within-person statement exposure control procedures. Simulations showed that the multinomial method reduces computation time while maintaining measurement precision. Both the freeze and revised Sympson-Hetter online (RSHO) methods controlled the statement exposure rate; RSHO sacrificed some measurement precision but increased pool use. Furthermore, preventing a statement's repetition on consecutive items neither hindered the effectiveness of the freeze or RSHO method nor reduced measurement precision.  相似文献   

16.
How should amount of testing be defined? On average, how many hours does a U.S. student spend on testing? How does this compare with testing time in other countries? How do the type and purpose of testing vary from U.S. to other countries?  相似文献   

17.
What standard reports are included in TAP and SPP? What information is provided in item analysis output? How may teachers use the information in their instruction?  相似文献   

18.
计算机自适应考试是项目反应理论和计算机技术想结合的产物,本文依据项目反应理论,对自适应考试系统的中的能力估计、选题策略和终止规则等关键模块的设计进行了较为深入的探讨,并提出了基于J2EE系统实现的模型框架。  相似文献   

19.
How did early work of Binet and Thurstone foreshadow item response theory? What connections to work in other areas are not widely recognized? What are current application trends?  相似文献   

20.
Studies have shown that item difficulty can vary significantly based on the context of an item within a test form. In particular, item position may be associated with practice and fatigue effects that influence item parameter estimation. The purpose of this research was to examine the relevance of item position specifically for assessments used in early education, an area of testing that has received relatively limited psychometric attention. In an initial study, multilevel item response models fit to data from an early literacy measure revealed statistically significant increases in difficulty for items appearing later in a 20‐item form. The estimated linear change in logits for an increase of 1 in position was .024, resulting in a predicted change of .46 logits for a shift from the beginning to the end of the form. A subsequent simulation study examined impacts of item position effects on person ability estimation within computerized adaptive testing. Implications and recommendations for practice are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号