首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
因为测试不仅是评价教学成果的很重要的手段,而且也会为未来的英语教学提供反馈信息。因此,对英语测试的研究是很有必要的。本文以英语阅读测试为研究对象,探讨了多项选择题和简答题两种不同的测试类型对英语阅读理解测试结果的影响。该研究的被试是112名非英语专业一年级学生。论文除了对被试进行了两次不同题型的测试外,还对他们应对不同题型题目时的答题状态、平时英语阅读的习惯等问题以问卷形式进行了调查。通过这两种研究手段论证了两种不同的测试方法各自的优缺点,以及他们对阅读理解测试的结果影响。希望本次研究对其以后的大学英语阅读方面的教学能够产生相应的指导意义。  相似文献   

2.
王萌 《考试周刊》2009,(1):10-11
多项选择题因具有较高信度而被广泛采用,但也因不能检测学生的语言产出能力、效度较低而备受批评。为此,本文提出一些提高多项选择题效度的方法,供测试命题者参考。  相似文献   

3.
对阅读能力的测试可以综合考查学语言知识的综合运用能力。作为测试阅读能力最常用的多项选择题题型,它的设计直接影响着阅读测试的效度。本文从3个方面探讨了影响英语阅读测试多项选择题设计的一些原理和多项选择题型测试的利弊。  相似文献   

4.
为保证语言测试题目的质量和加强题库建设,本文基于经典测试理论,使用Gitest Ⅲ对一份高考试卷(阅读部分)题目进行项目分析,结果显示:该阅读题目的难度、区分度较理想,但难度分布并不理想。建议在使用题库中的组合试卷前先进行试测,以改进试题的难度分布以及部分题目选项的质量,从而提高试题的信度和效度。  相似文献   

5.
阅读是获取语言知识的有效方法,阅读能力是衡量学习者掌握语言综合能力的重要标志之一,阅读测试的目的是检测学生的阅读能力。要达到阅读测试的目的,就必须保证阅读测试的效度。阅读测试效度的提高是保证阅读测试成功的一个重要方面。命题者在命题过程中应该掌握影响阅读测试效度的因素以及提高阅读测试效度的途径,尽可能遵循这些方法,圆满完成英语阅读测试的命题任务。  相似文献   

6.
近年来各地数学中考题的选择题多以单项选择题形式出现,为了考查考生发散性思维以及对容易混淆知识点分辨能力,命题者进行了大胆的探索,命制了很多虽然答题结果是单项选择,但每个选项中多以两个或者两个以上的结果供考生选择的多项选择题,使此类试题成为考题中难度较大、作为考试  相似文献   

7.
付英  李鉴 《海外英语》2013,(11X):47-49
阅读理解是大学英语校内测试的重要组成部分,其信度和效度值得关注。该文将从内容效度的角度探讨如何设计高质量的阅读测试题型。《大学英语课程教学要求(试行)》对大学英语测试的内容及能力范围有明确规定,因此命题者在设计阅读题型时应从以下几方面考虑:语篇长度、生词数量、句子类型、阅读速度、易读度、语篇体裁和题材、考察不同阅读技能、题型及题目设置。  相似文献   

8.
运用项目反应理论对2015年12月CET4多题多卷(一)阅读理解试题质量进行分析。结果显示:阅读理解类试题具有单维性;该部分题目属于中上难度水平,区分度理想,猜测系数低,用于测试中上水平考生时准确性最高;相当一部分仔细阅读多项选择题的信息量没有达标,这部分题目的质量值得进一步研究与改进。  相似文献   

9.
采用回顾性口头报告和访谈等方法定量和定性地分析了日本语能力测试(JPLT)中多项选择题图片选择项形式对听力理解测试的影响。定量分析表明总体上受试在图片选择项题目中的得分并不理想;定性材料分析表明图片选择项对受试的自上而下和自下而上的信息处理都产生了影响,受试在图片理解和分析上存在个体差异,受试的读图能力可能会影响受试的听力理解,甚至成为影响测试效度的潜在因素。  相似文献   

10.
2006年仍然是高考改革之年,除了考试中心命制两套文科综合能力测试试卷之外,自主命制试题的地区增加了四川和广东。高考多套试卷,有不同的命题人员,有不同的出试题选材,但命题的指导思想是相同的,命题的思路是基本一致的,命题者所关注的社  相似文献   

11.
本文研究的是不同的测试方法-单项选择和信息转移-是否会在阅读理解考试中产生测试方法效应的问题.除对学生的考试成绩(分数)进行分析外,本研究还进一步对试题的难度值进行了分析,而本研究中试题难度是通过项目反应理论(Item Response Theory)计算得到的.结果显示不同测试方法的确会影响题目难度及考生的考试表现,就试题难度而言信息转移比单项选择更难.  相似文献   

12.
Linguistic complexity of test items is one test format element that has been studied in the context of struggling readers and their participation in paper-and-pencil tests. The present article presents findings from an exploratory study on the potential relationship between linguistic complexity and test performance for deaf readers. A total of 64 students completed 52 multiple-choice items, 32 in mathematics and 20 in reading. These items were coded for linguistic complexity components of vocabulary, syntax, and discourse. Mathematics items had higher linguistic complexity ratings than reading items, but there were no significant relationships between item linguistic complexity scores and student performance on the test items. The discussion addresses issues related to the subject area, student proficiency levels in the test content, factors to look for in determining a "linguistic complexity effect," and areas for further research in test item development and deaf students.  相似文献   

13.
While previous research has identified numerous factors that contribute to item difficulty, studies involving large-scale reading tests have provided mixed results. This study examined five selected-response item types used to measure reading comprehension in the Pearson Test of English Academic: a) multiple-choice (choose one answer), b) multiple-choice (choose multiple answers), c) re-order paragraphs, d) reading (fill-in-the-blanks), and e) reading and writing (fill-in-the-blanks). Utilizing a multiple regression approach, the criterion measure consisted of item difficulty scores for 172 items. 18 passage, passage-question, and response-format variables served as predictors. Overall, four significant predictors were identified for the entire group (i.e., sentence length, falsifiable distractors, number of correct options, and abstractness of information requested) and five variables were found to be significant for high-performing readers (including the four listed above and passage coherence); only the number of falsifiable distractors was a significant predictor for low-performing readers. Implications for assessing reading comprehension are discussed.  相似文献   

14.
阅读理解能力测验中所选择的文章在内容方面对不同专业背景的考生亚团体是否具有公平性的问题,是测验效度高低的重要证据,也是测验效度验证(validation)的重要环节。本研究以中国语言与文学专业考生为目标组,分别将经济学专业和生物医学专业考生作为参照组,采用效标测量和蕴涵量表分析相结合的方法,对HSK(高等)阅读理解测验的文章难度对三个不同专业背景的考生组的公平性问题进行了检验。研究结果表明,两个参照组考生尽管具有各自的相对专业优势,但他们在六篇阅读材料上获得的难度排列顺序与目标组考生完全一致;虽然目标组考生不具备汉语知识以外的其他专业优势,但因为HSK考试所选择的阅读材料没有涉及语言知识本身以外的特殊专业要求,因而测验对三个不同专业背景的考生具有较高的公平性。  相似文献   

15.
Test Wiseness     
Test wiseness (TW) has been defined as the ability to respond advantageously to multiple-choice items containing extraneous clues and, therefore, to obtain credit without knowledge of the subject matter being tested. The capacity of examinees to develop cue-using strategies was examined and the results suggest that students profit from knowledge of a particular test constructor’s idiosyncracies. The findings also lend weight to the argument that TW is not a general ability but rather that performance on TW items is cue-specific. Findings with respect to particular item flaws (longer correct alternative, grammar) generally support the research literature.  相似文献   

16.
Test items become easier when a representational picture visualizes the text item stem; this is referred to as the multimedia effect in testing. To uncover the processes underlying this effect and to understand how pictures affect students' item-solving behavior, we recorded the eye movements of sixty-two schoolchildren solving multiple-choice (MC) science items either with or without a representational picture. Results show that the time students spent fixating the picture was compensated for by less time spent reading the corresponding text. In text-picture items, students also spent less time fixating incorrect answer options; a behavior that was associated with better test scores in general. Detailed gaze likelihood analyses revealed that the picture received particular attention right after item onset and in the later phase of item solving. Hence, comparable to learning, pictures in tests seemingly boost students' performance because they may serve as mental scaffolds, supporting comprehension and decision making.  相似文献   

17.
This article presents a study of ethnic Differential Item Functioning (DIF) for 4th-, 7th-, and 10th-grade reading items on a state criterion-referenced achievement test. The tests, administered 1997 to 2001, were composed of multiple-choice and constructed-response items. Item performance by focal groups (i.e., students from Asian/Pacific Island, Black/African American, Native American, and Latino/Hispanic origins) were compared with the performance of White students using simultaneous item bias and Rasch procedures. Flagged multiple-choice items generally favored White students, whereas flagged constructed-response items generally favored students from Asian/Pacific Islander, Black/African American, and Latino/Hispanic origins. Content analysis of flagged reading items showed that positively and negatively flagged items typically measured inference, interpretation, or analysis of text in multiple-choice and constructed-response formats. Items that were not flagged for DIF generally measured very easy reading skills (e.g., literal comprehension) and reading skills that require higher level thinking (e.g., developing interpretations across texts and analyzing graphic elements).  相似文献   

18.
难度不是试题的固有属性,而是考生因素与试题特征之间互动的结果。很多试题分析者倾向于将试题难度偏高的原因仅仅归结于学生未掌握相关知识或技能,而忽视试题本身的特征。通过分析60道难度在0.6以下的高考英语试题,探究其难度来源。结果显示,除考生因素外,难题或偏难题的难度来源也与命题技术有关,比如答案的唯一性与可接受性、考查内容超纲、考点设置与评分标准欠妥等方面的问题。为此,提出考试机构应提高命题水平,加强试题质量监控,确保大规模考试科学选拔人才。  相似文献   

19.
This empirical study aimed to investigate the impact of easy first vs. hard first ordering of the same items in a paper and-pencil multiple-choice exam on the performances of low, moderate, and high achiever examinees, as well as on the item statistics. Data were collected from 554 Turkish university students using two test forms, which included the same multiple-choice items ordered reversely, i.e. easy first vs. hard first. Tests included 26 multiple-choice items about the introductory unit of “Measurement and Assessment” course. The results suggested that sequencing the multiple-choice items in either direction from easy to hard or vice versa did not affect the test performances of the examinees no matter whether they are low, moderate or high achiever examinees. Finally, no statistically significant difference was observed between item statistics of both forms, i.e. the difficulty (p), discrimination (d), point biserial (r), and adjusted point biserial (adj. r) coefficients.  相似文献   

20.
实用汉语水平认定考试(简称C.TEST)是用来测试母语非汉语的外籍人士在国际环境下社会生活以及日常工作中实际运用汉语能力的考试。由于C.TEST的考试题目公开,题库数量较小,所以通过一般标准化考试采用的在部分目标被试中实施预测(fieldtest)的方法来获取考试题目的难度参数存在困难。然而,人工神经网络技术作为现代人工智能研究的成果,在预测(prediction)领域发挥了很大作用。本文选取C.TEST(A—D级)的阅读理解题目作为研究材料,运用人工神经网络技术对其难度进行预测,得到了网络预测难度值与实际考试难度值显著相关的研究结果。这一结果表明,利用人工神经网络模型对语言测验的题目难度等参数进行预测是可行的。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号