首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 671 毫秒
1.
This study established a Chinese scale for measuring high school students’ ocean literacy. This included testing its reliability, validity, and differential item functioning (DIF) with the aim of compensating for the lack of DIF tests focusing on current scales. The construct validity and reliability were verified and tested by analyzing the established scale’s items using the Rasch model, and a gender DIF test was conducted to ensure the test results’ fairness when distinct groups were compared simultaneously. The results indicated that the scale established in this study is unidimensional and possesses favorable internal consistency and construct validity. The gender DIF test results indicated that several items were difficult for either female or male students to correctly answer; however, the experts and scholars discussed these items individually and suggested retaining them. The final Chinese version of the ocean literacy scale developed here comprises 48 items that can reflect high school students’ understanding of ocean literacy—which helps students understand the topics of marine science encountered in real life.  相似文献   

2.
Greek pre-service teachers’ level of ocean literacy was assessed using a revised questionnaire concerning ocean content knowledge and an instrument about ocean stewardship. Rasch analyses showed that the items of both measures were well targeted to the sample. Pre-service teachers possessed a moderate knowledge of ocean sciences issues and positive attitudes toward ocean stewardship; they obtained most information on ocean content from the Internet and mass media and less from formal education, nongovernmental organizations, books, and out-of-school settings. Students who mostly preferred the Internet and mass media scored significantly higher on the knowledge questionnaire. The results could contribute to the enhancement of teachers’ ocean literacy.  相似文献   

3.
RCMLM模型是基于Rasch测量理论的通用拓展模型。利用RCMLM模型对一份普通高中数学试卷进行不同性别的DIF分析。结果表明:该模型可对具有二分计分和多分计分的试题同时进行DIF分析,避免了以往分别对两种计分方式试题进行DIF分析的弊端,保持了试卷的完整性,使DIF分析结果更加有效。  相似文献   

4.
The purpose of this study was to conduct a validation analysis of an SET and provide a validation framework of SETs that can be included when designing complete evaluations of teaching within higher education institutions. A series of Rasch analyses was conducted on the results of the SET, examining the responses of students within a college and three departments. Results show the majority of items were moderately difficult to endorse in the college and departments, there were issues with DIF, and two items did not consistently fit the model. The study provides an analysis framework that may aid policymakers and institutional administrators in developing higher quality SETs, and demonstrates the need for validating SETs being implemented in higher education settings.  相似文献   

5.
The performance of English language learners (ELLs) has been a concern given the rapidly changing demographics in US K-12 education. This study aimed to examine whether students' English language status has an impact on their inquiry science performance. Differential item functioning (DIF) analysis was conducted with regard to ELL status on an inquiry-based science assessment, using a multifaceted Rasch DIF model. A total of 1,396 seventh- and eighth-grade students took the science test, including 313 ELL students. The results showed that, overall, non-ELLs significantly outperformed ELLs. Of the four items that showed DIF, three favored non-ELLs while one favored ELLs. The item that favored ELLs provided a graphic representation of a science concept within a family context. There is some evidence that constructed-response items may help ELLs articulate scientific reasoning using their own words. Assessment developers and teachers should pay attention to the possible interaction between linguistic challenges and science content when designing assessment for and providing instruction to ELLs.  相似文献   

6.
In a previous simulation study of methods for assessing differential item functioning (DIF) in computer-adaptive tests (Zwick, Thayer, & Wingersky, 1993, 1994), modified versions of the Mantel-Haenszel and standardization methods were found to perform well. In that study, data were generated using the 3-parameter logistic (3PL) model and this same model was assumed in obtaining item parameter estimates. In the current study, the 3PL data were used but the Rasch model was assumed in obtaining the item parameter estimates, which determined the information table used for item selection. Although the obtained DIF statistics were highly correlated with the generating DIF values, they tended to be smaller in magnitude than in the 3PL analysis, resulting in a lower probability of DIF detection. This reduced sensitivity appeared to be related to a degradation in the accuracy of matching. Expected true scores from the Rasch-based computer-adaptive test tended to be biased downward, particularly for lower-ability examinees  相似文献   

7.
为克服经典测量理论存在的测量依赖性和样本依赖性,本研究将Rasch模型应用于小学六年级学生科学素养评测的质量分析中,从整体质量检验、单维性检验、怀特图、单题质量分析、气泡图等方面介绍了Rasch模型在质量分析中的应用。同时指出该评测设计的题目信效度高、区分度合理,绝大多数题目达到了测量预期。Rasch模型在评测设计中的应用,为评测设计提供了一定的测量质量数据的参考。  相似文献   

8.
In recent years, large-scale international assessments have been increasingly used to evaluate and compare the quality of education across regions and countries. However, measurement variance between different versions of these assessments often posts threats to the validity of such cross-cultural comparisons. In this study, we investigated the cross-language, cross-cultural validity of the Programme for International Student Assessment 2006 Science assessment via three differential item functioning (DIF) analyses between the USA and Canada, Chinese Hong Kong and mainland China, and between the USA and mainland China. Furthermore, we explored three plausible causes of DIF via content analysis, namely language, curriculum and cultural differences. Our results revealed that differential curriculum coverage was the most serious cause of DIF among the three factors we investigated in this study, and differential content familiarity also contributed to DIF here. We discussed the implications of the findings for future international assessment development, and for how to best define ‘scientific literacy’ for students around the world.  相似文献   

9.
The development of reliable tools for assessing digital competences is of great importance. This is why we set out a re-evaluation of the measurement quality of the D21-Digital-Index assessment instrument. The D21-Digital-Index is a biannual and influential study held in Germany. The instrument used in the D21-surveys is based on the theoretical framework DigComp. In our analyses we used data of 1142 participants from vocational training and higher education institutions to estimate item parameters and the quality of the instrument using Item Response Theory. Because choosing an appropriate IRT-model is crucial for instrument evaluation, we calculated and compared two types of models, the Rasch and the Birnbaum model of which the latter turned out to achieve the better fit. In a unidimensional analysis the five scales of the instrument with 24 items in total yield acceptable measures. Multidimensional analysis shows a dimensional separation and hence confirms the construct validity of the instrument.  相似文献   

10.
The purpose of this study was to make use of proposed definitions of environmental literacy to (1) guide the application of Rasch analysis and (2) utilize the developed instrumentation to further inform the work of environmental educators. A total of 2311 preservice teachers attending Faculty of Education departments of four public universities located in the capital city of Turkey provided data for this study. The instrument used included a knowledge scale, an attitude scale, an attitude towards environmental responsibility scale and a concern scale. Rasch analysis revealed which those items which address the environmental knowledge widely broadcasted by mass media also were answered correctly by most participants. Generally, instrument items that addressed the understanding of the interrelated nature of environmental knowledge were answered incorrectly by participants. Analysis of attitude and attitude towards environmental responsibility scales indicated that the preservice teachers exhibited the most support for plant and animal rights, environmental protection laws and ecological balance. Results of the concern scale suggested that the preservice teachers were most concerned with regard to issues of poor drinking-water quality. Gender analysis revealed different orientations among females and males in terms of knowledge, attitudes, attitude towards environmental responsibility and concern scales.  相似文献   

11.
Rasch测量原理及在高考命题评价中的实证研究   总被引:1,自引:1,他引:1  
王蕾 《中国考试》2008,(1):32-39
Rasch测量是当前教育与心理测量中具有客观等距量尺的测量。克服了经典测量的测验工具依赖和样本依赖的局限。本文通过介绍Rasch测量原理及其在高考命题评价考生抽样数据分析上的具体应用,为教育决策者和命题者提供了直观的Rasch测量对高考命题评价的量化图形表现形式。希望Rasch测量能在高考抽样数据分析中为命题量化评价提供新的、有价值的思考方式,能被教育决策者和命题者认同和有效使用。  相似文献   

12.
Given the central importance of the Nature of Science (NOS) and Scientific Inquiry (SI) in national and international science standards and science learning, empirical support for the theoretical delineation of these constructs is of considerable significance. Furthermore, tests of the effects of varying magnitudes of NOS knowledge on domain‐specific science understanding and belief require the application of instruments validated in accordance with AERA, APA, and NCME assessment standards. Our study explores three interrelated aspects of a recently developed NOS instrument: (1) validity and reliability; (2) instrument dimensionality; and (3) item scales, properties, and qualities within the context of Classical Test Theory and Item Response Theory (Rasch modeling). A construct analysis revealed that the instrument did not match published operationalizations of NOS concepts. Rasch analysis of the original instrument—as well as a reduced item set—indicated that a two‐dimensional Rasch model fit significantly better than a one‐dimensional model in both cases. Thus, our study revealed that NOS and SI are supported as two separate dimensions, corroborating theoretical distinctions in the literature. To identify items with unacceptable fit values, item quality analyses were used. A Wright Map revealed that few items sufficiently distinguished high performers in the sample and excessive numbers of items were present at the low end of the performance scale. Overall, our study outlines an approach for how Rasch modeling may be used to evaluate and improve Likert‐type instruments in science education.  相似文献   

13.
Loglinear latent class models are used to detect differential item functioning (DIF). These models are formulated in such a manner that the attribute to be assessed may be continuous, as in a Rasch model, or categorical, as in Latent Class Mastery models. Further, an item may exhibit DIF with respect to a manifest grouping variable, a latent grouping variable, or both. Likelihood-ratio tests for assessing the presence of various types of DIF are described, and these methods are illustrated through the analysis of a "real world" data set.  相似文献   

14.
This article describes the development, validation and application of a Rasch-based instrument, the Elementary School Science Classroom Environment Scale (ESSCES), for measuring students’ perceptions of constructivist practices within the elementary science classroom. The instrument, designed to complement the Reformed Teaching Observation Protocol (RTOP), is conceptualised using the RTOP’s three construct domains: Lesson Design and Implementation; Content; and Classroom Culture. Data from 895 elementary students was used to develop the Rasch scale, which was assessed for item fit, invariance and dimensionality. Overall, the data conformed to the assumptions of the Rasch model. In addition, the structural relationships among the retained items of the Rasch model supported and validated the instrument for measuring the reformed science classroom environment theoretical construct. The application of the ESSCES in a research study involving fourth grade students provides evidence that educators and researchers have a reliable instrument for understanding the elementary science classroom environment through the lens of the students.  相似文献   

15.
Student teachers are expected to develop their teaching skills sooner and more rapidly. However, a sound evaluation instrument that can be used to diagnose and monitor the skilfulness level to aid formative assessment of student teachers is still limited. This article is aimed to calibrate and validate a teaching skill evaluation instrument for use in secondary education. A total of 264 student teachers in the Netherlands participated in the study. Rasch and multilevel analyses were used. Results suggest that the evaluation instrument meets the restrictive assumptions of the Rasch model and has predictive value for academic engagement. This adds validation evidence and justifies the calibration of the evaluation instrument to be used for monitoring the development of teacher's teaching skills.  相似文献   

16.
In this paper we present a new methodology for detecting differential item functioning (DIF). We introduce a DIF model, called the random item mixture (RIM), that is based on a Rasch model with random item difficulties (besides the common random person abilities). In addition, a mixture model is assumed for the item difficulties such that the items may belong to one of two classes: a DIF or a non-DIF class. The crucial difference between the DIF class and the non-DIF class is that the item difficulties in the DIF class may differ according to the observed person groups while they are equal across the person groups for the items from the non-DIF class. Statistical inference for the RIM is carried out in a Bayesian framework. The performance of the RIM is evaluated using a simulation study in which it is compared with traditional procedures, like the likelihood ratio test, the Mantel-Haenszel procedure and the standardized p -DIF procedure. In this comparison, the RIM performs better than the other methods. Finally, the usefulness of the model is also demonstrated on a real life data set.  相似文献   

17.
《Africa Education Review》2013,10(3):365-385
Abstract

Are South Africans financially literate, and how can this be measured? Until 2009 there was no South African financial literacy measure and, therefore, the aim was to develop a South African measurement instrument that is scientific, socially acceptable, valid and reliable. To achieve this aim a contextual and conceptual analysis of financial literacy that indicated the importance of financial literacy, the scope and impact of financial literacy education, and uncovered an acceptable financial literacy definition and its constituent concepts, was applied. A rigorous five-step process was then followed in developing a questionnaire that measures financial literacy knowledge, behaviour and attitude. This draft questionnaire was applied at the South African Military Academy (SAMA) to firstly determine and improve its validity and reliability, and secondly to measure the financial literacy levels of school leavers. Experts and users found this measurement instrument to be valid, and internal consistency levels of above .7 registered its reliability. On average the first-year SAMA students achieved scores of 55.55%, 69.85%, and 77.11% for financial literacy knowledge, behaviour and attitude. As a result it is postulated that there is now a scientific and socially relevant, valid and reliable South African financial literacy measurement instrument available.  相似文献   

18.
Efficacy of the Measure of Understanding of Macroevolution (MUM) as a measurement tool has been a point of contention among scholars needing a valid measure for knowledge of macroevolution. We explored the structure and construct validity of the MUM using Rasch methodologies in the context of a general education biology course designed with an emphasis on macroevolution content. The Rasch model was utilized to quantify item- and test-level characteristics, including dimensionality, reliability, and fit with the Rasch model. Contrary to previous work, we found that the MUM provides a valid, reliable, and unidimensional scale for measuring knowledge of macroevolution in introductory non-science majors, and that its psychometric behavior does not exhibit large changes across time. While we found that all items provide productive measurement information, several depart substantially from ideal behavior, warranting a collective effort to improve these items. Suggestions for improving the measurement characteristics of the MUM at the item and test levels are put forward and discussed.  相似文献   

19.
Abstract

Multilevel Rasch models are increasingly used to estimate the relationships between test scores and student and school factors. Response data were generated to follow one-, two-, and three-parameter logistic (1PL, 2PL, 3PL) models, but the Rasch model was used to estimate the latent regression parameters. When the response functions followed 2PL or 3PL models, the proportion of variance explained in test scores by the simulated student or school predictors was estimated accurately with a Rasch model. Proportion of variance within and between schools was also estimated accurately. The regression coefficients were misestimated unless they were rescaled out of logit units. However, item-level parameters, such as DIF effects, were biased when the Rasch model was violated, similar to single-level models.  相似文献   

20.
Understanding infectious diseases such as influenza is an important element of health literacy. We present a fully validated knowledge instrument called the Assessment of Knowledge of Influenza (AKI) and use it to evaluate knowledge of influenza, with a focus on misconceptions, in Midwestern United States high-school students. A two-phase validation process was used. In phase 1, an initial factor structure was calculated based on 205 students of grades 9–12 at a rural school. In phase 2, one- and two-dimensional factor structures were analyzed from the perspectives of classical test theory and the Rasch model using structural equation modeling and principal components analysis (PCA) on Rasch residuals, respectively. Rasch knowledge measures were calculated for 410 students from 6 school districts in the Midwest, and misconceptions were verified through the χ 2 test. Eight items measured knowledge of flu transmission, and seven measured knowledge of flu management. While alpha reliability measures for the subscales were acceptable, Rasch person reliability measures and PCA on residuals advocated for a single-factor scale. Four misconceptions were found, which have not been previously documented in high-school students. The AKI is the first validated influenza knowledge assessment, and can be used by schools and health agencies to provide a quantitative measure of impact of interventions aimed at increasing understanding of influenza. This study also adds significantly to the literature on misconceptions about influenza in high-school students, a necessary step toward strategic development of educational interventions for these students.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号