首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abstract

This study concerns the initial development of a scale to measure teachers' attitudes toward teaching as a profession.

A modification of the W-technique, a combination of the equal-appearing interval and paired-comparison methods, was utilized to construct the scales. Teachers were used to judge the statements in the preliminary stages of scaling. An alternate scale was constructed for correlational purposes and later a revised scale was also developed from the same statements.

An estimate of test-retest reliability was obtained. In one college class a correlation coefficient of .92 was obtained between the original and alternate scales. In another class, the test-retest coefficient for the original scale was .99 and .97 between the alternate and original scales.

The high correlation gives evidence for the reliability of the original scale and both scales appear ready for further research purposes.  相似文献   

2.
In automated test assembly (ATA), the methodology of mixed‐integer programming is used to select test items from an item bank to meet the specifications for a desired test form and optimize its measurement accuracy. The same methodology can be used to automate the formatting of the set of selected items into the actual test form. Three different cases are discussed: (i) computerized test forms in which the items are presented on a screen one at a time and only their optimal order has to be determined; (ii) paper forms in which the items need to be ordered and paginated and the typical goal is to minimize paper use; and (iii) published test forms with the same requirements but a more sophisticated layout (e.g., double‐column print). For each case, a menu of possible test‐form specifications is identified, and it is shown how they can be modeled as linear constraints using 0–1 decision variables. The methodology is demonstrated using two empirical examples.  相似文献   

3.
An essential question when computing test–retest and alternate forms reliability coefficients is how many days there should be between tests. This article uses data from reading and math computerized adaptive tests to explore how the number of days between tests impacts alternate forms reliability coefficients. Results suggest that the highest alternate forms reliability coefficients were obtained when the second test was administered at least 2 to 3 weeks after the first test. Even though reliability coefficients after this amount of time were often similar, results suggested a potential tradeoff in waiting longer to retest as student ability tended to grow with time. These findings indicate that if keeping student ability similar is a concern that the best time to retest is shortly after 3 weeks have passed since the first test. Additional analyses suggested that alternate forms reliability coefficients were lower when tests were shorter and that narrowing the first test ability distribution of examinees also impacted estimates. Results did not appear to be largely impacted by differences in first test average ability, student demographics, or whether the student took the test under standard or extended time. It is suggested that for math and reading tests, like the ones analyzed in this article, the optimal retest interval would be shortly after 3 weeks have passed since the first test.  相似文献   

4.
Equivalent forms of a ten-item completion test were constructed. The same test items then were rewritten in matching format and in multiple-choice format, resulting in two forms (A and B) of each of three types of test. All tests were administered to 73 examinees, and parallel-forms reliability coefficients (correlation between scores on A and B) were calculated. These empirically obtained values were compared to the values of the reliability coefficient predicted from theoretically derived equations which indicate the influence of chance success due to guessing on test reliability. In accordance with theory it was found that the completion test was more reliable than the matching test and that the matching test was more reliable than the multiple-choice test. The empirically obtained reliability coefficients were very close to those predicted from the mathematically derived formulas.  相似文献   

5.
The concept of invariance in equating and linking is traced from the 1950s to the present. A number of research studies that examined population invariance are reviewed. Theory and research suggest that linkings other than equatings are population dependent. Theory also indicates that equatings are population dependent, although when test forms are built to detailed tables of content and statistical specifications and alternate forms are very similar to one another, the research suggests that equatings might be approximately population invariant. Suggestions are made about further research that should be conducted on methodology for examining population invariance and on empirical research to better understand the conditions under which equatings are sufficiently population invariant for practical purposes.  相似文献   

6.
Abstract

Previous researchers having established the equivalence of a group administered version of the PPVT with the standard procedure of individual administration and the reliability between alternate forms of the PPVT, an attempt was made to establish the concurrent validity of a group administered version of the PPVT in terms of two criterion variables. An r of .62 was obtained between the Otis, a group test of intelligence, and the PPVT. An r of .55 was found between the PPVT and the Stanford Achievement Test. Both r’s were significant beyond the .01 level. The concurrent validity of the PPVT was established and suggestions for additional research were made.  相似文献   

7.
Background: Many international science curriculum documents mandate that students should be able to participate in argument, debate and decision-making about contemporary science issues affecting society. Termed socioscientific issues, these topics provide students with opportunities to use their scientific knowledge to discuss, debate and defend their decisions and to evaluate the arguments of their peers.

Purpose: This study describes the development and trialling of scenarios based on the socioscientific issue of climate change. The scenarios required students to make and justify a decision and were designed to assess students’ argumentation skills.

Sample: A sample of 162 Year 10 students from five schools in Perth, Western Australia participated in this study.

Design and methods: Recent media articles were reviewed to identify relevant contexts for scenarios related to climate change that could be used to develop and assess students’ argumentation skills. In the first phase, students trialled scenarios about wind farms and hydrogen fuel buses using writing frames with scaffolding questions to generate as many reasons as possible to justify their decision. The responses were categorised into themes which were used to prepare a scoring rubric. In the second phase, students generated written arguments about the scenarios to support their decision. The arguments were analysed using both the scoring rubric developed from the first phase and Toulmin’s argumentation pattern of claim, data, backing, qualifier and rebuttal.

Results: Students’ responses to the scaffolded questions were categorised into themes of agriculture, economy, energy, environment, human impact and ethical factors. The themes of economy and the environment predominated with ethical justifications cited infrequently. An analysis of the arguments generated revealed a majority of students’ responses consisted of a claim and data with backings, qualifiers and rebuttals rarely provided.

Conclusions: Scenarios about climate change socioscientific issues can be used by teachers to both develop and assess students’ argumentation skills in classroom settings.  相似文献   


8.

The essay has been called the 'default genre' in high school and university education. This paper examines the nature, history and function of the essay in this role, including feminist critiques of the genre. It explores in particular the dialogic or multi-voiced character of most academic essays, and suggests that it is through dialogic structuring that new forms of academic writing might be generated. Excerpts from five student essays, and other forms of coursework and examination work are studied. The paper suggests that the handing in of essays and their role in the assessment of student performance is an elaborate game that students and teachers/lecturers have to learn to play well in order for both sides to enjoy and gain from the experience; it also concludes that it is time to recognise more formally the diverse forms of student expression as valid contributions to the demonstration of emerging knowledge.  相似文献   

9.
ABSTRACT

Construct-irrelevant cognitive complexity of some items in the statewide grade-level assessments may impose performance barriers for students with disabilities who are ineligible for alternate assessments based on alternate achievement standards. This has spurred research into whether items can be modified to reduce complexity without affecting item construct. This study uses a generalized linear mixed modeling analysis to investigate the effects of item modifications on improving test accessibility by reducing construct-irrelevant cognitive barriers for persistently low-performing fifth-grade students with cognitive disabilities. The results showed item scaffolding was an effective modification for both mathematics and reading. Other modifications, such as bolding/underlining of key words, hindered test performance for low-performing students. We discuss the findings’ potential impact on test development with universal design.  相似文献   

10.
While curriculum-based measurement (CBM) tools for screening decisions in reading, mathematics, and written language have been well examined, tools for use in content areas (e.g., science and social studies) remain in the beginning stages of research. In this study, two alternate forms of a new CBM tool (Statement Verification for Science; SV-S), for screening decisions regarding students’ science content knowledge, is examined for technical adequacy. A total of 1,545 students across Grades 7 (= 799) and 8 (= 746) completed two alternate forms of SV-S concurrently with a statewide high-stakes test of accountability. Promising results were found for reliability, in particular internal consistency, while results related to evidence of criterion- and construct-related validity were less than desired. Such results, along with additional exploratory analyses, provide support for future research of SV-S as a CBM tool to assist teachers and other educators with making screening decisions.  相似文献   

11.
When a computerized adaptive testing (CAT) version of a test co-exists with its paper-and-pencil (P&P) version, it is important for scores from the CAT version to be comparable to scores from its P&P version. The CAT version may require multiple item pools for test security reasons, and CAT scores based on alternate pools also need to be comparable to each other. In this paper, we review research literature on CAT comparability issues and synthesize issues specific to these two settings. A framework of criteria for evaluating comparability was developed that contains the following three categories of criteria: validity criterion, psychometric property/reliability criterion, and statistical assumption/test administration condition criterion. Methods for evaluating comparability under these criteria as well as various algorithms for improving comparability are described and discussed. Focusing on the psychometric property/reliability criterion, an example using an item pool of ACT Assessment Mathematics items is provided to demonstrate a process for developing comparable CAT versions and for evaluating comparability. This example illustrates how simulations can be used to improve comparability at the early stages of the development of a CAT. The effects of different specifications of practical constraints, such as content balancing and item exposure rate control, and the effects of using alternate item pools are examined. One interesting finding from this study is that a large part of incomparability may be due to the change from number-correct score-based scoring to IRT ability estimation-based scoring. In addition, changes in components of a CAT, such as exposure rate control, content balancing, test length, and item pool size were found to result in different levels of comparability in test scores.  相似文献   

12.
13.
Abstract

This study sought to determine if an objective measurement of instrumental music achievement could be obtained on sight-reading rhythms and if it could differentiate degrees of attainment. Through the utilization of concepts introduced by Joseph Schillinger, mathematical constructs were used to develop equivalent forms of an individual instrumental music performance test. Complete test protocols for 771 subjects at the 5th grade level or above revealed a high degree of consistency for the two forms of the test. Randomly selected tape recordings of testing sessions reflected a high degree of scorer reliability. Multiple and partial correlations revealed the importance of experience to performance, and t-test results indicated differences between any two levels of experience when experience was defined in years.  相似文献   

14.
Purpose: This article assesses a non-traditional training methodology for extension agents, focused on the exchange of experiences among peers and the reflection on practice, with the aim of exploring its potential as a training strategy.

Design/Methodology/approach: A quali-quantitative investigation was conducted, which included interviews with extension agents, the use of different questionnaires, and recordings of the evaluation sessions carried out during each workshop.

Findings: This research allowed us to understand the importance of effective group coordination, a participatory climate, working in small groups, and the feedback loop between theory and practice for processes of experience sharing and reflection on practice. Some of the positive effects of the training observed were that extension agents acquired new knowledge and methodologies, reflected critically upon their practice, and put into question their own extension approach.

Practical Implications: Given its potentialities, implementing training processes focused on experience sharing and reflection on practice for rural extension workers, seems advisable.

Theoretical Implications: This article contributes to the understanding of how experience sharing and reflection on practice can generate transformations in rural extension agents’ approaches and positioning.

Originality/Value: This study systematically assesses the impacts that training has on extension workers, as well as the underlying processes that made it possible to generate them.  相似文献   


15.
A critical component of test speededness is the distribution of the test taker’s total time on the test. A simple set of constraints on the item parameters in the lognormal model for response times is derived that can be used to control the distribution when assembling a new test form. As the constraints are linear in the item parameters, they can easily be included in a mixed integer programming model for test assembly. The use of the constraints is demonstrated for the problems of assembling a new test form to be equally speeded as a reference form, test assembly in which the impact of a change in the content specifications on speededness is to be neutralized, and the assembly of test forms with a revised level of speededness.  相似文献   

16.
Two types of answer-copying statistics for detecting copiers in small-scale examinations are proposed. One statistic identifies the "copier-source" pair, and the other in addition suggests who is copier and who is source. Both types of statistics can be used when the examination has alternate test forms. A simulation study shows that the statistics do not depend on the total-test score. Another simulation study compares the statistics with two known statistics, and shows that they have substantial power. The new statistics are applied to data from a small-scale examination  ( N = 230)  with two alternate test forms. Auxiliary information on the seat location of the examinees and the test scores of the examinees was used to determine whether or not examinees could be suspected.  相似文献   

17.
Background: Enhancing students’ metacognitive abilities will help to facilitate their understanding of science concepts.

Purpose: The study was designed to conduct and evaluate the effectiveness of a repertoire of interventions aimed at enhancing secondary school students’ metacognitive capabilities and their achievements in science.

Sample: A class of 35 Year 9 students participated in the study.

Design and methods: The study involved a pre-post design, conducted by the first author as part of the regular designated science programme in a class taught by him.

In order to enhance the students’ metacognitive capabilities, the first author employed clearly stated focused outcomes, engaging them in collaborative group work, reading scientific texts and using concept mapping techniques during classroom instruction. The data to evaluate the effectiveness of the metacognitive interventions were obtained from pre- and post-test results of two metacognitive questionnaires, the Metacognitive Support Questionnaire (MSpQ) and the Metacognitive Strategies Questionnaire (MStQ), and data from interviews. In addition, pre-test and post-test scores were used from a two-tier multiple-choice test on Light.

Results: The results showed gains in the MSpQ but not in the MStQ. However, the qualitative data from interviews suggested high metacognitive capabilities amongst the high- and average-achieving students at the end of the study. Students gains were also evident from the test scores in the Light test.

Conclusion: Although the quantitative data obtained from the Metacognitive Strategies Questionnaire did not show significant gains in the students’ metacognitive strategies, the qualitative data from interviews suggested positive perceptions of students’ metacognitive strategies amongst the high- and average-achieving students. Data from the Metacognitive Support Questionnaire showed that there were significant gains in the students’ perceptions of their metacognitive support implying that the majority of the students perceived that their learning environment was oriented towards the development of their metacognitive capabilities. The effect of the metacognitive interventions on students’ achievement in the Light test resulted in students displaying the correct declarative knowledge, but quite often they lacked the procedural knowledge by failing to explain their answers correctly.  相似文献   


18.
Purpose: In this study, we use a self-determination theory (SDT) approach to understand farmers’ attitudes toward, and intentions for, participation in competence development projects (CDP).

Design/methodology/approach: By applying SDT, we developed two measures. The first one assessed the degree to which the three basic human psychological needs motivate farmers to engage in CDP, and the second concerned farmers’ intrinsic and extrinsic motivation to seek knowledge through participation in CDP. Using data from two samples of farmers, we examined the effect of SDT needs and the influence of the different regulatory styles on individuals’ decision to participate in CDP.

Findings: Our findings indicated that participation in CDP is guided by the most internal forms of human motivation (identified, integrated, and intrinsic motivation), and that deficits in the needs for autonomy and competence predict farmers’ decision to participate in CDP.

Practical implications: These results stress the importance of designing CDP that promote self-directedness, emphasise choice rather than rewards, and generate the conditions that support farmers’ autonomy.

Theoretical implications: Our work suggests that the integration of social psychology into extension/education research can paint a more detailed picture of the way farmers interact with extension/education services.

Originality/value: To the best of our knowledge, this is the first study that uses an SDT framework to examine farmers’ motivation toward participation in CDP. Hence, this research opens a new realm for extension/education research, while it also contributes to the SDT literature by examining the role of self-determined motivation in a different life domain.  相似文献   


19.
This study explored the role of learner-generated and instructor-provided visuals in learning from scientific text. 134 college students studied a lesson on the human circulatory system and then completed recall and transfer tests. Across two consecutive study periods, students were randomly assigned to either view a provided illustration twice (provided-provided), generate a drawing from the text and then revise their drawing (generated-revised), view a provided illustration and then generate a drawing (provided-generated), or generate a drawing and then view a provided illustration (generated-provided). Results indicated a group by learning outcome interaction: the generated-provided and provided-generated groups performed higher on the transfer test and lower on the recall test compared to the provided-provided group. Furthermore, spatial ability was positively associated with learning outcomes among students who generated drawings but not among students in the provided-provided group. Finally, the relationship between spatial ability and learning outcomes among students who generated drawings was mediated by drawing quality. These findings suggest that provided and generated visuals have unique effects on different learning outcomes, and spatial ability plays an important role in supporting learner-generated visuals.  相似文献   

20.
An important assumption of item response theory is item parameter invariance. Sometimes, however, item parameters are not invariant across different test administrations due to factors other than sampling error; this phenomenon is termed item parameter drift. Several methods have been developed to detect drifted items. However, most of the existing methods were designed to detect drifts in individual items, which may not be adequate for test characteristic curve–based linking or equating. One example is the item response theory–based true score equating, whose goal is to generate a conversion table to relate number‐correct scores on two forms based on their test characteristic curves. This article introduces a stepwise test characteristic curve method to detect item parameter drift iteratively based on test characteristic curves without needing to set any predetermined critical values. Comparisons are made between the proposed method and two existing methods under the three‐parameter logistic item response model through simulation and real data analysis. Results show that the proposed method produces a small difference in test characteristic curves between administrations, an accurate conversion table, and a good classification of drifted and nondrifted items and at the same time keeps a large amount of linking items.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号