School of Languages and Linguistics - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    The process of the assessment of writing performance: the rater's perspective
    Lumley, Thomas James Nathaniel ( 2000)
    The primary purpose of this study is to investigate the process by which raters of texts written by ESL learners make their scoring decisions. The context is the Special Test of English Proficiency (step), used by the Australian government to assist in immigration decisions. Four trained, experienced and reliable step raters took part in the study, providing scores for two sets of 24 texts. The first set was scored as in an operational rating session. Raters then provided think-aloud protocols describing the rating process as they rated the second set. Scores were compared under the two conditions and comparisons made with the raters' operational rating behaviour. Both similarities and differences were observed. A coding scheme developed to describe the think-aloud data allowed analysis of the sequence of rating, the interpretations the raters made of the scoring categories in the analytic rating scale, and the difficulties raters faced in rating. Findings demonstrate that raters follow a fundamentally similar rating process, in three stages. With some exceptions, they appear to hold similar interpretations of the scale categories and descriptors, but the relationship between scale contents and text quality remains obscure. A model is presented describing the rating process. This shows that rating is at one level a rule-bound, socially governed procedure that relies upon a rating scale and the rater training which supports it, but it retains an indeterminate component as a result of the complexity of raters' reactions to individual texts. The task raters face is to reconcile their impression of the text, the specific features of the text, and the wordings of the rating scale, thereby producing a set of scores. The rules and the scale do not cover all eventualities, forcing the raters to develop various strategies to help them cope with problematic aspects of the rating process. In doing this they try to remain close to the scale, but are also heavily influenced by the complex intuitive impression of the text obtained when they first read it. This sets up a tension between the rules and the intuitive impression, which raters resolve by what is ultimately a somewhat indeterminate process. In spite of this tension and indeterminacy, rating can succeed in yielding consistent scores provided raters are supported by adequate training, with additional guidelines to assist them in dealing with problems. Rating requires such constraining procedures to produce reliable measurement.
  • Item
    Thumbnail Image
    Assessing the second language proficiency of health professionals
    MCNAMARA, TIMOTHY FRANCIS ( 1990)
    This thesis reports on the development of an Australian Government English as a Second Language test for health professionals, the Occupational English Test (OET) , and its validation using Rasch Item Response Theory models. The test contains sub-tests of the four macroskills, each based on workplace communication tasks. The thesis reports on the creation of test specifications, the trial ling of test materials and the analysis of data from full test sessions. The main research issues dealt with are as follows: 1. The nature of the constructs involved in communicative language testing. The term proficiency is analysed, and its relationship to a number of models of communicative competence examined. The difficulty of incorporating into these models factors underlying test performance is identified. 2. The nature of performance tests. A distinction is introduced between strong and weak senses of the term performance test, and related to the discussion in 1 above. 3. The content validity of the OET. This is established on the basis of a questionnaire survey, interviews, examination of relevant literature, workplace observation and test data. 4. The role of classical and Rasch IRT analysis in establishing the qualities of the test. Classical and Rasch IRT analyses are used to establish the basic reliability of the OET sub-tests. The Writing sub-test is shown to be somewhat problematic for raters because of the nature of the writing task involved. Analysis of data from the Reading subtest demonstrates the superiority of the Rasch analysis in the creation of short tests with a specific screening function. 5. The role of Rasch IRT analysis in investigating the construct and content validity of the test and hence of communicatively-oriented tests in general. Rasch analysis reveals that the sub-tests are satisfactory operationalizations of the constructs 'ESL listening/ speaking/ reading/ writing ability in health professional contexts. For the Speaking and Writing sub-tests, the analysis reveals that responses of raters in categories associated with perceptions of grammatical accuracy have a more important role in the determination of the candidate's total score than was anticipated in the design of the test. This finding has implications for the validity of communicatively oriented tests in general, and illustrates the potential of IRT analysis for the investigation of the construct validity of tests. 6. The appropriateness of the use of Rasch IRT in the analysis of language tests. The nature of the debate about 'unidimensionality' in Rasch analysis is reviewed. It is argued that the issue has been substantialy misunderstood. Data from the two parts of the Listening sub-test are analysed, and statistical tests are used to confirm the unidimensionality of the data set. It is concluded that Rasch analysis is appropriate for a language test of this type. 7. The behaviour of raters in the rating of oral and written production in a second language. The findings reported in 5 above suggest that the behaviour of raters is crucial to understanding what is being measured in a communicative test of the productive language skills. The research demonstrates the value of Rasch IRT analysis in the empirical validation of communicatively oriented language tests, and the potential of large-scale test development projects for theoretical work on language testing.