School of Languages and Linguistics - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 5 of 5
  • Item
    Thumbnail Image
    The process of the assessment of writing performance: the rater's perspective
    Lumley, Thomas James Nathaniel ( 2000)
    The primary purpose of this study is to investigate the process by which raters of texts written by ESL learners make their scoring decisions. The context is the Special Test of English Proficiency (step), used by the Australian government to assist in immigration decisions. Four trained, experienced and reliable step raters took part in the study, providing scores for two sets of 24 texts. The first set was scored as in an operational rating session. Raters then provided think-aloud protocols describing the rating process as they rated the second set. Scores were compared under the two conditions and comparisons made with the raters' operational rating behaviour. Both similarities and differences were observed. A coding scheme developed to describe the think-aloud data allowed analysis of the sequence of rating, the interpretations the raters made of the scoring categories in the analytic rating scale, and the difficulties raters faced in rating. Findings demonstrate that raters follow a fundamentally similar rating process, in three stages. With some exceptions, they appear to hold similar interpretations of the scale categories and descriptors, but the relationship between scale contents and text quality remains obscure. A model is presented describing the rating process. This shows that rating is at one level a rule-bound, socially governed procedure that relies upon a rating scale and the rater training which supports it, but it retains an indeterminate component as a result of the complexity of raters' reactions to individual texts. The task raters face is to reconcile their impression of the text, the specific features of the text, and the wordings of the rating scale, thereby producing a set of scores. The rules and the scale do not cover all eventualities, forcing the raters to develop various strategies to help them cope with problematic aspects of the rating process. In doing this they try to remain close to the scale, but are also heavily influenced by the complex intuitive impression of the text obtained when they first read it. This sets up a tension between the rules and the intuitive impression, which raters resolve by what is ultimately a somewhat indeterminate process. In spite of this tension and indeterminacy, rating can succeed in yielding consistent scores provided raters are supported by adequate training, with additional guidelines to assist them in dealing with problems. Rating requires such constraining procedures to produce reliable measurement.
  • Item
    Thumbnail Image
    An investigation of the relationship between L2 learners' goals and their attitudes towards their learning
    da Silva, Ronivaldo Braz ( 2006)
    This thesis investigates the students' reactions to a specific pedagogical approach to second language (L2) writing, termed the Enhanced Genre Approach (EGA), looking at not just gain scores and students' evaluative comments, but at how individual students differ in their classroom behaviour. The approach entails the teaching of one specific genre, the argumentative essay, through the use of model texts organised according to the elements of Toulmin' s (1958; see also Toulmin' s et aI., 1984) framework of argumentation. It emphasises the communicative purpose of writing and the importance of having an audience in mind during the writing process. Other features of the approach include handouts and exercises derived from the model texts and Toulmin's argumentative framework, written feedback provided by the teacher with focus on Toulmin's elements of claim, grounds, and warrants, the opportunity to re-write essays, and pair-work activities. This study presents my perspective as the teacher and designer of the EGA and investigates the students' reactions to, and engagement with, this approach in the classroom. The investigation of the students' perspectives is framed within the socio-cultural theoretical framework. The investigation of the student's reaction to the EGA includes their likes and dislikes about the specific features of the approach, as well as their improvement over the length of the course. The investigation also explores similarities and differences observed across the students in terms of outcome and behaviour. Through the framework of activity theory, this thesis also examines how the students' goals may help explain their individual actions in the classroom, i.e., their attitudes and behaviours towards class activities, such as lectures, teaching methodology, course materials, tasks, teacher's feedback, and collaborative work. Further, the investigation explores the robustness of activity theory in explaining the students' performance and outcome. The investigation took place in an eight-week elective composition course (Composition 1) at an English language institute in the USA. The participants were nine adult intermediate second language learners from various backgrounds: Togo, South America, Central America, and Sweden. The Composition 1 course was specifically structured for the teaching of argumentative essay writing using the enhanced genre approach, that is, all tasks pertaining to Composition 1 formed part of the approach to second language writing instruction devised for this study. These included individual and pair-work tasks extracted from three "default" model texts organised according to the elements of Toulmin's model of argumentation, and the writing of short argumentative essays.
  • Item
    Thumbnail Image
    Introducing EFL speaking tests into a Japanese senior high school entrance examination
    Akiyama, Tomoyasu ( 2004)
    This thesis investigates the feasibility of introducing speaking tests into the existing English test of the senior high school entrance examination in Japan by employing Messick's (1989) validity framework. The study demostrates that validity investigations need to include not only psychometric analysis but also a consideration of the competing values of stakeholders. The teaching guidelines for English issued by the Japanese Ministry of Education (1998) state that speaking is one of the most important skills for junior high school students. An entry decision to senior high school is based on both school-based assessment implemented by junior high school teachers and the existing external standardized English test. Despite the emphasis on the development of speaking skills, the existing English test does not include the assessment of speaking skills. There is a clear discrepancy between the aims of the guidelines and the skills tested in the entrance examination. A way to bridge this gap could be to introduce speaking tests into the English test of senior high school entrance examination, a step that would necessitate considering the validity of such test. The major issue of test validity relates to the meaning, relevance and utility of test scores as well as the value implications of test scores and the social consequences of test use (e.g. Messick, 1989; Bachman, 1990; McNamara, 2001). A questionnaire survey of teachers and students, and interviews with government officials and academics responsible for the test, were used to ascertain stakeholders' attitudes towards the introduction of speaking tests and their view of possible washback effects on the teaching or English (Study I). In order to respond concerns expressed by stakeholders in Study 1 about the reliability, validity and practicality of tests assessing oral skills, a possible oral skills component in the existing test was developed, and trialled and test scores were analysed, focusing on the practicality of the administration and psychometric adequacy of investigating student ability, raters, tasks and items via Rasch measurement (Study 2). Study 1 revealed that while most stakeholders were positive about the introduction of speaking tests, two stakeholders groups—the Education Board and senior high school teachers were not. The former, the test developers, took a conservative approach in wanting to maintain the status quo, and the latter, the test administrators, were resistant to the introduction of speaking tests for complex reasons, both internal and external. The views held by these two stakeholder groups are major obstacles to introducing such a test. Preliminary findings from Study 2 showed that the speaking tests developed were psychometrically adequate to measure junior high school students' oral skills. This study demonstrates that careful consideration needs to be given to the possible psychological fear aroused in stakeholders by the changes that would occur if speaking tests were included in the senior high school entrance examination. These changes can also challenge the values that underpin the existing educational system, both at the institutional and individual level, with different groups of stakeholders holding competing values. Clearly, taking these values into account is important in investigating the feasibility of introducing speaking tests into the entrance examinations in that any future component oral skills component in the entrance examination challenge the existing examination embodying ideological, political, and educational values - for as Messick argues, validity needs to be viewed in terms that go beyond psychometric rigour. The thesis concludes with a discussion on the implications for validity theory and the development of language assessment policy.
  • Item
    Thumbnail Image
    The importance and effectiveness of moderation training on the reliability of teacher assessments of ESL writing samples
    McIntyre, Philip N. ( 1993)
    This thesis reports the findings of a study of the inter-rater reliability of assessment of ESL Writing by teachers in the Australian Adult Migrant Education Program, using the ASLPR, a language proficiency scale used throughout the program. The study investigates the individual ratings assigned to 15 writing samples by 83 teachers, both before and after training aimed at moderation of raters' perceptions of descriptors in the scale by reference to features of other 'anchor' writing samples. The thesis argues the necessity for on-going training of assessors of ESL writing, at a time of change in the program, from assessment of language proficiency to that of language competencies, since both forms of assessments are increasingly having consequences which affect the lives of the candidates. The importance and necessity for moderation training is established by reference to the problems of validity in the scale itself and in its use in the program, and by reference to the literature of assessor-training and features of writing which influence rater-judgements. The findings indicate that training is effective in substantially increasing inter-rater reliability of the subjects, by reducing the range of levels assigned to the samples and increasing the percentages of ratings at the mode (most accurate) level and at the Mode +/- 1 level (an allowance for 'error' due to the subjective nature of the assessment), after training. The paper concludes that on-going training is effective in achieving greater consensus i.e. inter-rater reliability amongst the assessors, but suggests that variability needs to be further reduced and offers suggestions for further research aimed at other assessors and variables.
  • Item
    Thumbnail Image
    The predictive validity of the IELTS and TOEFL: a comparison
    Broadstock, Harvey James ( 1994)
    This study compared two groups of overseas students who entered Melbourne and Monash universities in Melbourne in semester 1 1993. One group entered on the basis of an IELTS score and the other group entered on the basis of TOEFL score. Their academic performance at the end of semester 1 1993 was compared. Predictive validity coefficients were also compared. Differences were minimal with a slight tendency for the TOEFL to correlate more strongly than IELTS with undergraduate academic performance. The assumption made by admissions officers who use the two tests to make admissions decisions that the two tests are equivalent in their predictive validity was not refuted.