School of Languages and Linguistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 5 of 5
  • Item
    Thumbnail Image
    Voice quality in Australian English
    Loakes, D ; Gregory, A (Acoustical Society of America (ASA), 2022-08)
    This study is an acoustic investigation of voice quality in Australian English. The speech of 33 Indigenous Australians (Aboriginal English speakers) is compared to that of 28 Anglo Australians [Mainstream Australian English (MAE) speakers] from two rural locations in Victoria. Analysis of F0 and H1*-H2* reveals that pitch and voice quality differ significantly for male speakers according to dialect and for female speakers according to location. This study highlights previously undescribed phonetic and sociophonetic variability in voice quality in Australian English.
  • Item
    Thumbnail Image
    Does Automatic Speech Recognition (ASR) Have a Role in the Transcription of Indistinct Covert Recordings for Forensic Purposes?
    Loakes, D (FRONTIERS MEDIA SA, 2022-06-14)
    The transcription of covert recordings used as evidence in court is a huge issue for forensic linguistics. Covert recordings are typically made under conditions in which the device needs to be hidden, and so the resulting speech is generally indistinct, with overlapping voices and background noise, and in many cases the acoustic record cannot be analyzed via conventional phonetic techniques (i.e. phonetic segments are unclear, or there are no cues at all present acoustically). In the case of indistinct audio, the resulting transcripts that are produced, often by police working on the case, are often questionable and despite their unreliable nature can be provided as evidence in court. Injustices can, and have, occurred. Given the growing performance of automatic speech recognition (ASR) technologies, and growing reliance on such technologies in everyday life, a common question asked, especially by lawyers and other legal professionals, is whether ASR can solve the problem of what was said in indistinct forensic audio, and this is the main focus of the current paper. The paper also looks at forced alignment, a way of automatically aligning an existing transcriptions to audio. This is an area that needs to be explored in the context of forensic linguistics because transcripts can technically be “aligned” with any audio, making it seem as if it is “correct” even if it is not. The aim of this research is to demonstrate how automatic transcription systems fare using forensic-like audio, and with more than one system. Forensic-like audio is most appropriate for research, because there is greater certainty with what the speech material consists of (unlike in forensic situations where it cannot be verified). Examples of how various ASR systems cope with indistinct audio are shown, highlighting that when a good-quality recording is used ASR systems cope well, with the resulting transcript being usable and, for the most part, accurate. When a poor-quality, forensic-like recording is used, on the other hand, the resulting transcript is effectively unusable, with numerous errors and very few words recognized (and in some cases, no words recognized). The paper also demonstrates some of the problems that arise when forced-alignment is used with indistinct forensic-like audio—the transcript is simply “forced” onto an audio signal giving completely wrong alignment. This research shows that the way things currently stand, computational methods are not suitable for solving the issue of transcription of indistinct forensic audio for a range of reasons. Such systems cannot transcribe what was said in indistinct covert recordings, nor can they determine who uttered the words and phrases in such recordings, nor prove that a transcript is “right” (or wrong). These systems can indeed be used advantageously in research, and for various other purposes, and the reasons they do not work for forensic transcription stems from the nature of the recording conditions, as well as the nature of the forensic context.
  • Item
    No Preview Available
    Acoustic injustice: The experience of listening to indistinct covert recordings presented as evidence in court
    Fraser, H ; Loakes, D (University of Wollongong, 2020)
    Audio recorded by hidden listening devices can provide powerful evidence in criminal trials. Unfortunately these covert recordings are often indistinct, to the extent the court needs a transcript to understand the content. Australian law allows police to provide transcripts as ‘ad hoc experts’. Legal procedures incorporate safeguards intended to ensure the transcripts are not misleading. The problem is that these safeguards have been shown to be ineffective, with multiple examples of inaccurate transcripts being provided to ‘assist’ the jury in determining what is said and who is saying it. The present paper explains the problem, provides an accessible overview of the nature of speech and how speech perception works, and outlines the solution proposed by the Research Hub for Language in Forensic Evidence to the ‘acoustic injustice’ embodied in current legal procedures.
  • Item
    Thumbnail Image
    New insights into /el/-/Æl/ merging in Australian English
    Schmidt, P ; Diskin-Holdaway, C ; Loakes, D (ROUTLEDGE JOURNALS, TAYLOR & FRANCIS LTD, 2021-01-02)
    A merger exists in Australian English in which /el/ is realized as [æl] for a number of speakers, particularly in Victoria. There have also been some observations of /æl/ raising to [el], termed “transposition”. Although thought to be characteristic of older speakers, empirical evidence for transposition is scant. Here we report the discovery of substantive degrees of merging in thirteen older speakers, aged between 51 and 80, from Ocean Grove, Victoria. Auditory and acoustic methods showed bidirectional vowel movement, with speakers converging on both the /æ/ and /e/ phonemes. Increasing velarization of the lateral has been posited as a factor in the development of the merger in Victoria, and thus /l/ quality was also investigated, with null results in terms of direct factors. The lateral, however, was shown to be dark in both syllable onset and coda positions, with evidence for /l/ being clearer in this age group when compared with younger speakers. Lexical frequency and orthography were also investigated as factors, the latter showing a significant effect and suggesting a role for velarization as a contrast maintenance strategy.
  • Item
    Thumbnail Image
    They Talk Muṯumuṯu: Variable Elision of Tense Suffixes in Contemporary Pitjantjatjara
    Wilmoth, S ; Defina, R ; Loakes, D (MDPI AG, 2021)
    Vowel elision is common in Pitjantjatjara and Yankunytjatjara connected speech. It also appears to be a locus of language change, with young people extending elision to new contexts; resulting in a distinctive style of speech which speakers refer to as muṯumuṯu (‘short’ speech). This study examines the productions of utterance-final past tense suffixes /-nu, -ɳu, -ŋu/ by four older and four younger Pitjantjatjara speakers in spontaneous speech. This is a context where elision tends not to be sociolinguistically or perceptually salient. We find extensive variance within and between speakers in the realization of both the vowel and nasal segments. We also find evidence of a change in progress, with a mixed effects model showing that among the older speakers, elision is associated with both the place of articulation of the nasal segment and the metrical structure of the verbal stem, while among the younger speakers, elision is associated with place of articulation but metrical structure plays little role. This is in line with a reanalysis of the conditions for elision by younger speakers based on the variability present in the speech of older people. Such a reanalysis would also account for many of the sociolinguistically marked extended contexts of elision.