School of Languages and Linguistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 8 of 8
  • Item
    Thumbnail Image
    Building Speech Recognition Systems for Language Documentation: The CoEDL Endangered Language Pipeline and Inference System (ELPIS)
    Foley, B ; Arnold, J ; Coto-Solano, R ; Durantin, G ; Ellison, TM ; van Esch, D ; Heath, S ; Kratochvíl, F ; Maxwell-Smith, Z ; Nash, D ; Olsson, O ; Richards, M ; San, N ; Stoakes, H ; Thieberger, N ; Wiles, J (ISCA, 2018)
    Machine learning has revolutionized speech technologies for major world languages, but these technologies have generally not been available for the roughly 4,000 languages with populations of fewer than 10,000 speakers. This paper describes the development of ELPIS, a pipeline which language documentation workers with minimal computational experience can use to build their own speech recognition models, resulting in models being built for 16 languages from the Asia-Pacific region. ELPIS puts machine learning speech technologies within reach of people working with languages with scarce data, in a scalable way. This is impactful since it enables language communities to cross the digital divide, and speeds up language documentation. Complete automation of the process is not feasible for languages with small quantities of data and potentially large vocabularies. Hence our goal is not full automation, but rather to make a practical and effective workflow that integrates machine learning technologies.
  • Item
    Thumbnail Image
    Nasal aerodynamics and coarticulation in Bininj Kunwok: Smoothing Spline Analysis of Variance
    STOAKES, H ; Fletcher, J ; Butcher, AR ; Carignan, C ; Tyler, M (ASSTA, 2016-12-06)
    Nasal phonemes are well represented within the lexicon of BininjKunwok.1 Thisstudyexaminesintervocalic,wordmedial nasals and reports patterns of coarticulation using a Smooth- ing Spline Analysis of Variance (SSANOVA). This allows for detailed comparisons of peak nasal airflow across six female speakers of the language. Results show that in a VNV sequence there is very little anticipatory vowel nasalisation and greater carryover into a following vowel. The maximum peak nasal flow is delayed for coronals when compared to the onset of oral closure in the nasal, indicating a delayed velum opening gesture. The velar place of articulation is the exception to this pattern with some limited anticipatory nasalisation. The SSANOVA has shown to be an appropriate technique for quantifying these patterns and dynamic speech data in general.
  • Item
    Thumbnail Image
    The Pacific Expansion: Optimizing phonetic transcription of archival corpora
    Billington, R ; Stoakes, H ; Thieberger, N (ISCA-INT SPEECH COMMUNICATION ASSOC, 2021-01-01)
    For most of the world’s languages, detailed phonetic analyses across different aspects of the sound system do not exist, due in part to limitations in available speech data and tools for efficiently processing such data for low-resource languages. Archival language documentation collections offer opportunities to extend the scope and scale of phonetic research on low-resource languages, and developments in methods for automatic recognition and alignment of speech facilitate the preparation of phonetic corpora based on these collections. We present a case study applying speech modelling and forced alignment methods to narrative data for Nafsan, an Oceanic language of central Vanuatu. We examine the accuracy of the forced-aligned phonetic labelling based on limited speech data used in the modelling process, and compare acoustic and durational measures of 17,851 vowel tokens for 11 speakers with previous experimental phonetic data for Nafsan. Results point to the suitability of archival data for large-scale studies of phonetic variation in low-resource languages, and also suggest that this approach can feasibly be used as a starting point in expanding to phonetic comparisons across closely-related Oceanic languages.
  • Item
    Thumbnail Image
    Scaling processes of clause chains in Pitjantjatjara
    Defina, R ; Torres, C ; Stoakes, H (Interspeech, 2020)
    Clause chains are a syntactic strategy for combining multiple clauses into a single unit. They are reported in many languages, including Korean and Turkish. However, they have seen relatively little focused research. In particular, prosodic features are often mentioned in descriptions of clause chaining, however there have been vanishingly few investigations. Corpus-based studies of the prosody of clause chains in two unrelated languages of Papua New Guinea report that they are typically produced as a sequence of Intonation phrases united by pitch-scaling of the L% boundary tones in each clause with only the final, finite, clause descending to a full L%. The present study is the first experimental investigation of the prosody of clause chains in Pitjantjatjara. This paper focuses on one type of clause chain found in the Australian Indigenous language Pitjantjatjara. We examine a set of 120 clause chains read out by three native Pitjantjatjara speakers. Prosodic analysis reveals that these Pitjantjatjara clause chains are produced within a single Intonational Phrase. Speakers do not pause between the clauses in the chain, there is consistent linear downstep throughout the phrase and additionally phrase final lowering occurs at the end of the utterance. This differs from previous impressionistic studies of the prosody of clause chains.
  • Item
    Thumbnail Image
    Building Speech Recognition Systems for Language Documentation: The CoEDL Endangered Language Pipeline and Inference System (ELPIS)
    Foley, B ; Arnold, J ; Coto-Solano, R ; Durantin, G ; Mark, E ; van Esch, D ; Heath, S ; Kratochvíl, F ; Maxwell-Smith, Z ; Nash, D ; Olsson, O ; Richards, M ; San, N ; Stoakes, H ; Thieberger, N ; Wiles, J (International Speech Communication Association, 2018-08-30)
    Machine learning has revolutionised speech technologies for major world languages, but these technologies have generally not been available for the roughly 4,000 languages with populations of fewer than 10,000 speakers. This paper describes the development of Elpis, a pipeline which language documentation workers with minimal computational experience can use to build their own speech recognition models, resulting in models being built for 16 languages from the Asia-Pacific region. Elpis puts machine learning speech technologies within reach of people working with languages with scarce data, in a scalable way. This is impactful since it enables language communities to cross the digital divide, and speeds up language documentation. Complete automation of the process is not feasible for languages with small quantities of data and potentially large vocabularies. Hence our goal is not full automation, but rather to make a practical and effective workflow that integrates machine learning technologies.
  • Item
    Thumbnail Image
    Intonational correlates of subject and object realisation in Mawng (Australian)
    FLETCHER, J ; Stoakes, H ; Singer, R ; Loakes, D ; BARNES, J ; VEILLEUX, N ; SHATTUCK-HUFNAGEL, S ; BRUGOS, A (ISCA, 2016)
    A range of intonational devices can be used in the grammar of information and corrective focus marking in languages with relatively free word order. In this paper we explore whether nouns in the Australian Indigenous language Mawng are realised differently depending on syntactic function and focus. Results show that the pitch level associated with Subjects is higher in conditions of corrective focus compared to other utterance contexts and there is a strong correlation between focus and utterance position. Placing a word in a corrective focus context does not appear to have an effect on word duration in this corpus confirming that pitch register variation and intonational phrasing are the major prosodic cues associated with corrective focus in Mawng.
  • Item
    Thumbnail Image
    Accentual prominence and consonant lengthening and strengthening in Mawng
    Fletcher, J ; Stoakes, H ; Loakes, D ; Singer, R ; Wolters, M ; Livingstone, J ; Beattie, B ; Smith, R ; MacMahon, M ; Stuart-Smith, J ; Scobbie, J (University of Glasgow, 2015)
  • Item
    Thumbnail Image
    SPECTRAL AND DURATIONAL PROPERTIES OF VOWELS IN KUNWINJKU
    FLETCHER, J ; STOAKES, HM ; LOAKES, D ; Butcher, (UNIVERSITY OF SAARLAND, 2007)