School of Languages and Linguistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 16
  • Item
    No Preview Available
    Nyingarn: Supporting Australian Indigenous languages from textual sources1
    Thieberger, N ; Lewincamp, S ; Rosa, ML (IEEE, 2023-01-01)
  • Item
    Thumbnail Image
    Doing it for Ourselves: The New Archive Built by and Responsive to the Researcher
    Thieberger, N (Alliance of Digital Humanities Organizations, 2023)
    In this paper I address the following research questions in the context of having built a research data repository to safeguard cultural research data. How can the PARADISEC team ensure the records we create in the course of our research will exist into the future and remain citable? How can our research data be made available for a wider public, most importantly for the people recorded and their descendants? How can we prepare our students for this new approach to curation of primary research data so that they can build good methodology into their normal research practice, with much more productive outcomes?
  • Item
    No Preview Available
    Customary song in Christian clothing
    Thieberger, N ; Barwick, L (Presses universitaires de la Nouvelle‐Calédonie, 2023)
    In this paper, we illustrate the maintenance of archaic forms of Nafsan (a language spoken in Efate, Vanuatu) in song, and take one particular song as an example. Nafsan is known for having lost medial and final vowels in everyday language, but these can be, as in many languages, retained in song. One of the very few books written in Nafsan by Nafsan speakers was produced in 1983 in Port Vila (Wai et al.). It contains twelve stories, and ends with a cryptic inscription, M‐dd‐M‐dd‐ddl‐S‐dl‐s‐dd. All the stories were transcribed and translated as part of Thieberger’s research, but he was not sure what to do with this collection of letters. By chance, a copy of a hymnal on Lelepa island had the same cryptic letters that were evidently a form of musical notation known as solfa, Tonic Sol‐fa, or Solfege. Translations of Christian hymns into Nafsan were first made in the 1840s, but none of these hymnals includes solfa notation. As Stevens (2005) notes, solfa “often resulted in the emergence of a school of indigenous composers writing in Tonic Sol-fa notation and using the tonal harmonic style”. That is clearly the case in this Nafsan story. In this paper, we will look in more detail at the Ririal song, noting its archaic content. Early translations of hymns often maintain vowels that are now lost in Nafsan, and the same appears to be the case with the Ririal song. It is indicative of the syncretism with which Christianity has been received in Efate that a method of transcription originally intended to make Christian hymns more accessible has been adapted in a monolingual set of kastom stories to present a traditional song.
  • Item
    Thumbnail Image
    Hypothetically Speaking: Ethics in linguistic fieldwork, a provocation
    MUSGRAVE, S ; Thieberger, N ; Derhemi, E ; Moseley, C (Routledge, 2023-03-06)
    Ethical issues are not always easily resolved. In the case of language documentation work, such issues require careful thought to ensure that all parties to a research process are informed and are able to participate equally, or to the level that they want, in the research process. While there is a considerable literature on ethics and fieldwork, here we present some of the issues in the form of an entertaining hypothetical discussion, presented as part of the social program at a conference of the Australian Linguistic Society with a cast who were given an outline of their roles, but not the scenarios that they would have to address in the course of the event. At the request of cast members, and in keeping with the topic, we did not record the presentation, but do offer the script here in the hope that it provides a less didactic coverage of some ethical issues than may be found elsewhere. We are pleased to be able to offer this chapter in celebration of Nick Ostler’s career and of his support for many language projects around the world. We hope this chapter’s entertainment can live up to Nick’s entertaining conversation in conference presentations and dinners.
  • Item
    Thumbnail Image
    Community-Led Documentation of Nafsan (Erakor, Vanuatu)
    Krajinovic, A ; Billington, R ; Emil, L ; Kaltapau, G ; Thieberger, N ; Vetulani, Z ; Paroubek, P ; Kubis, M (SPRINGER INTERNATIONAL PUBLISHING AG, 2022)
    We focus on a collaboration between community members and visiting linguists in Erakor, Vanuatu, aiming to build the capacity of community-based researchers to undertake and sustain documentation of Nafsan, the local indigenous language. We focus on the technical and procedural skills required to collect, manage, and work with audio and video data, and give an overview of the outcomes of a community-led documentation after initial training. We discuss the benefits and challenges of this type of project from the perspective of the community researchers and the external linguists. We show that community-led documentation such as this project in Erakor, in which data management and archiving are incorporated into the documentation process, has crucial benefits for both the community and the linguists. The two most salient benefits are: a) long-term documentation of linguistic and cultural practices calibrated towards community’s needs, and b) collection of larger quantities of data by community members, and often of better quality and scope than those collected by visiting linguists, which, besides being readily available for research, have a great potential for training and testing emerging language technologies for less-resourced languages, such as Automatic Speech Recognition (ASR).
  • Item
    Thumbnail Image
    Reflections on software and technology for language documentation
    Arkhipov, A ; Thieberger, N ( 2020-01-01)
    Technological developments in the last decades enabled an unprecedented growth in volumes and quality of collected language data. Emerging challenges include ensuring the longevity of the records, making them accessible and reusable for fellow researchers as well as for the speech communities. These records are robust research data on which verifiable claims can be based and on which future research can be built, and are the basis for revitalization of cultural practices, including language and music performance. Recording, storage and analysis technologies become more lightweight and portable, allowing language speakers to actively participate in documentation activities. This also results in growing needs for training and support, and thus more interaction and collaboration between linguists, developers and speakers. Both cutting-edge speech technologies and crowdsourcing methods can be effectively used to overcome bottlenecks between different stages of analysis. While the endeavour to develop a single all-purpose integrated workbench for documentary linguists may not be achievable, investing in robust open interchange formats that can be accessed and enriched by independent pieces of software seems more promising for the near future.
  • Item
    Thumbnail Image
    When Your Data is My Grandparents Singing. Digitisation and Access for Cultural Records, the Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC)
    Thieberger, N ; Harris, A (Ubiquity Press, Ltd., 2022-04-04)
    In this paper we discuss the Pacific and Regional Archive for Digital Sources in Endangered Cultures (PARADISEC), a research repository that explicitly aims to act as a conduit for research outputs to a range of audiences, both within and outside of academia. PARADISEC has been operating for 19 years, and has grown to hold over 390,000 files currently totaling 150 terabytes and representing 1,312 languages, many of them from Papua New Guinea and the Pacific. Our focus is on recordings and transcripts in the many small languages of the world, the songs and stories that are unique cultural expressions. While this research data is created for a particular project, it has huge value beyond academic research as it is typically oral tradition recorded in places where little else has been recorded. There is an increasing focus in academia on reproducible research and research data management, and repositories are the key to successful data management. We discuss the importance for research practice of having discipline-specific repositories. The data in our work is also cultural material that has value to the people recorded and their descendants, it is their grandparents and so we, as outsider researchers, have special responsibilities to treat the materials with respect and to ensure they are accessible to the people we have worked with.
  • Item
    Thumbnail Image
    Digital curation and access to recordings of traditional cultural performance.
    Thieberger, N ; Harris, A (UNESCO, 2021)
    Being home to over a quarter of the world’s languages, the Pacific is a particularly good place to focus on how language records can be made accessible. The creation and description of research records has not always been a priority for humanities academics and any records that are created have typically not been provided with good archival solutions. This is despite these records often being of cultural or historical relevance beyond academia. Many cultural agencies struggle to keep track of recordings they have made, and it is the same for many researchers. Often it is only when researchers prepare recordings for archiving that they realize how many (or few) are described adequately, or have been transcribed or translated.
  • Item
    Thumbnail Image
    The Pacific Expansion: Optimizing phonetic transcription of archival corpora
    Billington, R ; Stoakes, H ; Thieberger, N (ISCA-INT SPEECH COMMUNICATION ASSOC, 2021)
    For most of the world’s languages, detailed phonetic analyses across different aspects of the sound system do not exist, due in part to limitations in available speech data and tools for efficiently processing such data for low-resource languages. Archival language documentation collections offer opportunities to extend the scope and scale of phonetic research on low-resource languages, and developments in methods for automatic recognition and alignment of speech facilitate the preparation of phonetic corpora based on these collections. We present a case study applying speech modelling and forced alignment methods to narrative data for Nafsan, an Oceanic language of central Vanuatu. We examine the accuracy of the forced-aligned phonetic labelling based on limited speech data used in the modelling process, and compare acoustic and durational measures of 17,851 vowel tokens for 11 speakers with previous experimental phonetic data for Nafsan. Results point to the suitability of archival data for large-scale studies of phonetic variation in low-resource languages, and also suggest that this approach can feasibly be used as a starting point in expanding to phonetic comparisons across closely-related Oceanic languages.
  • Item
    No Preview Available
    Breathing digital life into Oceanic language corpora
    Vernaudon, J ; Thieberger, N ; Bambridge, T ; Parent, T (OpenEdition, 2021-01-01)