School of Languages and Linguistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 4 of 4
  • Item
    Thumbnail Image
    Linguistic Data Management
    Thieberger, N ; Berez, AL ; Thieberger, N (Oxford University Press, 2012-09-18)
  • Item
    Thumbnail Image
    Documentation in practice: developing a linked media corpus of South Efate
    Thieberger, N (Hans Rausing Endangered Languages Project, School of Oriental and African Studies, University of London, 2004)
    There is a growing need for linguists working with endangered languages to be able to provide documentation of those languages that will serve two functions, not only the analysis and presentation of examples and texts, but also the means for accessing the material in the future. In this paper I describe a workflow for building documentation into a language description developed in the course of writing a grammar of South Efate, an Oceanic language of Vanuatu, for a PhD dissertation. I suggest that, with appropriate tools, the effort of recording and transcribing documentary field recordings can result in a media corpus from which we can produce instant links between text and media, which in turn enriches our analysis. Further, these annotations are in an ideal form for archiving and for providing access to data by the speakers of the language. I take it as axiomatic that we must archive our recordings and associated material and that this step is integral to the larger project of language documentation.
  • Item
    Thumbnail Image
    Building an interactive corpus of field recordings
    Thieberger, N (Paris: ELRA, 2004)
    There is a growing need for linguists working with small and endangered languages to be able to provide documentation of those languages that will serve two functions, not only the analysis and presentation of examples and texts, but also the means for others to access the material in the future. In this presentation I describe the workflow developed in the course of writing a description of South Efate, an Oceanic language of Vanuatu for a PhD dissertation. This workflow steps through (i) field recording; (ii) digitising or capturing media data as citable objects for archival purposes; (iii) transcribing those objects with time-alignment; (iv) establishing a media corpus indexed by the transcript; (v) instantiating links between text and media using a purpose-built tool (Audiamus); (vi) exporting from Audiamus to interlinearise while maintaining timecodes; (vii) extracting citable example sentences for use in a grammatical description; (viii) exporting from Audiamus in XML, Quicktime or other formats.
  • Item
    Thumbnail Image
    Steps toward a grammar embedded in data
    THIEBERGER, N ; Epps, P ; Arkhipov, A (Walter de Gruyter, 2009-06-05)
    This volume continues the tradition of presenting the latest findings by typologists and field linguists, relevant to general linguistic theory and research methodology.