Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 5 of 5
  • Item
    Thumbnail Image
    Querying linguistic trees
    Lai, C ; Bird, S (Springer Science and Business Media LLC, 2010-01-01)
  • Item
    Thumbnail Image
    A Scalable Method for Preserving Oral Literature from Small Languages
    Bird, S ; Chowdhury, G ; Koo, C ; Hunter, J (SPRINGER-VERLAG BERLIN, 2010-08-16)
    Can the speakers of small languages, which may be remote, unwritten, and endangered, be trained to create an archival record of their oral literature, with only limited external support? This paper describes the model of "Basic Oral Language Documentation", as adapted for use in remote village locations, far from digital archives but close to endangered languages and cultures. Speakers of a small Papuan language were trained and observed during a six week period. Linguistic performances were collected using digital voice recorders. Careful speech versions of selected items, together with spontaneous oral translations into a language of wider communication, were also recorded and curated. A smaller selection was transcribed. This paper describes the method, and shows how it is able to address linguistic, technological and sociological obstacles, and how it can be used to collect a sizeable corpus. We conclude that Basic Oral Language Documentation is a promising technique for expediting the task of preserving endangered linguistic heritage.
  • Item
    Thumbnail Image
    Natural Language Processing
    Verspoor, K ; Cohen, KB (Springer Nature, 2013)
  • Item
    Thumbnail Image
    The Human Language Project: Building a Universal Corpus of the World's Languages
    Abney, S ; Bird, S (ASSOC COMPUTATIONAL LINGUISTICS, 2010)
  • Item
    Thumbnail Image
    Fast query for large treebanks
    GHODKE, SUMUKH ; BIRD, STEVEN (Association for Computational Linguistics, 2010)
    A variety of query systems have been developed for interrogatingparsed corpora, or treebanks. With the arrival of efficient,wide-coverage parsers, it is feasible to create very largedatabases of trees.However, existing approaches that use in-memory search,or relational or XML database technologies, do not scale up.We describe a method for storage, indexing, and query oftreebanks that uses an information retrieval engine.Several experiments with a large treebank demonstrateexcellent scaling characteristics for a wide rangeof query types. This work facilitates the curation ofmuch larger treebanks, and enables them to be used effectivelyin a variety of scientific and engineering tasks.