Computing and Information Systems - Research Publications

Search Results

Now showing 1 - 5 of 5

Querying linguistic trees

Lai, C ; Bird, S (Springer Science and Business Media LLC, 2010-01-01)
A Scalable Method for Preserving Oral Literature from Small Languages

Bird, S ; Chowdhury, G ; Koo, C ; Hunter, J (SPRINGER-VERLAG BERLIN, 2010-08-16)

Can the speakers of small languages, which may be remote, unwritten, and endangered, be trained to create an archival record of their oral literature, with only limited external support? This paper describes the model of "Basic Oral Language Documentation", as adapted for use in remote village locations, far from digital archives but close to endangered languages and cultures. Speakers of a small Papuan language were trained and observed during a six week period. Linguistic performances were collected using digital voice recorders. Careful speech versions of selected items, together with spontaneous oral translations into a language of wider communication, were also recorded and curated. A smaller selection was transcribed. This paper describes the method, and shows how it is able to address linguistic, technological and sociological obstacles, and how it can be used to collect a sizeable corpus. We conclude that Basic Oral Language Documentation is a promising technique for expediting the task of preserving endangered linguistic heritage.
Natural Language Processing

Verspoor, K ; Cohen, KB (Springer Nature, 2013)
The Human Language Project: Building a Universal Corpus of the World's Languages

Abney, S ; Bird, S (ASSOC COMPUTATIONAL LINGUISTICS, 2010)
Fast query for large treebanks

GHODKE, SUMUKH ; BIRD, STEVEN (Association for Computational Linguistics, 2010)

A variety of query systems have been developed for interrogatingparsed corpora, or treebanks. With the arrival of efficient,wide-coverage parsers, it is feasible to create very largedatabases of trees.However, existing approaches that use in-memory search,or relational or XML database technologies, do not scale up.We describe a method for storage, indexing, and query oftreebanks that uses an information retrieval engine.Several experiments with a large treebank demonstrateexcellent scaling characteristics for a wide rangeof query types. This work facilitates the curation ofmuch larger treebanks, and enables them to be used effectivelyin a variety of scientific and engineering tasks.

Computing and Information Systems - Research Publications

Permanent URI for this collection

Filters

Date

Author

Subject

Type

Settings

Sort By

Results per page

Statistics

Citations

Search Results