Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    Towards a General Model of Linguistic Paradigms
    Penton, D. ; Bow, C. ; Bird, S. G. ; Hughes, B. ( 2004-07)
    Linguistic forms are inherently multi-dimensional. They exhibit a variety of phonological, orthographic, morphosyntactic, semantic and pragmatic properties. Accordingly, linguistic analysis involves multi-dimensional exploration, a process in which the same collection of forms are laid out in many ways until clear patterns emerge. Equally, language documentation usually contains tabulations of linguistic forms to illustrate systematic patterns and variations. In all such cases, multi-dimensional data is projected onto a two-dimensional table known as a linguistic paradigm, the most widespread format for linguistic data presentation. In this paper we survey a representative sample of paradigms and develop a simple relational data model. We show how XML technologies can be used to store and render paradigms. The result is a flexible and extensible model for the storage, interchange and delivery of linguistic paradigms.
  • Item
    Thumbnail Image
    A Four-Level Model for Interlinear Text
    Bow, C. ; Hughes, B. ; Bird, S. G. ( 2003)
    Interlinear text has long been a valuable device in language documentation and linguistic description. However, the task of creating, editing and publishing interlinear text is an onerous one. Interlinear text is governed by simple rules, yet laborious manual formatting in a word processor is the norm. A handful of specialized software tools facilitate the creation of interlinear text, permitting customizable views and alignment to audio and video. However, word processors and specialized software alike fail to deliver on a key promise of digitization, namely reusability. In order to facilitate reusability, we have developed a general-purpose conceptual model of interlinear text consisting of four levels: text, phrase, word and morph. The details of the model are informed by our analysis of a representative sample of current practice. We have implemented the model using standard XML technologies.