Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    XSLT as a linguistic query language
    Taylor, Claire Louise ( 2003-11)
    With the growing use of linguistic data, suitable storage techniques and query languages need to be developed. A traditional relational database management system is inappropriate for linguistic data as it typically has some sort of structure associated with it, which can represent hierarchical or sequential relationships. Although there are many different forms of linguistic annotation, there are few query languages that succinctly service the data by providing the necessary features such as data accessibility, transformation and integration. The current challenge facing the creators of linguistic corpora and the corresponding query languages is to find a query language that is expressive enough to enable the features mentioned above while still providing an interface to the data that allows the corpus to be queried in terms of the user’s conceptual model. Previous work in this area has suggested that the hierarchical nature of XML would be well suited to linguistic data and that an existing XML query language could be applied to linguistic queries. This thesis represented two linguistic corpora, TIMIT and the Penn Treebank in XML. Two possible XML representations for TIMIT were explored to illustrate that a permutation in the structure of the data has a significant effect on the ease of writing queries for it. Data structures that were closely related to the user’s conceptual model of the data for a given query were easier to write queries for. It was concluded that the final XML representation for a given corpus would depend on the possible uses of the data.