Computing and Information Systems - Research Publications

Search Results

Now showing 1 - 10 of 29

Structuring Documents Efficiently

MARSHALL, RGJ ; BIRD, SG ; STUCKEY, PJ (University of Sydney, 2005)
The ACL Anthology Reference Corpus: A Reference Dataset for Bibliographic Research in Computational Linguistics

Bird, S ; Dale, R ; Dorr, BJ ; Gibson, B ; Joseph, MT ; Kan, M-Y ; Lee, D ; Powley, B ; Radev, DR ; Tan, YF (EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA, 2008)
Defining a core body of knowledge for the introductory computational linguistics curriculum

BIRD, STEVEN ( 2008)

Discourse in and about computational linguistics depends on a shared body of knowledge. However, little content is shared across the introductory courses in this field. Instead, they typically cover a diverse assortment of topics tailored to the capabilities of the students and the interests of the instructor. If the core body of knowledge could be agreed and incorporated into introductory courses several benefits would ensue, such as the proliferation of instructional materials, software support, and extension modules building on a common foundation. This paper argues that it is worthwhile to articulate a core body of knowledge, and proposes a starting point based on the ACM Computer Science Curriculum. A variety of issues specific to the multidisciplinary nature of computational linguistics are explored.
Multidisciplinary instruction with the Natural Language Toolkit

Bird, S ; Klein, E ; Loper, E ; Baldridge, J (Association for Computational Linguistics, 2008)
Graphical query for linguistic treebanks

BIRD, STEVEN ; Lee, Haejoong ( 2007)

Databases of hierarchically annotated text occupy a central place in linguistic research and language technology development. We describe a new approach to tree query which we call "Query by Annotation". Users express a query by annotating a tree, and the annotation is compiled into an expression in a path language. The result trees are overlaid with the original query, permitting the user to see why they match. Since queries and results are annotated trees, users can easily refine and resubmit their queries. The approach to Query by Annotation is motivated and exemplified using databases of linguistic trees, or treebanks.
Collecting low-density language materials on the Web

Baldwin, Timothy ; BIRD, STEPHEN ; HUGHES, BADEN (Southern Cross University, 2006)

Most web content exists in a few dozen languages. Hundreds of other languages - the `low-density languages' - are only represented in scarce quantities on the web. How can we locate, store and describe these low-density resources? In particular, how can we identify linguistically interesting resources, such as translation sets and multilingual documents? In this paper we describe ongoing research in which we integrate a number of discrete systems (language data crawler, automated metadata generation tools, language data repositories and federated search services) to address the identification, retrieval, description, storage and access issues for low-density language materials from the web.
Analysis and prediction of user behaviour in a museum environment

Grieser, Karl ; Baldwin, Timothy ; Bird, Steven (Australasian Language Technology Association, 2006)

N/A
Reconsidering language identification for written language resources

HUGHES, BADEN ; BALDWIN, TIMOTHY ; BIRD, STEVEN ; NICHOLSON, JEREMY ; MACKINLAY, ANDREW (European Language Resources Association, 2006)

The task of identifying the language in which a given document (ranging from a sentence to thousands of pages) is written has been relatively well studied over several decades. Automated approaches to written language identification are used widely throughout research and industrial contexts, over both oral and written source materials. Despite this widespread acceptance, a review of previous research in written language identification reveals a number of questions which remain open and ripe for further investigation.
Building a Search Engine to Drive Problem-Based Learning

BIRD, STEVEN ; Curran, James (ACM, 2006)

Search engines pervade the digital world, mediating most access to information instantaneously. We have found that students can build search engine components, and even entire search engines, in the context of problem-based learning in introductory and intermediate computer science courses. The courses cover a broad range of topics in algorithms, data structures, and web design, with a heavy emphasis on programming. Additionally, the internet is coupled with the syllabus at many places, from web design and HTML to graph algorithms and pattern matching. This connection enlivens the discussion of otherwise dry topics like searching, sorting, indexing and hashing. Moreover, the challenge of web-scale computing motivates the continuing students in their later study of formal topics like algorithmic complexity, while non-continuing students acquire transferable analytical skills. We report on the experience in search engine projects for driving problem-based learning in computer science courses, for both high school and university students. Our experience shows that such projects are effective in both introductory and intermediate courses, and readily encompass student groups with diverse programming abilities.
TalkBank: Building an open unified multimodal database of communicative interaction

MacWhinney, B ; Bird, S ; Cieri, C ; Martell, C (Evaluations and Language resources Distribution Agency, 2004-01-01)

Computing and Information Systems - Research Publications

Permanent URI for this collection

Filters

Date

Author

Subject

Type

Settings

Sort By

Results per page

Statistics

Citations

Search Results