University Library
  • Login
A gateway to Melbourne's research publications
Minerva Access is the University's Institutional Repository. It aims to collect, preserve, and showcase the intellectual output of staff and students of the University of Melbourne for a global audience.
View Item 
  • Minerva Access
  • Engineering
  • Computing and Information Systems
  • Computing and Information Systems - Theses
  • View Item
  • Minerva Access
  • Engineering
  • Computing and Information Systems
  • Computing and Information Systems - Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

    Extraction of neologisms from Japanese corpora

    Thumbnail
    Download
    Extraction of neologisms from Japanese corpora (1.339Mb)

    Citations
    Altmetric
    Author
    Breen, James
    Date
    2017
    Affiliation
    Computing and Information Systems
    Metadata
    Show full item record
    Document Type
    PhD thesis
    Access Status
    Open Access
    URI
    http://hdl.handle.net/11343/211675
    Description

    © 2017 Dr. James Breen

    Abstract
    In this thesis an exploration of the application of natural-language processing techniques to the extraction of neologisms from Japanese corpora is described. The research aim was to establish techniques which can be developed and exploited to assist significantly in neologism extraction for compiling Japanese monolingual and bilingual dictionaries. The particular challenge of the task is presented by the lack of word boundaries in Japanese text which creates a problem in the identification of unrecorded words. Three broad approaches have been explored, using a variety of language processing and artificial intelligence techniques, and drawing on large-scale Japanese corpora and reference lexicons: synthesis of possible Japanese words by mimicking Japanese morphological processes, followed by testing for the presence of candidate words in Japanese corpora; analysis of morpheme sequences in Japanese texts to determine the presence of potential new or unrecorded terms; and analysis of language patterns which are often used in Japanese in association with new and emerging terms. The research described in this thesis has identified a number of processes which can be used to assist lexicographers in the identification of unrecorded lexical items in Japanese texts.

    Export Reference in RIS Format     

    Endnote

    • Click on "Export Reference in RIS Format" and choose "open with... Endnote".

    Refworks

    • Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References


    Collections
    • Minerva Elements Records [45689]
    • Computing and Information Systems - Theses [398]
    Minerva AccessDepositing Your Work (for University of Melbourne Staff and Students)NewsFAQs

    BrowseCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects
    My AccountLoginRegister
    StatisticsMost Popular ItemsStatistics by CountryMost Popular Authors