University Library
  • Login
A gateway to Melbourne's research publications
Minerva Access is the University's Institutional Repository. It aims to collect, preserve, and showcase the intellectual output of staff and students of the University of Melbourne for a global audience.
View Item 
  • Minerva Access
  • Engineering and Information Technology
  • Computing and Information Systems
  • Computing and Information Systems - Research Publications
  • View Item
  • Minerva Access
  • Engineering and Information Technology
  • Computing and Information Systems
  • Computing and Information Systems - Research Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

    GeneRIF indexing: sentence selection based on machine learning

    Thumbnail
    Download
    Published version (387.7Kb)

    Citations
    Scopus
    Altmetric
    19
    Author
    Jimeno-Yepes, AJ; Sticco, JC; Mork, JG; Aronson, AR
    Date
    2013-05-31
    Source Title
    BMC Bioinformatics
    Publisher
    BMC
    University of Melbourne Author/s
    Jimeno Yepes, Antonio
    Affiliation
    Computing and Information Systems
    Metadata
    Show full item record
    Document Type
    Journal Article
    Citations
    Jimeno-Yepes, A. J., Sticco, J. C., Mork, J. G. & Aronson, A. R. (2013). GeneRIF indexing: sentence selection based on machine learning. BMC BIOINFORMATICS, 14 (1), https://doi.org/10.1186/1471-2105-14-171.
    Access Status
    Open Access
    URI
    http://hdl.handle.net/11343/259046
    DOI
    10.1186/1471-2105-14-171
    Abstract
    BACKGROUND: A Gene Reference Into Function (GeneRIF) describes novel functionality of genes. GeneRIFs are available from the National Center for Biotechnology Information (NCBI) Gene database. GeneRIF indexing is performed manually, and the intention of our work is to provide methods to support creating the GeneRIF entries. The creation of GeneRIF entries involves the identification of the genes mentioned in MEDLINE®; citations and the sentences describing a novel function. RESULTS: We have compared several learning algorithms and several features extracted or derived from MEDLINE sentences to determine if a sentence should be selected for GeneRIF indexing. Features are derived from the sentences or using mechanisms to augment the information provided by them: assigning a discourse label using a previously trained model, for example. We show that machine learning approaches with specific feature combinations achieve results close to one of the annotators. We have evaluated different feature sets and learning algorithms. In particular, Naïve Bayes achieves better performance with a selection of features similar to one used in related work, which considers the location of the sentence, the discourse of the sentence and the functional terminology in it. CONCLUSIONS: The current performance is at a level similar to human annotation and it shows that machine learning can be used to automate the task of sentence selection for GeneRIF annotation. The current experiments are limited to the human species. We would like to see how the methodology can be extended to other species, specifically the normalization of gene mentions in other species.

    Export Reference in RIS Format     

    Endnote

    • Click on "Export Reference in RIS Format" and choose "open with... Endnote".

    Refworks

    • Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References


    Collections
    • Minerva Elements Records [52609]
    • Computing and Information Systems - Research Publications [1565]
    Minerva AccessDepositing Your Work (for University of Melbourne Staff and Students)NewsFAQs

    BrowseCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects
    My AccountLoginRegister
    StatisticsMost Popular ItemsStatistics by CountryMost Popular Authors