University Library
  • Login
A gateway to Melbourne's research publications
Minerva Access is the University's Institutional Repository. It aims to collect, preserve, and showcase the intellectual output of staff and students of the University of Melbourne for a global audience.
View Item 
  • Minerva Access
  • Engineering
  • Computing and Information Systems
  • Computing and Information Systems - Research Publications
  • View Item
  • Minerva Access
  • Engineering
  • Computing and Information Systems
  • Computing and Information Systems - Research Publications
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

    Multi-field query expansion is effective for biomedical dataset retrieval

    Thumbnail
    Download
    published version (1.161Mb)

    Citations
    Scopus
    Altmetric
    7
    Author
    Bouadjenek, MR; Verspoor, K
    Date
    2017-09-07
    Source Title
    Database: the journal of biological databases and curation
    Publisher
    OXFORD UNIV PRESS
    University of Melbourne Author/s
    Verspoor, Cornelia; Bouadjenek, Mohamed Reda
    Affiliation
    Computing and Information Systems
    Metadata
    Show full item record
    Document Type
    Journal Article
    Citations
    Bouadjenek, M. R. & Verspoor, K. (2017). Multi-field query expansion is effective for biomedical dataset retrieval. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION, 2017, https://doi.org/10.1093/database/bax062.
    Access Status
    Open Access
    URI
    http://hdl.handle.net/11343/257401
    DOI
    10.1093/database/bax062
    ARC Grant code
    ARC/DP150101550
    Abstract
    In the context of the bioCADDIE challenge addressing information retrieval of biomedical datasets, we propose a method for retrieval of biomedical data sets with heterogenous schemas through query reformulation. In particular, the method proposed transforms the initial query into a multi-field query that is then enriched with terms that are likely to occur in the relevant datasets. We compare and evaluate two query expansion strategies, one based on the Rocchio method and another based on a biomedical lexicon. We then perform a comprehensive comparative evaluation of our method on the bioCADDIE dataset collection for biomedical retrieval. We demonstrate the effectiveness of our multi-field query method compared to two baselines, with MAP improved from 0.2171 and 0.2669 to 0.2996. We also show the benefits of query expansion, where the Rocchio expanstion method improves the MAP for our two baselines from 0.2171 and 0.2669 to 0.335. We show that the Rocchio query expansion method slightly outperforms the one based on the biomedical lexicon as a source of terms, with an improvement of roughly 3% for MAP. However, the query expansion method based on the biomedical lexicon is much less resource intensive since it does not require computation of any relevance feedback set or any initial execution of the query. Hence, in term of trade-off between efficiency, execution time and retrieval accuracy, we argue that the query expansion method based on the biomedical lexicon offers the best performance for a prototype biomedical data search engine intended to be used at a large scale. In the official bioCADDIE challenge results, although our approach is ranked seventh in terms of the infNDCG evaluation metric, it ranks second in term of P@10 and NDCG. Hence, the method proposed here provides overall good retrieval performance in relation to the approaches of other competitors. Consequently, the observations made in this paper should benefit the development of a Data Discovery Index prototype or the improvement of the existing one.

    Export Reference in RIS Format     

    Endnote

    • Click on "Export Reference in RIS Format" and choose "open with... Endnote".

    Refworks

    • Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References


    Collections
    • Minerva Elements Records [45770]
    • Computing and Information Systems - Research Publications [1456]
    Minerva AccessDepositing Your Work (for University of Melbourne Staff and Students)NewsFAQs

    BrowseCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects
    My AccountLoginRegister
    StatisticsMost Popular ItemsStatistics by CountryMost Popular Authors