Show simple item record

dc.contributor.authorKuo, C-J
dc.contributor.authorLing, MHT
dc.contributor.authorLin, K-T
dc.contributor.authorHsu, C-N
dc.date.accessioned2020-09-18T04:46:15Z
dc.date.available2020-09-18T04:46:15Z
dc.date.issued2009-01-01
dc.identifierpii: 1471-2105-10-S15-S7
dc.identifier.citationKuo, C. -J., Ling, M. H. T., Lin, K. -T. & Hsu, C. -N. (2009). BIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature. BMC BIOINFORMATICS, 10 (SUPPL. 15), https://doi.org/10.1186/1471-2105-10-S15-S7.
dc.identifier.issn1471-2105
dc.identifier.urihttp://hdl.handle.net/11343/242923
dc.description.abstractBACKGROUND: To automatically process large quantities of biological literature for knowledge discovery and information curation, text mining tools are becoming essential. Abbreviation recognition is related to NER and can be considered as a pair recognition task of a terminology and its corresponding abbreviation from free text. The successful identification of abbreviation and its corresponding definition is not only a prerequisite to index terms of text databases to produce articles of related interests, but also a building block to improve existing gene mention tagging and gene normalization tools. RESULTS: Our approach to abbreviation recognition (AR) is based on machine-learning, which exploits a novel set of rich features to learn rules from training data. Tested on the AB3P corpus, our system demonstrated a F-score of 89.90% with 95.86% precision at 84.64% recall, higher than the result achieved by the existing best AR performance system. We also annotated a new corpus of 1200 PubMed abstracts which was derived from BioCreative II gene normalization corpus. On our annotated corpus, our system achieved a F-score of 86.20% with 93.52% precision at 79.95% recall, which also outperforms all tested systems. CONCLUSION: By applying our system to extract all short form-long form pairs from all available PubMed abstracts, we have constructed BIOADI. Mining BIOADI reveals many interesting trends of bio-medical research. Besides, we also provide an off-line AR software in the download section on http://bioagent.iis.sinica.edu.tw/BIOADI/.
dc.languageEnglish
dc.publisherBMC
dc.titleBIOADI: a machine learning approach to identifying abbreviations and definitions in biological literature
dc.typeJournal Article
dc.identifier.doi10.1186/1471-2105-10-S15-S7
melbourne.affiliation.departmentChancellery Research
melbourne.source.titleBMC BIOINFORMATICS
melbourne.source.volume10
melbourne.source.issueSUPPL. 15
dc.rights.licenseCC BY
melbourne.elementsid317130
melbourne.contributor.authorLING, HAN
dc.identifier.eissn1471-2105
melbourne.accessrightsOpen Access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record