Show simple item record

dc.contributor.authorJimeno-Yepes, AJ
dc.contributor.authorPlaza, L
dc.contributor.authorMork, JG
dc.contributor.authorAronson, AR
dc.contributor.authorDiaz, A
dc.date.accessioned2021-02-04T00:11:01Z
dc.date.available2021-02-04T00:11:01Z
dc.date.issued2013-06-26
dc.identifierpii: 1471-2105-14-208
dc.identifier.citationJimeno-Yepes, A. J., Plaza, L., Mork, J. G., Aronson, A. R. & Diaz, A. (2013). MeSH indexing based on automatically generated summaries. BMC BIOINFORMATICS, 14 (1), https://doi.org/10.1186/1471-2105-14-208.
dc.identifier.issn1471-2105
dc.identifier.urihttp://hdl.handle.net/11343/259048
dc.description.abstractBACKGROUND: MEDLINE citations are manually indexed at the U.S. National Library of Medicine (NLM) using as reference the Medical Subject Headings (MeSH) controlled vocabulary. For this task, the human indexers read the full text of the article. Due to the growth of MEDLINE, the NLM Indexing Initiative explores indexing methodologies that can support the task of the indexers. Medical Text Indexer (MTI) is a tool developed by the NLM Indexing Initiative to provide MeSH indexing recommendations to indexers. Currently, the input to MTI is MEDLINE citations, title and abstract only. Previous work has shown that using full text as input to MTI increases recall, but decreases precision sharply. We propose using summaries generated automatically from the full text for the input to MTI to use in the task of suggesting MeSH headings to indexers. Summaries distill the most salient information from the full text, which might increase the coverage of automatic indexing approaches based on MEDLINE. We hypothesize that if the results were good enough, manual indexers could possibly use automatic summaries instead of the full texts, along with the recommendations of MTI, to speed up the process while maintaining high quality of indexing results. RESULTS: We have generated summaries of different lengths using two different summarizers, and evaluated the MTI indexing on the summaries using different algorithms: MTI, individual MTI components, and machine learning. The results are compared to those of full text articles and MEDLINE citations. Our results show that automatically generated summaries achieve similar recall but higher precision compared to full text articles. Compared to MEDLINE citations, summaries achieve higher recall but lower precision. CONCLUSIONS: Our results show that automatic summaries produce better indexing than full text articles. Summaries produce similar recall to full text but much better precision, which seems to indicate that automatic summaries can efficiently capture the most important contents within the original articles. The combination of MEDLINE citations and automatically generated summaries could improve the recommendations suggested by MTI. On the other hand, indexing performance might be dependent on the MeSH heading being indexed. Summarization techniques could thus be considered as a feature selection algorithm that might have to be tuned individually for each MeSH heading.
dc.languageEnglish
dc.publisherBMC
dc.rights.urihttps://creativecommons.org/licenses/by/4.0
dc.titleMeSH indexing based on automatically generated summaries
dc.typeJournal Article
dc.identifier.doi10.1186/1471-2105-14-208
melbourne.affiliation.departmentComputing and Information Systems
melbourne.affiliation.facultyEngineering and Information Technology
melbourne.source.titleBMC Bioinformatics
melbourne.source.volume14
melbourne.source.issue1
dc.rights.licenseCC BY
melbourne.elementsid1197639
melbourne.contributor.authorJimeno Yepes, Antonio
dc.identifier.eissn1471-2105
melbourne.accessrightsOpen Access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record