Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text
AuthorSouth, BR; Mowery, D; Suo, Y; Leng, J; Ferrandez, O; Meystre, SM; Chapman, WW
Source TitleJournal of Biomedical Informatics
PublisherACADEMIC PRESS INC ELSEVIER SCIENCE
University of Melbourne Author/sChapman, Wendy
AffiliationMedicine Dentistry & Health Sciences
Document TypeJournal Article
CitationsSouth, B. R., Mowery, D., Suo, Y., Leng, J., Ferrandez, O., Meystre, S. M. & Chapman, W. W. (2014). Evaluating the effects of machine pre-annotation and an interactive annotation interface on manual de-identification of clinical text. JOURNAL OF BIOMEDICAL INFORMATICS, 50, pp.162-172. https://doi.org/10.1016/j.jbi.2014.05.002.
Access StatusOpen Access
The Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor method requires removal of 18 types of protected health information (PHI) from clinical documents to be considered "de-identified" prior to use for research purposes. Human review of PHI elements from a large corpus of clinical documents can be tedious and error-prone. Indeed, multiple annotators may be required to consistently redact information that represents each PHI class. Automated de-identification has the potential to improve annotation quality and reduce annotation time. For instance, using machine-assisted annotation by combining de-identification system outputs used as pre-annotations and an interactive annotation interface to provide annotators with PHI annotations for "curation" rather than manual annotation from "scratch" on raw clinical documents. In order to assess whether machine-assisted annotation improves the reliability and accuracy of the reference standard quality and reduces annotation effort, we conducted an annotation experiment. In this annotation study, we assessed the generalizability of the VA Consortium for Healthcare Informatics Research (CHIR) annotation schema and guidelines applied to a corpus of publicly available clinical documents called MTSamples. Specifically, our goals were to (1) characterize a heterogeneous corpus of clinical documents manually annotated for risk-ranked PHI and other annotation types (clinical eponyms and person relations), (2) evaluate how well annotators apply the CHIR schema to the heterogeneous corpus, (3) compare whether machine-assisted annotation (experiment) improves annotation quality and reduces annotation time compared to manual annotation (control), and (4) assess the change in quality of reference standard coverage with each added annotator's annotations.
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References