Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 92
  • Item
    No Preview Available
    Focused Contrastive Loss for Classification With Pre-Trained Language Models
    He, J ; Li, Y ; Zhai, Z ; Fang, B ; Thorne, C ; Druckenbrodt, C ; Akhondi, S ; Verspoor, K (Institute of Electrical and Electronics Engineers (IEEE), 2023-01-01)
  • Item
    No Preview Available
    Attention-based multimodal fusion with contrast for robust clinical prediction in the face of missing modalities
    Liu, J ; Capurro, D ; Nguyen, A ; Verspoor, K (ACADEMIC PRESS INC ELSEVIER SCIENCE, 2023-09)
    OBJECTIVE: With the increasing amount and growing variety of healthcare data, multimodal machine learning supporting integrated modeling of structured and unstructured data is an increasingly important tool for clinical machine learning tasks. However, it is non-trivial to manage the differences in dimensionality, volume, and temporal characteristics of data modalities in the context of a shared target task. Furthermore, patients can have substantial variations in the availability of data, while existing multimodal modeling methods typically assume data completeness and lack a mechanism to handle missing modalities. METHODS: We propose a Transformer-based fusion model with modality-specific tokens that summarize the corresponding modalities to achieve effective cross-modal interaction accommodating missing modalities in the clinical context. The model is further refined by inter-modal, inter-sample contrastive learning to improve the representations for better predictive performance. We denote the model as Attention-based cRoss-MOdal fUsion with contRast (ARMOUR). We evaluate ARMOUR using two input modalities (structured measurements and unstructured text), six clinical prediction tasks, and two evaluation regimes, either including or excluding samples with missing modalities. RESULTS: Our model shows improved performances over unimodal or multimodal baselines in both evaluation regimes, including or excluding patients with missing modalities in the input. The contrastive learning improves the representation power and is shown to be essential for better results. The simple setup of modality-specific tokens enables ARMOUR to handle patients with missing modalities and allows comparison with existing unimodal benchmark results. CONCLUSION: We propose a multimodal model for robust clinical prediction to achieve improved performance while accommodating patients with missing modalities. This work could inspire future research to study the effective incorporation of multiple, more complex modalities of clinical data into a single model.
  • Item
    Thumbnail Image
    Graph embedding-based link prediction for literature-based discovery in Alzheimer's Disease
    Pu, Y ; Beck, D ; Verspoor, K (ACADEMIC PRESS INC ELSEVIER SCIENCE, 2023-09)
    OBJECTIVE: We explore the framing of literature-based discovery (LBD) as link prediction and graph embedding learning, with Alzheimer's Disease (AD) as our focus disease context. The key link prediction setting of prediction window length is specifically examined in the context of a time-sliced evaluation methodology. METHODS: We propose a four-stage approach to explore literature-based discovery for Alzheimer's Disease, creating and analyzing a knowledge graph tailored to the AD context, and predicting and evaluating new knowledge based on time-sliced link prediction. The first stage is to collect an AD-specific corpus. The second stage involves constructing an AD knowledge graph with identified AD-specific concepts and relations from the corpus. In the third stage, 20 pairs of training and testing datasets are constructed with the time-slicing methodology. Finally, we infer new knowledge with graph embedding-based link prediction methods. We compare different link prediction methods in this context. The impact of limiting prediction evaluation of LBD models in the context of short-term and longer-term knowledge evolution for Alzheimer's Disease is assessed. RESULTS: We constructed an AD corpus of over 16 k papers published in 1977-2021, and automatically annotated it with concepts and relations covering 11 AD-specific semantic entity types. The knowledge graph of Alzheimer's Disease derived from this resource consisted of ∼11 k nodes and ∼394 k edges, among which 34% were genotype-phenotype relationships, 57% were genotype-genotype relationships, and 9% were phenotype-phenotype relationships. A Structural Deep Network Embedding (SDNE) model consistently showed the best performance in terms of returning the most confident set of link predictions as time progresses over 20 years. A huge improvement in model performance was observed when changing the link prediction evaluation setting to consider a more distant future, reflecting the time required for knowledge accumulation. CONCLUSION: Neural network graph-embedding link prediction methods show promise for the literature-based discovery context, although the prediction setting is extremely challenging, with graph densities of less than 1%. Varying prediction window length on the time-sliced evaluation methodology leads to hugely different results and interpretations of LBD studies. Our approach can be generalized to enable knowledge discovery for other diseases. AVAILABILITY: Code, AD ontology, and data are available at https://github.com/READ-BioMed/readbiomed-lbd.
  • Item
    Thumbnail Image
    Overcoming challenges in extracting prescribing habits from veterinary clinics using big data and deep learning
    Hur, B ; Hardefeldt, LY ; Verspoor, K ; Baldwin, T ; Gilkerson, JR (WILEY, 2022-05)
    Understanding antimicrobial usage patterns and encouraging appropriate antimicrobial usage is a critical component of antimicrobial stewardship. Studies using VetCompass Australia and Natural Language Processing (NLP) have demonstrated antimicrobial usage patterns in companion animal practices across Australia. Doing so has highlighted the many obstacles and barriers to the task of converting raw clinical notes into a format that can be readily queried and analysed. We developed NLP systems using rules-based algorithms and machine learning to automate the extraction of data describing the key elements to assess appropriate antimicrobial use. These included the clinical indication, antimicrobial agent selection, dose and duration of therapy. Our methods were applied to over 4.4 million companion animal clinical records across Australia on all consultations with antimicrobial use to help us understand what antibiotics are being given and why on a population level. Of these, approximately only 40% recorded the reason why antimicrobials were prescribed, along with the dose and duration of treatment. NLP and deep learning might be able to overcome the difficulties of harvesting free text data from clinical records, but when the essential data are not recorded in the clinical records, then, this becomes an insurmountable obstacle.
  • Item
    No Preview Available
    Predicting Publication of Clinical Trials Using Structured and Unstructured Data: Model Development and Validation Study.
    Wang, S ; Šuster, S ; Baldwin, T ; Verspoor, K (JMIR Publications, 2022-12-23)
    BACKGROUND: Publication of registered clinical trials is a critical step in the timely dissemination of trial findings. However, a significant proportion of completed clinical trials are never published, motivating the need to analyze the factors behind success or failure to publish. This could inform study design, help regulatory decision-making, and improve resource allocation. It could also enhance our understanding of bias in the publication of trials and publication trends based on the research direction or strength of the findings. Although the publication of clinical trials has been addressed in several descriptive studies at an aggregate level, there is a lack of research on the predictive analysis of a trial's publishability given an individual (planned) clinical trial description. OBJECTIVE: We aimed to conduct a study that combined structured and unstructured features relevant to publication status in a single predictive approach. Established natural language processing techniques as well as recent pretrained language models enabled us to incorporate information from the textual descriptions of clinical trials into a machine learning approach. We were particularly interested in whether and which textual features could improve the classification accuracy for publication outcomes. METHODS: In this study, we used metadata from ClinicalTrials.gov (a registry of clinical trials) and MEDLINE (a database of academic journal articles) to build a data set of clinical trials (N=76,950) that contained the description of a registered trial and its publication outcome (27,702/76,950, 36% published and 49,248/76,950, 64% unpublished). This is the largest data set of its kind, which we released as part of this work. The publication outcome in the data set was identified from MEDLINE based on clinical trial identifiers. We carried out a descriptive analysis and predicted the publication outcome using 2 approaches: a neural network with a large domain-specific language model and a random forest classifier using a weighted bag-of-words representation of text. RESULTS: First, our analysis of the newly created data set corroborates several findings from the existing literature regarding attributes associated with a higher publication rate. Second, a crucial observation from our predictive modeling was that the addition of textual features (eg, eligibility criteria) offers consistent improvements over using only structured data (F1-score=0.62-0.64 vs F1-score=0.61 without textual features). Both pretrained language models and more basic word-based representations provide high-utility text representations, with no significant empirical difference between the two. CONCLUSIONS: Different factors affect the publication of a registered clinical trial. Our approach to predictive modeling combines heterogeneous features, both structured and unstructured. We show that methods from natural language processing can provide effective textual features to enable more accurate prediction of publication success, which has not been explored for this task previously.
  • Item
    No Preview Available
    Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study
    Suster, S ; Baldwin, T ; Lau, JH ; Yepes, AJ ; Iraola, DM ; Otmakhova, Y ; Verspoor, K (JMIR PUBLICATIONS, INC, 2023-03-13)
    BACKGROUND: Assessment of the quality of medical evidence available on the web is a critical step in the preparation of systematic reviews. Existing tools that automate parts of this task validate the quality of individual studies but not of entire bodies of evidence and focus on a restricted set of quality criteria. OBJECTIVE: We proposed a quality assessment task that provides an overall quality rating for each body of evidence (BoE), as well as finer-grained justification for different quality criteria according to the Grading of Recommendation, Assessment, Development, and Evaluation formalization framework. For this purpose, we constructed a new data set and developed a machine learning baseline system (EvidenceGRADEr). METHODS: We algorithmically extracted quality-related data from all summaries of findings found in the Cochrane Database of Systematic Reviews. Each BoE was defined by a set of population, intervention, comparison, and outcome criteria and assigned a quality grade (high, moderate, low, or very low) together with quality criteria (justification) that influenced that decision. Different statistical data, metadata about the review, and parts of the review text were extracted as support for grading each BoE. After pruning the resulting data set with various quality checks, we used it to train several neural-model variants. The predictions were compared against the labels originally assigned by the authors of the systematic reviews. RESULTS: Our quality assessment data set, Cochrane Database of Systematic Reviews Quality of Evidence, contains 13,440 instances, or BoEs labeled for quality, originating from 2252 systematic reviews published on the internet from 2002 to 2020. On the basis of a 10-fold cross-validation, the best neural binary classifiers for quality criteria detected risk of bias at 0.78 F1 (P=.68; R=0.92) and imprecision at 0.75 F1 (P=.66; R=0.86), while the performance on inconsistency, indirectness, and publication bias criteria was lower (F1 in the range of 0.3-0.4). The prediction of the overall quality grade into 1 of the 4 levels resulted in 0.5 F1. When casting the task as a binary problem by merging the Grading of Recommendation, Assessment, Development, and Evaluation classes (high+moderate vs low+very low-quality evidence), we attained 0.74 F1. We also found that the results varied depending on the supporting information that is provided as an input to the models. CONCLUSIONS: Different factors affect the quality of evidence in the context of systematic reviews of medical evidence. Some of these (risk of bias and imprecision) can be automated with reasonable accuracy. Other quality dimensions such as indirectness, inconsistency, and publication bias prove more challenging for machine learning, largely because they are much rarer. This technology could substantially reduce reviewer workload in the future and expedite quality assessment as part of evidence synthesis.
  • Item
    No Preview Available
    Detecting evidence of invasive fungal infections in cytology and histopathology reports enriched with concept-level annotations
    Rozova, V ; Khanina, A ; Teng, JC ; Teh, JSK ; Worth, LJ ; Slavin, MA ; Thursky, KA ; Verspoor, K (ACADEMIC PRESS INC ELSEVIER SCIENCE, 2023-03)
    Invasive fungal infections (IFIs) are particularly dangerous to high-risk patients with haematological malignancies and are responsible for excessive mortality and delays in cancer therapy. Surveillance of IFI in clinical settings offers an opportunity to identify potential risk factors and evaluate new therapeutic strategies. However, manual surveillance is both time- and resource-intensive. As part of a broader project aimed to develop a system for automated IFI surveillance by leveraging electronic medical records, we present our approach to detecting evidence of IFI in the key diagnostic domain of histopathology. Using natural language processing (NLP), we analysed cytology and histopathology reports to identify IFI-positive reports. We compared a conventional bag-of-words classification model to a method that relies on concept-level annotations. Although the investment to prepare data supporting concept annotations is substantial, extracting targeted information specific to IFI as a pre-processing step increased the performance of the classifier from the PR AUC of 0.84 to 0.92 and enabled model interpretability. We have made publicly available the annotated dataset of 283 reports, the Cytology and Histopathology IFI Reports corpus (CHIFIR), to allow the clinical NLP research community to further build on our results.
  • Item
    No Preview Available
    Analysis of predictive performance and reliability of classifiers for quality assessment of medical evidence revealed important variation by medical area
    Suster, S ; Baldwin, T ; Verspoor, K (ELSEVIER SCIENCE INC, 2023-07)
    OBJECTIVES: A major obstacle in deployment of models for automated quality assessment is their reliability. To analyze their calibration and selective classification performance. STUDY DESIGN AND SETTING: We examine two systems for assessing the quality of medical evidence, EvidenceGRADEr and RobotReviewer, both developed from Cochrane Database of Systematic Reviews (CDSR) to measure strength of bodies of evidence and risk of bias (RoB) of individual studies, respectively. We report their calibration error and Brier scores, present their reliability diagrams, and analyze the risk-coverage trade-off in selective classification. RESULTS: The models are reasonably well calibrated on most quality criteria (expected calibration error [ECE] 0.04-0.09 for EvidenceGRADEr, 0.03-0.10 for RobotReviewer). However, we discover that both calibration and predictive performance vary significantly by medical area. This has ramifications for the application of such models in practice, as average performance is a poor indicator of group-level performance (e.g., health and safety at work, allergy and intolerance, and public health see much worse performance than cancer, pain, and anesthesia, and Neurology). We explore the reasons behind this disparity. CONCLUSION: Practitioners adopting automated quality assessment should expect large fluctuations in system reliability and predictive performance depending on the medical area. Prospective indicators of such behavior should be further researched.
  • Item
    No Preview Available
    Stratification of keratoconus progression using unsupervised machine learning analysis of tomographical parameters
    Cao, K ; Verspoor, K ; Chan, E ; Daniell, M ; Sahebjada, S ; Baird, PN (Elsevier BV, 2023-01-01)
    Purpose: This study aimed to stratify eyes with keratoconus (KC) based on longitudinal changes in all Pentacam parameters into clusters using unsupervised machine learning, with the broader objective of more clearly defining the characteristics of KC progression. Methods: A data-driven cluster analysis (hierarchical clustering) was undertaken on a retrospective cohort of 1017 kC eyes and 128 control eyes. Clusters were derived using 6-month tomographical change in individual eyes from analysis of the reduced dimensionality parameter space using all available Pentacam parameters (406 principal components). The optimal number of clusters was determined by the clustering's capacity to discriminate progression between KC and control eyes based on change across parameters. One-way ANOVA was used to compare parameters between inferred clusters. Complete Pentacam data changes at 6, 12 and 18-month time points provided validation datasets to determine the generalizability of the clustering model. Results: We identified three clusters in KC progression patterns. Eyes designated within cluster 3 had the most rapidly changing tomographical parameters compared to eyes in either cluster 1 or 2. Eyes designated within cluster 1 reflected minimal changes in tomographical parameters, closest to the tomographical changes of control (non-KC) eyes. Thirty-nine corneal curvature parameters were identified and associated with these stratified clusters, with each of these parameters changing significantly different between three clusters. Similar clusters were identified at the 6, 12 and 18-month follow-up. Conclusions: The clustering model developed was able to automatically detect and categorize KC tomographical features into fast, slow, or limited change at different time points. This new KC stratification tool may provide an opportunity to provide a precision medicine approach to KC.
  • Item
    Thumbnail Image
    Propagation, detection and correction of errors using the sequence database network
    Goudey, B ; Geard, N ; Verspoor, K ; Zobel, J (OXFORD UNIV PRESS, 2022-11)
    Nucleotide and protein sequences stored in public databases are the cornerstone of many bioinformatics analyses. The records containing these sequences are prone to a wide range of errors, including incorrect functional annotation, sequence contamination and taxonomic misclassification. One source of information that can help to detect errors are the strong interdependency between records. Novel sequences in one database draw their annotations from existing records, may generate new records in multiple other locations and will have varying degrees of similarity with existing records across a range of attributes. A network perspective of these relationships between sequence records, within and across databases, offers new opportunities to detect-or even correct-erroneous entries and more broadly to make inferences about record quality. Here, we describe this novel perspective of sequence database records as a rich network, which we call the sequence database network, and illustrate the opportunities this perspective offers for quantification of database quality and detection of spurious entries. We provide an overview of the relevant databases and describe how the interdependencies between sequence records across these databases can be exploited by network analyses. We review the process of sequence annotation and provide a classification of sources of error, highlighting propagation as a major source. We illustrate the value of a network perspective through three case studies that use network analysis to detect errors, and explore the quality and quantity of critical relationships that would inform such network analyses. This systematic description of a network perspective of sequence database records provides a novel direction to combat the proliferation of errors within these critical bioinformatics resources.