Computing and Information Systems - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 75
  • Item
    No Preview Available
    Disease progression modelling of Alzheimer's disease using probabilistic principal components analysis
    Saint-Jalmes, M ; Fedyashov, V ; Beck, D ; Baldwin, T ; Faux, NG ; Bourgeat, P ; Fripp, J ; Masters, CL ; Goudey, B (ACADEMIC PRESS INC ELSEVIER SCIENCE, 2023-09)
    The recent biological redefinition of Alzheimer's Disease (AD) has spurred the development of statistical models that relate changes in biomarkers with neurodegeneration and worsening condition linked to AD. The ability to measure such changes may facilitate earlier diagnoses for affected individuals and help in monitoring the evolution of their condition. Amongst such statistical tools, disease progression models (DPMs) are quantitative, data-driven methods that specifically attempt to describe the temporal dynamics of biomarkers relevant to AD. Due to the heterogeneous nature of this disease, with patients of similar age experiencing different AD-related changes, a challenge facing longitudinal mixed-effects-based DPMs is the estimation of patient-realigning time-shifts. These time-shifts are indispensable for meaningful biomarker modelling, but may impact fitting time or vary with missing data in jointly estimated models. In this work, we estimate an individual's progression through Alzheimer's disease by combining multiple biomarkers into a single value using a probabilistic formulation of principal components analysis. Our results show that this variable, which summarises AD through observable biomarkers, is remarkably similar to jointly estimated time-shifts when we compute our scores for the baseline visit, on cross-sectional data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Reproducing the expected properties of clinical datasets, we confirm that estimated scores are robust to missing data or unavailable biomarkers. In addition to cross-sectional insights, we can model the latent variable as an individual progression score by repeating estimations at follow-up examinations and refining long-term estimates as more data is gathered, which would be ideal in a clinical setting. Finally, we verify that our score can be used as a pseudo-temporal scale instead of age to ignore some patient heterogeneity in cohort data and highlight the general trend in expected biomarker evolution in affected individuals.
  • Item
    Thumbnail Image
    Overcoming challenges in extracting prescribing habits from veterinary clinics using big data and deep learning
    Hur, B ; Hardefeldt, LY ; Verspoor, K ; Baldwin, T ; Gilkerson, JR (WILEY, 2022-05)
    Understanding antimicrobial usage patterns and encouraging appropriate antimicrobial usage is a critical component of antimicrobial stewardship. Studies using VetCompass Australia and Natural Language Processing (NLP) have demonstrated antimicrobial usage patterns in companion animal practices across Australia. Doing so has highlighted the many obstacles and barriers to the task of converting raw clinical notes into a format that can be readily queried and analysed. We developed NLP systems using rules-based algorithms and machine learning to automate the extraction of data describing the key elements to assess appropriate antimicrobial use. These included the clinical indication, antimicrobial agent selection, dose and duration of therapy. Our methods were applied to over 4.4 million companion animal clinical records across Australia on all consultations with antimicrobial use to help us understand what antibiotics are being given and why on a population level. Of these, approximately only 40% recorded the reason why antimicrobials were prescribed, along with the dose and duration of treatment. NLP and deep learning might be able to overcome the difficulties of harvesting free text data from clinical records, but when the essential data are not recorded in the clinical records, then, this becomes an insurmountable obstacle.
  • Item
    No Preview Available
    Predicting Publication of Clinical Trials Using Structured and Unstructured Data: Model Development and Validation Study.
    Wang, S ; Šuster, S ; Baldwin, T ; Verspoor, K (JMIR Publications, 2022-12-23)
    BACKGROUND: Publication of registered clinical trials is a critical step in the timely dissemination of trial findings. However, a significant proportion of completed clinical trials are never published, motivating the need to analyze the factors behind success or failure to publish. This could inform study design, help regulatory decision-making, and improve resource allocation. It could also enhance our understanding of bias in the publication of trials and publication trends based on the research direction or strength of the findings. Although the publication of clinical trials has been addressed in several descriptive studies at an aggregate level, there is a lack of research on the predictive analysis of a trial's publishability given an individual (planned) clinical trial description. OBJECTIVE: We aimed to conduct a study that combined structured and unstructured features relevant to publication status in a single predictive approach. Established natural language processing techniques as well as recent pretrained language models enabled us to incorporate information from the textual descriptions of clinical trials into a machine learning approach. We were particularly interested in whether and which textual features could improve the classification accuracy for publication outcomes. METHODS: In this study, we used metadata from ClinicalTrials.gov (a registry of clinical trials) and MEDLINE (a database of academic journal articles) to build a data set of clinical trials (N=76,950) that contained the description of a registered trial and its publication outcome (27,702/76,950, 36% published and 49,248/76,950, 64% unpublished). This is the largest data set of its kind, which we released as part of this work. The publication outcome in the data set was identified from MEDLINE based on clinical trial identifiers. We carried out a descriptive analysis and predicted the publication outcome using 2 approaches: a neural network with a large domain-specific language model and a random forest classifier using a weighted bag-of-words representation of text. RESULTS: First, our analysis of the newly created data set corroborates several findings from the existing literature regarding attributes associated with a higher publication rate. Second, a crucial observation from our predictive modeling was that the addition of textual features (eg, eligibility criteria) offers consistent improvements over using only structured data (F1-score=0.62-0.64 vs F1-score=0.61 without textual features). Both pretrained language models and more basic word-based representations provide high-utility text representations, with no significant empirical difference between the two. CONCLUSIONS: Different factors affect the publication of a registered clinical trial. Our approach to predictive modeling combines heterogeneous features, both structured and unstructured. We show that methods from natural language processing can provide effective textual features to enable more accurate prediction of publication success, which has not been explored for this task previously.
  • Item
    No Preview Available
    Automating Quality Assessment of Medical Evidence in Systematic Reviews: Model Development and Validation Study
    Suster, S ; Baldwin, T ; Lau, JH ; Yepes, AJ ; Iraola, DM ; Otmakhova, Y ; Verspoor, K (JMIR PUBLICATIONS, INC, 2023-03-13)
    BACKGROUND: Assessment of the quality of medical evidence available on the web is a critical step in the preparation of systematic reviews. Existing tools that automate parts of this task validate the quality of individual studies but not of entire bodies of evidence and focus on a restricted set of quality criteria. OBJECTIVE: We proposed a quality assessment task that provides an overall quality rating for each body of evidence (BoE), as well as finer-grained justification for different quality criteria according to the Grading of Recommendation, Assessment, Development, and Evaluation formalization framework. For this purpose, we constructed a new data set and developed a machine learning baseline system (EvidenceGRADEr). METHODS: We algorithmically extracted quality-related data from all summaries of findings found in the Cochrane Database of Systematic Reviews. Each BoE was defined by a set of population, intervention, comparison, and outcome criteria and assigned a quality grade (high, moderate, low, or very low) together with quality criteria (justification) that influenced that decision. Different statistical data, metadata about the review, and parts of the review text were extracted as support for grading each BoE. After pruning the resulting data set with various quality checks, we used it to train several neural-model variants. The predictions were compared against the labels originally assigned by the authors of the systematic reviews. RESULTS: Our quality assessment data set, Cochrane Database of Systematic Reviews Quality of Evidence, contains 13,440 instances, or BoEs labeled for quality, originating from 2252 systematic reviews published on the internet from 2002 to 2020. On the basis of a 10-fold cross-validation, the best neural binary classifiers for quality criteria detected risk of bias at 0.78 F1 (P=.68; R=0.92) and imprecision at 0.75 F1 (P=.66; R=0.86), while the performance on inconsistency, indirectness, and publication bias criteria was lower (F1 in the range of 0.3-0.4). The prediction of the overall quality grade into 1 of the 4 levels resulted in 0.5 F1. When casting the task as a binary problem by merging the Grading of Recommendation, Assessment, Development, and Evaluation classes (high+moderate vs low+very low-quality evidence), we attained 0.74 F1. We also found that the results varied depending on the supporting information that is provided as an input to the models. CONCLUSIONS: Different factors affect the quality of evidence in the context of systematic reviews of medical evidence. Some of these (risk of bias and imprecision) can be automated with reasonable accuracy. Other quality dimensions such as indirectness, inconsistency, and publication bias prove more challenging for machine learning, largely because they are much rarer. This technology could substantially reduce reviewer workload in the future and expedite quality assessment as part of evidence synthesis.
  • Item
    No Preview Available
    Analysis of predictive performance and reliability of classifiers for quality assessment of medical evidence revealed important variation by medical area
    Suster, S ; Baldwin, T ; Verspoor, K (ELSEVIER SCIENCE INC, 2023-07)
    OBJECTIVES: A major obstacle in deployment of models for automated quality assessment is their reliability. To analyze their calibration and selective classification performance. STUDY DESIGN AND SETTING: We examine two systems for assessing the quality of medical evidence, EvidenceGRADEr and RobotReviewer, both developed from Cochrane Database of Systematic Reviews (CDSR) to measure strength of bodies of evidence and risk of bias (RoB) of individual studies, respectively. We report their calibration error and Brier scores, present their reliability diagrams, and analyze the risk-coverage trade-off in selective classification. RESULTS: The models are reasonably well calibrated on most quality criteria (expected calibration error [ECE] 0.04-0.09 for EvidenceGRADEr, 0.03-0.10 for RobotReviewer). However, we discover that both calibration and predictive performance vary significantly by medical area. This has ramifications for the application of such models in practice, as average performance is a poor indicator of group-level performance (e.g., health and safety at work, allergy and intolerance, and public health see much worse performance than cancer, pain, and anesthesia, and Neurology). We explore the reasons behind this disparity. CONCLUSION: Practitioners adopting automated quality assessment should expect large fluctuations in system reliability and predictive performance depending on the medical area. Prospective indicators of such behavior should be further researched.
  • Item
    No Preview Available
    Overview of ChEMU 2022 Evaluation Campaign: Information Extraction in Chemical Patents
    Li, Y ; Fang, B ; He, J ; Yoshikawa, H ; Akhondi, SA ; Druckenbrodt, C ; Thorne, C ; Afzal, Z ; Zhai, Z ; Baldwin, T ; Verspoor, K ; Barron-Cedeno, A ; DaSanMartino, G ; Esposti, MD ; Sebastiani, F ; Macdonald, C ; Pasi, G ; Hanbury, A ; Potthast, M ; Faggioli, G ; Ferro, N (SPRINGER INTERNATIONAL PUBLISHING AG, 2022)
  • Item
    Thumbnail Image
    Cloze Evaluation for Deeper Understanding of Commonsense Stories in Indonesian
    Koto, F ; Baldwin, T ; Lau, JH (ASSOC COMPUTATIONAL LINGUISTICS-ACL, 2022-01-01)
  • Item
    Thumbnail Image
    One Country, 700+Languages: NLP Challenges for Underrepresented Languages and Dialects in Indonesia
    Aji, AF ; Winata, GI ; Koto, F ; Cahyawijaya, S ; Romadhony, A ; Mahendra, R ; Kurniawan, K ; Moeljadi, D ; Prasojo, RE ; Baldwin, T ; Lau, JH ; Ruder, S (ASSOC COMPUTATIONAL LINGUISTICS-ACL, 2022)
  • Item
    Thumbnail Image
    The patient is more dead than alive: exploring the current state of the multi-document summarization of the biomedical literature
    Otmakhova, Y ; Verspoor, K ; Baldwin, T ; Lau, JH (ASSOC COMPUTATIONAL LINGUISTICS-ACL, 2022)
  • Item
    Thumbnail Image
    Can Pretrained Language Models Generate Persuasive, Faithful, and Informative Ad Text for Product Descriptions?
    Koto, F ; Lau, JH ; Baldwin, T (ASSOC COMPUTATIONAL LINGUISTICS-ACL, 2022)