Clinical Pathology - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    Clinical outcome prediction using biomedical data and machine learning approaches
    Liu, Yang ( 2022)
    Identifying asymptomatic individuals with increased susceptibility to disease provides substantial opportunities for preventative interventions. Over the past few years, the advances in sequencing and computing technologies have enabled omics-driven disease prediction modelling which may aid in the exploration of new biomarkers and future clinical utility. Recent studies have revealed evidence linking human gut microbiota with the pathogenesis of various complex diseases. However, previous studies have been limited by cross-sectional study design and there are limited data regarding the longitudinal association between baseline gut microbiome and incident diseases. In addition, there are few published studies on incident disease prediction combing genetic risk and gut microbial risk factors. To address this, we designed a longitudinal study to examine the predictive utility of clinical metadata, the gut metagenomics and genomics data for a series of complex diseases, using statistical and machine learning approaches in a large population-based cohort with ~15 years of electronic health records follow-up. Chapter 1 provides a comprehensive review on advances and challenges in complex disease prediction. Emerging prediction methods and novel biomarkers are highlighted, including the polygenic risk scores, gut metagenomics, and machine learning approaches in the context of disease prediction. Recent progress in clinical utility of the advancements in multi-omics-based prediction, and future challenges and potential opportunities for clinical translation are discussed. In Chapter 2, the potential of gut microbiota for prospective risk prediction of liver disease was investigated using machine learning approaches. The predictive capacity of the baseline gut microbiota was evaluated individually and in combination with conventional risk factors. The results demonstrated that the microbiome augmentation of conventional risk factors using gradient boosting classifiers significantly improved prediction performance. Investigation of predictive microbial signatures revealed previously unknown bacterial taxa for incident liver disease, as well as those previously associated with hepatic function and disease. In Chapter 3, the associations with baseline gut microbiome were tested for incident respiratory diseases, including COPD and adult-onset asthma. The gut microbial alterations and variations at each taxonomic level were compared between disease cases and non-cases. Machine learning models demonstrated moderate predictive capacities of baseline gut microbiome for incident asthma/COPD. Subgroup analyses indicated gut microbiome was significantly associated with incident COPD in both current smokers and non-smokers, as well as in individuals who reported never smoking. In Chapter 4, the predictive utility of genetic risk factors, gut microbial risk factors, and lifestyle risk factors was investigated for multiple complex diseases, including myocardial infarction, coronary heart disease, prostate cancer, Type 2 diabetes and Alzheimer’s disease. Since the gut microbiome is involved in numerous host physiological processes and linked to all vital organs, it was hypothesized that the gut microbiome can reflect host environmental risk factors for relevant diseases. It was also hypothesized that the inclusion of genetic susceptibility could improve the prediction performance over clinical risk factors for complex diseases. The findings demonstrated the individual and combined impact of polygenic predisposition and variations in baseline gut microbiota on disease incidence. This thesis presents a comprehensive investigation of the integrative use of clinical metadata and multi-omics data, the human gut metagenomics in particular, for incident disease prediction. The findings of this work provide an evidence base for the translation of omics and machine learning to risk prediction of multiple diseases, and support further investigation into identification of new biomarkers for disease risk assessment and prevention.