School of Mathematics and Statistics - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    Identification of molecular phenotypes and their regulation in cancer
    Bhuva, Dharmesh Dinesh ( 2020)
    Complex diseases manifest through the dysregulation of otherwise finely regulated transcriptional programs resulting in functional alterations. Insight into altered transcriptional processes and their subsequent functional consequences allow molecular characterisation of disease phenotypes and can lead to the identification of potential therapeutic targets. The altered regulation of transcriptional programs can be identified using computational and statistical methods to infer gene regulatory networks that are changed between biological contexts. Numerous methods have been developed to infer conditional relationships however extensive evaluations remain scarce because of a lack of validation data. I developed an evaluation framework that simulates transcriptomic data from a dynamical systems model of gene regulation. I used 812 simulated datasets with varying model parameters to evaluate 14 different context specific inference methods. The evaluation revealed that context specific causative regulatory relationships were difficult to infer while inferring context specific co expression was relatively easier. Some variability in performance was attributed to properties of the global regulatory network structure. Applying the best performing approach, a z score method, to identify estrogen receptor specific regulatory relationships in a breast cancer dataset revealed an immune related program regulated in basal like breast cancers and dysregulated in all other subtypes. I identified a key gene in this network that was associated with immune infiltration in basal like breast cancers. The result of any regulatory cascade is a molecular phenotype such as the immune infiltration phenotype described above. Assessing these phenotypes aids in characterisation of disease and can be used to guide therapies. Most methods to assess molecular phenotypes are incapable of acting on individual samples and therefore cannot be used in personalised medicine. With colleagues, I developed a novel rank based method, singscore, that assesses molecular phenotypes using transcriptomic measurements from an individual sample. I evaluated the new method in a variety of applications, ranging from molecular phenotyping to sample stratification, and benchmarked the method against other single sample based methods. I then demonstrated three applications of this flexible rank based approach: inferring and investigating the epithelial mesenchymal landscape in breast cancer, inferring the mutation status of the NPM1c mutation in acute myeloid leukemia, and the prioritisation of gene sets with stably expressed genes. While the transcriptome clearly contains abundant information that might be of clinical use, translation of molecular phenotyping into clinical applications that can be readily adopted would require a reduction in the number of transcriptomic measurements to tens or hundreds of transcripts. This would support a reduction in the cost of potential clinical assays and a reduction in the amount of transcriptomic material required for molecular phenotyping. I developed a method that uses genes with stable expression to drastically reduce the number of transcriptomic measurements required for molecular phenotyping. I showed that molecular phenotype assessments using these reduced numbers of measurements are comparable to those performed using transcriptome wide measurements. Stable genes identified from this analysis provide enhanced scope for use compared with similar previously identified sets thereby promoting other applications such as correction of batch effects, and normalisation across a wide range of transcriptomic and other datatypes. In summary, my PhD developed methodology to understand the molecular state of biological systems in a context specific manner. This was done by identifying context specific gene regulatory networks and followed with assessing the molecular phenotypes resulting from context specific regulation in a clinical setting. This work highlights the importance of context specific analysis in disease and shows the importance and utility of comprehensive benchmarks. It also highlights the need for developing clinically applicable analysis methods to achieve the eventual goal of personalised medicine.