Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    Optimisation of phasing: towards improved haplotype-based genetic investigations
    Al Bkhetan, Ziad ( 2020)
    Haplotype or phase information significantly adds to the ability to resolve genetic problems and is important to elucidate and interpret certain genetic basis underlying diseases or traits. The common approach to derive phase information is through computational haplotype phasing or estimation methods. Current developments in phasing have paved the way to widely use haplotype information in population genetic investigations. This PhD thesis explores various ways to utilise haplotype information effectively to conduct precise haplotype-based genetic investigations. It provides evidence of the important role of haplotypes to detect significant genetic associations with phenotypes that can be missed otherwise. In particular, it provides a comprehensive evaluation of state of art phasing approaches as well as different haplotype block determination methods (sliding window and through linkage disequilibrium). A rigorous analysis is conducted to improve phasing accuracy through a consensus haplotype estimator across datasets with different characteristics. Furthermore, phasing optimisation was utilised to develop a new approach to carry out haplotype-based Expression Quantitative Trait Loci (eQTL) analysis. The approach is assessed against genotype-based eQTL methods (both single and combinations of SNPs). The main contributions of this PhD study are: 1. Novel evaluations and comparisons for haplotype phasing considering the accuracy at block scale that is the most popular way to use phase information in genetic studies. 2. An improvement of phasing accuracy reaching 10% when using the proposed consensus approach. 3. The consensus approach leads to the highest accuracy genotype imputation performed via the well-known tools Minimac3, pbwt and Beagle5. 4. An approach for haplotype-based eQTL analysis, that is demonstrated to outperform standard eQTL methods when the causal genetic architecture involves multiple variations. Finally, the work in this PhD thesis highlights the fundamental role of haplotype information in genetic problems and provides guidance for other researchers interested in performing haplotype related investigations. Two tools (consHap and eQTLHap) are also released publicly with this PhD to support other research studies.