School of Mathematics and Statistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 3 of 3
  • Item
    Thumbnail Image
    Inference of haplotypic phase and missing genotypes in polyploid organisms and variable copy number genomic regions
    Su, S-Y ; White, J ; Balding, DJ ; Coin, LJM (BMC, 2008-12-01)
    BACKGROUND: The power of haplotype-based methods for association studies, identification of regions under selection, and ancestral inference, is well-established for diploid organisms. For polyploids, however, the difficulty of determining phase has limited such approaches. Polyploidy is common in plants and is also observed in animals. Partial polyploidy is sometimes observed in humans (e.g. trisomy 21; Down's syndrome), and it arises more frequently in some human tissues. Local changes in ploidy, known as copy number variations (CNV), arise throughout the genome. Here we present a method, implemented in the software polyHap, for the inference of haplotype phase and missing observations from polyploid genotypes. PolyHap allows each individual to have a different ploidy, but ploidy cannot vary over the genomic region analysed. It employs a hidden Markov model (HMM) and a sampling algorithm to infer haplotypes jointly in multiple individuals and to obtain a measure of uncertainty in its inferences. RESULTS: In the simulation study, we combine real haplotype data to create artificial diploid, triploid, and tetraploid genotypes, and use these to demonstrate that polyHap performs well, in terms of both switch error rate in recovering phase and imputation error rate for missing genotypes. To our knowledge, there is no comparable software for phasing a large, densely genotyped region of chromosome from triploids and tetraploids, while for diploids we found polyHap to be more accurate than fastPhase. We also compare the results of polyHap to SATlotyper on an experimentally haplotyped tetraploid dataset of 12 SNPs, and show that polyHap is more accurate. CONCLUSION: With the availability of large SNP data in polyploids and CNV regions, we believe that polyHap, our proposed method for inferring haplotypic phase from genotype data, will be useful in enabling researchers analysing such data to exploit the power of haplotype-based analyses.
  • Item
    Thumbnail Image
    Pathway Analysis of GWAS Provides New Insights into Genetic Susceptibility to 3 Inflammatory Diseases
    Eleftherohorinou, H ; Wright, V ; Hoggart, C ; Hartikainen, A-L ; Jarvelin, M-R ; Balding, D ; Coin, L ; Levin, M ; Weedon, MN (PUBLIC LIBRARY SCIENCE, 2009-11-30)
    Although the introduction of genome-wide association studies (GWAS) have greatly increased the number of genes associated with common diseases, only a small proportion of the predicted genetic contribution has so far been elucidated. Studying the cumulative variation of polymorphisms in multiple genes acting in functional pathways may provide a complementary approach to the more common single SNP association approach in understanding genetic determinants of common disease. We developed a novel pathway-based method to assess the combined contribution of multiple genetic variants acting within canonical biological pathways and applied it to data from 14,000 UK individuals with 7 common diseases. We tested inflammatory pathways for association with Crohn's disease (CD), rheumatoid arthritis (RA) and type 1 diabetes (T1D) with 4 non-inflammatory diseases as controls. Using a variable selection algorithm, we identified variants responsible for the pathway association and evaluated their use for disease prediction using a 10 fold cross-validation framework in order to calculate out-of-sample area under the Receiver Operating Curve (AUC). The generalisability of these predictive models was tested on an independent birth cohort from Northern Finland. Multiple canonical inflammatory pathways showed highly significant associations (p 10(-3)-10(-20)) with CD, T1D and RA. Variable selection identified on average a set of 205 SNPs (149 genes) for T1D, 350 SNPs (189 genes) for RA and 493 SNPs (277 genes) for CD. The pattern of polymorphisms at these SNPS were found to be highly predictive of T1D (91% AUC) and RA (85% AUC), and weakly predictive of CD (60% AUC). The predictive ability of the T1D model (without any parameter refitting) had good predictive ability (79% AUC) in the Finnish cohort. Our analysis suggests that genetic contribution to common inflammatory diseases operates through multiple genes interacting in functional pathways.
  • Item
    Thumbnail Image
    Dysregulation of Complement System and CD4+T Cell Activation Pathways Implicated in Allergic Response
    Alves, AC ; Bruhn, S ; Ramasamy, A ; Wang, H ; Holloway, JW ; Hartikainen, A-L ; Jarvelin, M-R ; Benson, M ; Balding, DJ ; Coin, LJM ; Tran, DQ (PUBLIC LIBRARY SCIENCE, 2013-10-08)
    Allergy is a complex disease that is likely to involve dysregulated CD4+ T cell activation. Here we propose a novel methodology to gain insight into how coordinated behaviour emerges between disease-dysregulated pathways in response to pathophysiological stimuli. Using peripheral blood mononuclear cells of allergic rhinitis patients and controls cultured with and without pollen allergens, we integrate CD4+ T cell gene expression from microarray data and genetic markers of allergic sensitisation from GWAS data at the pathway level using enrichment analysis; implicating the complement system in both cellular and systemic response to pollen allergens. We delineate a novel disease network linking T cell activation to the complement system that is significantly enriched for genes exhibiting correlated gene expression and protein-protein interactions, suggesting a tight biological coordination that is dysregulated in the disease state in response to pollen allergen but not to diluent. This novel disease network has high predictive power for the gene and protein expression of the Th2 cytokine profile (IL-4, IL-5, IL-10, IL-13) and of the Th2 master regulator (GATA3), suggesting its involvement in the early stages of CD4+ T cell differentiation. Dissection of the complement system gene expression identifies 7 genes specifically associated with atopic response to pollen, including C1QR1, CFD, CFP, ITGB2, ITGAX and confirms the role of C3AR1 and C5AR1. Two of these genes (ITGB2 and C3AR1) are also implicated in the network linking complement system to T cell activation, which comprises 6 differentially expressed genes. C3AR1 is also significantly associated with allergic sensitisation in GWAS data.