School of Mathematics and Statistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 5 of 5
  • Item
    Thumbnail Image
    Investigating and Correcting Plasma DNA Sequencing Coverage Bias to Enhance Aneuploidy Discovery
    Chandrananda, D ; Thorne, NP ; Ganesamoorthy, D ; Bruno, DL ; Benjamini, Y ; Speed, TP ; Slater, HR ; Bahlo, M ; Zhou, F (PUBLIC LIBRARY SCIENCE, 2014-01-29)
    Pregnant women carry a mixture of cell-free DNA fragments from self and fetus (non-self) in their circulation. In recent years multiple independent studies have demonstrated the ability to detect fetal trisomies such as trisomy 21, the cause of Down syndrome, by Next-Generation Sequencing of maternal plasma. The current clinical tests based on this approach show very high sensitivity and specificity, although as yet they have not become the standard diagnostic test. Here we describe improvements to the analysis of the sequencing data by reducing GC bias and better handling of the genomic repeats. We show substantial improvements in the sensitivity of the standard trisomy 21 statistical tests, which we measure by artificially reducing read coverage. We also explore the bias stemming from the natural cleavage of plasma DNA by examining DNA motifs and position specific base distributions. We propose a model to correct this fragmentation bias and observe that incorporating this bias does not lead to any further improvements in the detection of fetal trisomy. The improved bias corrections that we demonstrate in this work can be readily adopted into existing fetal trisomy detection protocols and should also lead to improvements in sub-chromosomal copy number variation detection.
  • Item
    Thumbnail Image
    Systematic noise degrades gene co-expression signals but can be corrected
    Freytag, S ; Gagnon-Bartsch, J ; Speed, TP ; Bahlo, M (BMC, 2015-09-24)
    BACKGROUND: In the past decade, the identification of gene co-expression has become a routine part of the analysis of high-dimensional microarray data. Gene co-expression, which is mostly detected via the Pearson correlation coefficient, has played an important role in the discovery of molecular pathways and networks. Unfortunately, the presence of systematic noise in high-dimensional microarray datasets corrupts estimates of gene co-expression. Removing systematic noise from microarray data is therefore crucial. Many cleaning approaches for microarray data exist, however these methods are aimed towards improving differential expression analysis and their performances have been primarily tested for this application. To our knowledge, the performances of these approaches have never been systematically compared in the context of gene co-expression estimation. RESULTS: Using simulations we demonstrate that standard cleaning procedures, such as background correction and quantile normalization, fail to adequately remove systematic noise that affects gene co-expression and at times further degrade true gene co-expression. Instead we show that a global version of removal of unwanted variation (RUV), a data-driven approach, removes systematic noise but also allows the estimation of the true underlying gene-gene correlations. We compare the performance of all noise removal methods when applied to five large published datasets on gene expression in the human brain. RUV retrieves the highest gene co-expression values for sets of genes known to interact, but also provides the greatest consistency across all five datasets. We apply the method to prioritize epileptic encephalopathy candidate genes. CONCLUSIONS: Our work raises serious concerns about the quality of many published gene co-expression analyses. RUV provides an efficient and flexible way to remove systematic noise from high-dimensional microarray datasets when the objective is gene co-expression analysis. The RUV method as applicable in the context of gene-gene correlation estimation is available as a BioconductoR-package: RUVcorr.
  • Item
    Thumbnail Image
    Multiple sclerosis risk variants regulate gene expression in innate and adaptive immune cells
    Gresle, MM ; Jordan, MA ; Stankovich, J ; Spelman, T ; Johnson, LJ ; Laverick, L ; Hamlett, A ; Smith, LD ; Jokubaitis, VG ; Baker, J ; Haartsen, J ; Taylor, B ; Charlesworth, J ; Bahlo, M ; Speed, TP ; Brown, MA ; Field, J ; Baxter, AG ; Butzkueven, H (LIFE SCIENCE ALLIANCE LLC, 2020-07)
    At least 200 single-nucleotide polymorphisms (SNPs) are associated with multiple sclerosis (MS) risk. A key function that could mediate SNP-encoded MS risk is their regulatory effects on gene expression. We performed microarrays using RNA extracted from purified immune cell types from 73 untreated MS cases and 97 healthy controls and then performed Cis expression quantitative trait loci mapping studies using additive linear models. We describe MS risk expression quantitative trait loci associations for 129 distinct genes. By extending these models to include an interaction term between genotype and phenotype, we identify MS risk SNPs with opposing effects on gene expression in cases compared with controls, namely, rs2256814 MYT1 in CD4 cells (q = 0.05) and rs12087340 RF00136 in monocyte cells (q = 0.04). The rs703842 SNP was also associated with a differential effect size on the expression of the METTL21B gene in CD8 cells of MS cases relative to controls (q = 0.03). Our study provides a detailed map of MS risk loci that function by regulating gene expression in cell types relevant to MS.
  • Item
    Thumbnail Image
    Genes implicated in multiple sclerosis pathogenesis from consilience of genotyping and expression profiles in relapse and remission
    Arthur, AT ; Armati, PJ ; Bye, C ; Heard, RNS ; Stewart, GJ ; Pollard, JD ; Booth, DR (BMC, 2008-03-19)
    BACKGROUND: Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system (CNS). Although the pathogenesis of MS remains unknown, it is widely regarded as an autoimmune disease mediated by T-lymphocytes directed against myelin proteins and/or other oligodendrocyte epitopes. METHODS: In this study we investigated the gene expression profiles of peripheral blood cells from patients with RRMS during the relapse and the remission phases utilizing gene microarray technology. Dysregulated genes encoded in regions associated with MS susceptibility from genomic screens or previous transcriptomic studies were identified. The proximal promoter region polymorphisms of two genes were tested for association with disease and expression level. RESULTS: Distinct sets of dysregulated genes during the relapse and remission phases were identified including genes involved in apoptosis and inflammation. Three of these dysregulated genes have been previously implicated with MS susceptibility in genomic screens: TGFbeta1, CD58 and DBC1. TGFbeta1 has one common SNP in the proximal promoter: -508 T>C (rs1800469). Genotyping two Australian trio sets (total 620 families) found a trend for over-transmission of the T allele in MS in females (p < 0.13). Upregulation of CD58 and DBC1 in remission is consistent with their putative roles in promoting regulatory T cells and reducing cell proliferation, respectively. A fourth gene, ALOX5, is consistently found over-expressed in MS. Two common genetic variants were confirmed in the ALOX5 putative promoter: -557 T>C (rs12762303) and a 6 bp tandem repeat polymorphism (GGGCGG) between position -147 and -176; but no evidence for transmission distortion found. CONCLUSION: The dysregulation of these genes tags their metabolic pathways for further investigation for potential therapeutic intervention.
  • Item
    Thumbnail Image
    Variants of ST8SIA1 Are Associated with Risk of Developing Multiple Sclerosis
    Husain, S ; Yildirim-Toruner, C ; Rubio, JP ; Field, J ; Schwalb, M ; Cook, S ; Devoto, M ; Vitale, E ; Reitsma, PH (PUBLIC LIBRARY SCIENCE, 2008-07-09)
    Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system of unknown etiology with both genetic and environmental factors playing a role in susceptibility. To date, the HLA DR15/DQ6 haplotype within the major histocompatibility complex on chromosome 6p, is the strongest genetic risk factor associated with MS susceptibility. Additional alleles of IL7 and IL2 have been identified as risk factors for MS with small effect. Here we present two independent studies supporting an allelic association of MS with polymorphisms in the ST8SIA1 gene, located on chromosome 12p12 and encoding ST8 alpha-N-acetyl-neuraminide alpha-2,8-sialyltransferase 1. The initial association was made in a single three-generation family where a single-nucleotide polymorphism (SNP) rs4762896, was segregating together with HLA DR15/DQ6 in MS patients. A study of 274 family trios (affected child and both unaffected parents) from Australia validated the association of ST8SIA1 in individuals with MS, showing transmission disequilibrium of the paternal alleles for three additional SNPs, namely rs704219, rs2041906, and rs1558793, with p = 0.001, p = 0.01 and p = 0.01 respectively. These findings implicate ST8SIA1 as a possible novel susceptibility gene for MS.