School of Mathematics and Statistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 4 of 4
  • Item
    Thumbnail Image
    Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation
    McCarthy, DJ ; Chen, Y ; Smyth, GK (OXFORD UNIV PRESS, 2012-05)
    A flexible statistical framework is developed for the analysis of read counts from RNA-Seq gene expression studies. It provides the ability to analyse complex experiments involving multiple treatment conditions and blocking variables while still taking full account of biological variation. Biological variation between RNA samples is estimated separately from the technical variation associated with sequencing technologies. Novel empirical Bayes methods allow each gene to have its own specific variability, even when there are relatively few biological replicates from which to estimate such variability. The pipeline is implemented in the edgeR package of the Bioconductor project. A case study analysis of carcinoma data demonstrates the ability of generalized linear model methods (GLMs) to detect differential expression in a paired design, and even to detect tumour-specific expression changes. The case study demonstrates the need to allow for gene-specific variability, rather than assuming a common dispersion across genes or a fixed relationship between abundance and variability. Genewise dispersions de-prioritize genes with inconsistent results and allow the main analysis to focus on changes that are consistent between biological replicates. Parallel computational approaches are developed to make non-linear model fitting faster and more reliable, making the application of GLMs to genomic data more convenient and practical. Simulations demonstrate the ability of adjusted profile likelihood estimators to return accurate estimators of biological variability in complex situations. When variation is gene-specific, empirical Bayes estimators provide an advantageous compromise between the extremes of assuming common dispersion or separate genewise dispersion. The methods developed here can also be applied to count data arising from DNA-Seq applications, including ChIP-Seq for epigenetic marks and DNA methylation analyses.
  • Item
    Thumbnail Image
    edgeR: a Bioconductor package for differential expression analysis of digital gene expression data
    Robinson, MD ; McCarthy, DJ ; Smyth, GK (OXFORD UNIV PRESS, 2010-01-01)
    SUMMARY: It is expected that emerging digital gene expression (DGE) technologies will overtake microarray technologies in the near future for many functional genomics applications. One of the fundamental data analysis tasks, especially for gene expression studies, involves determining whether there is evidence that counts for a transcript or exon are significantly different across experimental conditions. edgeR is a Bioconductor software package for examining differential expression of replicated count data. An overdispersed Poisson model is used to account for both biological and technical variability. Empirical Bayes methods are used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, provided at least one phenotype or experimental condition is replicated. The software may have other applications beyond sequencing data, such as proteome peptide count data. AVAILABILITY: The package is freely available under the LGPL licence from the Bioconductor web site (http://bioconductor.org).
  • Item
    Thumbnail Image
    Testing significance relative to a fold-change threshold is a TREAT
    McCarthy, DJ ; Smyth, GK (OXFORD UNIV PRESS, 2009-03-15)
    MOTIVATION: Statistical methods are used to test for the differential expression of genes in microarray experiments. The most widely used methods successfully test whether the true differential expression is different from zero, but give no assurance that the differences found are large enough to be biologically meaningful. RESULTS: We present a method, t-tests relative to a threshold (TREAT), that allows researchers to test formally the hypothesis (with associated p-values) that the differential expression in a microarray experiment is greater than a given (biologically meaningful) threshold. We have evaluated the method using simulated data, a dataset from a quality control experiment for microarrays and data from a biological experiment investigating histone deacetylase inhibitors. When the magnitude of differential expression is taken into account, TREAT improves upon the false discovery rate of existing methods and identifies more biologically relevant genes. AVAILABILITY: R code implementing our methods is contributed to the software package limma available at http://www.bioconductor.org.
  • Item
    Thumbnail Image
    MOZ and BMI1 play opposing roles during Hox gene activation in ES cells and in body segment identity specification in vivo
    Sheikh, BN ; Downer, NL ; Phipson, B ; Vanyai, HK ; Kueh, AJ ; McCarthy, DJ ; Smyth, GK ; Thomas, T ; Voss, AK (NATL ACAD SCIENCES, 2015-04-28)
    Hox genes underlie the specification of body segment identity in the anterior-posterior axis. They are activated during gastrulation and undergo a dynamic shift from a transcriptionally repressed to an active chromatin state in a sequence that reflects their chromosomal location. Nevertheless, the precise role of chromatin modifying complexes during the initial activation phase remains unclear. In the current study, we examined the role of chromatin regulators during Hox gene activation. Using embryonic stem cell lines lacking the transcriptional activator MOZ and the polycomb-family repressor BMI1, we showed that MOZ and BMI1, respectively, promoted and repressed Hox genes during the shift from the transcriptionally repressed to the active state. Strikingly however, MOZ but not BMI1 was required to regulate Hox mRNA levels after the initial activation phase. To determine the interaction of MOZ and BMI1 in vivo, we interrogated their role in regulating Hox genes and body segment identity using Moz;Bmi1 double deficient mice. We found that the homeotic transformations and shifts in Hox gene expression boundaries observed in single Moz and Bmi1 mutant mice were rescued to a wild type identity in Moz;Bmi1 double knockout animals. Together, our findings establish that MOZ and BMI1 play opposing roles during the onset of Hox gene expression in the ES cell model and during body segment identity specification in vivo. We propose that chromatin-modifying complexes have a previously unappreciated role during the initiation phase of Hox gene expression, which is critical for the correct specification of body segment identity.