School of Mathematics and Statistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 21
  • Item
    Thumbnail Image
    Polycomb repressive complex 2 (PRC2) restricts hematopoietic stem cell activity
    Majewski, IJ ; Blewitt, ME ; de Graaf, CA ; McManus, EJ ; Bahlo, M ; Hilton, AA ; Hyland, CD ; Smyth, GK ; Corbin, JE ; Metcalf, D ; Alexander, WS ; Hilton, DJ ; Goodell, MA (PUBLIC LIBRARY SCIENCE, 2008-04)
    Polycomb group proteins are transcriptional repressors that play a central role in the establishment and maintenance of gene expression patterns during development. Using mice with an N-ethyl-N-nitrosourea (ENU)-induced mutation in Suppressor of Zeste 12 (Suz12), a core component of Polycomb Repressive Complex 2 (PRC2), we show here that loss of Suz12 function enhances hematopoietic stem cell (HSC) activity. In addition to these effects on a wild-type genetic background, mutations in Suz12 are sufficient to ameliorate the stem cell defect and thrombocytopenia present in mice that lack the thrombopoietin receptor (c-Mpl). To investigate the molecular targets of the PRC2 complex in the HSC compartment, we examined changes in global patterns of gene expression in cells deficient in Suz12. We identified a distinct set of genes that are regulated by Suz12 in hematopoietic cells, including eight genes that appear to be highly responsive to PRC2 function within this compartment. These data suggest that PRC2 is required to maintain a specific gene expression pattern in hematopoiesis that is indispensable to normal stem cell function.
  • Item
    Thumbnail Image
    Statistical analysis of an RNA titration series evaluates microarray precision and sensitivity on a whole-array basis
    Holloway, AJ ; Oshlack, A ; Diyagama, DS ; Bowtell, DDL ; Smyth, GK (BMC, 2006-11-22)
    BACKGROUND: Concerns are often raised about the accuracy of microarray technologies and the degree of cross-platform agreement, but there are yet no methods which can unambiguously evaluate precision and sensitivity for these technologies on a whole-array basis. RESULTS: A methodology is described for evaluating the precision and sensitivity of whole-genome gene expression technologies such as microarrays. The method consists of an easy-to-construct titration series of RNA samples and an associated statistical analysis using non-linear regression. The method evaluates the precision and responsiveness of each microarray platform on a whole-array basis, i.e., using all the probes, without the need to match probes across platforms. An experiment is conducted to assess and compare four widely used microarray platforms. All four platforms are shown to have satisfactory precision but the commercial platforms are superior for resolving differential expression for genes at lower expression levels. The effective precision of the two-color platforms is improved by allowing for probe-specific dye-effects in the statistical model. The methodology is used to compare three data extraction algorithms for the Affymetrix platforms, demonstrating poor performance for the commonly used proprietary algorithm relative to the other algorithms. For probes which can be matched across platforms, the cross-platform variability is decomposed into within-platform and between-platform components, showing that platform disagreement is almost entirely systematic rather than due to measurement variability. CONCLUSION: The results demonstrate good precision and sensitivity for all the platforms, but highlight the need for improved probe annotation. They quantify the extent to which cross-platform measures can be expected to be less accurate than within-platform comparisons for predicting disease progression or outcome.
  • Item
    Thumbnail Image
    Empirical array quality weights in the analysis of microarray data
    Ritchie, ME ; Diyagama, D ; Neilson, J ; van Laar, R ; Dobrovic, A ; Holloway, A ; Smyth, GK (BMC, 2006-05-19)
    BACKGROUND: Assessment of array quality is an essential step in the analysis of data from microarray experiments. Once detected, less reliable arrays are typically excluded or "filtered" from further analysis to avoid misleading results. RESULTS: In this article, a graduated approach to array quality is considered based on empirical reproducibility of the gene expression measures from replicate arrays. Weights are assigned to each microarray by fitting a heteroscedastic linear model with shared array variance terms. A novel gene-by-gene update algorithm is used to efficiently estimate the array variances. The inverse variances are used as weights in the linear model analysis to identify differentially expressed genes. The method successfully assigns lower weights to less reproducible arrays from different experiments. Down-weighting the observations from suspect arrays increases the power to detect differential expression. In smaller experiments, this approach outperforms the usual method of filtering the data. The method is available in the limma software package which is implemented in the R software environment. CONCLUSION: This method complements existing normalisation and spot quality procedures, and allows poorer quality arrays, which would otherwise be discarded, to be included in an analysis. It is applicable to microarray data from experiments with some level of replication.
  • Item
    Thumbnail Image
    Molecular networks involved in mouse cerebral corticogenesis and spatio-temporal regulation of Sox4 and Sox11 novel antisense transcripts revealed by transcriptome profiling
    Ling, K-H ; Hewitt, CA ; Beissbarth, T ; Hyde, L ; Banerjee, K ; Cheah, P-S ; Cannon, PZ ; Hahn, CN ; Thomas, PQ ; Smyth, GK ; Tan, S-S ; Thomas, T ; Scott, HS (BMC, 2009)
    BACKGROUND: Development of the cerebral cortex requires highly specific spatio-temporal regulation of gene expression. It is proposed that transcriptome profiling of the cerebral cortex at various developmental time points or regions will reveal candidate genes and associated molecular pathways involved in cerebral corticogenesis. RESULTS: Serial analysis of gene expression (SAGE) libraries were constructed from C57BL/6 mouse cerebral cortices of age embryonic day (E) 15.5, E17.5, postnatal day (P) 1.5 and 4 to 6 months. Hierarchical clustering analysis of 561 differentially expressed transcripts showed regionalized, stage-specific and co-regulated expression profiles. SAGE expression profiles of 70 differentially expressed transcripts were validated using quantitative RT-PCR assays. Ingenuity pathway analyses of validated differentially expressed transcripts demonstrated that these transcripts possess distinctive functional properties related to various stages of cerebral corticogenesis and human neurological disorders. Genomic clustering analysis of the differentially expressed transcripts identified two highly transcribed genomic loci, Sox4 and Sox11, during embryonic cerebral corticogenesis. These loci feature unusual overlapping sense and antisense transcripts with alternative polyadenylation sites and differential expression. The Sox4 and Sox11 antisense transcripts were highly expressed in the brain compared to other mouse organs and are differentially expressed in both the proliferating and differentiating neural stem/progenitor cells and P19 (embryonal carcinoma) cells. CONCLUSIONS: We report validated gene expression profiles that have implications for understanding the associations between differentially expressed transcripts, novel targets and related disorders pertaining to cerebral corticogenesis. The study reports, for the first time, spatio-temporally regulated Sox4 and Sox11 antisense transcripts in the brain, neural stem/progenitor cells and P19 cells, suggesting they have an important role in cerebral corticogenesis and neuronal/glial cell differentiation.
  • Item
    Thumbnail Image
    Integrative analysis of RUNX1 downstream pathways and target genes
    Michaud, J ; Simpson, KM ; Escher, R ; Buchet-Poyau, K ; Beissbarth, T ; Carmichael, C ; Ritchie, ME ; Schuetz, F ; Cannon, P ; Liu, M ; Shen, X ; Ito, Y ; Raskind, WH ; Horwitz, MS ; Osato, M ; Turner, DR ; Speed, TP ; Kavallaris, M ; Smyth, GK ; Scott, HS (BMC, 2008-07-31)
    BACKGROUND: The RUNX1 transcription factor gene is frequently mutated in sporadic myeloid and lymphoid leukemia through translocation, point mutation or amplification. It is also responsible for a familial platelet disorder with predisposition to acute myeloid leukemia (FPD-AML). The disruption of the largely unknown biological pathways controlled by RUNX1 is likely to be responsible for the development of leukemia. We have used multiple microarray platforms and bioinformatic techniques to help identify these biological pathways to aid in the understanding of why RUNX1 mutations lead to leukemia. RESULTS: Here we report genes regulated either directly or indirectly by RUNX1 based on the study of gene expression profiles generated from 3 different human and mouse platforms. The platforms used were global gene expression profiling of: 1) cell lines with RUNX1 mutations from FPD-AML patients, 2) over-expression of RUNX1 and CBFbeta, and 3) Runx1 knockout mouse embryos using either cDNA or Affymetrix microarrays. We observe that our datasets (lists of differentially expressed genes) significantly correlate with published microarray data from sporadic AML patients with mutations in either RUNX1 or its cofactor, CBFbeta. A number of biological processes were identified among the differentially expressed genes and functional assays suggest that heterozygous RUNX1 point mutations in patients with FPD-AML impair cell proliferation, microtubule dynamics and possibly genetic stability. In addition, analysis of the regulatory regions of the differentially expressed genes has for the first time systematically identified numerous potential novel RUNX1 target genes. CONCLUSION: This work is the first large-scale study attempting to identify the genetic networks regulated by RUNX1, a master regulator in the development of the hematopoietic system and leukemia. The biological pathways and target genes controlled by RUNX1 will have considerable importance in disease progression in both familial and sporadic leukemia as well as therapeutic implications.
  • Item
    Thumbnail Image
    Illumina WG-6 BeadChip strips should be normalized separately
    Shi, W ; Banerjee, A ; Ritchie, ME ; Gerondakis, S ; Smyth, GK (BMC, 2009-11-11)
    BACKGROUND: Illumina Sentrix-6 Whole-Genome Expression BeadChips are relatively new microarray platforms which have been used in many microarray studies in the past few years. These Chips have a unique design in which each Chip contains six microarrays and each microarray consists of two separate physical strips, posing special challenges for precise between-array normalization of expression values. RESULTS: None of the normalization strategies proposed so far for this microarray platform allow for the possibility of systematic variation between the two strips comprising each array. That this variation can be substantial is illustrated by a data example. We demonstrate that normalizing at the strip-level rather than at the array-level can effectively remove this between-strip variation, improve the precision of gene expression measurements and discover more differentially expressed genes. The gain is substantial, yielding a 20% increase in statistical information and doubling the number of genes detected at a 5% false discovery rate. Functional analysis reveals that the extra genes found tend to have interesting biological meanings, dramatically strengthening the biological conclusions from the experiment. Strip-level normalization still outperforms array-level normalization when non-expressed probes are filtered out. CONCLUSION: Plots are proposed which demonstrate how the need for strip-level normalization relates to inconsistent intensity range variation between the strips. Strip-level normalization is recommended for the preprocessing of Illumina Sentrix-6 BeadChips whenever the intensity range is seen to be inconsistent between the strips. R code is provided to implement the recommended plots and normalization algorithms.
  • Item
    Thumbnail Image
    Gene Network Disruptions and Neurogenesis Defects in the Adult Ts1Cje Mouse Model of Down Syndrome
    Hewitt, CA ; Ling, K-H ; Merson, TD ; Simpson, KM ; Ritchie, ME ; King, SL ; Pritchard, MA ; Smyth, GK ; Thomas, T ; Scott, HS ; Voss, AK ; Aziz, SA (PUBLIC LIBRARY SCIENCE, 2010-07-16)
    BACKGROUND: Down syndrome (DS) individuals suffer mental retardation with further cognitive decline and early onset Alzheimer's disease. METHODOLOGY/PRINCIPAL FINDINGS: To understand how trisomy 21 causes these neurological abnormalities we investigated changes in gene expression networks combined with a systematic cell lineage analysis of adult neurogenesis using the Ts1Cje mouse model of DS. We demonstrated down regulation of a number of key genes involved in proliferation and cell cycle progression including Mcm7, Brca2, Prim1, Cenpo and Aurka in trisomic neurospheres. We found that trisomy did not affect the number of adult neural stem cells but resulted in reduced numbers of neural progenitors and neuroblasts. Analysis of differentiating adult Ts1Cje neural progenitors showed a severe reduction in numbers of neurons produced with a tendency for less elaborate neurites, whilst the numbers of astrocytes was increased. CONCLUSIONS/SIGNIFICANCE: We have shown that trisomy affects a number of elements of adult neurogenesis likely to result in a progressive pathogenesis and consequently providing the potential for the development of therapies to slow progression of, or even ameliorate the neuronal deficits suffered by DS individuals.
  • Item
    Thumbnail Image
    Copy Number Analysis Identifies Novel Interactions Between Genomic Loci in Ovarian Cancer
    Gorringe, KL ; George, J ; Anglesio, MS ; Ramakrishna, M ; Etemadmoghadam, D ; Cowin, P ; Sridhar, A ; Williams, LH ; Boyle, SE ; Yanaihara, N ; Okamoto, A ; Urashima, M ; Smyth, GK ; Campbell, IG ; Bowtell, DDL ; Jordan, IK (PUBLIC LIBRARY SCIENCE, 2010-09-10)
    Ovarian cancer is a heterogeneous disease displaying complex genomic alterations, and consequently, it has been difficult to determine the most relevant copy number alterations with the scale of studies to date. We obtained genome-wide copy number alteration (CNA) data from four different SNP array platforms, with a final data set of 398 ovarian tumours, mostly of the serous histological subtype. Frequent CNA aberrations targeted many thousands of genes. However, high-level amplicons and homozygous deletions enabled filtering of this list to the most relevant. The large data set enabled refinement of minimal regions and identification of rare amplicons such as at 1p34 and 20q11. We performed a novel co-occurrence analysis to assess cooperation and exclusivity of CNAs and analysed their relationship to patient outcome. Positive associations were identified between gains on 19 and 20q, gain of 20q and loss of X, and between several regions of loss, particularly 17q. We found weak correlations of CNA at genomic loci such as 19q12 with clinical outcome. We also assessed genomic instability measures and found a correlation of the number of higher amplitude gains with poorer overall survival. By assembling the largest collection of ovarian copy number data to date, we have been able to identify the most frequent aberrations and their interactions.
  • Item
    Thumbnail Image
    Amplicon-Dependent CCNE1 Expression Is Critical for Clonogenic Survival after Cisplatin Treatment and Is Correlated with 20q11 Gain in Ovarian Cancer
    Etemadmoghadam, D ; George, J ; Cowin, PA ; Cullinane, C ; Kansara, M ; Gorringe, KL ; Smyth, GK ; Bowtell, DDL ; Wong, N (PUBLIC LIBRARY SCIENCE, 2010-11-12)
    Genomic amplification of 19q12 occurs in several cancer types including ovarian cancer where it is associated with primary treatment failure. We systematically attenuated expression of genes within the minimally defined 19q12 region in ovarian cell lines using short-interfering RNAs (siRNA) to identify driver oncogene(s) within the amplicon. Knockdown of CCNE1 resulted in G1/S phase arrest, reduced cell viability and apoptosis only in amplification-carrying cells. Although CCNE1 knockdown increased cisplatin resistance in short-term assays, clonogenic survival was inhibited after treatment. Gain of 20q11 was highly correlated with 19q12 amplification and spanned a 2.5 Mb region including TPX2, a centromeric protein required for mitotic spindle function. Expression of TPX2 was highly correlated with gene amplification and with CCNE1 expression in primary tumors. siRNA inhibition of TPX2 reduced cell viability but this effect was not amplicon-dependent. These findings demonstrate that CCNE1 is a key driver in the 19q12 amplicon required for survival and clonogenicity in cells with locus amplification. Co-amplification at 19q12 and 20q11 implies the presence of a cooperative mutational network. These observations have implications for the application of targeted therapies in CCNE1 dependent ovarian cancers.
  • Item
    Thumbnail Image
    Estimating the proportion of microarray probes expressed in an RNA sample
    Shi, W ; de Graaf, CA ; Kinkel, SA ; Achtman, AH ; Baldwin, T ; Schofield, L ; Scott, HS ; Hilton, DJ ; Smyth, GK (OXFORD UNIV PRESS, 2010-04)
    A fundamental question in microarray analysis is the estimation of the number of expressed probes in different RNA samples. Negative control probes available in the latest microarray platforms, such as Illumina whole genome expression BeadChips, provide a unique opportunity to estimate the number of expressed probes without setting a threshold. A novel algorithm was proposed in this study to estimate the number of expressed probes in an RNA sample by utilizing these negative controls to measure background noise. The performance of the algorithm was demonstrated by comparing different generations of Illumina BeadChips, comparing the set of probes targeting well-characterized RefSeq NM transcripts with other probes on the array and comparing pure samples with heterogenous samples. Furthermore, hematopoietic stem cells were found to have a larger transcriptome than progenitor cells. Aire knockout medullary thymic epithelial cells were shown to have significantly less expressed probes than matched wild-type cells.