School of Mathematics and Statistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 21
  • Item
    No Preview Available
    Evolving gene/transcript definitions significantly alter the interpretation of GeneChip data
    Dai, MH ; Wang, PL ; Boyd, AD ; Kostov, G ; Athey, B ; Jones, EG ; Bunney, WE ; Myers, RM ; Speed, TP ; Akil, H ; Watson, SJ ; Meng, F (OXFORD UNIV PRESS, 2005)
    Genome-wide expression profiling is a powerful tool for implicating novel gene ensembles in cellular mechanisms of health and disease. The most popular platform for genome-wide expression profiling is the Affymetrix GeneChip. However, its selection of probes relied on earlier genome and transcriptome annotation which is significantly different from current knowledge. The resultant informatics problems have a profound impact on analysis and interpretation the data. Here, we address these critical issues and offer a solution. We identified several classes of problems at the individual probe level in the existing annotation, under the assumption that current genome and transcriptome databases are more accurate than those used for GeneChip design. We then reorganized probes on more than a dozen popular GeneChips into gene-, transcript- and exon-specific probe sets in light of up-to-date genome, cDNA/EST clustering and single nucleotide polymorphism information. Comparing analysis results between the original and the redefined probe sets reveals approximately 30-50% discrepancy in the genes previously identified as differentially expressed, regardless of analysis method. Our results demonstrate that the original Affymetrix probe set definitions are inaccurate, and many conclusions derived from past GeneChip analyses may be significantly flawed. It will be beneficial to re-analyze existing GeneChip data with updated probe set definitions.
  • Item
    Thumbnail Image
    Lineage-specific expansion of proteins exported to erythrocytes in malaria parasites
    Sargeant, TJ ; Marti, M ; Caler, E ; Carlton, JM ; Simpson, K ; Speed, TP ; Cowman, AF (BMC, 2006)
    BACKGROUND: The apicomplexan parasite Plasmodium falciparum causes the most severe form of malaria in humans. After invasion into erythrocytes, asexual parasite stages drastically alter their host cell and export remodeling and virulence proteins. Previously, we have reported identification and functional analysis of a short motif necessary for export of proteins out of the parasite and into the red blood cell. RESULTS: We have developed software for the prediction of exported proteins in the genus Plasmodium, and identified exported proteins conserved between malaria parasites infecting rodents and the two major causes of human malaria, P. falciparum and P. vivax. This conserved 'exportome' is confined to a few subtelomeric chromosomal regions in P. falciparum and the synteny of these and surrounding regions is conserved in P. vivax. We have identified a novel gene family PHIST (for Plasmodium helical interspersed subtelomeric family) that shares a unique domain with 72 paralogs in P. falciparum and 39 in P. vivax; however, there is only one member in each of the three species studied from the P. berghei lineage. CONCLUSION: These data suggest radiation of genes encoding remodeling and virulence factors from a small number of loci in a common Plasmodium ancestor, and imply a closer phylogenetic relationship between the P. vivax and P. falciparum lineages than previously believed. The presence of a conserved 'exportome' in the genus Plasmodium has important implications for our understanding of both common mechanisms and species-specific differences in host-parasite interactions, and may be crucial in developing novel antimalarial drugs to this infectious disease.
  • Item
    Thumbnail Image
    Integrative analysis of RUNX1 downstream pathways and target genes
    Michaud, J ; Simpson, KM ; Escher, R ; Buchet-Poyau, K ; Beissbarth, T ; Carmichael, C ; Ritchie, ME ; Schuetz, F ; Cannon, P ; Liu, M ; Shen, X ; Ito, Y ; Raskind, WH ; Horwitz, MS ; Osato, M ; Turner, DR ; Speed, TP ; Kavallaris, M ; Smyth, GK ; Scott, HS (BMC, 2008-07-31)
    BACKGROUND: The RUNX1 transcription factor gene is frequently mutated in sporadic myeloid and lymphoid leukemia through translocation, point mutation or amplification. It is also responsible for a familial platelet disorder with predisposition to acute myeloid leukemia (FPD-AML). The disruption of the largely unknown biological pathways controlled by RUNX1 is likely to be responsible for the development of leukemia. We have used multiple microarray platforms and bioinformatic techniques to help identify these biological pathways to aid in the understanding of why RUNX1 mutations lead to leukemia. RESULTS: Here we report genes regulated either directly or indirectly by RUNX1 based on the study of gene expression profiles generated from 3 different human and mouse platforms. The platforms used were global gene expression profiling of: 1) cell lines with RUNX1 mutations from FPD-AML patients, 2) over-expression of RUNX1 and CBFbeta, and 3) Runx1 knockout mouse embryos using either cDNA or Affymetrix microarrays. We observe that our datasets (lists of differentially expressed genes) significantly correlate with published microarray data from sporadic AML patients with mutations in either RUNX1 or its cofactor, CBFbeta. A number of biological processes were identified among the differentially expressed genes and functional assays suggest that heterozygous RUNX1 point mutations in patients with FPD-AML impair cell proliferation, microtubule dynamics and possibly genetic stability. In addition, analysis of the regulatory regions of the differentially expressed genes has for the first time systematically identified numerous potential novel RUNX1 target genes. CONCLUSION: This work is the first large-scale study attempting to identify the genetic networks regulated by RUNX1, a master regulator in the development of the hematopoietic system and leukemia. The biological pathways and target genes controlled by RUNX1 will have considerable importance in disease progression in both familial and sporadic leukemia as well as therapeutic implications.
  • Item
    Thumbnail Image
    Differential splicing using whole-transcript microarrays
    Robinson, MD ; Speed, TP (BMC, 2009-05-22)
    BACKGROUND: The latest generation of Affymetrix microarrays are designed to interrogate expression over the entire length of every locus, thus giving the opportunity to study alternative splicing genome-wide. The Exon 1.0 ST (sense target) platform, with versions for Human, Mouse and Rat, is designed primarily to probe every known or predicted exon. The smaller Gene 1.0 ST array is designed as an expression microarray but still interrogates expression with probes along the full length of each well-characterized transcript. We explore the possibility of using the Gene 1.0 ST platform to identify differential splicing events. RESULTS: We propose a strategy to score differential splicing by using the auxiliary information from fitting the statistical model, RMA (robust multichip analysis). RMA partitions the probe-level data into probe effects and expression levels, operating robustly so that if a small number of probes behave differently than the rest, they are downweighted in the fitting step. We argue that adjacent poorly fitting probes for a given sample can be evidence of differential splicing and have designed a statistic to search for this behaviour. Using a public tissue panel dataset, we show many examples of tissue-specific alternative splicing. Furthermore, we show that evidence for putative alternative splicing has a strong correspondence between the Gene 1.0 ST and Exon 1.0 ST platforms. CONCLUSION: We propose a new approach, FIRMAGene, to search for differentially spliced genes using the Gene 1.0 ST platform. Such an analysis complements the search for differential expression. We validate the method by illustrating several known examples and we note some of the challenges in interpreting the probe-level data.Software implementing our methods is freely available as an R package.
  • Item
    Thumbnail Image
    Evolution and comparative analysis of the MHC Class III inflammatory region
    Deakin, JE ; Papenfuss, AT ; Belov, K ; Cross, JGR ; Coggill, P ; Palmer, S ; Sims, S ; Speed, TP ; Beck, S ; Graves, JAM (BMC, 2006-11-02)
    BACKGROUND: The Major Histocompatibility Complex (MHC) is essential for immune function. Historically, it has been subdivided into three regions (Class I, II, and III), but a cluster of functionally related genes within the Class III region has also been referred to as the Class IV region or "inflammatory region". This group of genes is involved in the inflammatory response, and includes members of the tumour necrosis family. Here we report the sequencing, annotation and comparative analysis of a tammar wallaby BAC containing the inflammatory region. We also discuss the extent of sequence conservation across the entire region and identify elements conserved in evolution. RESULTS: Fourteen Class III genes from the tammar wallaby inflammatory region were characterised and compared to their orthologues in other vertebrates. The organisation and sequence of genes in the inflammatory region of both the wallaby and South American opossum are highly conserved compared to known genes from eutherian ("placental") mammals. Some minor differences separate the two marsupial species. Eight genes within the inflammatory region have remained tightly clustered for at least 360 million years, predating the divergence of the amphibian lineage. Analysis of sequence conservation identified 354 elements that are conserved. These range in size from 7 to 431 bases and cover 15.6% of the inflammatory region, representing approximately a 4-fold increase compared to the average for vertebrate genomes. About 5.5% of this conserved sequence is marsupial-specific, including three cases of marsupial-specific repeats. Highly Conserved Elements were also characterised. CONCLUSION: Using comparative analysis, we show that a cluster of MHC genes involved in inflammation, including TNF, LTA (or its putative teleost homolog TNF-N), APOM, and BAT3 have remained together for over 450 million years, predating the divergence of mammals from fish. The observed enrichment in conserved sequences within the inflammatory region suggests conservation at the transcriptional regulatory level, in addition to the functional level.
  • Item
    Thumbnail Image
    Proximal genomic localization of STATI binding and regulated transcriptional activity
    Wormald, S ; Hilton, DJ ; Smyth, GK ; Speed, TP (BMC, 2006-10-11)
    BACKGROUND: Signal transducer and activator of transcription (STAT) proteins are key regulators of gene expression in response to the interferon (IFN) family of anti-viral and anti-microbial cytokines. We have examined the genomic relationship between STAT1 binding and regulated transcription using multiple tiling microarray and chromatin immunoprecipitation microarray (ChIP-chip) experiments from public repositories. RESULTS: In response to IFN-gamma, STAT1 bound proximally to regions of the genome that exhibit regulated transcriptional activity. This finding was consistent between different tiling microarray platforms, and between different measures of transcriptional activity, including differential binding of RNA polymerase II, and differential mRNA transcription. Re-analysis of tiling microarray data from a recent study of IFN-gamma-induced STAT1 ChIP-chip and mRNA expression revealed that STAT1 binding is tightly associated with localized mRNA transcription in response to IFN-gamma. Close relationships were also apparent between STAT1 binding, STAT2 binding, and mRNA transcription in response to IFN-alpha. Furthermore, we found that sites of STAT1 binding within the Encyclopedia of DNA Elements (ENCODE) region are precisely correlated with sites of either enhanced or diminished binding by the RNA polymerase II complex. CONCLUSION: Together, our results indicate that STAT1 binds proximally to regions of the genome that exhibit regulated transcriptional activity. This finding establishes a generalized basis for the positioning of STAT1 binding sites within the genome, and supports a role for STAT1 in the direct recruitment of the RNA polymerase II complex to the promoters of IFN-gamma-responsive genes.
  • Item
    Thumbnail Image
    Rooting a phylogenetic tree with nonreversible substitution models
    Yap, VB ; Speed, T (BMC, 2005-01-04)
    BACKGROUND: We compared two methods of rooting a phylogenetic tree: the stationary and the nonstationary substitution processes. These methods do not require an outgroup. METHODS: Given a multiple alignment and an unrooted tree, the maximum likelihood estimates of branch lengths and substitution parameters for each associated rooted tree are found; rooted trees are compared using their likelihood values. Site variation in substitution rates is handled by assigning sites into several classes before the analysis. RESULTS: In three test datasets where the trees are small and the roots are assumed known, the nonstationary process gets the correct estimate significantly more often, and fits data much better, than the stationary process. Both processes give biologically plausible root placements in a set of nine primate mitochondrial DNA sequences. CONCLUSIONS: The nonstationary process is simple to use and is much better than the stationary process at inferring the root. It could be useful for situations where an outgroup is unavailable.
  • Item
    Thumbnail Image
    Drug and Cell Type-Specific Regulation of Genes with Different Classes of Estrogen Receptor β-Selective Agonists
    Paruthiyil, S ; Cvoro, A ; Zhao, X ; Wu, Z ; Sui, Y ; Staub, RE ; Baggett, S ; Herber, CB ; Griffin, C ; Tagliaferri, M ; Harris, HA ; Cohen, I ; Bjeldanes, LF ; Speed, TP ; Schaufele, F ; Leitman, DC ; Laudet, V (PUBLIC LIBRARY SCIENCE, 2009-07-17)
    Estrogens produce biological effects by interacting with two estrogen receptors, ERalpha and ERbeta. Drugs that selectively target ERalpha or ERbeta might be safer for conditions that have been traditionally treated with non-selective estrogens. Several synthetic and natural ERbeta-selective compounds have been identified. One class of ERbeta-selective agonists is represented by ERB-041 (WAY-202041) which binds to ERbeta much greater than ERalpha. A second class of ERbeta-selective agonists derived from plants include MF101, nyasol and liquiritigenin that bind similarly to both ERs, but only activate transcription with ERbeta. Diarylpropionitrile represents a third class of ERbeta-selective compounds because its selectivity is due to a combination of greater binding to ERbeta and transcriptional activity. However, it is unclear if these three classes of ERbeta-selective compounds produce similar biological activities. The goals of these studies were to determine the relative ERbeta selectivity and pattern of gene expression of these three classes of ERbeta-selective compounds compared to estradiol (E(2)), which is a non-selective ER agonist. U2OS cells stably transfected with ERalpha or ERbeta were treated with E(2) or the ERbeta-selective compounds for 6 h. Microarray data demonstrated that ERB-041, MF101 and liquiritigenin were the most ERbeta-selective agonists compared to estradiol, followed by nyasol and then diarylpropionitrile. FRET analysis showed that all compounds induced a similar conformation of ERbeta, which is consistent with the finding that most genes regulated by the ERbeta-selective compounds were similar to each other and E(2). However, there were some classes of genes differentially regulated by the ERbeta agonists and E(2). Two ERbeta-selective compounds, MF101 and liquiritigenin had cell type-specific effects as they regulated different genes in HeLa, Caco-2 and Ishikawa cell lines expressing ERbeta. Our gene profiling studies demonstrate that while most of the genes were commonly regulated by ERbeta-selective agonists and E(2), there were some genes regulated that were distinct from each other and E(2), suggesting that different ERbeta-selective agonists might produce distinct biological and clinical effects.
  • Item
    Thumbnail Image
    Analysis of gene expression during neurite outgrowth and regeneration
    Szpara, ML ; Vranizan, K ; Tai, YC ; Goodman, CS ; Speed, TP ; Ngai, J (BMC, 2007-11-23)
    BACKGROUND: The ability of a neuron to regenerate functional connections after injury is influenced by both its intrinsic state and also by extrinsic cues in its surroundings. Investigations of the transcriptional changes undergone by neurons during in vivo models of injury and regeneration have revealed many transcripts associated with these processes. Because of the complex milieu of interactions in vivo, these results include not only expression changes directly related to regenerative outgrowth and but also unrelated responses to surrounding cells and signals. In vitro models of neurite outgrowth provide a means to study the intrinsic transcriptional patterns of neurite outgrowth in the absence of extensive extrinsic cues from nearby cells and tissues. RESULTS: We have undertaken a genome-wide study of transcriptional activity in embryonic superior cervical ganglia (SCG) and dorsal root ganglia (DRG) during a time course of neurite outgrowth in vitro. Gene expression observed in these models likely includes both developmental gene expression patterns and regenerative responses to axotomy, which occurs as the result of tissue dissection. Comparison across both models revealed many genes with similar gene expression patterns during neurite outgrowth. These patterns were minimally affected by exposure to the potent inhibitory cue Semaphorin3A, indicating that this extrinsic cue does not exert major effects at the level of nuclear transcription. We also compared our data to several published studies of DRG and SCG gene expression in animal models of regeneration, and found the expression of a large number of genes in common between neurite outgrowth in vitro and regeneration in vivo. CONCLUSION: Many gene expression changes undergone by SCG and DRG during in vitro outgrowth are shared between these two tissue types and in common with in vivo regeneration models. This suggests that the genes identified in this in vitro study may represent new candidates worthy of further study for potential roles in the therapeutic regrowth of neuronal connections.
  • Item
    Thumbnail Image
    FIRMA: a method for detection of alternative splicing from exon array data
    Purdom, E ; Simpson, KM ; Robinson, MD ; Conboy, JG ; Lapuk, AV ; Speed, TP (OXFORD UNIV PRESS, 2008-08-01)
    MOTIVATION: Analyses of EST data show that alternative splicing is much more widespread than once thought. The advent of exon and tiling microarrays means that researchers now have the capacity to experimentally measure alternative splicing on a genome wide level. New methods are needed to analyze the data from these arrays. RESULTS: We present a method, finding isoforms using robust multichip analysis (FIRMA), for detecting differential alternative splicing in exon array data. FIRMA has been developed for Affymetrix exon arrays, but could in principle be extended to other exon arrays, tiling arrays or splice junction arrays. We have evaluated the method using simulated data, and have also applied it to two datasets: a panel of 11 human tissues and a set of 10 pairs of matched normal and tumor colon tissue. FIRMA is able to detect exons in several genes confirmed by reverse transcriptase PCR. AVAILABILITY: R code implementing our methods is contributed to the package aroma.affymetrix.