School of Mathematics and Statistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 5 of 5
  • Item
    Thumbnail Image
    Differential splicing using whole-transcript microarrays
    Robinson, MD ; Speed, TP (BMC, 2009-05-22)
    BACKGROUND: The latest generation of Affymetrix microarrays are designed to interrogate expression over the entire length of every locus, thus giving the opportunity to study alternative splicing genome-wide. The Exon 1.0 ST (sense target) platform, with versions for Human, Mouse and Rat, is designed primarily to probe every known or predicted exon. The smaller Gene 1.0 ST array is designed as an expression microarray but still interrogates expression with probes along the full length of each well-characterized transcript. We explore the possibility of using the Gene 1.0 ST platform to identify differential splicing events. RESULTS: We propose a strategy to score differential splicing by using the auxiliary information from fitting the statistical model, RMA (robust multichip analysis). RMA partitions the probe-level data into probe effects and expression levels, operating robustly so that if a small number of probes behave differently than the rest, they are downweighted in the fitting step. We argue that adjacent poorly fitting probes for a given sample can be evidence of differential splicing and have designed a statistic to search for this behaviour. Using a public tissue panel dataset, we show many examples of tissue-specific alternative splicing. Furthermore, we show that evidence for putative alternative splicing has a strong correspondence between the Gene 1.0 ST and Exon 1.0 ST platforms. CONCLUSION: We propose a new approach, FIRMAGene, to search for differentially spliced genes using the Gene 1.0 ST platform. Such an analysis complements the search for differential expression. We validate the method by illustrating several known examples and we note some of the challenges in interpreting the probe-level data.Software implementing our methods is freely available as an R package.
  • Item
    Thumbnail Image
    Drug and Cell Type-Specific Regulation of Genes with Different Classes of Estrogen Receptor β-Selective Agonists
    Paruthiyil, S ; Cvoro, A ; Zhao, X ; Wu, Z ; Sui, Y ; Staub, RE ; Baggett, S ; Herber, CB ; Griffin, C ; Tagliaferri, M ; Harris, HA ; Cohen, I ; Bjeldanes, LF ; Speed, TP ; Schaufele, F ; Leitman, DC ; Laudet, V (PUBLIC LIBRARY SCIENCE, 2009-07-17)
    Estrogens produce biological effects by interacting with two estrogen receptors, ERalpha and ERbeta. Drugs that selectively target ERalpha or ERbeta might be safer for conditions that have been traditionally treated with non-selective estrogens. Several synthetic and natural ERbeta-selective compounds have been identified. One class of ERbeta-selective agonists is represented by ERB-041 (WAY-202041) which binds to ERbeta much greater than ERalpha. A second class of ERbeta-selective agonists derived from plants include MF101, nyasol and liquiritigenin that bind similarly to both ERs, but only activate transcription with ERbeta. Diarylpropionitrile represents a third class of ERbeta-selective compounds because its selectivity is due to a combination of greater binding to ERbeta and transcriptional activity. However, it is unclear if these three classes of ERbeta-selective compounds produce similar biological activities. The goals of these studies were to determine the relative ERbeta selectivity and pattern of gene expression of these three classes of ERbeta-selective compounds compared to estradiol (E(2)), which is a non-selective ER agonist. U2OS cells stably transfected with ERalpha or ERbeta were treated with E(2) or the ERbeta-selective compounds for 6 h. Microarray data demonstrated that ERB-041, MF101 and liquiritigenin were the most ERbeta-selective agonists compared to estradiol, followed by nyasol and then diarylpropionitrile. FRET analysis showed that all compounds induced a similar conformation of ERbeta, which is consistent with the finding that most genes regulated by the ERbeta-selective compounds were similar to each other and E(2). However, there were some classes of genes differentially regulated by the ERbeta agonists and E(2). Two ERbeta-selective compounds, MF101 and liquiritigenin had cell type-specific effects as they regulated different genes in HeLa, Caco-2 and Ishikawa cell lines expressing ERbeta. Our gene profiling studies demonstrate that while most of the genes were commonly regulated by ERbeta-selective agonists and E(2), there were some genes regulated that were distinct from each other and E(2), suggesting that different ERbeta-selective agonists might produce distinct biological and clinical effects.
  • Item
    Thumbnail Image
    A single-array preprocessing method for estimating full-resolution raw copy numbers from all Affymetrix genotyping arrays including GenomeWideSNP 5 & 6
    Bengtsson, H ; Wirapati, P ; Speed, TP (OXFORD UNIV PRESS, 2009-09-01)
    MOTIVATION: High-resolution copy-number (CN) analysis has in recent years gained much attention, not only for the purpose of identifying CN aberrations associated with a certain phenotype, but also for identifying CN polymorphisms. In order for such studies to be successful and cost effective, the statistical methods have to be optimized. We propose a single-array preprocessing method for estimating full-resolution total CNs. It is applicable to all Affymetrix genotyping arrays, including the recent ones that also contain non-polymorphic probes. A reference signal is only needed at the last step when calculating relative CNs. RESULTS: As with our method for earlier generations of arrays, this one controls for allelic crosstalk, probe affinities and PCR fragment-length effects. Additionally, it also corrects for probe sequence effects and co-hybridization of fragments digested by multiple enzymes that takes place on the latest chips. We compare our method with Affymetrix's CN5 method and the dChip method by assessing how well they differentiate between various CN states at the full resolution and various amounts of smoothing. Although CRMA v2 is a single-array method, we observe that it performs as well as or better than alternative methods that use data from all arrays for their preprocessing. This shows that it is possible to do online analysis in large-scale projects where additional arrays are introduced over time.
  • Item
    Thumbnail Image
    A single-sample method for normalizing and combining full-resolution copy numbers from multiple platforms, labs and analysis methods
    Bengtsson, H ; Ray, A ; Spellman, P ; Speed, TP (OXFORD UNIV PRESS, 2009-04-01)
    MOTIVATION: The rapid expansion of whole-genome copy number (CN) studies brings a demand for increased precision and resolution of CN estimates. Recent studies have obtained CN estimates from more than one platform for the same set of samples, and it is natural to want to combine the different estimates in order to meet this demand. Estimates from different platforms show different degrees of attenuation of the true CN changes. Similar differences can be observed in CNs from the same platform run in different labs, or in the same lab, with different analytical methods. This is the reason why it is not straightforward to combine CN estimates from different sources (platforms, labs and analysis methods). RESULTS: We propose a single-sample multi source normalization that brings full-resolution CN estimates to the same scale across sources. The normalized CNs are such that for any underlying CN level, their mean level is the same regardless of the source, which make them better suited for being combined across sources, e.g. existing segmentation methods may be used to identify aberrant regions. We use microarray-based CN estimates from 'The Cancer Genome Atlas' (TCGA) project to illustrate and validate the method. We show that the normalized and combined data better separate two CN states at a given resolution. We conclude that it is possible to combine CNs from multiple sources such that the resolution becomes effectively larger, and when multiple platforms are combined, they also enhance the genome coverage by complementing each other in different regions. AVAILABILITY: A bounded-memory implementation is available in aroma.cn.
  • Item
    Thumbnail Image
    Analysis of the platypus genome suggests a transposon origin for mammalian imprinting
    Pask, AJ ; Papenfuss, AT ; Ager, EI ; Mccoll, KA ; Speed, TP ; Renfree, MB (BIOMED CENTRAL LTD, 2009)
    BACKGROUND: Genomic imprinting is an epigenetic phenomenon that results in monoallelic gene expression. Many hypotheses have been advanced to explain why genomic imprinting evolved in mammals, but few have examined how it arose. The host defence hypothesis suggests that imprinting evolved from existing mechanisms within the cell that act to silence foreign DNA elements that insert into the genome. However, the changes to the mammalian genome that accompanied the evolution of imprinting have been hard to define due to the absence of large scale genomic resources between all extant classes. The recent release of the platypus genome has provided the first opportunity to perform comparisons between prototherian (monotreme; which appear to lack imprinting) and therian (marsupial and eutherian; which have imprinting) mammals. RESULTS: We compared the distribution of repeat elements known to attract epigenetic silencing across the entire genome from monotremes and therian mammals, particularly focusing on the orthologous imprinted regions. There is a significant accumulation of certain repeat elements within imprinted regions of therian mammals compared to the platypus. CONCLUSIONS: Our analyses show that the platypus has significantly fewer repeats of certain classes in the regions of the genome that have become imprinted in therian mammals. The accumulation of repeats, especially long terminal repeats and DNA elements, in therian imprinted genes and gene clusters is coincident with, and may have been a potential driving force in, the development of mammalian genomic imprinting. These data provide strong support for the host defence hypothesis.