School of Mathematics and Statistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 10
  • Item
    No Preview Available
    International network of cancer genome projects
    Hudson, TJ ; Anderson, W ; Aretz, A ; Barker, AD ; Bell, C ; Bernabe, RR ; Bhan, MK ; Calvo, F ; Eerola, I ; Gerhard, DS ; Guttmacher, A ; Guyer, M ; Hemsley, FM ; Jennings, JL ; Kerr, D ; Klatt, P ; Kolar, P ; Kusuda, J ; Lane, DP ; Laplace, F ; Lu, Y ; Nettekoven, G ; Ozenberger, B ; Peterson, J ; Rao, TS ; Remacle, J ; Schafer, AJ ; Shibata, T ; Stratton, MR ; Vockley, JG ; Watanabe, K ; Yang, H ; Yuen, MMF ; Knoppers, M ; Bobrow, M ; Cambon-Thomsen, A ; Dressler, LG ; Dyke, SOM ; Joly, Y ; Kato, K ; Kennedy, KL ; Nicolas, P ; Parker, MJ ; Rial-Sebbag, E ; Romeo-Casabona, CM ; Shaw, KM ; Wallace, S ; Wiesner, GL ; Zeps, N ; Lichter, P ; Biankin, AV ; Chabannon, C ; Chin, L ; Clement, B ; de Alava, E ; Degos, F ; Ferguson, ML ; Geary, P ; Hayes, DN ; Johns, AL ; Nakagawa, H ; Penny, R ; Piris, MA ; Sarin, R ; Scarpa, A ; van de Vijver, M ; Futreal, PA ; Aburatani, H ; Bayes, M ; Bowtell, DDL ; Campbell, PJ ; Estivill, X ; Grimmond, SM ; Gut, I ; Hirst, M ; Lopez-Otin, C ; Majumder, P ; Marra, M ; Ning, Z ; Puente, XS ; Ruan, Y ; Stunnenberg, HG ; Swerdlow, H ; Velculescu, VE ; Wilson, RK ; Xue, HH ; Yang, L ; Spellman, PT ; Bader, GD ; Boutros, PC ; Flicek, P ; Getz, G ; Guigo, R ; Guo, G ; Haussler, D ; Heath, S ; Hubbard, TJ ; Jiang, T ; Jones, SM ; Li, Q ; Lopez-Bigas, N ; Luo, R ; Pearson, JV ; Quesada, V ; Raphael, BJ ; Sander, C ; Speed, TP ; Stuart, JM ; Teague, JW ; Totoki, Y ; Tsunoda, T ; Valencia, A ; Wheeler, DA ; Wu, H ; Zhao, S ; Zhou, G ; Stein, LD ; Lathrop, M ; Ouellette, BFF ; Thomas, G ; Yoshida, T ; Axton, M ; Gunter, C ; McPherson, JD ; Miller, LJ ; Kasprzyk, A ; Zhang, J ; Haider, SA ; Wang, J ; Yung, CK ; Cross, A ; Liang, Y ; Gnaneshan, S ; Guberman, J ; Hsu, J ; Chalmers, DRC ; Hasel, KW ; Kaan, TSH ; Knoppers, BM ; Lowrance, WW ; Masui, T ; Rodriguez, LL ; Vergely, C ; Cloonan, N ; Defazio, A ; Eshleman, JR ; Etemadmoghadam, D ; Gardiner, BA ; Kench, JG ; Sutherland, RL ; Tempero, MA ; Waddell, NJ ; Wilson, PJ ; Gallinger, S ; Tsao, M-S ; Shaw, PA ; Petersen, GM ; Mukhopadhyay, D ; DePinho, RA ; Thayer, S ; Muthuswamy, L ; Shazand, K ; Beck, T ; Sam, M ; Timms, L ; Ballin, V ; Ji, J ; Zhang, X ; Chen, F ; Hu, X ; Yang, Q ; Tian, G ; Zhang, L ; Xing, X ; Li, X ; Zhu, Z ; Yu, Y ; Yu, J ; Tost, J ; Brennan, P ; Holcatova, I ; Zaridze, D ; Brazma, A ; Egevad, L ; Prokhortchouk, E ; Banks, RE ; Uhlen, M ; Viksna, J ; Ponten, F ; Skryabin, K ; Birney, E ; Borg, A ; Borresen-Dale, A-L ; Caldas, C ; Foekens, JA ; Martin, S ; Reis-Filho, JS ; Richardson, AL ; Sotiriou, C ; van't Veer, L ; Birnbaum, D ; Blanche, H ; Boucher, P ; Boyault, S ; Masson-Jacquemier, JD ; Pauporte, I ; Pivot, X ; Vincent-Salomon, A ; Tabone, E ; Theillet, C ; Treilleux, I ; Bioulac-Sage, P ; Decaens, T ; Franco, D ; Gut, M ; Samuel, D ; Zucman-Rossi, J ; Eils, R ; Brors, B ; Korbel, JO ; Korshunov, A ; Landgraf, P ; Lehrach, H ; Pfister, S ; Radlwimmer, B ; Reifenberger, G ; Taylor, MD ; von Kalle, C ; Majumder, PP ; Pederzoli, P ; Lawlor, RT ; Delledonne, M ; Bardelli, A ; Gress, T ; Klimstra, D ; Zamboni, G ; Nakamura, Y ; Miyano, S ; Fujimoto, A ; Campo, E ; de Sanjose, S ; Montserrat, E ; Gonzalez-Diaz, M ; Jares, P ; Himmelbaue, H ; Bea, S ; Aparicio, S ; Easton, DF ; Collins, FS ; Compton, CC ; Lander, ES ; Burke, W ; Green, AR ; Hamilton, SR ; Kallioniemi, OP ; Ley, TJ ; Liu, ET ; Wainwright, BJ (NATURE PORTFOLIO, 2010-04-15)
    The International Cancer Genome Consortium (ICGC) was launched to coordinate large-scale cancer genome studies in tumours from 50 different cancer types and/or subtypes that are of clinical and societal importance across the globe. Systematic studies of more than 25,000 cancer genomes at the genomic, epigenomic and transcriptomic levels will reveal the repertoire of oncogenic mutations, uncover traces of the mutagenic influences, define clinically relevant subtypes for prognosis and therapeutic management, and enable the development of new cancer therapies.
  • Item
    Thumbnail Image
    Copy Number Variation in Patients with Disorders of Sex Development Due to 46,XY Gonadal Dysgenesis
    White, S ; Ohnesorg, T ; Notini, A ; Roeszler, K ; Hewitt, J ; Daggag, H ; Smith, C ; Turbitt, E ; Gustin, S ; van den Bergen, J ; Miles, D ; Western, P ; Arboleda, V ; Schumacher, V ; Gordon, L ; Bell, K ; Bengtsson, H ; Speed, T ; Hutson, J ; Warne, G ; Harley, V ; Koopman, P ; Vilain, E ; Sinclair, A ; Orban, L (PUBLIC LIBRARY SCIENCE, 2011-03-07)
    Disorders of sex development (DSD), ranging in severity from mild genital abnormalities to complete sex reversal, represent a major concern for patients and their families. DSD are often due to disruption of the genetic programs that regulate gonad development. Although some genes have been identified in these developmental pathways, the causative mutations have not been identified in more than 50% 46,XY DSD cases. We used the Affymetrix Genome-Wide Human SNP Array 6.0 to analyse copy number variation in 23 individuals with unexplained 46,XY DSD due to gonadal dysgenesis (GD). Here we describe three discrete changes in copy number that are the likely cause of the GD. Firstly, we identified a large duplication on the X chromosome that included DAX1 (NR0B1). Secondly, we identified a rearrangement that appears to affect a novel gonad-specific regulatory region in a known testis gene, SOX9. Surprisingly this patient lacked any signs of campomelic dysplasia, suggesting that the deletion affected expression of SOX9 only in the gonad. Functional analysis of potential SRY binding sites within this deleted region identified five putative enhancers, suggesting that sequences additional to the known SRY-binding TES enhancer influence human testis-specific SOX9 expression. Thirdly, we identified a small deletion immediately downstream of GATA4, supporting a role for GATA4 in gonad development in humans. These CNV analyses give new insights into the pathways involved in human gonad development and dysfunction, and suggest that rearrangements of non-coding sequences disturbing gene regulation may account for significant proportion of DSD cases.
  • Item
    Thumbnail Image
    Estrogenic Plant Extracts Reverse Weight Gain and Fat Accumulation without Causing Mammary Gland or Uterine Proliferation
    Saunier, EF ; Vivar, OI ; Rubenstein, A ; Zhao, X ; Olshansky, M ; Baggett, S ; Staub, RE ; Tagliaferri, M ; Cohen, I ; Speed, TP ; Baxter, JD ; Leitman, DC ; Laudet, V (PUBLIC LIBRARY SCIENCE, 2011-12-07)
    Long-term estrogen deficiency increases the risk of obesity, diabetes and metabolic syndrome in postmenopausal women. Menopausal hormone therapy containing estrogens might prevent these conditions, but its prolonged use increases the risk of breast cancer, as wells as endometrial cancer if used without progestins. Animal studies indicate that beneficial effects of estrogens in adipose tissue and adverse effects on mammary gland and uterus are mediated by estrogen receptor alpha (ERα). One strategy to improve the safety of estrogens to prevent/treat obesity, diabetes and metabolic syndrome is to develop estrogens that act as agonists in adipose tissue, but not in mammary gland and uterus. We considered plant extracts, which have been the source of many pharmaceuticals, as a source of tissue selective estrogens. Extracts from two plants, Glycyrrhiza uralensis (RG) and Pueraria montana var. lobata (RP) bound to ERα, activated ERα responsive reporters, and reversed weight gain and fat accumulation comparable to estradiol in ovariectomized obese mice maintained on a high fat diet. Unlike estradiol, RG and RP did not induce proliferative effects on mammary gland and uterus. Gene expression profiling demonstrated that RG and RP induced estradiol-like regulation of genes in abdominal fat, but not in mammary gland and uterus. The compounds in extracts from RG and RP might constitute a new class of tissue selective estrogens to reverse weight gain, fat accumulation and metabolic syndrome in postmenopausal women.
  • Item
    Thumbnail Image
    Identification of rare DNA variants in mitochondrial disorders with improved array-based sequencing
    Wang, W ; Shen, P ; Thiyagarajan, S ; Lin, S ; Palm, C ; Horvath, R ; Klopstock, T ; Cutler, D ; Pique, L ; Schrijver, I ; Davis, RW ; Mindrinos, M ; Speed, TP ; Scharfe, C (OXFORD UNIV PRESS, 2011-01)
    A common goal in the discovery of rare functional DNA variants via medical resequencing is to incur a relatively lower proportion of false positive base-calls. We developed a novel statistical method for resequencing arrays (SRMA, sequence robust multi-array analysis) to increase the accuracy of detecting rare variants and reduce the costs in subsequent sequence verifications required in medical applications. SRMA includes single and multi-array analysis and accounts for technical variables as well as the possibility of both low- and high-frequency genomic variation. The confidence of each base-call was ranked using two quality measures. In comparison to Sanger capillary sequencing, we achieved a false discovery rate of 2% (false positive rate 1.2 × 10⁻⁵, false negative rate 5%), which is similar to automated second-generation sequencing technologies. Applied to the analysis of 39 nuclear candidate genes in disorders of mitochondrial DNA (mtDNA) maintenance, we confirmed mutations in the DNA polymerase gamma POLG in positive control cases, and identified novel rare variants in previously undiagnosed cases in the mitochondrial topoisomerase TOP1MT, the mismatch repair enzyme MUTYH, and the apurinic-apyrimidinic endonuclease APEX2. Some patients carried rare heterozygous variants in several functionally interacting genes, which could indicate synergistic genetic effects in these clinically similar disorders.
  • Item
    Thumbnail Image
    Genome-Wide Analysis of Glucocorticoid Receptor Binding Regions in Adipocytes Reveal Gene Network Involved in Triglyceride Homeostasis
    Yu, C-Y ; Mayba, O ; Lee, JV ; Tran, J ; Harris, C ; Speed, TP ; Wang, J-C ; Pazin, MJ (PUBLIC LIBRARY SCIENCE, 2010-12-20)
    Glucocorticoids play important roles in the regulation of distinct aspects of adipocyte biology. Excess glucocorticoids in adipocytes are associated with metabolic disorders, including central obesity, insulin resistance and dyslipidemia. To understand the mechanisms underlying the glucocorticoid action in adipocytes, we used chromatin immunoprecipitation sequencing to isolate genome-wide glucocorticoid receptor (GR) binding regions (GBRs) in 3T3-L1 adipocytes. Furthermore, gene expression analyses were used to identify genes that were regulated by glucocorticoids. Overall, 274 glucocorticoid-regulated genes contain or locate nearby GBR. We found that many GBRs were located in or nearby genes involved in triglyceride (TG) synthesis (Scd-1, 2, 3, GPAT3, GPAT4, Agpat2, Lpin1), lipolysis (Lipe, Mgll), lipid transport (Cd36, Lrp-1, Vldlr, Slc27a2) and storage (S3-12). Gene expression analysis showed that except for Scd-3, the other 13 genes were induced in mouse inguinal fat upon 4-day glucocorticoid treatment. Reporter gene assays showed that except Agpat2, the other 12 glucocorticoid-regulated genes contain at least one GBR that can mediate hormone response. In agreement with the fact that glucocorticoids activated genes in both TG biosynthetic and lipolytic pathways, we confirmed that 4-day glucocorticoid treatment increased TG synthesis and lipolysis concomitantly in inguinal fat. Notably, we found that 9 of these 12 genes were induced in transgenic mice that have constant elevated plasma glucocorticoid levels. These results suggested that a similar mechanism was used to regulate TG homeostasis during chronic glucocorticoid treatment. In summary, our studies have identified molecular components in a glucocorticoid-controlled gene network involved in the regulation of TG homeostasis in adipocytes. Understanding the regulation of this gene network should provide important insight for future therapeutic developments for metabolic diseases.
  • Item
    Thumbnail Image
    Conserved Role of unc-79 in Ethanol Responses in Lightweight Mutant Mice
    Speca, DJ ; Chihara, D ; Ashique, AM ; Bowers, MS ; Pierce-Shimomura, JT ; Lee, J ; Rabbee, N ; Speed, TP ; Gularte, RJ ; Chitwood, J ; Medrano, JF ; Liao, M ; Sonner, JM ; Eger, EI ; Peterson, AS ; McIntire, SL ; Beier, DR (PUBLIC LIBRARY SCIENCE, 2010-08)
    The mechanisms by which ethanol and inhaled anesthetics influence the nervous system are poorly understood. Here we describe the positional cloning and characterization of a new mouse mutation isolated in an N-ethyl-N-nitrosourea (ENU) forward mutagenesis screen for animals with enhanced locomotor activity. This allele, Lightweight (Lwt), disrupts the homolog of the Caenorhabditis elegans (C. elegans) unc-79 gene. While Lwt/Lwt homozygotes are perinatal lethal, Lightweight heterozygotes are dramatically hypersensitive to acute ethanol exposure. Experiments in C. elegans demonstrate a conserved hypersensitivity to ethanol in unc-79 mutants and extend this observation to the related unc-80 mutant and nca-1;nca-2 double mutants. Lightweight heterozygotes also exhibit an altered response to the anesthetic isoflurane, reminiscent of unc-79 invertebrate mutant phenotypes. Consistent with our initial mapping results, Lightweight heterozygotes are mildly hyperactive when exposed to a novel environment and are smaller than wild-type animals. In addition, Lightweight heterozygotes exhibit increased food consumption yet have a leaner body composition. Interestingly, Lightweight heterozygotes voluntarily consume more ethanol than wild-type littermates. The acute hypersensitivity to and increased voluntary consumption of ethanol observed in Lightweight heterozygous mice in combination with the observed hypersensitivity to ethanol in C. elegans unc-79, unc-80, and nca-1;nca-2 double mutants suggests a novel conserved pathway that might influence alcohol-related behaviors in humans.
  • Item
    Thumbnail Image
    Summarizing and correcting the GC content bias in high-throughput sequencing
    Benjamini, Y ; Speed, TP (OXFORD UNIV PRESS, 2012-05)
    GC content bias describes the dependence between fragment count (read coverage) and GC content found in Illumina sequencing data. This bias can dominate the signal of interest for analyses that focus on measuring fragment abundance within a genome, such as copy number estimation (DNA-seq). The bias is not consistent between samples; and there is no consensus as to the best methods to remove it in a single sample. We analyze regularities in the GC bias patterns, and find a compact description for this unimodal curve family. It is the GC content of the full DNA fragment, not only the sequenced read, that most influences fragment count. This GC effect is unimodal: both GC-rich fragments and AT-rich fragments are underrepresented in the sequencing results. This empirical evidence strengthens the hypothesis that PCR is the most important cause of the GC bias. We propose a model that produces predictions at the base pair level, allowing strand-specific GC-effect correction regardless of the downstream smoothing or binning. These GC modeling considerations can inform other high-throughput sequencing analyses such as ChIP-seq and RNA-seq.
  • Item
    Thumbnail Image
    Unifying Gene Expression Measures from Multiple Platforms Using Factor Analysis
    Wang, XV ; Verhaak, RGW ; Purdom, E ; Spellman, PT ; Speed, TP ; Aerts, S (PUBLIC LIBRARY SCIENCE, 2011-03-11)
    In the Cancer Genome Atlas (TCGA) project, gene expression of the same set of samples is measured multiple times on different microarray platforms. There are two main advantages to combining these measurements. First, we have the opportunity to obtain a more precise and accurate estimate of expression levels than using the individual platforms alone. Second, the combined measure simplifies downstream analysis by eliminating the need to work with three sets of expression measures and to consolidate results from the three platforms.We propose to use factor analysis (FA) to obtain a unified gene expression measure (UE) from multiple platforms. The UE is a weighted average of the three platforms, and is shown to perform well in terms of accuracy and precision. In addition, the FA model produces parameter estimates that allow the assessment of the model fit.The R code is provided in File S2. Gene-level FA measurements for the TCGA data sets are available from http://tcga-data.nci.nih.gov/docs/publications/unified_expression/.
  • Item
    Thumbnail Image
    TumorBoost: Normalization of allele-specific tumor copy numbers from a single pair of tumor-normal genotyping microarrays
    Bengtsson, H ; Neuvial, P ; Speed, TP (BMC, 2010-05-12)
    BACKGROUND: High-throughput genotyping microarrays assess both total DNA copy number and allelic composition, which makes them a tool of choice for copy number studies in cancer, including total copy number and loss of heterozygosity (LOH) analyses. Even after state of the art preprocessing methods, allelic signal estimates from genotyping arrays still suffer from systematic effects that make them difficult to use effectively for such downstream analyses. RESULTS: We propose a method, TumorBoost, for normalizing allelic estimates of one tumor sample based on estimates from a single matched normal. The method applies to any paired tumor-normal estimates from any microarray-based technology, combined with any preprocessing method. We demonstrate that it increases the signal-to-noise ratio of allelic signals, making it significantly easier to detect allelic imbalances. CONCLUSIONS: TumorBoost increases the power to detect somatic copy-number events (including copy-neutral LOH) in the tumor from allelic signals of Affymetrix or Illumina origin. We also conclude that high-precision allelic estimates can be obtained from a single pair of tumor-normal hybridizations, if TumorBoost is combined with single-array preprocessing methods such as (allele-specific) CRMA v2 for Affymetrix or BeadStudio's (proprietary) XY-normalization method for Illumina. A bounded-memory implementation is available in the open-source and cross-platform R package aroma.cn, which is part of the Aroma Project (http://www.aroma-project.org/).
  • Item
    Thumbnail Image
    Identification of Candidate Growth Promoting Genes in Ovarian Cancer through Integrated Copy Number and Expression Analysis
    Ramakrishna, M ; Williams, LH ; Boyle, SE ; Bearfoot, JL ; Sridhar, A ; Speed, TP ; Gorringe, KL ; Campbell, IG ; Tan, P (PUBLIC LIBRARY SCIENCE, 2010-04-08)
    Ovarian cancer is a disease characterised by complex genomic rearrangements but the majority of the genes that are the target of these alterations remain unidentified. Cataloguing these target genes will provide useful insights into the disease etiology and may provide an opportunity to develop novel diagnostic and therapeutic interventions. High resolution genome wide copy number and matching expression data from 68 primary epithelial ovarian carcinomas of various histotypes was integrated to identify genes in regions of most frequent amplification with the strongest correlation with expression and copy number. Regions on chromosomes 3, 7, 8, and 20 were most frequently increased in copy number (> 40% of samples). Within these regions, 703/1370 (51%) unique gene expression probesets were differentially expressed when samples with gain were compared to samples without gain. 30% of these differentially expressed probesets also showed a strong positive correlation (r > or =0.6) between expression and copy number. We also identified 21 regions of high amplitude copy number gain, in which 32 known protein coding genes showed a strong positive correlation between expression and copy number. Overall, our data validates previously known ovarian cancer genes, such as ERBB2, and also identified novel potential drivers such as MYNN, PUF60 and TPX2.