School of Mathematics and Statistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 47
  • Item
    No Preview Available
    Divergent molecular networks program functionally distinct CD8+ skin-resident memory T cells
    Park, SL ; Christo, SN ; Wells, AC ; Gandolfo, LC ; Zaid, A ; Alexandre, YO ; Burn, TN ; Schroeder, J ; Collins, N ; Han, S-J ; Guillaume, SM ; Evrard, M ; Castellucci, C ; Davies, B ; Osman, M ; Obers, A ; McDonald, KM ; Wang, H ; Mueller, SN ; Kannourakis, G ; Berzins, SP ; Mielke, LA ; Carbone, FR ; Kallies, A ; Speed, TP ; Belkaid, Y ; Mackay, LK (AMER ASSOC ADVANCEMENT SCIENCE, 2023-12-01)
    Skin-resident CD8+ T cells include distinct interferon-γ-producing [tissue-resident memory T type 1 (TRM1)] and interleukin-17 (IL-17)-producing (TRM17) subsets that differentially contribute to immune responses. However, whether these populations use common mechanisms to establish tissue residence is unknown. In this work, we show that TRM1 and TRM17 cells navigate divergent trajectories to acquire tissue residency in the skin. TRM1 cells depend on a T-bet-Hobit-IL-15 axis, whereas TRM17 cells develop independently of these factors. Instead, c-Maf commands a tissue-resident program in TRM17 cells parallel to that induced by Hobit in TRM1 cells, with an ICOS-c-Maf-IL-7 axis pivotal to TRM17 cell commitment. Accordingly, by targeting this pathway, skin TRM17 cells can be ablated without compromising their TRM1 counterparts. Thus, skin-resident T cells rely on distinct molecular circuitries, which can be exploited to strategically modulate local immunity.
  • Item
    No Preview Available
    Runx3 drives a CD8+ T cell tissue residency program that is absent in CD4+ T cells
    Fonseca, R ; Burn, TN ; Gandolfo, LC ; Devi, S ; Park, SL ; Obers, A ; Evrard, M ; Christo, SN ; Buquicchio, FA ; Lareau, CA ; McDonald, KM ; Sandford, SK ; Zamudio, NM ; Zanluqui, NG ; Zaid, A ; Speed, TP ; Satpathy, AT ; Mueller, SN ; Carbone, FR ; Mackay, LK (NATURE PORTFOLIO, 2022-08)
    Tissue-resident memory T cells (TRM cells) provide rapid and superior control of localized infections. While the transcription factor Runx3 is a critical regulator of CD8+ T cell tissue residency, its expression is repressed in CD4+ T cells. Here, we show that, as a direct consequence of this Runx3-deficiency, CD4+ TRM cells lacked the transforming growth factor (TGF)-β-responsive transcriptional network that underpins the tissue residency of epithelial CD8+ TRM cells. While CD4+ TRM cell formation required Runx1, this, along with the modest expression of Runx3 in CD4+ TRM cells, was insufficient to engage the TGF-β-driven residency program. Ectopic expression of Runx3 in CD4+ T cells incited this TGF-β-transcriptional network to promote prolonged survival, decreased tissue egress, a microanatomical redistribution towards epithelial layers and enhanced effector functionality. Thus, our results reveal distinct programming of tissue residency in CD8+ and CD4+ TRM cell subsets that is attributable to divergent Runx3 activity.
  • Item
    No Preview Available
    Stem cell plasticity, acetylation of H3K14, and de novo gene activation rely on KAT7.
    Kueh, AJ ; Bergamasco, MI ; Quaglieri, A ; Phipson, B ; Li-Wai-Suen, CSN ; Lönnstedt, IM ; Hu, Y ; Feng, Z-P ; Woodruff, C ; May, RE ; Wilcox, S ; Garnham, AL ; Snyder, MP ; Smyth, GK ; Speed, TP ; Thomas, T ; Voss, AK (Elsevier BV, 2023-01-31)
    In the conventional model of transcriptional activation, transcription factors bind to response elements and recruit co-factors, including histone acetyltransferases. Contrary to this model, we show that the histone acetyltransferase KAT7 (HBO1/MYST2) is required genome wide for histone H3 lysine 14 acetylation (H3K14ac). Examining neural stem cells, we find that KAT7 and H3K14ac are present not only at transcribed genes but also at inactive genes, intergenic regions, and in heterochromatin. KAT7 and H3K14ac were not required for the continued transcription of genes that were actively transcribed at the time of loss of KAT7 but indispensable for the activation of repressed genes. The absence of KAT7 abrogates neural stem cell plasticity, diverse differentiation pathways, and cerebral cortex development. Re-expression of KAT7 restored stem cell developmental potential. Overexpression of KAT7 enhanced neuron and oligodendrocyte differentiation. Our data suggest that KAT7 prepares chromatin for transcriptional activation and is a prerequisite for gene activation.
  • Item
    No Preview Available
    Spatial analysis with SPIAT and spaSim to characterize and simulate tissue microenvironments
    Feng, Y ; Yang, T ; Zhu, J ; Li, M ; Doyle, M ; Ozcoban, V ; Bass, GTT ; Pizzolla, A ; Cain, L ; Weng, S ; Pasam, A ; Kocovski, N ; Huang, Y-K ; Keam, SPP ; Speed, TPP ; Neeson, PJ ; Pearson, RBB ; Sandhu, S ; Goode, DLL ; Trigos, ASS (NATURE PORTFOLIO, 2023-05-15)
    Spatial proteomics technologies have revealed an underappreciated link between the location of cells in tissue microenvironments and the underlying biology and clinical features, but there is significant lag in the development of downstream analysis methods and benchmarking tools. Here we present SPIAT (spatial image analysis of tissues), a spatial-platform agnostic toolkit with a suite of spatial analysis algorithms, and spaSim (spatial simulator), a simulator of tissue spatial data. SPIAT includes multiple colocalization, neighborhood and spatial heterogeneity metrics to characterize the spatial patterns of cells. Ten spatial metrics of SPIAT are benchmarked using simulated data generated with spaSim. We show how SPIAT can uncover cancer immune subtypes correlated with prognosis in cancer and characterize cell dysfunction in diabetes. Our results suggest SPIAT and spaSim as useful tools for quantifying spatial patterns, identifying and validating correlates of clinical outcomes and supporting method development.
  • Item
    Thumbnail Image
    Estimation of tumor cell total mRNA expression in 15 cancer types predicts disease progression
    Cao, S ; Wang, JR ; Ji, S ; Yang, P ; Dai, Y ; Guo, S ; Montierth, MD ; Shen, JP ; Zhao, X ; Chen, J ; Lee, JJ ; Guerrero, PA ; Spetsieris, N ; Engedal, N ; Taavitsainen, S ; Yu, K ; Livingstone, J ; Bhandari, V ; Hubert, SM ; Daw, NC ; Futreal, PA ; Efstathiou, E ; Lim, B ; Viale, A ; Zhang, J ; Nykter, M ; Czerniak, BA ; Brown, PH ; Swanton, C ; Msaouel, P ; Maitra, A ; Kopetz, S ; Campbell, P ; Speed, TP ; Boutros, PC ; Zhu, H ; Urbanucci, A ; Demeulemeester, J ; Van Loo, P ; Wang, W (NATURE PORTFOLIO, 2022-11)
    Single-cell RNA sequencing studies have suggested that total mRNA content correlates with tumor phenotypes. Technical and analytical challenges, however, have so far impeded at-scale pan-cancer examination of total mRNA content. Here we present a method to quantify tumor-specific total mRNA expression (TmS) from bulk sequencing data, taking into account tumor transcript proportion, purity and ploidy, which are estimated through transcriptomic/genomic deconvolution. We estimate and validate TmS in 6,590 patient tumors across 15 cancer types, identifying significant inter-tumor variability. Across cancers, high TmS is associated with increased risk of disease progression and death. TmS is influenced by cancer-specific patterns of gene alteration and intra-tumor genetic heterogeneity as well as by pan-cancer trends in metabolic dysregulation. Taken together, our results indicate that measuring cell-type-specific total mRNA expression in tumor cells predicts tumor phenotypes and clinical outcomes.
  • Item
    Thumbnail Image
    Removing unwanted variation from large-scale RNA sequencing data with PRPS
    Molania, R ; Foroutan, M ; Gagnon-Bartsch, JA ; Gandolfo, LC ; Jain, A ; Sinha, A ; Olshansky, G ; Dobrovic, A ; Papenfuss, AT ; Speed, TP (NATURE PORTFOLIO, 2023-01)
    Accurate identification and effective removal of unwanted variation is essential to derive meaningful biological results from RNA sequencing (RNA-seq) data, especially when the data come from large and complex studies. Using RNA-seq data from The Cancer Genome Atlas (TCGA), we examined several sources of unwanted variation and demonstrate here how these can significantly compromise various downstream analyses, including cancer subtype identification, association between gene expression and survival outcomes and gene co-expression analysis. We propose a strategy, called pseudo-replicates of pseudo-samples (PRPS), for deploying our recently developed normalization method, called removing unwanted variation III (RUV-III), to remove the variation caused by library size, tumor purity and batch effects in TCGA RNA-seq data. We illustrate the value of our approach by comparing it to the standard TCGA normalizations on several TCGA RNA-seq datasets. RUV-III with PRPS can be used to integrate and normalize other large transcriptomic datasets coming from multiple laboratories or platforms.
  • Item
    Thumbnail Image
    RUV-III-NB: normalization of single cell RNA-seq data
    Salim, A ; Molania, R ; Wang, J ; De Livera, A ; Thijssen, R ; Speed, TP (OXFORD UNIV PRESS, 2022-09-09)
    Normalization of single cell RNA-seq data remains a challenging task. The performance of different methods can vary greatly between datasets when unwanted factors and biology are associated. Most normalization methods also only remove the effects of unwanted variation for the cell embedding but not from gene-level data typically used for differential expression (DE) analysis to identify marker genes. We propose RUV-III-NB, a method that can be used to remove unwanted variation from both the cell embedding and gene-level counts. Using pseudo-replicates, RUV-III-NB explicitly takes into account potential association with biology when removing unwanted variation. The method can be used for both UMI or read counts and returns adjusted counts that can be used for downstream analyses such as clustering, DE and pseudotime analyses. Using published datasets with different technological platforms, kinds of biology and levels of association between biology and unwanted variation, we show that RUV-III-NB manages to remove library size and batch effects, strengthen biological signals, improve DE analyses, and lead to results exhibiting greater concordance with independent datasets of the same kind. The performance of RUV-III-NB is consistent and is not sensitive to the number of factors assumed to contribute to the unwanted variation.
  • Item
    Thumbnail Image
    A hierarchical approach to removal of unwanted variation for large-scale metabolomics data
    Kim, T ; Tang, O ; Vernon, ST ; Kott, KA ; Koay, YC ; Park, J ; James, DE ; Grieve, SM ; Speed, TP ; Yang, P ; Figtree, GA ; O'Sullivan, JF ; Yang, JYH (NATURE PORTFOLIO, 2021-08-17)
    Liquid chromatography-mass spectrometry-based metabolomics studies are increasingly applied to large population cohorts, which run for several weeks or even years in data acquisition. This inevitably introduces unwanted intra- and inter-batch variations over time that can overshadow true biological signals and thus hinder potential biological discoveries. To date, normalisation approaches have struggled to mitigate the variability introduced by technical factors whilst preserving biological variance, especially for protracted acquisitions. Here, we propose a study design framework with an arrangement for embedding biological sample replicates to quantify variance within and between batches and a workflow that uses these replicates to remove unwanted variation in a hierarchical manner (hRUV). We use this design to produce a dataset of more than 1000 human plasma samples run over an extended period of time. We demonstrate significant improvement of hRUV over existing methods in preserving biological signals whilst removing unwanted variation for large scale metabolomics studies. Our tools not only provide a strategy for large scale data normalisation, but also provides guidance on the design strategy for large omics studies.
  • Item
    Thumbnail Image
    Strategies to enable large-scale proteomics for reproducible research
    Poulos, RC ; Hains, PG ; Shah, R ; Lucas, N ; Xavier, D ; Manda, SS ; Anees, A ; Koh, JMS ; Mahboob, S ; Wittman, M ; Williams, SG ; Sykes, EK ; Hecker, M ; Dausmann, M ; Wouters, MA ; Ashman, K ; Yang, J ; Wild, PJ ; deFazio, A ; Balleine, RL ; Tully, B ; Aebersold, R ; Speed, TP ; Liu, Y ; Reddel, RR ; Robinson, PJ ; Zhong, Q (NATURE PORTFOLIO, 2020-07-30)
    Reproducible research is the bedrock of experimental science. To enable the deployment of large-scale proteomics, we assess the reproducibility of mass spectrometry (MS) over time and across instruments and develop computational methods for improving quantitative accuracy. We perform 1560 data independent acquisition (DIA)-MS runs of eight samples containing known proportions of ovarian and prostate cancer tissue and yeast, or control HEK293T cells. Replicates are run on six mass spectrometers operating continuously with varying maintenance schedules over four months, interspersed with ~5000 other runs. We utilise negative controls and replicates to remove unwanted variation and enhance biological signal, outperforming existing methods. We also design a method for reducing missing values. Integrating these computational modules into a pipeline (ProNorM), we mitigate variation among instruments over time and accurately predict tissue proportions. We demonstrate how to improve the quantitative analysis of large-scale DIA-MS data, providing a pathway toward clinical proteomics.
  • Item
    Thumbnail Image
    Controlling technical variation amongst 6693 patient microarrays of the randomized MINDACT trial
    Jacob, L ; Witteveen, A ; Beumer, I ; Delahaye, L ; Wehkamp, D ; van den Akker, J ; Snel, M ; Chan, B ; Floore, A ; Bakx, N ; Brink, G ; Poncet, C ; Bogaerts, J ; Delorenzi, M ; Piccart, M ; Rutgers, E ; Cardoso, F ; Speed, T ; van't Veer, L ; Glas, A (NATURE PUBLISHING GROUP, 2020-07-27)
    Gene expression data obtained in large studies hold great promises for discovering disease signatures or subtypes through data analysis. It is also prone to technical variation, whose removal is essential to avoid spurious discoveries. Because this variation is not always known and can be confounded with biological signals, its removal is a challenging task. Here we provide a step-wise procedure and comprehensive analysis of the MINDACT microarray dataset. The MINDACT trial enrolled 6693 breast cancer patients and prospectively validated the gene expression signature MammaPrint for outcome prediction. The study also yielded a full-transcriptome microarray for each tumor. We show for the first time in such a large dataset how technical variation can be removed while retaining expected biological signals. Because of its unprecedented size, we hope the resulting adjusted dataset will be an invaluable tool to discover or test gene expression signatures and to advance our understanding of breast cancer.