School of Mathematics and Statistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 56
  • Item
    Thumbnail Image
    Sincast: a computational framework to predict cell identities in single-cell transcriptomes using bulk atlases as references
    Deng, Y ; Choi, J ; Cao, K-AL (OXFORD UNIV PRESS, 2022-03-31)
    Characterizing the molecular identity of a cell is an essential step in single-cell RNA sequencing (scRNA-seq) data analysis. Numerous tools exist for predicting cell identity using single-cell reference atlases. However, many challenges remain, including correcting for inherent batch effects between reference and query data andinsufficient phenotype data from the reference. One solution is to project single-cell data onto established bulk reference atlases to leverage their rich phenotype information. Sincast is a computational framework to query scRNA-seq data by projection onto bulk reference atlases. Prior to projection, single-cell data are transformed to be directly comparable to bulk data, either with pseudo-bulk aggregation or graph-based imputation to address sparse single-cell expression profiles. Sincast avoids batch effect correction, and cell identity is predicted along a continuum to highlight new cell states not found in the reference atlas. In several case study scenarios, we show that Sincast projects single cells into the correct biological niches in the expression space of the bulk reference atlas. We demonstrate the effectiveness of our imputation approach that was specifically developed for querying scRNA-seq data based on bulk reference atlases. We show that Sincast is an efficient and powerful tool for single-cell profiling that will facilitate downstream analysis of scRNA-seq data.
  • Item
    Thumbnail Image
    Alterations in the Gut Fungal Community in a Mouse Model of Huntington's Disease
    Kong, G ; Cao, K-AL ; Hannan, AJ ; Shapiro, RS (AMER SOC MICROBIOLOGY, 2022-04-01)
    Huntington's disease (HD) is a neurodegenerative disorder caused by a trinucleotide expansion in the HTT gene, which is expressed throughout the brain and body, including the gut epithelium and enteric nervous system. Afflicted individuals suffer from progressive impairments in motor, psychiatric, and cognitive faculties, as well as peripheral deficits, including the alteration of the gut microbiome. However, studies characterizing the gut microbiome in HD have focused entirely on the bacterial component, while the fungal community (mycobiome) has been overlooked. The gut mycobiome has gained recognition for its role in host homeostasis and maintenance of the gut epithelial barrier. We aimed to characterize the gut mycobiome profile in HD using fecal samples collected from the R6/1 transgenic mouse model (and wild-type littermate controls) from 4 to 12 weeks of age, corresponding to presymptomatic through to early disease stages. Shotgun sequencing was performed on fecal DNA samples, followed by metagenomic analyses. The HD gut mycobiome beta diversity was significantly different from that of wild-type littermates at 12 weeks of age, while no genotype differences were observed at the earlier time points. Similarly, greater alpha diversity was observed in the HD mice by 12 weeks of age. Key taxa, including Malassezia restricta, Yarrowia lipolytica, and Aspergillus species, were identified as having a negative association with HD. Furthermore, integration of the bacterial and fungal data sets at 12 weeks of age identified negative correlations between the HD-associated fungal species and Lactobacillus reuteri. These findings provide new insights into gut microbiome alterations in HD and may help identify novel therapeutic targets. IMPORTANCE Huntington's disease (HD) is a fatal neurodegenerative disorder affecting both the mind and body. We have recently discovered that gut bacteria are disrupted in HD. The present study provides the first evidence of an altered gut fungal community (mycobiome) in HD. The genomes of many thousands of gut microbes were sequenced and used to assess "metagenomics" in particular the different types of fungal species in the HD versus control gut, in a mouse model. At an early disease stage, before the onset of symptoms, the overall gut mycobiome structure (array of fungi) in HD mice was distinct from that of their wild-type littermates. Alterations of multiple key fungi species were identified as being associated with the onset of disease symptoms, some of which showed strong correlations with the gut bacterial community. This study highlights the potential role of gut fungi in HD and may facilitate the development of novel therapeutic approaches.
  • Item
    Thumbnail Image
    Host Traits and Phylogeny Contribute to Shaping Coral-Bacterial Symbioses
    Ricci, F ; Tandon, K ; Black, JR ; Cao, K-AL ; Blackall, LL ; Verbruggen, H ; Raina, J-B (AMER SOC MICROBIOLOGY, 2022-03-07)
    The success of tropical scleractinian corals depends on their ability to establish symbioses with microbial partners. Host phylogeny and traits are known to shape the coral microbiome, but to what extent they affect its composition remains unclear. Here, by using 12 coral species representing the complex and robust clades, we explored the influence of host phylogeny, skeletal architecture, and reproductive mode on the microbiome composition, and further investigated the structure of the tissue and skeleton bacterial communities. Our results show that host phylogeny and traits explained 14% of the tissue and 13% of the skeletal microbiome composition, providing evidence that these predictors contributed to shaping the holobiont in terms of presence and relative abundance of bacterial symbionts. Based on our data, we conclude that host phylogeny affects the presence of specific microbial lineages, reproductive mode predictably influences the microbiome composition, and skeletal architecture works like a filter that affects bacterial relative abundance. We show that the β-diversity of coral tissue and skeleton microbiomes differed, but we found that a large overlapping fraction of bacterial sequences were recovered from both anatomical compartments, supporting the hypothesis that the skeleton can function as a microbial reservoir. Additionally, our analysis of the microbiome structure shows that 99.6% of tissue and 99.7% of skeletal amplicon sequence variants (ASVs) were not consistently present in at least 30% of the samples, suggesting that the coral tissue and skeleton are dominated by rare bacteria. Together, these results provide novel insights into the processes driving coral-bacterial symbioses, along with an improved understanding of the scleractinian microbiome.
  • Item
    No Preview Available
    A field guide to cultivating computational biology
    Way, GP ; Greene, CS ; Carninci, P ; Carvalho, BS ; de Hoon, M ; Finley, S ; Gosline, SJC ; Le Cao, K-A ; Lee, JSH ; Marchionni, L ; Robine, N ; Sindi, SS ; Theis, FJ ; Yang, JYH ; Carpenter, AE ; Fertig, EJ (PUBLIC LIBRARY SCIENCE, 2021-10-01)
    Evolving in sync with the computation revolution over the past 30 years, computational biology has emerged as a mature scientific field. While the field has made major contributions toward improving scientific knowledge and human health, individual computational biology practitioners at various institutions often languish in career development. As optimistic biologists passionate about the future of our field, we propose solutions for both eager and reluctant individual scientists, institutions, publishers, funding agencies, and educators to fully embrace computational biology. We believe that in order to pave the way for the next generation of discoveries, we need to improve recognition for computational biologists and better align pathways of career success with pathways of scientific progress. With 10 outlined steps, we call on all adjacent fields to move away from the traditional individual, single-discipline investigator research model and embrace multidisciplinary, data-driven, team science.
  • Item
    Thumbnail Image
    Gene-environment-gut interactions in Huntington's disease mice are associated with environmental modulation of the gut microbiome
    Gubert, C ; Love, CJ ; Kodikara, S ; Liew, JJM ; Renoir, T ; Cao, K-AL ; Hannan, AJ (CELL PRESS, 2022-01-21)
    Gut dysbiosis in Huntington's disease (HD) has recently been reported using microbiome profiling in R6/1 HD mice and replicated in clinical HD. In HD mice, environmental enrichment (EE) and exercise (EX) were shown to have therapeutic impacts on the brain and associated symptoms. We hypothesize that these housing interventions modulate the gut microbiome, configuring one of the mechanisms that mediate their therapeutic effects observed in HD. We exposed R6/1 mice to a protocol of either EE or EX, relative to standard-housed control conditions, before the onset of gut dysbiosis and motor deficits. We characterized gut structure and function, as well as gut microbiome profiling using 16S rRNA sequencing. Multivariate analysis identified specific orders, namely Bacteroidales, Lachnospirales and Oscillospirales, as the main bacterial signatures that discriminate between housing conditions. Our findings suggest a promising role for the gut microbiome in mediating the effects of EE and EX exposures, and possibly other environmental interventions, in HD mice.
  • Item
    Thumbnail Image
    Interpretation of network-based integration from multi-omics longitudinal data.
    Bodein, A ; Scott-Boyer, M-P ; Perin, O ; Lê Cao, K-A ; Droit, A (Oxford University Press (OUP), 2022-03-21)
    Multi-omics integration is key to fully understand complex biological processes in an holistic manner. Furthermore, multi-omics combined with new longitudinal experimental design can unreveal dynamic relationships between omics layers and identify key players or interactions in system development or complex phenotypes. However, integration methods have to address various experimental designs and do not guarantee interpretable biological results. The new challenge of multi-omics integration is to solve interpretation and unlock the hidden knowledge within the multi-omics data. In this paper, we go beyond integration and propose a generic approach to face the interpretation problem. From multi-omics longitudinal data, this approach builds and explores hybrid multi-omics networks composed of both inferred and known relationships within and between omics layers. With smart node labelling and propagation analysis, this approach predicts regulation mechanisms and multi-omics functional modules. We applied the method on 3 case studies with various multi-omics designs and identified new multi-layer interactions involved in key biological functions that could not be revealed with single omics analysis. Moreover, we highlighted interplay in the kinetics that could help identify novel biological mechanisms. This method is available as an R package netOmics to readily suit any application.
  • Item
    Thumbnail Image
    Predicting qualitative phenotypes from microarray data - the Eadgene pig data set.
    Robert-Granié, C ; Lê Cao, K-A ; Sancristobal, M (Springer Science and Business Media LLC, 2009-07-16)
    BACKGROUND: The aim of this work was to study the performances of 2 predictive statistical tools on a data set that was given to all participants of the Eadgene-SABRE Post Analyses Working Group, namely the Pig data set of Hazard et al. (2008). The data consisted of 3686 gene expressions measured on 24 animals partitioned in 2 genotypes and 2 treatments. The objective was to find biomarkers that characterized the genotypes and the treatments in the whole set of genes. METHODS: We first considered the Random Forest approach that enables the selection of predictive variables. We then compared the classical Partial Least Squares regression (PLS) with a novel approach called sparse PLS, a variant of PLS that adapts lasso penalization and allows for the selection of a subset of variables. RESULTS: All methods performed well on this data set. The sparse PLS outperformed the PLS in terms of prediction performance and improved the interpretability of the results. CONCLUSION: We recommend the use of machine learning methods such as Random Forest and multivariate methods such as sparse PLS for prediction purposes. Both approaches are well adapted to transcriptomic data where the number of features is much greater than the number of individuals.
  • Item
    Thumbnail Image
    Genetic Variants in ERAP1 and ERAP2 Associated With Immune-Mediated Diseases Influence Protein Expression and the Isoform Profile
    Hanson, AL ; Cuddihy, T ; Haynes, K ; Loo, D ; Morton, CJ ; Oppermann, U ; Leo, P ; Thomas, GP ; Kim-Anh, LC ; Kenna, TJ ; Brown, MA (WILEY, 2018-02-01)
    OBJECTIVE: Endoplasmic reticulum aminopeptidase 1 (ERAP-1) and ERAP-2, encoded on chromosome 5q15, trim endogenous peptides for HLA-mediated presentation to the immune system. Polymorphisms in ERAP1 and/or ERAP2 are strongly associated with several immune-mediated diseases with specific HLA backgrounds, implicating altered peptide handling and presentation as prerequisites for autoreactivity against an arthritogenic peptide. Given the thorough characterization of disease risk-associated polymorphisms that alter ERAP activity, this study aimed instead to interrogate the expression effect of chromosome 5q15 polymorphisms to determine their effect on ERAP isoform and protein expression. METHODS: RNA sequencing and genotyping across chromosome 5q15 were performed to detect genetic variants in ERAP1 and ERAP2 associated with altered total gene and isoform-specific expression. The functional implication of a putative messenger RNA splice-altering variant on ERAP-1 protein levels was validated using mass spectrometry. RESULTS: Polymorphisms associated with ankylosing spondylitis (AS) significantly influenced the transcript and protein expression of ERAP-1 and ERAP-2. Disease risk-associated polymorphisms in and around both genes were also associated with increased gene expression. Furthermore, key risk-associated ERAP1 variants were associated with altered transcript splicing, leading to allele-dependent alternate expression of 2 distinct isoforms and significant differences in the type of ERAP-1 protein produced. CONCLUSION: In accordance with studies demonstrating that polymorphisms that increase aminopeptidase activity predispose to immune disease, the increased risk also attributed to increased expression of ERAP1 and ERAP2 supports the notion of using aminopeptidase inhibition to treat AS and other ERAP-associated conditions.
  • Item
    Thumbnail Image
    Human Hepatocellular Carcinomas With a Periportal Phenotype Have the Lowest Potential for Early Recurrence After Curative Resection
    Desert, R ; Rohart, F ; Canal, F ; Sicard, M ; Desille, M ; Renaud, S ; Turlin, B ; Bellaud, P ; Perret, C ; Clement, B ; Le Cao, K-A ; Musso, O (WILEY, 2017-11-01)
    Hepatocellular carcinomas (HCCs) exhibit a diversity of molecular phenotypes, raising major challenges in clinical management. HCCs detected by surveillance programs at an early stage are candidates for potentially curative therapies (local ablation, resection, or transplantation). In the long term, transplantation provides the lowest recurrence rates. Treatment allocation is based on tumor number, size, vascular invasion, performance status, functional liver reserve, and the prediction of early (<2 years) recurrence, which reflects the intrinsic aggressiveness of the tumor. Well-differentiated, potentially low-aggressiveness tumors form the heterogeneous molecular class of nonproliferative HCCs, characterized by an approximate 50% β-catenin mutation rate. To define the clinical, pathological, and molecular features and the outcome of nonproliferative HCCs, we constructed a 1,133-HCC transcriptomic metadata set and validated findings in a publically available 210-HCC RNA sequencing set. We show that nonproliferative HCCs preserve the zonation program that distributes metabolic functions along the portocentral axis in normal liver. More precisely, we identified two well-differentiated, nonproliferation subclasses, namely periportal-type (wild-type β-catenin) and perivenous-type (mutant β-catenin), which expressed negatively correlated gene networks. The new periportal-type subclass represented 29% of all HCCs; expressed a hepatocyte nuclear factor 4A-driven gene network, which was down-regulated in mouse hepatocyte nuclear factor 4A knockout mice; were early-stage tumors by Barcelona Clinic Liver Cancer, Cancer of the Liver Italian Program, and tumor-node-metastasis staging systems; had no macrovascular invasion; and showed the lowest metastasis-specific gene expression levels and TP53 mutation rates. Also, we identified an eight-gene periportal-type HCC signature, which was independently associated with the highest 2-year recurrence-free survival by multivariate analyses in two independent cohorts of 247 and 210 patients. CONCLUSION: Well-differentiated HCCs display mutually exclusive periportal or perivenous zonation programs. Among all HCCs, periportal-type tumors have the lowest intrinsic potential for early recurrence after curative resection. (Hepatology 2017;66:1502-1518).
  • Item
    Thumbnail Image
    Multiparameter analysis of naevi and primary melanomas identifies a subset of naevi with elevated markers of transformation.
    Fox, C ; Lambie, D ; Wilmott, JS ; Pinder, A ; Pavey, S ; Lê Cao, K-A ; Akalin, T ; Karaarslan, IK ; Ozdemir, F ; Scolyer, RA ; Yamada, M ; Soyer, HP ; Schaider, H ; Gabrielli, B (Wiley, 2016-07)
    Here we have carried out a multiparameter analysis using a panel of 28 immunohistochemical markers to identify markers of transformation from benign and dysplastic naevus to primary melanoma in three separate cohorts totalling 279 lesions. We have identified a set of eight markers that distinguish naevi from melanoma. None of markers or parameters assessed differentiated benign from dysplastic naevi. Indeed, the naevi clustered tightly in terms of their immunostaining patterns whereas primary melanomas showed more diverse staining patterns. A small subset of histopathologically benign lesions had elevated levels of multiple markers associated with melanoma, suggesting that these represent naevi with an increased potential for transformation to melanoma.