School of BioSciences - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 61
  • Item
    Thumbnail Image
    Turing pattern design principles and their robustness
    Vittadello, ST ; Leyshon, T ; Schnoerr, D ; Stumpf, MPH (ROYAL SOC, 2021-12-27)
    Turing patterns have morphed from mathematical curiosities into highly desirable targets for synthetic biology. For a long time, their biological significance was sometimes disputed but there is now ample evidence for their involvement in processes ranging from skin pigmentation to digit and limb formation. While their role in developmental biology is now firmly established, their synthetic design has so far proved challenging. Here, we review recent large-scale mathematical analyses that have attempted to narrow down potential design principles. We consider different aspects of robustness of these models and outline why this perspective will be helpful in the search for synthetic Turing-patterning systems. We conclude by considering robustness in the context of developmental modelling more generally. This article is part of the theme issue 'Recent progress and open frontiers in Turing's theory of morphogenesis'.
  • Item
    Thumbnail Image
    Pathway dynamics can delineate the sources of transcriptional noise in gene expression
    Ham, L ; Jackson, M ; Stumpf, MPH (eLIFE SCIENCES PUBL LTD, 2021-10-12)
    Single-cell expression profiling opens up new vistas on cellular processes. Extensive cell-to-cell variability at the transcriptomic and proteomic level has been one of the stand-out observations. Because most experimental analyses are destructive we only have access to snapshot data of cellular states. This loss of temporal information presents significant challenges for inferring dynamics, as well as causes of cell-to-cell variability. In particular, we typically cannot separate dynamic variability from within cells ('intrinsic noise') from variability across the population ('extrinsic noise'). Here, we make this non-identifiability mathematically precise, allowing us to identify new experimental set-ups that can assist in resolving this non-identifiability. We show that multiple generic reporters from the same biochemical pathways (e.g. mRNA and protein) can infer magnitudes of intrinsic and extrinsic transcriptional noise, identifying sources of heterogeneity. Stochastic simulations support our theory, and demonstrate that 'pathway-reporters' compare favourably to the well-known, but often difficult to implement, dual-reporter method.
  • Item
    Thumbnail Image
    Model comparison via simplicial complexes and persistent homology
    Vittadello, ST ; Stumpf, MPH (ROYAL SOC, 2021-10-13)
    In many scientific and technological contexts, we have only a poor understanding of the structure and details of appropriate mathematical models. We often, therefore, need to compare different models. With available data we can use formal statistical model selection to compare and contrast the ability of different mathematical models to describe such data. There is, however, a lack of rigorous methods to compare different models a priori. Here, we develop and illustrate two such approaches that allow us to compare model structures in a systematic way by representing models as simplicial complexes. Using well-developed concepts from simplicial algebraic topology, we define a distance between models based on their simplicial representations. Employing persistent homology with a flat filtration provides for alternative representations of the models as persistence intervals, which represent model structure, from which the model distances are also obtained. We then expand on this measure of model distance to study the concept of model equivalence to determine the conceptual similarity of models. We apply our methodology for model comparison to demonstrate an equivalence between a positional-information model and a Turing-pattern model from developmental biology, constituting a novel observation for two classes of models that were previously regarded as unrelated.
  • Item
    No Preview Available
    Non-equilibrium statistical physics, transitory epigenetic landscapes, and cell fate decision dynamics
    Guillemin, A ; Stumpf, MPH (AMER INST MATHEMATICAL SCIENCES-AIMS, 2020-01-01)
    Statistical physics provides a useful perspective for the analysis of many complex systems; it allows us to relate microscopic fluctuations to macroscopic observations. Developmental biology, but also cell biology more generally, are examples where apparently robust behaviour emerges from highly complex and stochastic sub-cellular processes. Here we attempt to make connections between different theoretical perspectives to gain qualitative insights into the types of cell-fate decision making processes that are at the heart of stem cell and developmental biology. We discuss both dynamical systems as well as statistical mechanics perspectives on the classical Waddington or epigenetic landscape. We find that non-equilibrium approaches are required to overcome some of the shortcomings of classical equilibrium statistical thermodynamics or statistical mechanics in order to shed light on biological processes, which, almost by definition, are typically far from equilibrium.
  • Item
    No Preview Available
    Gene Regulatory Network Inference
    Babtie, AC ; Stumpf, MPH ; Thorne, T (Elsevier, 2020-01-01)
  • Item
    Thumbnail Image
    Protein degradation rate is the dominant mechanism accounting for the differences in protein abundance of basal p53 in a human breast and colorectal cancer cell line
    Lakatos, E ; Salehi-Reyhani, A ; Barclay, M ; Stumpf, MPH ; Klug, DR ; Deb, S (PUBLIC LIBRARY SCIENCE, 2017-05-10)
    We determine p53 protein abundances and cell to cell variation in two human cancer cell lines with single cell resolution, and show that the fractional width of the distributions is the same in both cases despite a large difference in average protein copy number. We developed a computational framework to identify dominant mechanisms controlling the variation of protein abundance in a simple model of gene expression from the summary statistics of single cell steady state protein expression distributions. Our results, based on single cell data analysed in a Bayesian framework, lends strong support to a model in which variation in the basal p53 protein abundance may be best explained by variations in the rate of p53 protein degradation. This is supported by measurements of the relative average levels of mRNA which are very similar despite large variation in the level of protein.
  • Item
    Thumbnail Image
    Inferring extrinsic noise from single-cell gene expression data using approximate Bayesian computation
    Lenive, O ; Kirk, PDW ; Stumpf, MPH (BIOMED CENTRAL LTD, 2016-08-22)
    BACKGROUND: Gene expression is known to be an intrinsically stochastic process which can involve single-digit numbers of mRNA molecules in a cell at any given time. The modelling of such processes calls for the use of exact stochastic simulation methods, most notably the Gillespie algorithm. However, this stochasticity, also termed "intrinsic noise", does not account for all the variability between genetically identical cells growing in a homogeneous environment. Despite substantial experimental efforts, determining appropriate model parameters continues to be a challenge. Methods based on approximate Bayesian computation can be used to obtain posterior parameter distributions given the observed data. However, such inference procedures require large numbers of simulations of the model and exact stochastic simulation is computationally costly. In this work we focus on the specific case of trying to infer model parameters describing reaction rates and extrinsic noise on the basis of measurements of molecule numbers in individual cells at a given time point. RESULTS: To make the problem computationally tractable we develop an exact, model-specific, stochastic simulation algorithm for the commonly used two-state model of gene expression. This algorithm relies on certain assumptions and favourable properties of the model to forgo the simulation of the whole temporal trajectory of protein numbers in the system, instead returning only the number of protein and mRNA molecules present in the system at a specified time point. The computational gain is proportional to the number of protein molecules created in the system and becomes significant for systems involving hundreds or thousands of protein molecules. CONCLUSIONS: We employ this simulation algorithm with approximate Bayesian computation to jointly infer the model's rate and noise parameters from published gene expression data. Our analysis indicates that for most genes the extrinsic contributions to noise will be small to moderate but certainly are non-negligible.
  • Item
    No Preview Available
    The extent and importance of intragenic recombination.
    de Silva, E ; Kelley, LA ; Stumpf, MPH (Springer Science and Business Media LLC, 2004-11)
    We have studied the recombination rate behaviour of a set of 140 genes which were investigated for their potential importance in inflammatory disease. Each gene was extensively sequenced in 24 individuals of African descent and 23 individuals of European descent, and the recombination process was studied separately in the two population samples. The results obtained from the two populations were highly correlated, suggesting that demographic bias does not affect our population genetic estimation procedure. We found evidence that levels of recombination correlate with levels of nucleotide diversity. High marker density allowed us to study recombination rate variation on a very fine spatial scale. We found that about 40 per cent of genes showed evidence of uniform recombination, while approximately 12 per cent of genes carried distinct signatures of recombination hotspots. On studying the locations of these hotspots, we found that they are not always confined to introns but can also stretch across exons. An investigation of the protein products of these genes suggested that recombination hotspots can sometimes separate exons belonging to different protein domains; however, this occurs much less frequently than might be expected based on evolutionary studies into the origins of recombination. This suggests that evolutionary analysis of the recombination process is greatly aided by considering nucleotide sequences and protein products jointly.
  • Item
    No Preview Available
    Evolution of pathogenicity and sexual reproduction in eight Candida genomes
    Butler, G ; Rasmussen, MD ; Lin, MF ; Santos, MAS ; Sakthikumar, S ; Munro, CA ; Rheinbay, E ; Grabherr, M ; Forche, A ; Reedy, JL ; Agrafioti, I ; Arnaud, MB ; Bates, S ; Brown, AJP ; Brunke, S ; Costanzo, MC ; Fitzpatrick, DA ; de Groot, PWJ ; Harris, D ; Hoyer, LL ; Hube, B ; Klis, FM ; Kodira, C ; Lennard, N ; Logue, ME ; Martin, R ; Neiman, AM ; Nikolaou, E ; Quail, MA ; Quinn, J ; Santos, MC ; Schmitzberger, FF ; Sherlock, G ; Shah, P ; Silverstein, KAT ; Skrzypek, MS ; Soll, D ; Staggs, R ; Stansfield, I ; Stumpf, MPH ; Sudbery, PE ; Srikantha, T ; Zeng, Q ; Berman, J ; Berriman, M ; Heitman, J ; Gow, NAR ; Lorenz, MC ; Birren, BW ; Kellis, M ; Cuomo, CA (NATURE PUBLISHING GROUP, 2009-06-04)
    Candida species are the most common cause of opportunistic fungal infection worldwide. Here we report the genome sequences of six Candida species and compare these and related pathogens and non-pathogens. There are significant expansions of cell wall, secreted and transporter gene families in pathogenic species, suggesting adaptations associated with virulence. Large genomic tracts are homozygous in three diploid species, possibly resulting from recent recombination events. Surprisingly, key components of the mating and meiosis pathways are missing from several species. These include major differences at the mating-type loci (MTL); Lodderomyces elongisporus lacks MTL, and components of the a1/2 cell identity determinant were lost in other species, raising questions about how mating and cell types are controlled. Analysis of the CUG leucine-to-serine genetic-code change reveals that 99% of ancestral CUG codons were erased and new ones arose elsewhere. Lastly, we revise the Candida albicans gene catalogue, identifying many new genes.
  • Item
    No Preview Available
    What the papers say: text mining for genomics and systems biology.
    Harmston, N ; Filsell, W ; Stumpf, MPH (Springer Science and Business Media LLC, 2010-10)
    Keeping up with the rapidly growing literature has become virtually impossible for most scientists. This can have dire consequences. First, we may waste research time and resources on reinventing the wheel simply because we can no longer maintain a reliable grasp on the published literature. Second, and perhaps more detrimental, judicious (or serendipitous) combination of knowledge from different scientific disciplines, which would require following disparate and distinct research literatures, is rapidly becoming impossible for even the most ardent readers of research publications. Text mining - the automated extraction of information from (electronically) published sources - could potentially fulfil an important role - but only if we know how to harness its strengths and overcome its weaknesses. As we do not expect that the rate at which scientific results are published will decrease, text mining tools are now becoming essential in order to cope with, and derive maximum benefit from, this information explosion. In genomics, this is particularly pressing as more and more rare disease-causing variants are found and need to be understood. Not being conversant with this technology may put scientists and biomedical regulators at a severe disadvantage. In this review, we introduce the basic concepts underlying modern text mining and its applications in genomics and systems biology. We hope that this review will serve three purposes: (i) to provide a timely and useful overview of the current status of this field, including a survey of present challenges; (ii) to enable researchers to decide how and when to apply text mining tools in their own research; and (iii) to highlight how the research communities in genomics and systems biology can help to make text mining from biomedical abstracts and texts more straightforward.