School of Mathematics and Statistics - Research Publications
Now showing items 1-12 of 536
Infectious disease pandemic planning and response: Incorporating decision analysis
(PUBLIC LIBRARY SCIENCE, 2020-01-01)
Freya Shearer and co-authors discuss the use of decision analysis in planning for infectious disease pandemics.
qtQDA: quantile transformed quadratic discriminant analysis for high-dimensional RNA-seq data
(PEERJ INC, 2019-12-18)
Classification on the basis of gene expression data derived from RNA-seq promises to become an important part of modern medicine. We propose a new classification method based on a model where the data is marginally negative binomial but dependent, thereby incorporating the dependence known to be present between measurements from different genes. The method, called qtQDA, works by first performing a quantile transformation (qt) then applying Gaussian quadratic discriminant analysis (QDA) using regularized covariance matrix estimates. We show that qtQDA has excellent performance when applied to real data sets and has advantages over some existing approaches. An R package implementing the method is also available on https://github.com/goknurginer/qtQDA.
Vireo: Bayesian demultiplexing of pooled single-cell RNA-seq data without genotype reference
Multiplexed single-cell RNA-seq analysis of multiple samples using pooling is a promising experimental design, offering increased throughput while allowing to overcome batch variation. To reconstruct the sample identify of each cell, genetic variants that segregate between the samples in the pool have been proposed as natural barcode for cell demultiplexing. Existing demultiplexing strategies rely on availability of complete genotype data from the pooled samples, which limits the applicability of such methods, in particular when genetic variation is not the primary object of study. To address this, we here present Vireo, a computationally efficient Bayesian model to demultiplex single-cell data from pooled experimental designs. Uniquely, our model can be applied in settings when only partial or no genotype information is available. Using pools based on synthetic mixtures and results on real data, we demonstrate the robustness of Vireo and illustrate the utility of multiplexed experimental designs for common expression analyses.
Temporal development of the oral microbiome and prediction of early childhood caries
(NATURE PUBLISHING GROUP, 2019-12-24)
Human microbiomes are predicted to assemble in a reproducible and ordered manner yet there is limited knowledge on the development of the complex bacterial communities that constitute the oral microbiome. The oral microbiome plays major roles in many oral diseases including early childhood caries (ECC), which afflicts up to 70% of children in some countries. Saliva contains oral bacteria that are indicative of the whole oral microbiome and may have the ability to reflect the dysbiosis in supragingival plaque communities that initiates the clinical manifestations of ECC. The aim of this study was to determine the assembly of the oral microbiome during the first four years of life and compare it with the clinical development of ECC. The oral microbiomes of 134 children enrolled in a birth cohort study were determined at six ages between two months and four years-of-age and their mother's oral microbiome was determined at a single time point. We identified and quantified 356 operational taxonomic units (OTUs) of bacteria in saliva by sequencing the V4 region of the bacterial 16S RNA genes. Bacterial alpha diversity increased from a mean of 31 OTUs in the saliva of infants at 1.9 months-of-age to 84 OTUs at 39 months-of-age. The oral microbiome showed a distinct shift in composition as the children matured. The microbiome data were compared with the clinical development of ECC in the cohort at 39, 48, and 60 months-of-age as determined by ICDAS-II assessment. Streptococcus mutans was the most discriminatory oral bacterial species between health and current disease, with an increased abundance in disease. Overall our study demonstrates an ordered temporal development of the oral microbiome, describes a limited core oral microbiome and indicates that saliva testing of infants may help predict ECC risk.
Mathematical modelling indicates that lower activity of the haemostatic system in neonates is primarily due to lower prothrombin concentration
(NATURE PUBLISHING GROUP, 2019-03-08)
Haemostasis is governed by a highly complex system of interacting proteins. Due to the central role of thrombin, thrombin generation and specifically the thrombin generation curve (TGC) is commonly used as an indicator of haemostatic activity. Functional characteristics of the haemostatic system in neonates and children are significantly different compared with adults; at the same time plasma levels of haemostatic proteins vary considerably with age. However, relating one to the other has been difficult, both due to significant inter-individual differences for individuals of similar age and the complexity of the biochemical reactions underlying haemostasis. Mathematical modelling has been very successful at representing the biochemistry of blood clotting. In this study we address the challenge of large inter-individual variability by parameterising the Hockin-Mann model with data from individual patients, across different age groups from neonates to adults. Calculating TGCs for each patient of a specific age group provides us with insight into the variability of haemostatic activity across that age group. From our model we observe that two commonly used metrics for haemostatic activity are significantly lower in neonates than in older patients. Because both metrics are strongly determined by prothrombin and prothrombin levels are considerably lower in neonates we conclude that decreased haemostatic activity in neonates is due to lower prothrombin availability.
Differential co-expression-based detection of conditional relationships in transcriptional data: comparative analysis and application to breast cancer
BACKGROUND: Elucidation of regulatory networks, including identification of regulatory mechanisms specific to a given biological context, is a key aim in systems biology. This has motivated the move from co-expression to differential co-expression analysis and numerous methods have been developed subsequently to address this task; however, evaluation of methods and interpretation of the resulting networks has been hindered by the lack of known context-specific regulatory interactions. RESULTS: In this study, we develop a simulator based on dynamical systems modelling capable of simulating differential co-expression patterns. With the simulator and an evaluation framework, we benchmark and characterise the performance of inference methods. Defining three different levels of "true" networks for each simulation, we show that accurate inference of causation is difficult for all methods, compared to inference of associations. We show that a z-score-based method has the best general performance. Further, analysis of simulation parameters reveals five network and simulation properties that explained the performance of methods. The evaluation framework and inference methods used in this study are available in the dcanr R/Bioconductor package. CONCLUSIONS: Our analysis of networks inferred from simulated data show that hub nodes are more likely to be differentially regulated targets than transcription factors. Based on this observation, we propose an interpretation of the inferred differential network that can reconstruct a putative causal network.
Identification of cancer sex-disparity in the functional integrity of p53 and its X chromosome network
(NATURE PUBLISHING GROUP, 2019-11-26)
The disproportionately high prevalence of male cancer is poorly understood. We tested for sex-disparity in the functional integrity of the major tumor suppressor p53 in sporadic cancers. Our bioinformatics analyses expose three novel levels of p53 impact on sex-disparity in 12 non-reproductive cancer types. First, TP53 mutation is more frequent in these cancers among US males than females, with poorest survival correlating with its mutation. Second, numerous X-linked genes are associated with p53, including vital genomic regulators. Males are at unique risk from alterations of their single copies of these genes. High expression of X-linked negative regulators of p53 in wild-type TP53 cancers corresponds with reduced survival. Third, females exhibit an exceptional incidence of non-expressed mutations among p53-associated X-linked genes. Our data indicate that poor survival in males is contributed by high frequencies of TP53 mutations and an inability to shield against deregulated X-linked genes that engage in p53 networks.
Multiple interaction nodes define the postreplication repair response to UV-induced DNA damage that is defective in melanomas and correlated with UV signature mutation load
Ultraviolet radiation-induced DNA mutations are a primary environmental driver of melanoma. The reason for this very high level of unrepaired DNA lesions leading to these mutations is still poorly understood. The primary DNA repair mechanism for UV-induced lesions, that is, the nucleotide excision repair pathway, appears intact in most melanomas. We have previously reported a postreplication repair mechanism that is commonly defective in melanoma cell lines. Here we have used a genome-wide approach to identify the components of this postreplication repair mechanism. We have used differential transcript polysome loading to identify transcripts that are associated with UV response, and then functionally assessed these to identify novel components of this repair and cell cycle checkpoint network. We have identified multiple interaction nodes, including global genomic nucleotide excision repair and homologous recombination repair, and previously unexpected MASTL pathway, as components of the response. Finally, we have used bioinformatics to assess the contribution of dysregulated expression of these pathways to the UV signature mutation load of a large melanoma cohort. We show that dysregulation of the pathway, especially the DNA damage repair components, are significant contributors to UV mutation load, and that dysregulation of the MASTL pathway appears to be a significant contributor to high UV signature mutation load.
Insights into malaria susceptibility using genome-wide data on 17,000 individuals from Africa, Asia and Oceania
(NATURE PUBLISHING GROUP, 2019-12-16)
The human genetic factors that affect resistance to infectious disease are poorly understood. Here we report a genome-wide association study in 17,000 severe malaria cases and population controls from 11 countries, informed by sequencing of family trios and by direct typing of candidate loci in an additional 15,000 samples. We identify five replicable associations with genome-wide levels of evidence including a newly implicated variant on chromosome 6. Jointly, these variants account for around one-tenth of the heritability of severe malaria, which we estimate as ~23% using genome-wide genotypes. We interrogate available functional data and discover an erythroid-specific transcription start site underlying the known association in ATP2B4, but are unable to identify a likely causal mechanism at the chromosome 6 locus. Previously reported HLA associations do not replicate in these samples. This large dataset will provide a foundation for further research on thegenetic determinants of malaria resistance in diverse populations.
Mobile-surface bubbles and droplets coalesce faster but bounce stronger
(AMER ASSOC ADVANCEMENT SCIENCE, 2019-10-01)
Enhancing the hydrodynamic interfacial mobility of bubbles and droplets in multiphase systems is expected to reduce the characteristic coalescence times and thereby affect the stability of gas or liquid emulsions that are of wide industrial and biological importance. However, by comparing the controlled collision of bubbles or water droplets with mobile or immobile liquid interfaces, in a pure fluorocarbon liquid, we demonstrate that collisions involving mobile surfaces result in a significantly stronger series of rebounds before the rapid coalescence event. The stronger rebound is explained by the lower viscous dissipation during collisions involving mobile surfaces. We present direct numerical simulations to confirm that the observed rebound is enhanced with increased surface mobility. These observations require a reassessment of the role of surface mobility for controlling the dynamic stability of gas or liquid emulsion systems relevant to a wide range of processes, from microfluidics and pharmaceuticals to food and crude oil processing.
Genome-wide association study of eosinophilic granulomatosis with polyangiitis reveals genomic loci stratified by ANCA status
(NATURE PUBLISHING GROUP, 2019-11-12)
Eosinophilic granulomatosis with polyangiitis (EGPA) is a rare inflammatory disease of unknown cause. 30% of patients have anti-neutrophil cytoplasmic antibodies (ANCA) specific for myeloperoxidase (MPO). Here, we describe a genome-wide association study in 676 EGPA cases and 6809 controls, that identifies 4 EGPA-associated loci through conventional case-control analysis, and 4 additional associations through a conditional false discovery rate approach. Many variants are also associated with asthma and six are associated with eosinophil count in the general population. Through Mendelian randomisation, we show that a primary tendency to eosinophilia contributes to EGPA susceptibility. Stratification by ANCA reveals that EGPA comprises two genetically and clinically distinct syndromes. MPO+ ANCA EGPA is an eosinophilic autoimmune disease sharing certain clinical features and an HLA-DQ association with MPO+ ANCA-associated vasculitis, while ANCA-negative EGPA may instead have a mucosal/barrier dysfunction origin. Four candidate genes are targets of therapies in development, supporting their exploration in EGPA.
On the range of lattice models in high dimensions
(SPRINGER HEIDELBERG, 2020-04-01)
We investigate the scaling limit of the range (the set of visited vertices) for a general class of critical lattice models, starting from a single initial particle at the origin. Conditions are given on the random sets and an associated "ancestral relation" under which, conditional on longterm survival, the rescaled ranges converge weakly to the range of super-Brownian motion as random sets. These hypotheses also give precise asymptotics for the limiting behaviour of the probability of exiting a large ball, that is for the extrinsic one-arm probability. We show that these conditions are satisfied by the voter model in dimensions d ≥ 2 , sufficiently spread out critical oriented percolation and critical contact processes in dimensions d > 4 , and sufficiently spread out critical lattice trees in dimensions d > 8 . The latter result proves Conjecture 1.6 of van der Hofstad et al. (Ann Probab 45:278-376, 2017) and also has important consequences for the behaviour of random walks on lattice trees in high dimensions.