Application of bayesian networks to a longitudinal asthma study
AuthorWalker, Michael Luke
Document TypeMasters Research thesis
Access StatusOpen Access
© 2016 Michael Luke Walker
Asthma is a highly prevalent and often serious condition causing significant illness and sometimes death. It typically consumes between 1-3% of the medical budget in most countries and imposes a disease burden on society comparable to schizophrenia or cirrhosis of the liver. Its causes are as yet unknown but a significant number of risk factors, covering such diverse factors as viral infections during infancy, blood antibody titres, mode of birth and number of siblings have been identified. In recent years there has been increasing recognition of the role played by the microbiome in human health, with a growing understanding that our relationship with the microbes that colonise the different parts of the human body is symbiotic. Disruptions in the microbiome have been implicated in diseases such as obesity, autism and auto-immune diseases, as well as asthma. At the same time there has been an increasing awareness in asthma research that its multi-faceted and multi-factorial nature requires more sophistication than statistical association and regression. In this spirit we employ Bayesian networks, whose properties render them suitable for representing time-direct or even causal relationships, to gain insight into the nature of asthma. We begin with an example of the simplest Bayesian networks, a linear classifier, with which we predict outcomes in the fifth year-of-life according to the statistical distribution of variables from the first two years-of-life. (The qualification linear refers to the neglect of correlation and interaction among the predictive variables.)While classifiers have long been used for prognosis and diagnosis, we use them to identify useful asthma subtypes, called endotypes. Different endotypes often require different treatments and management programs, and driven by different biological factors. These different factors provide different predictors, and a predictor which separates one endotype from the healthy may not do so for a different endotype. We use this to mathematically construct an indicator of when a given predictor is exclusively predictive of a given endotype. Our so-called “exclusivity index” is quantitatively precise, unlike a significance threshold. The Cohort Asthma Study, whose longitudinal data we analyse, includes the relative abundances of genera present in the nasopharyngeal microbiome. In an apparent diversion, we use qq-plots to indicate relationships between the infant microbiome and fifth-year wheeze- and atopy- status. Interestingly, the relative abundance of Streptococcus under certain circumstances was found to be highly predictive of one of the endotypes we identified in the preceding chapter. Finally, we address the problem of mapping out the complicated interactions among multiple variables. Our model is an adaption of a package originally designed for inferring gene-interaction networks, called ARTIVA. This was a non-trivial matter requiring us to augment the discrete data values in order to make them compatible with the underlying mathematics of ARTIVA’s algorithm. With questions from the asthma literature and the posterior probabilities output by ARTIVA, we were guided to networks of the interactions between atopy, wheeze and infection, and could see the difference in the development of immunity-related variables between those who went on to exhibit wheeze in the fifth year-of-life and those who did not. Our model yielded networks indicating that sensitivity to viral infection is an effect and not a cause of atop and wheeze.
Keywordsbayesian networks; machine learning; asthma; statistics
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References
- Pathology - Theses