School of Mathematics and Statistics
http://hdl.handle.net/11343/293
2019-03-19T09:39:51Z
2019-03-19T09:39:51Z
Spline techniques for incomplete and complex data
Huang, Wei
http://hdl.handle.net/11343/221330
2019-03-09T23:22:07Z
2018-01-01T00:00:00Z
Spline techniques for incomplete and complex data
Huang, Wei
We consider incomplete data problems in two different complex data contexts: group testing data and functional data. In the group testing data context, we consider estimating the conditional prevalence of a disease from data pooled according to the group testing mechanism. Consistent estimators have been proposed in the literature, but they rely on the data being available for all individuals. In infectious disease studies where group testing is frequently applied, the covariate is often missing for some individuals. There, unless the missing mechanism occurs completely at random, applying the existing techniques to the complete cases without adjusting for missingness does not generally provide consistent estimators, and finding appropriate modifications is challenging. We develop a consistent adjusted spline estimator, derive its theoretical properties, and show how to adapt local polynomial and likelihood estimators to the missing data problem. We illustrate the numerical performance of our methods on simulated and real examples.
In the functional data context, we consider the problem of estimating the covariance function of functional data which are only observed on a subset of their domain, for example in the form of fragments observed on a small interval. Typically in this setting, no curve is observed on the entire domain so that the empirical covariance function or smooth versions of it can be computed only on a subset of its domain which typically consists in a diagonal band. We show that estimating the covariance function consistently outside that subset is possible and introduce conditions under which the covariance function is identifiable on its entire domain from the incomplete data. We propose to estimate the covariance on the observed subdomain first and extrapolate that to the entire domain by a tensor product series approximation.
While implementing our idea on the covariance estimation of the incomplete functional data, we found that the final extrapolated estimator was sensitive to the covariance estimator on the observed subdomain and that some smoothing over the subdomain was sometimes needed. However, smoothing over those subdomains is not straightforward since the subdomains are irregularly shaped and can even have interior gaps, where conventional smoothing methods are usually not applicable. We proposed a tensor product spline technique adapted to the irregularly shaped domain with interior gaps. A thorough introduction to how to construct the estimator in practice is given. We also give a review of various smoothing methods for irregularly shaped domains in the literature, and investigate and compare the finite sample properties of these techniques.
© 2018 Dr. Wei Huang
2018-01-01T00:00:00Z
Mechanistic and statistical models of skin disease transmission
Lydeamore, Michael J.
http://hdl.handle.net/11343/221232
2019-03-09T23:17:13Z
2018-01-01T00:00:00Z
Mechanistic and statistical models of skin disease transmission
Lydeamore, Michael J.
At any one time, more than 160 million children worldwide are infected with skin sores. In remote Aboriginal Australian communities, prevalence is as high as 40%. Skin sores infected with Group A Streptococcus (GAS) can lead to a number of acute and chronic health conditions. One of the primary risk factors for GAS infection is scabies, a small mite which causes a break in the skin layer, potentially allowing skin sore infection to take hold. This biological connection is reaffirmed by the observation that mass treatment for scabies in these remote communities has been associated with a reduction in the prevalence of skin sore infection, despite skin sores not being directly targeted. In the most extreme case, it has been hypothesised that the eradication of scabies in remote communities may lead to an eradication of skin sore related infection. Mass drug administration is the go-to solution for tackling the high prevalence of disease in these rural settings, but despite more than 20 years of implementation, sustained reductions in prevalence have not been achieved.
My aim in this thesis is to develop and analyse both mechanistic and statistical models of skin sores and scabies, considering the dynamics of each disease in isolation and coupled together. These models build a framework on which control strategies can be tested, with the aim to develop strategies that will lead to sustained prevalence reductions.
Following a biological introduction and technical information (Chapters 1 and 2), a mechanistic model for scabies infection is introduced. This model includes the dynamics of the life-cycle of the scabies mite, incorporating two methods of treatment for the infection. Mass drug administration strategies are also modelled. The optimal interval between successive mass drug administration (MDA) doses is calculated to be approximately two weeks. The analysis shows that an optimally timed two-dose, 100% effective, 100% coverage MDA is highly unlikely to lead to the eradication of scabies. In fact, four optimally timed successive doses are required for a probability of eradication greater than 1/1000. Next, an annually recurring MDA program is considered, in which some number of optimally timed doses is administered, and repeated annually. It is shown that increasing the number of administered doses always increases the probability of eradication. Importantly, moving from a two dose to a three dose annual strategy significantly increases the probability of eradication of scabies infection.
In order to parameterise a dynamic transmission model for skin sores, at least two key quantities must be estimated: the force of infection, and the infectious period. The study in Chapter 4 estimates the age of first infection, which is the inverse of the force of infection, using clinic presentation data of children from birth up to five years age. Three survival models are considered: the Kaplan-Meier estimator, the Cox proportional hazards model, and the parametric exponential mixture model. The mean age of first infection is estimated to be ten months for skin sores, and nine months for scabies. The work in Chapter 5 estimates both the force of infection and the infectious period by utilising a linearised infectious disease model. The data considered in this chapter consists of longitudinal observations of individuals across three studies. The methodology is verified using simulation estimation, and each dataset tested to ensure it carries sufficient information for use with the estimation method. The estimates for the force of infection vary by an order of magnitude between settings. Estimates of the infectious period are relatively constant at 12 − 20 days.
Chapter 6 consists of a dynamic model for skin sores transmission coupled with models for scabies transmission. Three different scabies models are considered. The first assumes that the dynamics of scabies are at equilibrium. In this case, analytical expressions for key epidemiological quantities can be derived, and values for the scabies prevalence below which skin sores will be eradicated can be calculated. Next, two dynamic models of scabies are considered. The first of these is the scabies model introduced in Chapter 3, which includes the full life-cycle of the scabies mite and treatment mechanisms. The second model consists of just three compartments, which is termed the SITS model. The SITS model approximates the complex life-cycle of the scabies mite into two compartments. The differences in dynamics between these two scabies models are analysed, and the impact on the prevalence of skin sores of an MDA which directly targets only scabies is considered. The comparison shows that, relative to the full model, the SITS model overestimates the impact on skin sores prevalence due to the MDA in the time period immediately following the MDA, but also predicts an earlier time of return to pre-MDA endemic infection prevalence. The SITS model also estimates a higher probability of eradication of skin sores compared to the full model. These two results demonstrate that caution should be taken when approximating the life-cycle of the scabies mite to consider the potential impact of MDA which targets only scabies.
Finally, Chapter 7 summarises the work presented in my thesis, discusses the limitations of the work and explores potential future directions for this research problem.
© 2018 Dr. Michael John Lydeamore
2018-01-01T00:00:00Z
An investigation of Australian rainfall using extreme value theory
Saunders, Kate
http://hdl.handle.net/11343/220318
2019-03-08T15:57:58Z
2018-01-01T00:00:00Z
An investigation of Australian rainfall using extreme value theory
Saunders, Kate
In this thesis, we use extreme value theory to fit statistical models to observations of Australian daily rainfall extremes. We build upon the existing literature by challenging the basic assumptions of these models when applied in a climate setting. The types of applications we consider range from univariate extreme value theory, to spatial extremes with dependence, and finally to an investigation of extremal dependence. The combined application content provides an in-depth investigation into our understanding of Australian rainfall extremes and the risks posed by these extreme events.
We consider how large scale climate drivers, such as El Niño Southern Oscillation (ENSO), can influence the distribution of rainfall extremes. Using observations of daily rainfall from station data, we quantify the magnitude and spatial influence of ENSO on the distribution of seasonal maximum daily rainfall. We contrast these results obtained from an at-station analysis, with those from a simple spatial model, ultimately producing maps of the region of ENSO influence.
We then consider an application where we use max-stable processes to model rainfall extremes in continuous space with dependence. We fit a max-stable process to the annual maximum daily rainfall in South East Queensland and simulate the extreme precipitation field. We quantify the severity of an historical flash flood in this region, showing that the probability of this event was significantly higher given the phase of ENSO.
Finally, we examine variation in the dependence behaviour of daily rainfall extremes. For Australia, a single dependence structure for spatial models of rainfall extremes is unrealistic. This is due to the country size, variations in climate and complexity of topography. In order help account for these variations, we present a regionalisation of Australia. In this regionalisation, locations are grouped according to similar dependence of rainfall extremes.
The overarching goal of this thesis is to improve our understanding of the risks posed by extreme rainfall events in Australia. We achieve this by utilising spatial, statistical models. However, we acknowledge that using extreme value theory for modelling real world applications in practice has challenges. We highlight these practical considerations in our applications, so that other researchers may be aware of the advantages of this kind of modelling, as well as some of its practical limitations.
© 2018 Dr. Kate Saunders
2018-01-01T00:00:00Z
Selected problems in enumerative combinatorics: permutation classes, random walks and planar maps
Elvey Price, Andrew
http://hdl.handle.net/11343/219277
2018-12-15T23:48:05Z
2018-01-01T00:00:00Z
Selected problems in enumerative combinatorics: permutation classes, random walks and planar maps
Elvey Price, Andrew
In this thesis we consider a number of enumerative combinatorial problems. We solve the problems of enumerating Eulerian orientations by edges and quartic Eulerian orientations counted by vertices. We also find and prove an algebraic relationship between the counting functions for permutations sortable by a double ended queue (deque) and permutations sortable by two stacks in parallel (2sip). In each of these cases, our proof of the result uses an elaborate system of functional equations which is much more complicated than the result itself, leaving the door open for a more direct, combinatorial proof.
We find polynomial time algorithms for generating the counting sequence of deque-sortable permutations and the cogrowth sequence of some groups, including the lamplighter group $L$ and the Brin-Navas group $B$. For permutations sortable by two stacks in series and for the cogrowth sequence of Thompson's group $F$, we find exponential time algorithms which are significantly more efficient than the algorithms that previously existed in the literature. In each case an empirical analysis of the produced terms of the sequence leads to a prediction regarding its asymptotic form. In particular, this method leads us to conjecture that the growth rate of deque-sortable permutations is equal to that of 2sip-sortable permutations, a conjecture which we reduce to three conjectures of Albert and Bousquet-M\'elou about quarter plane walks. The analysis of the cogrowth sequence of Thompson's group $F$ leads us to conjecture that $F$ is not amenable.
We also study the enumeration of $1324$-avoiding permutations, a notoriously difficult problem in the field of pattern avoiding permutations. Using a structural decomposition of these permutations, we improve the lower and upper bounds on the growth rate to $10.271$ and $13.5$ respectively.
Next we investigate the concept of combinatorial Stieltjes moment sequences. We prove that the counting sequence of returns in any undirected locally-finite graph is a Stieltjes moment sequence. As a special case, this implies that any cogrowth sequence is a Stieltjes moment sequence. Based on empirical evidence, we conjecture that the counting sequence for $1324$-avoiding permutations is a Stieltjes moment sequence, which would imply an improved lower bound of $10.302$ on its growth rate.
We then describe a general class of counting sequences of augmented perfect matchings, which we prove to be Stieltjes moment sequences. In fact, we prove the stronger result that these sequences are Hankel totally-positive as sequences of polynomials. As a special case, we show that the Ward polynomials are Hankel totally-positive.
In the final chapter we generalise an identity of Duminil-Copin and Smirnov for the $O(n)$ loop model on the hexagonal lattice to the off-critical case. In the $n=0$ case, which corresponds to the enumeration of self-avoiding walks, we use our identity to prove a relationship between the half-plane surface critical exponents $\gamma_{1}$ and $\gamma_{11}$ and the exponent characterising the winding angle distribution of self-avoiding walks in the half-plane.
© 2018 Dr Andrew Elvey Price
2018-01-01T00:00:00Z