Show simple item record

dc.contributor.authorKwok, Chun Fung
dc.date.accessioned2019-10-15T03:15:12Z
dc.date.available2019-10-15T03:15:12Z
dc.date.issued2019
dc.identifier.urihttp://hdl.handle.net/11343/228925
dc.description© 2019 Chun Fung Kwok
dc.description.abstractThis thesis examines three problems in statistics: the missing data problem in the context of extracting trends from time series data, the combinatorial model selection problem in regression analysis, and the structure learning problem in graphical modelling / system identification. The goal of the first problem is to study how uncertainty in the missing data affects trend extraction. This work derives an analytical bound to characterise the error of the estimated trend in terms of the error of the imputation. It works for any imputation method and various trend-extraction methods, including a large subclass of linear filters and the Seasonal-Trend decomposition based on Loess (STL). The second problem is to tackle the combinatorial complexity which arises from the best-subset selection in regression analysis. Given p variables, a model can be formed by taking a subset of the variables, and the total number of models p is $2^p$. This work shows that if a hierarchical structure can be established on the model space, then the proposed algorithm, Gibbs Stochastic Search (GSS), can recover the true model with probability one in the limit and high probability with finite samples. The core idea is that when a hierarchical structure exists, every evaluation of a wrong model would give information about the correct model. By aggregating these information, one may recover the correct model without exhausting the model space. As an extension, parallelisation of the algorithm is also considered. The third problem is about inferring from data the systemic relationship between a set of variables. This work proposes a flexible class of multivariate distributions in a form of a directed acyclic graphical model, which uses a graph and models each node conditioning on the rest using a Generalised Linear Model (GLM), and it shows that while the number of possible graphs is $\Omega(2^{p \choose 2})$, a hierarchical structure exists and the GSS algorithm applies. Hence, a systemic relationship may be recovered from the data. Other applications like imputing missing data and simulating data with complex covariance structure are also investigated.
dc.rightsTerms and Conditions: Copyright in works deposited in Minerva Access is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only download, print and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works.
dc.subjecttrend extraction
dc.subjectSTL
dc.subjectLoess
dc.subjectlinear filters
dc.subjecttime series analysis
dc.subjectmissing data analysis
dc.subjectcombinatorial model selection
dc.subjectstochastic search
dc.subjectgeneralised linear model
dc.subjectGLM
dc.subjectbest subset model selection
dc.subjectGibbs sampler
dc.subjectMarkov chain Monte Carlo
dc.subjectstructure learning
dc.subjectgraphical models
dc.titleMissing data analysis, combinatorial model selection and structure learning
dc.typePhD thesis
melbourne.affiliation.departmentSchool of Mathematics and Statistics
melbourne.affiliation.facultyScience
melbourne.thesis.supervisornameGuoqi Qian
melbourne.contributor.authorKwok, Chun Fung
melbourne.thesis.supervisorothernameYuriy Kuleshov
melbourne.tes.fieldofresearch1010405 Statistical Theory
melbourne.tes.fieldofresearch2010401 Applied Statistics
melbourne.tes.confirmedtrue
melbourne.accessrightsOpen Access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record