dc.contributor.author Kwok, Chun Fung dc.date.accessioned 2019-10-15T03:15:12Z dc.date.available 2019-10-15T03:15:12Z dc.date.issued 2019 dc.identifier.uri http://hdl.handle.net/11343/228925 dc.description © 2019 Chun Fung Kwok dc.description.abstract This thesis examines three problems in statistics: the missing data problem in the context of extracting trends from time series data, the combinatorial model selection problem in regression analysis, and the structure learning problem in graphical modelling / system identification. The goal of the first problem is to study how uncertainty in the missing data affects trend extraction. This work derives an analytical bound to characterise the error of the estimated trend in terms of the error of the imputation. It works for any imputation method and various trend-extraction methods, including a large subclass of linear filters and the Seasonal-Trend decomposition based on Loess (STL). The second problem is to tackle the combinatorial complexity which arises from the best-subset selection in regression analysis. Given p variables, a model can be formed by taking a subset of the variables, and the total number of models p is $2^p$. This work shows that if a hierarchical structure can be established on the model space, then the proposed algorithm, Gibbs Stochastic Search (GSS), can recover the true model with probability one in the limit and high probability with finite samples. The core idea is that when a hierarchical structure exists, every evaluation of a wrong model would give information about the correct model. By aggregating these information, one may recover the correct model without exhausting the model space. As an extension, parallelisation of the algorithm is also considered. The third problem is about inferring from data the systemic relationship between a set of variables. This work proposes a flexible class of multivariate distributions in a form of a directed acyclic graphical model, which uses a graph and models each node conditioning on the rest using a Generalised Linear Model (GLM), and it shows that while the number of possible graphs is $\Omega(2^{p \choose 2})$, a hierarchical structure exists and the GSS algorithm applies. Hence, a systemic relationship may be recovered from the data. Other applications like imputing missing data and simulating data with complex covariance structure are also investigated. dc.rights Terms and Conditions: Copyright in works deposited in Minerva Access is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only download, print and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works. dc.subject trend extraction dc.subject STL dc.subject Loess dc.subject linear filters dc.subject time series analysis dc.subject missing data analysis dc.subject combinatorial model selection dc.subject stochastic search dc.subject generalised linear model dc.subject GLM dc.subject best subset model selection dc.subject Gibbs sampler dc.subject Markov chain Monte Carlo dc.subject structure learning dc.subject graphical models dc.title Missing data analysis, combinatorial model selection and structure learning dc.type PhD thesis melbourne.affiliation.department School of Mathematics and Statistics melbourne.affiliation.faculty Science melbourne.thesis.supervisorname Guoqi Qian melbourne.contributor.author Kwok, Chun Fung melbourne.thesis.supervisorothername Yuriy Kuleshov melbourne.tes.fieldofresearch1 010405 Statistical Theory melbourne.tes.fieldofresearch2 010401 Applied Statistics melbourne.tes.confirmed true melbourne.accessrights Open Access
﻿