University Library
  • Login
A gateway to Melbourne's research publications
Minerva Access is the University's Institutional Repository. It aims to collect, preserve, and showcase the intellectual output of staff and students of the University of Melbourne for a global audience.
View Item 
  • Minerva Access
  • Medicine, Dentistry & Health Sciences
  • Melbourne School of Population and Global Health
  • Melbourne School of Population and Global Health - Theses
  • View Item
  • Minerva Access
  • Medicine, Dentistry & Health Sciences
  • Melbourne School of Population and Global Health
  • Melbourne School of Population and Global Health - Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

    Evaluation of multiple imputation methods for dealing with missing longitudinal data

    Thumbnail
    Citations
    Altmetric
    Author
    De Silva, Anurika Priyanjali
    Date
    2018
    Affiliation
    Melbourne School of Population and Global Health
    Metadata
    Show full item record
    Document Type
    PhD thesis
    Access Status
    This item is embargoed and will be available on 2021-02-13. This item is currently available to University of Melbourne staff and students only, login required.
    URI
    http://hdl.handle.net/11343/220625
    Description

    © 2018 Dr. Anurika Priyanjali De Silva

    Abstract
    Background: Missing data is a common problem in epidemiological studies and is especially prominent in longitudinal cohorts, as these studies require the participation of respondents at multiple waves. The statistical literature contains extensive research on handling missing data at a single time point, with multiple imputation (MI) being a widely used approach. However, there is limited guidance on using MI in complex longitudinal settings. My PhD focuses on the evaluation of MI methods in three specific settings commonly encountered in practice: 1) a time-dependent exposure with a non-linear trajectory over time, 2) a longitudinal categorical exposure with restrictions on transitions over time and 3) a time-dependent outcome variable in a longitudinal study with sampling weights. Methods: I evaluated three MI methods currently available in the Stata statistical software; multivariate normal imputation (MVNI), fully conditional specification (FCS) and the two-fold fully conditional specification (two-fold FCS) algorithm. When handling missing longitudinal data MVNI and FCS treat repeated measurements of the same variables as distinct variables, and face convergence issues when there are many time points and/or many incomplete variables. The two-fold FCS algorithm was introduced to overcome these limitations as it only uses information from current and adjacent time points for imputation. In each scenario, various versions of the MI methods were evaluated using comprehensive simulation studies based on the Longitudinal Study of Australian Children (LSAC). The performance of these methods was evaluated for varying percentages of missing data where data were either missing completely at random or missing at random. The methods were also compared using case studies from LSAC. Results: MVNI and FCS performed adequately when handling the incomplete time-dependent exposure with a non-linear trajectory, demonstrating the importance of including as much information as possible in the imputation model. If faced with convergence problems, the two-fold FCS may be used as long as there is a sufficiently large time window to capture the non-linear trajectory. Predictive mean matching within the FCS framework performed best for imputing an incomplete categorical variable with restrictions over time, while all other implementations of FCS faced convergence problems. MVNI followed by rounding to transform non-integer imputed values into original categories resulted in biased estimates. It was found that it is important to account for restrictions within the imputation procedure. All implementations of FCS incorporating sampling weights faced issues of convergence. Meanwhile, MVNI including the design variables used to generate sampling weights in the imputation model performed well. If these variables are unknown, information from sampling weights can be incorporated within the imputation model by including sampling weights or the design stratum indicator as a fixed effect in the imputation model. Conclusions: While existing MI methods, MVNI and FCS, can be used to handle incomplete longitudinal data, their performance varies depending on the scenario suggesting that MI should be customised to the given setting. This research provides guidance on how to do so in specific scenarios contributing to the literature on missing data methodology. Researchers are encouraged to be aware of these developments to ensure that missing data are handled effectively.
    Keywords
    multiple imputation; missing longitudinal data; multivariate normal imputation; fully conditional specification; sampling weights; restricted categorical variables; non-linear trajectories

    Export Reference in RIS Format     

    Endnote

    • Click on "Export Reference in RIS Format" and choose "open with... Endnote".

    Refworks

    • Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References


    Collections
    • Minerva Elements Records [45770]
    • Melbourne School of Population and Global Health - Theses [254]
    Minerva AccessDepositing Your Work (for University of Melbourne Staff and Students)NewsFAQs

    BrowseCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects
    My AccountLoginRegister
    StatisticsMost Popular ItemsStatistics by CountryMost Popular Authors