Paediatrics (RCH) - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    An exploration of multiple imputation strategies for handling missing data in composite scores with incomplete items
    Apajee, Jemishabye ( 2016)
    Missing data are common in medical research. One area where missing data can arise is in composite scores (or scale scores) when one or more of the items that form the scale is incomplete. A method that is becoming increasingly popular for handling missing data is multiple imputation (MI). In the context of missing data in scale scores, MI can be applied at either the item level or the scale level. Various strategies have been proposed in the literature for imputing missing data at the scale level and the item level. Yet there is little comparison of these strategies in longitudinal settings and not much guidance is available about how to best implement these strategies. The challenge with using the available strategies in longitudinal studies is that one may want to impute missing data in several scales, each of which comprises a large number of items that have been measured at several waves, leading to large imputation models which may result in convergence problems. It is therefore important to evaluate the performance of these strategies in longitudinal settings to provide proper guidance for users of MI. In this thesis, I used a simulation study and a real example from the Longitudinal Study of Australian Children (LSAC) to compare the performance of the four MI strategies that are available for handling missing data in composite scores within a longitudinal setting. These strategies are: scale-level imputation using scale scores as auxiliary variables; the “standard” item-level imputation, which uses other items as auxiliary variables; item-level imputation using scale scores as auxiliary variables; and item-level imputation using principal components, derived from other items, as auxiliary variables. I also compared the effect of implementing these strategies using two MI approaches, multivariate normal imputation (MVNI) and fully conditional specification (FCS). While the literature recommends item-level imputation over scale level imputation, the research in this thesis demonstrates that when implemented using FCS, item-level imputation, with items from other scales as auxiliary variables, could produce biased parameter estimates. This research also provides support for using scales scores or principal components as auxiliary variables in item-level imputation models when the “standard” item-level imputation strategy cannot be used due to convergence problems.
  • Item
    Thumbnail Image
    An investigation of multiple imputation for missing data in a longitudinal study of mental health and behaviour in young people
    Rodwell, Laura ( 2015)
    Longitudinal studies involve the repeated follow-up of individuals over a period of time. In epidemiological research, longitudinal studies are used to observe changes in health and behaviour, and to investigate associations between risk factors and outcomes measured at a later time point. The current study uses data from the Victorian Adolescent Health Cohort Study (VAHCS), a longitudinal study of young people recruited in adolescence and followed into adulthood. During the adolescent phase, the VAHCS collected data on participants’ mental health and behavioural problems, including substance use and antisocial behaviour. During the young adult phase, data were also collected on key social-role transitions, including the completion of education, leaving the family home, becoming financially independent, forming a committed relationship, and having children. The epidemiological research presented in this thesis focused on the transition from school into the workforce, and examined the extent to which adolescent mental health and behavioural problems were associated with being not in employment, education, or training (NEET) in young adulthood. When analysing data from longitudinal studies, it is common to face the problem of missing data. To handle the missing data in the VAHCS, I used the method of multiple imputation. Multiple imputation is a powerful tool for the analysis of incomplete data, but it requires a range of decisions on issues that arise when building the imputation model. One such decision concerns how to impute different variable types. When outlining the proposed imputation model for the epidemiological analysis, I was unsure about which methods would be most appropriate for the imputation of two types of variable. The first type was a limited-range variable, which had a restriction to both ends of its range. The second type of variable was a semi-continuous variable that was to be categorised for analysis. Semi-continuous variables have a large proportion of zeros and a continuous range of values otherwise. An example of a semi-continuous variable is weekly units of alcohol consumed. In the methodological research component of this thesis, simulation experiments were used to compare available methods for imputing missing values in these variables, with the goal to identify the most appropriate method. A key finding from both simulation studies was that methods that required values to be rounded after imputation (either to the limits of the range or to discrete categories) performed poorly, producing estimates with the greatest bias. The results from the epidemiological analysis, with multiple imputation used to handle the missing data in the VAHCS, identified that around 8 per cent of males were NEET at the age of 21 years, decreasing to around 6 per cent at 24 years. The percentage of females who were NEET actually increased slightly from just below 9 per cent at the age of 21 years to 10 per cent at 24 years of age. The risk of being NEET in young adulthood was higher for participants who reported repeated antisocial behaviours and frequent cannabis use, and who had a longer duration of mental health disorder in adolescence.