School of Mathematics and Statistics - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 12
  • Item
    Thumbnail Image
    Central subspaces review: methods and applications
    Rodrigues, SA ; Huggins, R ; Liquet, B (Institute of Mathematical Statistics, 2022-01-01)
  • Item
    Thumbnail Image
    A model for analyzing clustered occurrence data
    Hwang, W-H ; Huggins, R ; Stoklosa, J (WILEY, 2022-06)
    Spatial or temporal clustering commonly arises in various biological and ecological applications, for example, species or communities may cluster in groups. In this paper, we develop a new clustered occurrence data model where presence-absence data are modeled under a multivariate negative binomial framework. We account for spatial or temporal clustering by introducing a community parameter in the model that controls the strength of dependence between observations thereby enhancing the estimation of the mean and dispersion parameters. We provide conditions to show the existence of maximum likelihood estimates when cluster sizes are homogeneous and equal to 2 or 3 and consider a composite likelihood approach that allows for additional robustness and flexibility in fitting for clustered occurrence data. The proposed method is evaluated in a simulation study and demonstrated using forest plot data from the Center for Tropical Forest Science. Finally, we present several examples using multiple visit occupancy data to illustrate the difference between the proposed model and those of N-mixture models.
  • Item
    Thumbnail Image
    Estimating negative binomial parameters from occurrence data with detection times
    Hwang, W-H ; Huggins, R ; Stoklosa, J (WILEY, 2016-11)
    The negative binomial distribution is a common model for the analysis of count data in biology and ecology. In many applications, we may not observe the complete frequency count in a quadrat but only that a species occurred in the quadrat. If only occurrence data are available then the two parameters of the negative binomial distribution, the aggregation index and the mean, are not identifiable. This can be overcome by data augmentation or through modeling the dependence between quadrat occupancies. Here, we propose to record the (first) detection time while collecting occurrence data in a quadrat. We show that under what we call proportionate sampling, where the time to survey a region is proportional to the area of the region, that both negative binomial parameters are estimable. When the mean parameter is larger than two, our proposed approach is more efficient than the data augmentation method developed by Solow and Smith (, Am. Nat. 176, 96-98), and in general is cheaper to conduct. We also investigate the effect of misidentification when collecting negative binomially distributed data, and conclude that, in general, the effect can be simply adjusted for provided that the mean and variance of misidentification probabilities are known. The results are demonstrated in a simulation study and illustrated in several real examples.
  • Item
    Thumbnail Image
    Nonparametric Estimation of the Number of Drug Users in Hong Kong Using Repeated Multiple Lists
    Huggins, RM ; Yip, PSF ; Stoklosa, J (WILEY, 2016-03)
    Summary We update a previous approach to the estimation of the size of an open population when there are multiple lists at each time point. Our motivation is 35 years of longitudinal data on the detection of drug users by the Central Registry of Drug Abuse in Hong Kong. We develop a two‐stage smoothing spline approach. This gives a flexible and easily implemented alternative to the previous method which was based on kernel smoothing. The new method retains the property of reducing the variability of the individual estimates at each time point. We evaluate the new method by means of a simulation study that includes an examination of the effects of variable selection. The new method is then applied to data collected by the Central Registry of Drug Abuse. The parameter estimates obtained are compared with the well known Jolly–Seber estimates based on single capture methods.
  • Item
    Thumbnail Image
    The score test for the two-sample occupancy model
    Karavarsamis, N ; Guillera-Arroita, G ; Huggins, RM ; Morgan, BJT (WILEY, 2020-04)
    Summary The score test statistic from the observed information is easy to compute numerically. Its large sample distribution under the null hypothesis is well known and is equivalent to that of the score test based on the expected information, the likelihood‐ratio test and the Wald test. However, several authors have noted that under the alternative hypothesis this no longer holds and in particular the score statistic from the observed information can take negative values. We extend the anthology on the score test to a problem of interest in ecology when studying species occurrence. This is the comparison of two zero‐inflated binomial random variables from two independent samples under imperfect detection. An analysis of eigenvalues associated with the score test in this setting assists in understanding why using the observed information matrix in the score test can be problematic. We demonstrate through a combination of simulations and theoretical analysis that the power of the score test calculated under the observed information decreases as the populations being compared become more dissimilar. In particular, the score test based on the observed information is inconsistent. Finally, we propose a modified rule that rejects the null hypothesis when the score statistic is computed using the observed information is negative or is larger than the usual chi‐square cut‐off. In simulations in our setting this has power that is comparable to the Wald and likelihood ratio tests and consistency is largely restored. Our new test is easy to use and inference is possible. Supplementary material for this article is available online as per journal instructions.
  • Item
    Thumbnail Image
    Improved Methodology for Assessment of mRNA Levels in Blood of Patients with FMR1 Related Disorders
    Godler, DE ; Loesch, DZ ; Huggins, R ; Gordon, L ; Slater, HR ; Gehling, F ; Burgess, T ; Choo, KHA (BIOMED CENTRAL LTD, 2009)
    BACKGROUND: Elevated levels of FMR1 mRNA in blood have been implicated in RNA toxicity associated with a number of clinical conditions. Due to the extensive inter-sample variation in the time lapse between the blood collection and RNA extraction in clinical practice, the resulting variation in mRNA quality significantly confounds mRNA analysis by real-time PCR. METHODS: Here, we developed an improved method to normalize for mRNA degradation in a sample set with large variation in rRNA quality, without sample omission. Initially, RNA samples were artificially degraded, and analyzed using capillary electrophoresis and real-time PCR standard curve method, with the aim of defining the best predictors of total RNA and mRNA degradation. RESULTS: We found that: (i) the 28S:18S ratio and RNA quality indicator (RQI) were good predictors of severe total RNA degradation, however, the greatest changes in the quantity of different mRNAs (FMR1, DNMT1, GUS, B2M and GAPDH) occurred during the early to moderate stages of degradation; (ii) chromatographic features for the 18S, 28S and the inter-peak region were the most reliable predictors of total RNA degradation, however their use for target gene normalization was inferior to internal control genes, of which GUS was the most appropriate. Using GUS for normalization, we examined in the whole blood the relationship between the FMR1 mRNA and CGG expansion in a non-coding portion of this gene, in a sample set (n = 30) with the large variation in rRNA quality. By combining FMR1 3' and 5' mRNA analyses the confounding impact of mRNA degradation on the correlation between FMR1 expression and CGG size was minimized, and the biological significance increased from p = 0.046 for the 5' FMR1 assay, to p = 0.018 for the combined FMR1 3' and 5' mRNA analysis. CONCLUSION: Our observations demonstrate that, through the use of an appropriate internal control and the direct analysis of multiple sites of target mRNA, samples that do not conform to the conventional rRNA criteria can still be utilized to obtain biologically/clinically relevant data. Although, this strategy clearly has application for improved assessment of FMR1 mRNA toxicity in blood, it may also have more general implications for gene expression studies in fresh and archival tissues.
  • Item
    Thumbnail Image
    The cost of breast cancer recurrences.
    Hurley, SF ; Huggins, RM ; Snyder, RD ; Bishop, JF (Springer Science and Business Media LLC, 1992-03)
    Information about the costs of recurrent breast cancer is potentially important for targeting cost containment strategies and analysing the cost-effectiveness of breast cancer control programmes. We estimated these costs by abstracting health service and consumable usage data from the medical histories of 128 patients, and valuing each of the resources used. Resource usage and costs were summarised by regarding the recurrence as a series of episodes which were categorised into five anatomical site-based groups according to the following hierarchy: visceral, central nervous system (CNS), bone, local and other. Hospital visits and investigations comprised 78% of total costs for all episodes combined, and there were significant differences between the site-based groups in the frequency of hospital visits and most investigations. Total costs were most accurately described by separate linear regression models for each group, with the natural logarithm of the cost of the episode as the dependent variable, and predictor variables including the duration of the episode, duration squared, duration cubed and a variable indicating whether the episode was fatal. Visceral and CNS episodes were associated with higher costs than the other groups and were more likely to be shorter and fatal. A fatal recurrence of duration 15.7 months (the median for our sample) was predicted to cost $10,575 (Aus + 1988; or 4,877 pounds). Reduction of the substantial costs of recurrent breast cancer is likely to be a sizable economic benefit of adjuvant systemic therapy and mammographic screening. We did not identify any major opportunities for cost containment during the management of recurrences.
  • Item
    No Preview Available
    A nonparametric estimation of the infection curve
    Lin, H ; Yip, PSF ; Huggins, RM (SCIENCE PRESS, 2011-09)
    Predicting the future course of an epidemic depends on being able to estimate the current numbers of infected individuals. However, while back-projection techniques allow reliable estimation of the numbers of infected individuals in the more distant past, they are less reliable in the recent past. We propose two new nonparametric methods to estimate the unobserved numbers of infected individuals in the recent past in an epidemic. The proposed methods are noniterative, easily computed and asymptotically normal with simple variance formulas. Simulations show that the proposed methods are much more robust and accurate than the existing back projection method, especially for the recent past, which is our primary interest. We apply the proposed methods to the 2003 Severe Acute Respiratory Syndorme (SARS) epidemic in Hong Kong.
  • Item
    No Preview Available
    A chain multinomial model for estimating the real-time fatality rate of a disease, with an application to severe acute respiratory syndrome
    Yip, PSF ; Lau, EHY ; Lam, KF ; Huggins, RM (OXFORD UNIV PRESS INC, 2005-04-01)
    It is well known that statistics using cumulative data are insensitive to changes. World Health Organization (WHO) estimates of fatality rates are of the above type, which may not be able to reflect the latest changes in fatality due to treatment or government policy in a timely fashion. Here, the authors propose an estimate of a real-time fatality rate based on a chain multinomial model with a kernel function. It is more accurate than the WHO estimate in describing fatality, especially earlier in the course of an epidemic. The estimator provides useful information for public health policy makers for understanding the severity of the disease or evaluating the effects of treatments or policies within a shorter time period, which is critical in disease control during an outbreak. Simulation results showed that the performance of the proposed estimator is superior to that of the WHO estimator in terms of its sensitivity to changes and its timeliness in reflecting the severity of the disease.
  • Item
    Thumbnail Image
    Small population size and extremely low levels of genetic diversity in island populations of the platypus, Ornithorhynchus anatinus
    Furlan, E ; Stoklosa, J ; Griffiths, J ; Gust, N ; Ellis, R ; Huggins, RM ; Weeks, AR (WILEY, 2012-04)
    Genetic diversity generally underpins population resilience and persistence. Reductions in population size and absence of gene flow can lead to reductions in genetic diversity, reproductive fitness, and a limited ability to adapt to environmental change increasing the risk of extinction. Island populations are typically small and isolated, and as a result, inbreeding and reduced genetic diversity elevate their extinction risk. Two island populations of the platypus, Ornithorhynchus anatinus, exist; a naturally occurring population on King Island in Bass Strait and a recently introduced population on Kangaroo Island off the coast of South Australia. Here we assessed the genetic diversity within these two island populations and contrasted these patterns with genetic diversity estimates in areas from which the populations are likely to have been founded. On Kangaroo Island, we also modeled live capture data to determine estimates of population size. Levels of genetic diversity in King Island platypuses are perilously low, with eight of 13 microsatellite loci fixed, likely reflecting their small population size and prolonged isolation. Estimates of heterozygosity detected by microsatellites (H(E)= 0.032) are among the lowest level of genetic diversity recorded by this method in a naturally outbreeding vertebrate population. In contrast, estimates of genetic diversity on Kangaroo Island are somewhat higher. However, estimates of small population size and the limited founders combined with genetic isolation are likely to lead to further losses of genetic diversity through time for the Kangaroo Island platypus population. Implications for the future of these and similarly isolated or genetically depauperate populations are discussed.