Business Administration - Research Publications

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 11
  • Item
    No Preview Available
    bamlss: A Lego Toolbox for Flexible Bayesian Regression (and Beyond)
    Umlauf, N ; Klein, N ; Simon, T ; Zeileis, A (JOURNAL STATISTICAL SOFTWARE, 2021-11)
  • Item
    Thumbnail Image
    Statistical model building: Background "knowledge" based on inappropriate preselection causes misspecification
    Hafermann, L ; Becher, H ; Herrmann, C ; Klein, N ; Heinze, G ; Rauch, G (BMC, 2021-09-29)
    BACKGROUND: Statistical model building requires selection of variables for a model depending on the model's aim. In descriptive and explanatory models, a common recommendation often met in the literature is to include all variables in the model which are assumed or known to be associated with the outcome independent of their identification with data driven selection procedures. An open question is, how reliable this assumed "background knowledge" truly is. In fact, "known" predictors might be findings from preceding studies which may also have employed inappropriate model building strategies. METHODS: We conducted a simulation study assessing the influence of treating variables as "known predictors" in model building when in fact this knowledge resulting from preceding studies might be insufficient. Within randomly generated preceding study data sets, model building with variable selection was conducted. A variable was subsequently considered as a "known" predictor if a predefined number of preceding studies identified it as relevant. RESULTS: Even if several preceding studies identified a variable as a "true" predictor, this classification is often false positive. Moreover, variables not identified might still be truly predictive. This especially holds true if the preceding studies employed inappropriate selection methods such as univariable selection. CONCLUSIONS: The source of "background knowledge" should be evaluated with care. Knowledge generated on preceding studies can cause misspecification.
  • Item
    Thumbnail Image
    Editorial "Joint modeling of longitudinal and time-to-event data and beyond"
    Suarez, CC ; Klein, N ; Kneib, T ; Molenberghs, G ; Rizopoulos, D (WILEY, 2017-11)
  • Item
    Thumbnail Image
    Studying the relationship between a woman's reproductive lifespan and age at menarche using a Bayesian multivariate structured additive distributional regression model
    Duarte, E ; de Sousa, B ; Cadarso-Suarez, C ; Klein, N ; Kneib, T ; Rodrigues, V (WILEY, 2017-11)
    Studies addressing breast cancer risk factors have been looking at trends relative to age at menarche and menopause. These studies point to a downward trend of age at menarche and an upward trend for age at menopause, meaning an increase of a woman's reproductive lifespan cycle. In addition to studying the effect of the year of birth on the expectation of age at menarche and a woman's reproductive lifespan, it is important to understand how a woman's cohort affects the correlation between these two variables. Since the behavior of age at menarche and menopause may vary with the geographic location of a woman's residence, the spatial effect of the municipality where a woman resides needs to be considered. Thus, a Bayesian multivariate structured additive distributional regression model is proposed in order to analyze how a woman's municipality and year of birth affects a woman's age of menarche, her lifespan cycle, and the correlation of the two. The data consists of 212,517 postmenopausal women, born between 1920 and 1965, who attended the breast cancer screening program in the central region of Portugal.
  • Item
    Thumbnail Image
    Boosting joint models for longitudinal and time-to-event data
    Waldmann, E ; Taylor-Robinson, D ; Klein, N ; Kneib, T ; Pressler, T ; Schmid, M ; Mayr, A (WILEY, 2017-11)
    Joint models for longitudinal and time-to-event data have gained a lot of attention in the last few years as they are a helpful technique clinical studies where longitudinal outcomes are recorded alongside event times. Those two processes are often linked and the two outcomes should thus be modeled jointly in order to prevent the potential bias introduced by independent modeling. Commonly, joint models are estimated in likelihood-based expectation maximization or Bayesian approaches using frameworks where variable selection is problematic and that do not immediately work for high-dimensional data. In this paper, we propose a boosting algorithm tackling these challenges by being able to simultaneously estimate predictors for joint models and automatically select the most influential variables even in high-dimensional data situations. We analyze the performance of the new algorithm in a simulation study and apply it to the Danish cystic fibrosis registry that collects longitudinal lung function data on patients with cystic fibrosis together with data regarding the onset of pulmonary infections. This is the first approach to combine state-of-the art algorithms from the field of machine-learning with the model class of joint models, providing a fully data-driven mechanism to select variables and predictor effects in a unified framework of boosting joint models.
  • Item
    Thumbnail Image
    Mixed binary-continuous copula regression models with application to adverse birth outcomes
    Klein, N ; Kneib, T ; Marra, G ; Radice, R ; Rokicki, S ; McGovern, ME (Wiley, 2019-02-10)
    Bivariate copula regression allows for the flexible combination of two arbitrary, continuous marginal distributions with regression effects being placed on potentially all parameters of the resulting bivariate joint response distribution. Motivated by the risk factors for adverse birth outcomes, many of which are dichotomous, we consider mixed binary‐continuous responses that extend the bivariate continuous framework to the situation where one response variable is discrete (more precisely, binary) whereas the other response remains continuous. Utilizing the latent continuous representation of binary regression models, we implement a penalized likelihood–based approach for the resulting class of copula regression models and employ it in the context of modeling gestational age and the presence/absence of low birth weight. The analysis demonstrates the advantage of the flexible specification of regression impacts including nonlinear effects of continuous covariates and spatial effects. Our results imply that racial and spatial inequalities in the risk factors for infant mortality are even greater than previously suggested.
  • Item
    Thumbnail Image
    Bayesian Effect Selection in Structured Additive Distributional Regression Models
    Klein, N ; Carlan, M ; Kneib, T ; Lang, S ; Wagner, H (INT SOC BAYESIAN ANALYSIS, 2021-06)
  • Item
    Thumbnail Image
    Bayesian variable selection for non-Gaussian responses: a marginally calibrated copula approach
    Klein, N ; Smith, MS (WILEY, 2021-09)
    We propose a new highly flexible and tractable Bayesian approach to undertake variable selection in non-Gaussian regression models. It uses a copula decomposition for the joint distribution of observations on the dependent variable. This allows the marginal distribution of the dependent variable to be calibrated accurately using a nonparametric or other estimator. The family of copulas employed are "implicit copulas" that are constructed from existing hierarchical Bayesian models widely used for variable selection, and we establish some of their properties. Even though the copulas are high dimensional, they can be estimated efficiently and quickly using Markov chain Monte Carlo. A simulation study shows that when the responses are non-Gaussian, the approach selects variables more accurately than contemporary benchmarks. A real data example in the Web Appendix illustrates that accounting for even mild deviations from normality can lead to a substantial increase in accuracy. To illustrate the full potential of our approach, we extend it to spatial variable selection for fMRI. Using real data, we show our method allows for voxel-specific marginal calibration of the magnetic resonance signal at over 6000 voxels, leading to an increase in the quality of the activation maps.
  • Item
    Thumbnail Image
    Systematic review of education and practical guidance on regression modeling for medical researchers who lack a strong statistical background: Study protocol
    Bach, P ; Wallisch, C ; Klein, N ; Hafermann, L ; Sauerbrei, W ; Steyerberg, EW ; Heinze, G ; Rauch, G ; Bender, R (PUBLIC LIBRARY SCIENCE, 2020-12-21)
    In the last decades, statistical methodology has developed rapidly, in particular in the field of regression modeling. Multivariable regression models are applied in almost all medical research projects. Therefore, the potential impact of statistical misconceptions within this field can be enormous Indeed, the current theoretical statistical knowledge is not always adequately transferred to the current practice in medical statistics. Some medical journals have identified this problem and published isolated statistical articles and even whole series thereof. In this systematic review, we aim to assess the current level of education on regression modeling that is provided to medical researchers via series of statistical articles published in medical journals. The present manuscript is a protocol for a systematic review that aims to assess which aspects of regression modeling are covered by statistical series published in medical journals that intend to train and guide applied medical researchers with limited statistical knowledge. Statistical paper series cannot easily be summarized and identified by common keywords in an electronic search engine like Scopus. We therefore identified series by a systematic request to statistical experts who are part or related to the STRATOS Initiative (STRengthening Analytical Thinking for Observational Studies). Within each identified article, two raters will independently check the content of the articles with respect to a predefined list of key aspects related to regression modeling. The content analysis of the topic-relevant articles will be performed using a predefined report form to assess the content as objectively as possible. Any disputes will be resolved by a third reviewer. Summary analyses will identify potential methodological gaps and misconceptions that may have an important impact on the quality of analyses in medical research. This review will thus provide a basis for future guidance papers and tutorials in the field of regression modeling which will enable medical researchers 1) to interpret publications in a correct way, 2) to perform basic statistical analyses in a correct way and 3) to identify situations when the help of a statistical expert is required.
  • Item
    Thumbnail Image
    Quality and resource efficiency in hospital service provision: A geoadditive stochastic frontier analysis of stroke quality of care in Germany
    Pross, C ; Strumann, C ; Geissler, A ; Herwartz, H ; Klein, N ; Arrieta, A (PUBLIC LIBRARY SCIENCE, 2018-09-06)
    We specify a Bayesian, geoadditive Stochastic Frontier Analysis (SFA) model to assess hospital performance along the dimensions of resources and quality of stroke care in German hospitals. With 1,100 annual observations and data from 2006 to 2013 and risk-adjusted patient volume as output, we introduce a production function that captures quality, resource inputs, hospital inefficiency determinants and spatial patterns of inefficiencies. With high relevance for hospital management and health system regulators, we identify performance improvement mechanisms by considering marginal effects for the average hospital. Specialization and certification can substantially reduce mortality. Regional and hospital-level concentration can improve quality and resource efficiency. Finally, our results demonstrate a trade-off between quality improvement and resource reduction and substantial regional variation in efficiency.