Skip to content

Advertisement

  • Poster presentation
  • Open Access

Simulation study - handling missing covariates in the context of external validation

  • 1Email author,
  • 2,
  • 1 and
  • 1
Trials201112 (Suppl 1) :A62

https://doi.org/10.1186/1745-6215-12-S1-A62

  • Published:

Keywords

  • Simulation Study
  • External Validation
  • Random Selection
  • Bootstrap Sample
  • Prognostic Model

Objectives

Before a predictive or prognostic model, often developed using data from clinical trials, can be introduced into general practice, it needs to be externally validated to ensure that it performs satisfactorily in data sets that are fully independent of the development data. Various methods exist to handle covariates with missing data and many of these are regularly employed in the analysis of data from a single trial. We propose that several of these strategies may be adapted to handle covariates with every entry missing in the context of external validation.

Methods

A simulation study was undertaken to test the suitability of our five proposed strategies: (1) random selection with replacement, (2) hot deck imputation, (3) single imputation via estimation, (4) random selection with replacement multiple times, (5) using only covariates common to both development and validation data sets. Survival times were simulated via the Cox-exponential distribution with a binary censoring indicator variable. Up to two binary, two continuous and two categorical covariates were simulated via binomial, log-normal and multinomial distributions respectively. To assess how the methods perform in general three statistics were calculated across 1000 bootstrap samples: (i) estimated regression coefficients from the model fit to the validation set, (ii) associated standard deviations, (iii) mean square errors of the parameter estimates from the development and validation sets.

Results

Preliminary results suggest that random selection with replacement multiple times was the most consistent method; the mean difference between the actual regression coefficients from the development set and those estimated from the validation set was only 0.02 whereas it was 0.10 for random selection with replacement, 0.09 for imputation via estimation and 0.05 for hot-deck imputation. Standard deviations were fairly constant across methods (1) to (4). Results for method (5) are to follow together with mean square errors for all five methods.

Conclusion

Random selection with replacement multiple times may offer a solution to externally validating a predictive or prognostic model when at least one covariate is missing from the validation data set. The simulation study described is an over-simplification of reality so leads to more favourable results than can be expected in everyday applications. Similarly it does not consider associations between variables. Further work is required to determine how the methods perform in alternative settings and also in real life.

Declarations

Acknowledgements

This programme (RP-PG-0606-1062) receives financial support from the National Institute for Health Research (NIHR) Programme Grants for Applied Research funding scheme.

Authors’ Affiliations

(1)
Department of Biostatistics, University of Liverpool, Liverpool, L69 3GS, UK
(2)
Clinical and Molecular Pharmacology, University of Liverpool, Liverpool, L69 3GS, UK

Copyright

© Bonnett et al; licensee BioMed Central Ltd. 2011

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate. Please note that comments may be removed without notice if they are flagged by another user or do not comply with our community guidelines.

Advertisement