Simulation study - handling missing covariates in the context of external validation

Bonnett, Laura J; Marson, Tony G; Williamson, Paula R; Tudur-Smith, Catrin

doi:10.1186/1745-6215-12-S1-A62

Volume 12 Supplement 1

Clinical Trials Methodology Conference 2011

Poster presentation
Open access
Published: 13 December 2011

Simulation study - handling missing covariates in the context of external validation

Laura J Bonnett¹,
Tony G Marson²,
Paula R Williamson¹ &
…
Catrin Tudur-Smith¹

Trials volume 12, Article number: A62 (2011) Cite this article

1310 Accesses
Metrics details

Objectives

Before a predictive or prognostic model, often developed using data from clinical trials, can be introduced into general practice, it needs to be externally validated to ensure that it performs satisfactorily in data sets that are fully independent of the development data. Various methods exist to handle covariates with missing data and many of these are regularly employed in the analysis of data from a single trial. We propose that several of these strategies may be adapted to handle covariates with every entry missing in the context of external validation.

Methods

A simulation study was undertaken to test the suitability of our five proposed strategies: (1) random selection with replacement, (2) hot deck imputation, (3) single imputation via estimation, (4) random selection with replacement multiple times, (5) using only covariates common to both development and validation data sets. Survival times were simulated via the Cox-exponential distribution with a binary censoring indicator variable. Up to two binary, two continuous and two categorical covariates were simulated via binomial, log-normal and multinomial distributions respectively. To assess how the methods perform in general three statistics were calculated across 1000 bootstrap samples: (i) estimated regression coefficients from the model fit to the validation set, (ii) associated standard deviations, (iii) mean square errors of the parameter estimates from the development and validation sets.

Results

Preliminary results suggest that random selection with replacement multiple times was the most consistent method; the mean difference between the actual regression coefficients from the development set and those estimated from the validation set was only 0.02 whereas it was 0.10 for random selection with replacement, 0.09 for imputation via estimation and 0.05 for hot-deck imputation. Standard deviations were fairly constant across methods (1) to (4). Results for method (5) are to follow together with mean square errors for all five methods.

Conclusion

Random selection with replacement multiple times may offer a solution to externally validating a predictive or prognostic model when at least one covariate is missing from the validation data set. The simulation study described is an over-simplification of reality so leads to more favourable results than can be expected in everyday applications. Similarly it does not consider associations between variables. Further work is required to determine how the methods perform in alternative settings and also in real life.

Acknowledgements

This programme (RP-PG-0606-1062) receives financial support from the National Institute for Health Research (NIHR) Programme Grants for Applied Research funding scheme.

Author information

Authors and Affiliations

Department of Biostatistics, University of Liverpool, Liverpool, L69 3GS, UK
Laura J Bonnett, Paula R Williamson & Catrin Tudur-Smith
Clinical and Molecular Pharmacology, University of Liverpool, Liverpool, L69 3GS, UK
Tony G Marson

Authors

Laura J Bonnett
View author publications
You can also search for this author in PubMed Google Scholar
Tony G Marson
View author publications
You can also search for this author in PubMed Google Scholar
Paula R Williamson
View author publications
You can also search for this author in PubMed Google Scholar
Catrin Tudur-Smith
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Laura J Bonnett.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Bonnett, L.J., Marson, T.G., Williamson, P.R. et al. Simulation study - handling missing covariates in the context of external validation. Trials 12 (Suppl 1), A62 (2011). https://doi.org/10.1186/1745-6215-12-S1-A62

Download citation

Published: 13 December 2011
DOI: https://doi.org/10.1186/1745-6215-12-S1-A62

Clinical Trials Methodology Conference 2011

Simulation study - handling missing covariates in the context of external validation

Objectives

Methods

Results

Conclusion

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Trials

Contact us

Clinical Trials Methodology Conference 2011

Simulation study - handling missing covariates in the context of external validation

Objectives

Methods

Results

Conclusion

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Trials

Contact us