Repo for the paper entitled "Compatibility of Missing Data Handling Methods across the Clinical Prediction Model Pipeline", which is currently in-preparation.
This paper investigates the impact of using different missing data imputation methods on estimated predictive performance of a prediction model across development, validation and deployment of a CPM. The paper aims to determine which combinations of imputation methods are compatible across the prediction model pipeline.
The repo contains the coding scripts and results from the simulation and empirical study, described in the paper as follows:
This contains the R scripts used in the simulation study. Additionally, this folder also contains the R script used to analyse the NCORR data, which was used as part of the empirical study described in the paper. Much of the simulation code was run on the computational shared facility (CSF) at the University of Manchester.
This contains a .RDS file of the results of the simulation (summarised) and an .RDS file of the summarised NCORR results.