Gesundheitswesen 2017; 79(08/09): 656-804
DOI: 10.1055/s-0037-1605796
Vorträge
Georg Thieme Verlag KG Stuttgart · New York

Accuracy estimation after model selection using bootstrapping: An application to clinical data

J Schöpe
1   Institute for Medical Biometry, Epidemiology and Medical Informatics, Saarland University, Campus Homburg, Homburg
,
A Bekhit
1   Institute for Medical Biometry, Epidemiology and Medical Informatics, Saarland University, Campus Homburg, Homburg
,
G Wagenpfeil
1   Institute for Medical Biometry, Epidemiology and Medical Informatics, Saarland University, Campus Homburg, Homburg
,
A Schneider
2   Institute for General Practice, University Medical Center Klinikum rechts der Isar, TUM, Munich
,
S Wagenpfeil
1   Institute for Medical Biometry, Epidemiology and Medical Informatics, Saarland University, Campus Homburg, Homburg
› Author Affiliations
Further Information

Publication History

Publication Date:
01 September 2017 (online)

 

Background:

Predictive modeling in conjunction with model selection are commonly used in clinical research. However, uncertainty arising from model selection is not incorporated in classical theory of statistical inference. Accounting for model selection in statistical inference, Efron1 recently developed a formula to approximate standard errors of smoothed estimators derived from bagging2. Therefore, the primary aim of this study was to implement and evaluate Efron's approach in R with an application to clinical data.

Methods:

Clinical data was obtained from a previously published study3, which was designed to develop clinical prediction rules for diagnosing asthma in patients suspected of suffering from obstructive respiratory disease. Smoothed estimators from non-parametric bootstrap replicates of stepwise fitted binomial logistic regression models and their approximated standard errors using Efron's approach were compared with results obtained from the delta method. Additionally, practical properties in extreme case problems were assessed using results from simulations.

Findings and Discussion:

Findings and possible implications for penalized regression, especially L1-norm regularization, will be discussed.

References:

[1] Efron B. Estimation and accuracy after model selection. Journal of the American Statistical Association. 2014;109: 991 – 1007.

[2] Breiman L. Bagging predictors. Machine Learning. 1996;24: 123 – 140.

[3] Schneider A, Wagenpfeil G, Jörres RA, Wagenpfeil S. Influence of the practice setting on diagnostic prediction rules using FENO measurement in combination with clinical signs and symptoms of asthma. BMJ Open. 2015;5: e009676. doi: 10.1136/bmjopen-2015 – 009676.