Methods Inf Med 2015; 54(06): 505-514
DOI: 10.3414/ME14-01-0113
Original Articles
Schattauer GmbH

Analysis of Clinical Cohort Data Using Nested Case-control and Case-cohort Sampling Designs

A Powerful and Economical Tool
K. Ohneberg
1   Institute for Medical Biometry and Statistics, Medical Center University of Freiburg, Freiburg, Germany
2   Freiburg Center for Data Analysis and Modelling, University of Freiburg, Freiburg, Germany
,
M. Wolkewitz
1   Institute for Medical Biometry and Statistics, Medical Center University of Freiburg, Freiburg, Germany
2   Freiburg Center for Data Analysis and Modelling, University of Freiburg, Freiburg, Germany
,
J. Beyersmann
3   Institute of Statistics, Ulm University, Ulm, Germany
,
M. Palomar-Martinez
4   Hospital Universitari Arnau de Vilanova, Lleida, Spain
5   Universitat Autónoma de Barcelona, Barcelona, Spain
,
P. Olaechea-Astigarraga
6   Service of Intensive Care Medicine, Hospital de Galdakao-Usansolo, Bizkaia, Spain
,
F. Alvarez-Lerma
7   Service of Intensive Care Medicine, Parc de Salut Mar, Barcelona, Spain
,
M. Schumacher
1   Institute for Medical Biometry and Statistics, Medical Center University of Freiburg, Freiburg, Germany
› Author Affiliations
Further Information

Publication History

received: 06 November 2014

accepted: 13 April 2015

Publication Date:
23 January 2018 (online)

Preview

Summary

Background: Sampling from a large cohort in order to derive a subsample that would be sufficient for statistical analysis is a frequently used method for handling large data sets in epidemiological studies with limited resources for exposure measurement. For clinical studies however, when interest is in the influence of a potential risk factor, cohort studies are often the first choice with all individuals entering the analysis.

Objectives: Our aim is to close the gap between epidemiological and clinical studies with respect to design and power considerations. Schoenfeld’s formula for the number of events required for a Cox’ proportional hazards model is fundamental. Our objective is to compare the power of analyzing the full cohort and the power of a nested case- control and a case-cohort design.

Methods: We compare formulas for power for sampling designs and cohort studies. In our data example we simultaneously apply a nested case-control design with a varying number of controls matched to each case, a case cohort design with varying subcohort size, a random subsample and a full cohort analysis. For each design we calculate the standard error for estimated regression coefficients and the mean number of distinct persons, for whom covariate information is required.

Results: The formula for the power of a nested case-control design and the power of a case-cohort design is directly connected to the power of a cohort study using the well known Schoenfeld formula. The loss in precision of parameter estimates is relatively small compared to the saving in resources.

Conclusions: Nested case-control and case-cohort studies, but not random subsamples yield an attractive alternative for analyzing clinical studies in the situation of a low event rate. Power calculations can be conducted straightforwardly to quantify the loss of power compared to the savings in the number of patients using a sampling design instead of analyzing the full cohort.