Summary
Objectives:
In oncological studies, the hazard rate can be used to differentiate subgroups of
the study population according to their patterns of survival risk over time. Nonparametric
curve estimation has been suggested as an exploratory means of revealing such patterns.
The decision about the type of smoothing parameter is critical for performance in
practice. In this paper, we study data-adaptive smoothing.
Methods:
A decade ago, the nearest-neighbor bandwidth was introduced for censored data in
survival analysis. It is specified by one parameter, namely the number of nearest
neighbors. Bandwidth selection in this setting has rarely been investigated, although
the heuristical advantages over the frequently-studied fixed bandwidth are quite obvious.
The asymptotical relationship between the fixed and the nearest-neighbor bandwidth
can be used to generate novel approaches.
Results:
We develop a new selection algorithm termed double-smoothing for the nearest-neighbor bandwidth in hazard rate estimation. Our approach uses a
finite sample approximation of the asymptotical relationship between the fixed and
nearest-neighbor bandwidth. By so doing, we identify the nearest-neighbor bandwidth
as an additional smoothing step and achieve further data-adaption after fixed bandwidth
smoothing. We illustrate the application of the new algorithm in a clinical study
and compare the outcome to the traditional fixed bandwidth result, thus demonstrating
the practical performance of the technique.
Conclusion:
The double-smoothing approach enlarges the methodological repertoire for selecting
smoothing parameters in nonparametric hazard rate estimation. The slight increase
in computational effort is rewarded with a substantial amount of estimation stability,
thus demonstrating the benefit of the technique for biostatistical applications.
Keywords
Disease-free survival - nonparametric statistics - statistical distributions - statistical
data interpretation