Assessing risk prediction models in case-control studies using semiparametric and nonparametric methods.


The predictiveness curve is a graphical tool that characterizes the population distribution of Risk(Y)=P(D=1|Y), where D denotes a binary outcome such as occurrence of an event within a specified time period and Y denotes predictors. A wider distribution of Risk(Y) indicates better performance of a risk model in the sense that making treatment recommendations is easier for more subjects. Decisions are more straightforward when a subject's risk is deemed to be high or low. Methods have been developed to estimate predictiveness curves from cohort studies. However, early phase studies to evaluate novel risk prediction markers typically employ case-control designs. Here, we present semiparametric and nonparametric methods for evaluating a continuous risk prediction marker that accommodates case-control data. Small sample properties are investigated through simulation studies. The semiparametric methods are substantially more efficient than their nonparametric counterparts under a correctly specified model. We generalize them to settings where multiple prediction markers are involved. Applications to prostate cancer risk prediction markers illustrate methods for comparing the risk prediction capacities of markers and for evaluating the increment in performance gained by adding a marker to a baseline risk model. We propose a modified Hosmer-Lemeshow test for case-control study data to assess calibration of the risk model that is a natural complement to this graphical tool.

  • Huang Y
  • Pepe MS
PubMed ID
Appears In
Stat Med, 2010, 29 (13)