Combining several screening tests: optimality of the risk score.
The development of biomarkers for cancer screening is an active area of research. While several biomarkers exist, none is sufficiently sensitive and specific on its own for population screening. It is likely that successful screening programs will require combinations of multiple markers. We consider how to combine multiple disease markers for optimal performance of a screening program. We show that the risk score, defined as the probability of disease given data on multiple markers, is the optimal function in the sense that the receiver operating characteristic (ROC) curve is maximized at every point. Arguments draw on the Neyman-Pearson lemma. This contrasts with the corresponding optimality result of classic decision theory, which is set in a Bayesian framework and is based on minimizing an expected loss function associated with decision errors. Ours is an optimality result defined from a strictly frequentist point of view and does not rely on the notion of associating costs with misclassifications. The implication for data analysis is that binary regression methods can be used to yield appropriate relative weightings of different biomarkers, at least in large samples. We propose some modifications to standard binary regression methods for application to the disease screening problem. A flexible biologically motivated simulation model for cancer biomarkers is presented and we evaluate our methods by application to it. An application to real data concerning two ovarian cancer biomarkers is also presented. Our results are equally relevant to the more general medical diagnostic testing problem, where results of multiple tests or predictors are combined to yield a composite diagnostic test. Moreover, our methods justify the development of clinical prediction scores based on binary regression.
- McIntosh MW
- Pepe MS