Identifying optimal biomarker combinations for treatment selection through randomized controlled trials.


Biomarkers associated with treatment-effect heterogeneity can be used to make treatment recommendations that optimize individual clinical outcomes. To accomplish this, statistical methods are needed to generate marker-based treatment-selection rules that can most effectively reduce the population burden due to disease and treatment. Compared to the standard approach of risk modeling to derive treatment-selection rules, a more robust approach is to directly minimize an unbiased estimate of total disease and treatment burden among a pre-specified class of rules. This problem is one of minimizing a weighted sum of 0-1 loss function, which is computationally challenging to solve due to the nonsmoothness of 0-1 loss. Huang and Fong, among others, proposed a method that uses the Ramp loss to approximate the 0-1 loss and solves the minimization problem through repetitive constrained optimizations. The algorithm was shown to have comparable or better performance than other comparative estimators in various settings. Our aim in this article is to further extend the algorithm to allow for variable selection in the presence of a large number of candidate markers.

We develop an alternative method to derive marker combinations to minimize the weighted sum of Ramp loss in Huang and Fong, based on data from randomized trials. The new algorithm estimates treatment-selection rules by repetitively minimizing a smooth and differentiable objective function. Through the use of an L1 penalty, we expand the method to allow for feature selection and develop an algorithm based on the coordinate descent method to build the treatment-selection rule.

Through extensive simulation studies, we compared performance of the proposed estimator to four existing approaches: (1) a logistic regression risk modeling approach, and three other "direct optimizing" approaches including (2) the estimator in Huang and Fong, (3) the weighted support vector machine, and (4) the weighted logistic regression. The proposed estimator performs comparably to that of Huang and Fong, and comparably or better than other estimators. Allowing for variable selection using the proposed estimator in the presence of a large number of markers further improves treatment-selection performance. The proposed estimator is also advantageous for selecting variables relevant to treatment selection compared to L1 penalized logistic regression and weighted logistic regression. We illustrate the application of the proposed methods in host-genetics data from an HIV vaccine trial.

The proposed estimator is appealing considering its effectiveness and conceptual simplicity. It has significant potential to contribute to the selection and combination of biomarkers for treatment selection in clinical practice.

  • Huang Y
PubMed ID
Appears In
Clin Trials, 2015, 12 (4)