You are here: Home / Publications / Boosting with missing predictors.

Boosting with missing predictors.


Biostatistics. 2010 Apr 11 (2).

Boosting is an important tool in classification methodology. It combines the performance of many weak classifiers to produce a powerful committee, and its validity can be explained by additive modeling and maximum likelihood. The method has very general applications, especially for high-dimensional predictors. For example, it can be applied to distinguish cancer samples from healthy control samples by using antibody microarray data. Microarray data are often high-dimensional and many of them are incomplete. One natural idea is to impute a missing variable based on the observed predictors. However, the calculation of imputation for high-dimensional predictors with missing data may be rather tedious. In this paper, we propose 2 conditional mean imputation methods. They can be applied to the situation even when a complete-case subset does not exist. Simulation results indicate that the proposed methods are superior than other naive methods. We apply the methods to a pancreatic cancer study in which serum protein microarrays are used for classification.

This icon signifies that something is happening and we kindly ask you to please wait