Skip to main content

Faculty & Research


Targeting Prospective Customers: Robustness of Machine-Learning Methods to Typical Data Challenges

Journal Article
The authors investigate how firms can use the results of field experiments to optimize the targeting of promotions when prospecting for new customers. They evaluate seven widely used machine-learning methods using a series of two large-scale field experiments. The first field experiment generates a common pool of training data for each of the seven methods. They then validate the seven optimized policies provided by each method together with uniform benchmark policies in a second field experiment. The findings not only compare the performance of the targeting methods, but also demonstrate how well the methods address common data challenges. Their results reveal that when the training data are ideal, model-driven methods perform better than distance-driven methods and classification methods. However, the performance advantage vanishes in the presence of challenges that affect the quality of the training data, including the extent to which the training data captures details of the implementation setting. The challenges they study are covariate shift, concept shift, information loss through aggregation, and imbalanced data. Intuitively, the model-driven methods make better use of the information available in the training data, but the performance of these methods is more sensitive to deterioration in the quality of this information. The classification methods they tested performed relatively poorly. The authors explain the poor performance of the classification methods in their setting and describe how the performance of these methods could be improved.

Associate Professor of Decision Sciences