Journal Article
Biomedical research may uncover insights regarding the interaction of the treatments of a disease with patient covariates. The authors show how to use such insights to improve the efficiency of adaptive clinical trials for precision medicine by extending ideas from optimal Bayesian learning. They present a model for response-adaptive multiarm clinical trials that leverages knowledge about the predictive-prognostic covariate structure to accelerate the learning of personalized treatment strategies that obtain the best expected outcomes for post-trial patients. Their base model is a contextual linear bandit for best-arm identification, and outcomes may be observed with delay. They characterize the optimal policy for sequentially allocating treatments to in-trial patients and, because it is hard to compute, propose several computable heuristics based on Bayesian one-step look-ahead techniques. They prove that several of their proposed heuristics are asymptotically optimal in learning treatment strategies. Numerical results based on two case studies motivated by sepsis management show that their heuristics can significantly improve clinical trial efficiency to learn a treatment strategy for precision medicine. The authors provide extensions that allow for rewards from outcomes of in-trial patients (resolving the exploration-exploitation tradeoff) and for inferring covariate structure using Lasso when biomedical insights on covariate structure are lacking. Their proposed trial design is of interest to funders, designers, and managers of clinical trials. It may also apply to other contextual bandit problems in settings where insights about covariate-treatment interactions are available.
Faculty
Professor of Technology and Operations Management
Associate Professor of Decision Sciences