Statistical Learning and Contextual Stochastic Optimization: Separate or Integrate?


Event details

Date and time 21.09.2021 17:0019:00  
Place and room
Speaker Nathan Kallus, Cornell University
Event Language English
Category Conferences - Seminars
Seminar organized by the Management of Technology & Entrepreneurship Institute

"Statistical Learning and Contextual Stochastic Optimization: Separate or Integrate?"

Nathan Kallus, Cornell University

Contextual stochastic optimization (CSO) conditions on predictive observations available at decision time, which can reduce uncertainty and boost performance, but it also requires we learn a potentially complex predictive relationship between predictive observations and uncertain cost variables. While off-the-shelf ML methods can often be used to learn this relationship, their training loss ignores the downstream optimization task. Alternatively, we can train the predictive model in an end-to-end fashion to directly optimize the downstream costs of the decision policy it would induce. In this talk I will tackle the question, which is better? Should we separate or integrate the learning and optimization tasks?

In the first part of this talk I will focus on contextual linear optimization (CLO), where the cost function is bilinear in the decision and uncertain variables and the only relevant aspect of the predictive relationship is the conditional expectation (aka regression function). Surprisingly, I will show that the naive separated approach actually achieves regret convergence rates that are significantly faster than any end-to-end method that directly optimizes downstream decision performance. I show this by leveraging the fact that specific problem instances do not have arbitrarily bad near-dual-degeneracy and developing appropriate upper and lower bounds. This is overall positive for practice: predictive models are easy and fast to train using existing ML tools, simple to interpret and reuse as a prediction, and, as shown, lead to decisions that perform very well.

In the second part of this talk I will focus on the more general CSO problem, where we must consider the whole conditional probability model. As this object can be much higher dimensional than any decision policy, it is better to integrate the tasks and directly learn a policy. We adapt random forests to this integrated task by searching tree splits to directly optimize downstream decisions, rather than prediction accuracy. We solve this seemingly intractable problem by developing approximate splitting criteria that utilize optimization perturbation analysis to eschew burdensome re-optimization for every candidate split, so that our method scales to large-scale problems. We prove that our approximations are consistent and that our method is asymptotically optimal, and we empirically validate its superior performance.

This talk is based on the following papers:
Fast Rates for Contextual Linear Optimization, with Y. Hu and X. Mao.
Stochastic Optimization Forests, with X. Mao