Risk Minimization from Adaptively Collected Data: Guarantees for Policy Learning

Thumbnail

Event details

Date 17.12.2021
Hour 15:1517:00
Speaker Antoine Chambaz, Université de Paris        
Location Online
Category Conferences - Seminars
Event Language English

Empirical risk minimization (ERM) is the workhorse of machine learning but its model-agnostic guarantees can fail when using data collected in an adaptive fashion, like in the setting of a contextual bandit algorithm for instance.  In this setting, and focusing on policy learning, I will present a generic importance sampling weighted ERM algorithm and its regret guarantees, which close an open gap in the existing literature whenever exploration decays to zero.  An empirical investigation validates the theory.

This is a joint work with Aurélien Bibaut (Netflix), Nathan Kallus (Cornell University and Netflix), Maria Dimakopoulou (Netflix) and Mark J. van der laan (UC Berkeley)

 

Practical information

  • Informed public
  • Free

Organizer

  • Mats Stensrud

Contact

  • Maroussia Schaffner

Share