Instant-Optimal Algorithms for Pure Exploration in Reinforcement Learning
Event details
| Date | 01.12.2025 › 02.12.2025 |
| Hour | 11:15 › 12:00 |
| Speaker | Cyrille Kone, PhD, University of Lille |
| Location | |
| Category | Conferences - Seminars |
| Event Language | English |
Abstract
Instant-Optimal Algorithms for Pure Exploration in Reinforcement Learning
In online reinforcement learning, pure exploration aims to identify an optimal policy after a learning phase with minimal sample complexity, in contrast to regret minimization which focuses on performance during learning. We study instance-dependent lower bounds for this problem, which take the form of a two-player zero-sum game between an explorer choosing a behavior policy and nature selecting an alternative MDP. We propose a computationally efficient algorithm based on posterior sampling that matches this lower bound in the small-error regime, bypassing the hardness of computing best responses. We further discuss extensions to multi-agent reinforcement learning, where the goal is to identify strategic equilibria such as Nash equilibria in unknown environments.
Biography
Cyrille Kone is a PhD candidate in Computer Science at the University of Lille and Inria within the Scool team, supervised by Prof. Emilie Kaufmann and Prof. Laura Richert. His research focuses on the theoretical foundations of sequential decision-making, with emphasis on pure exploration in bandits and reinforcement learning, instance-optimal algorithm design, and multi-objective optimization. His work has been published at top machine learning venues including NeurIPS, ICML, and AISTATS. He will defend his PhD in December 2025.
Practical information
- General public
- Free
Organizer
- Prof. Maryam Kamgarpour