Instant-Optimal Algorithms for Pure Exploration in Reinforcement Learning

Event details

Date	02.12.2025
Hour	11:00 › 12:00
Speaker	Cyrille Kone, PhD, University of Lille
Location	ME C2 405
Category	Conferences - Seminars
Event Language	English

Abstract
Instant-Optimal Algorithms for Pure Exploration in Reinforcement Learning
In online reinforcement learning, pure exploration aims to identify an optimal policy after a learning phase with minimal sample complexity, in contrast to regret minimization which focuses on performance during learning. We study instance-dependent lower bounds for this problem, which take the form of a two-player zero-sum game between an explorer choosing a behavior policy and nature selecting an alternative MDP. We propose a computationally efficient algorithm based on posterior sampling that matches this lower bound in the small-error regime, bypassing the hardness of computing best responses. We further discuss extensions to multi-agent reinforcement learning, where the goal is to identify strategic equilibria such as Nash equilibria in unknown environments.

Biography
Cyrille Kone is a PhD candidate in Computer Science at the University of Lille and Inria within the Scool team, supervised by Prof. Emilie Kaufmann and Prof. Laura Richert. His research focuses on the theoretical foundations of sequential decision-making, with emphasis on pure exploration in bandits and reinforcement learning, instance-optimal algorithm design, and multi-objective optimization. His work has been published at top machine learning venues including NeurIPS, ICML, and AISTATS. He will defend his PhD in December 2025.

Practical information

General public
Free

Organizer

Prof. Maryam Kamgarpour

Contact

[email protected]

Export Event

Event broadcasted in

Send a reminder