Understanding the Behaviour of Inverse Reinforcement Learning Agents

Thumbnail

Event details

Date 29.08.2018
Hour 11:1513:15
Speaker Teresa Yeo
Location
Category Conferences - Seminars
EDIC candidacy exam
Exam president: Prof. Boi Faltings
Thesis advisor: Prof. Volkan Cevher
Thesis co-advisor: Prof. Pierre Dillenbourg
Co-examiner: Dr. Mathieu Salzmann

Abstract
Programming autonomous agents in a Markov decision
process setting typically requires designing a reward function.
This is a challenging problem in many areas that do not have
a well-defined score, such as control, locomotion and navigation
tasks, among many others. In inverse reinforcement learning
(IRL), the agent learns this function from expert demonstrations.
Numerous IRL methods have been developed, each with their
own strengths and weakness. However, a less studied area, is on
understanding such a model’s behavior. We would like models
that not only perform well but are also explainable as it is
essential for establishing trust in a system or for debugging.
Our goal is to be able to explain why an IRL agent behaves a
certain way, by identifying which of the expert’s trajectory was
most responsible for that behavior. As the method used has close
connections to generating adversarial attacks, we also discuss
how this can be applied to IRL.

Background papers
Apprenticeship Learning via Inverse Reinforcement Learning, by Pieter Abbel and Andrew Ng [ICML04] 
Model-free Imitation Learning with Policy Optimization, by Jonathan Ho, Jayesh Gupta and Stefano Ermon [ICML16]
Understanding Black-box Predictions via Influence Functions, by Pang Wei Koh and Percy Liang [ICML17]
 

Practical information

  • General public
  • Free

Contact

Share