Generalization and Reasoning in Language Models


Event details

Date 20.08.2024
Hour 10:1512:15
Speaker Diba Hashemi
Category Conferences - Seminars
Event Language English
EDIC candidacy exam
Exam president: Prof. Emmanuel Abbé
Thesis advisor: Prof. Martin Jaggi
Co-examiner: Prof. Michael Gastpar

Language models have shown promising ability to perform various tasks in natural language processing. However, they still have some weaknesses, resulting them to be far from optimal. In this report, we study some of these weaknesses, such as the bias while generalizing on reasoning tasks, sub-optimal retrieval of the data from the middle of the input context, and problems with processing large sequences while being trained on shorter ones. We try to understand the reason behind these flaws and propose solutions to resolve them.

Background papers

Practical information

  • Informed public
  • Free


