Generalization and Reasoning in Language Models
Event details
Date | 20.08.2024 |
Hour | 10:15 › 12:15 |
Speaker | Diba Hashemi |
Location | |
Category | Conferences - Seminars |
Event Language | English |
EDIC candidacy exam
Exam president: Prof. Emmanuel Abbé
Thesis advisor: Prof. Martin Jaggi
Co-examiner: Prof. Michael Gastpar
Abstract
Language models have shown promising ability to perform various tasks in natural language processing. However, they still have some weaknesses, resulting them to be far from optimal. In this report, we study some of these weaknesses, such as the bias while generalizing on reasoning tasks, sub-optimal retrieval of the data from the middle of the input context, and problems with processing large sequences while being trained on shorter ones. We try to understand the reason behind these flaws and propose solutions to resolve them.
Background papers
Exam president: Prof. Emmanuel Abbé
Thesis advisor: Prof. Martin Jaggi
Co-examiner: Prof. Michael Gastpar
Abstract
Language models have shown promising ability to perform various tasks in natural language processing. However, they still have some weaknesses, resulting them to be far from optimal. In this report, we study some of these weaknesses, such as the bias while generalizing on reasoning tasks, sub-optimal retrieval of the data from the middle of the input context, and problems with processing large sequences while being trained on shorter ones. We try to understand the reason behind these flaws and propose solutions to resolve them.
Background papers
Practical information
- Informed public
- Free