Generalization and Reasoning in Language Models

Event details

Date	20.08.2024
Hour	10:15 › 12:15
Speaker	Diba Hashemi
Location	INF 119
Category	Conferences - Seminars
Event Language	English

EDIC candidacy exam
Exam president: Prof. Emmanuel Abbé
Thesis advisor: Prof. Martin Jaggi
Co-examiner: Prof. Michael Gastpar

Abstract
Language models have shown promising ability to perform various tasks in natural language processing. However, they still have some weaknesses, resulting them to be far from optimal. In this report, we study some of these weaknesses, such as the bias while generalizing on reasoning tasks, sub-optimal retrieval of the data from the middle of the input context, and problems with processing large sequences while being trained on shorter ones. We try to understand the reason behind these flaws and propose solutions to resolve them.

Background papers

1. Generalization on the Unseen, Logic Reasoning and Degree Curriculum

2. Lost in the Middle: How Language Models Use Long Contexts

3. LM-Infinite: Zero-Shot Extreme Length Generalization for Large Language Models

Practical information

Informed public
Free

Contact

edic@epfl.ch

Export Event

Event broadcasted in

Send a reminder