Generalization and Reasoning in Language Models

Thumbnail

Event details

Date 20.08.2024
Hour 10:1512:15
Speaker Diba Hashemi
Location
Category Conferences - Seminars
Event Language English
EDIC candidacy exam
Exam president: Prof. Emmanuel Abbé
Thesis advisor: Prof. Martin Jaggi
Co-examiner: Prof. Michael Gastpar

Abstract
Language models have shown promising ability to perform various tasks in natural language processing. However, they still have some weaknesses, resulting them to be far from optimal. In this report, we study some of these weaknesses, such as the bias while generalizing on reasoning tasks, sub-optimal retrieval of the data from the middle of the input context, and problems with processing large sequences while being trained on shorter ones. We try to understand the reason behind these flaws and propose solutions to resolve them.

Background papers

Practical information

  • Informed public
  • Free

Tags

EDIC

Share