Methods for efficient LLM pre-training at scale

Thumbnail

Event details

Date 12.02.2026
Hour 14:0016:00
Speaker Alejandro Hernandez Cano
Location
Category Conferences - Seminars
EDIC candidacy exam
Exam president: Prof. Antoine Bosselut
Thesis advisor: Prof. Martin Jaggi
Co-examiner: Prof. Emmanuel Abbé

Abstract
Adoption and demand for ever-stronger language
foundation models has been steadily increasing over the last
decade. In order to obtain them, it is crucial to start from a
strong base model, and thus pre-training remains an essential
stage of the training pipeline, especially as it uses the majority
of computational resources. Investigating methods for efficient
training at scale is therefore crucial for the field. In this work,
we review three different papers that highlight the importance
of transformer architecture components when one aims for an
efficient training, and propose future work to further push this
line of research.

Selected papers

Practical information

  • General public
  • Free

Tags

EDIC candidacy exam

Share