Feedback-based alignment of large language models

Event details

Date	07.09.2023
Hour	10:00 › 12:00
Speaker	Beatriz Borges Ribeiro
Location	BC 133
Category	Conferences - Seminars

EDIC candidacy exam
Exam president: Prof. Tanja Käser
Thesis advisor: Prof. Antoine Bosselut
Co-examiner: Prof. Robert West

Abstract
Feedback has become an increasingly popular avenue for aligning Large Language Models (LLMs) to human values. Out of many different possible formulations of feedback, Natural Language Feedback (NLF) stands out as the richest and most diverse in terms of the information it can convey. However, exploration on what makes NLF effective remains an open question, as current approaches are typically hand-designed and arbitrary. My work will seek to first survey the most impactful feedback models in pedagogy for human learning. Then, with this grounding, I will propose novel approaches and systems for feedback and model alignment, aiming to leverage higher granularities than are currently used.

Background papers

Proximal Policy Optimization Algorithms ( https://arxiv.org/abs/1707.06347 )
Training language models to follow instructions with human feedback ( https://arxiv.org/abs/2203.02155 )
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training ( https://arxiv.org/abs/2306.01693 )

Practical information

General public
Free

Contact

[email protected]

Export Event

Event broadcasted in