Decoding Strategies for Large Language Models

Event details
Date | 10.08.2023 |
Hour | 14:00 › 16:00 |
Speaker | Saibo Geng |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. Antoine Bosselut
Thesis advisor: Prof. Robert West
Co-examiner: Prof. Viktor Kuncak
Abstract
Large Language Models (LLMs) have significantly advanced the field of artificial intelligence, achieving state-of-the- art results across a diverse range of tasks. Central to their success is the decoding algorithm, which converts the model’s probability distribution into the generated output text. This report first reviews three notable decoding algorithms: constrained decoding, diverse beam search, and a recent approach for handling multi- step reasoning tasks with LLMs. We then propose a novel area of investigation—designing a decoding algorithm specifically to enhance the multi-step reasoning capabilities of LLMs.
Background papers
- GenIE: Generative Information Extraction: https://arxiv.org/abs/2112.08340
- Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models: https://arxiv.org/abs/1610.02424
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models: https://arxiv.org/abs/2305.10601
Exam president: Prof. Antoine Bosselut
Thesis advisor: Prof. Robert West
Co-examiner: Prof. Viktor Kuncak
Abstract
Large Language Models (LLMs) have significantly advanced the field of artificial intelligence, achieving state-of-the- art results across a diverse range of tasks. Central to their success is the decoding algorithm, which converts the model’s probability distribution into the generated output text. This report first reviews three notable decoding algorithms: constrained decoding, diverse beam search, and a recent approach for handling multi- step reasoning tasks with LLMs. We then propose a novel area of investigation—designing a decoding algorithm specifically to enhance the multi-step reasoning capabilities of LLMs.
Background papers
- GenIE: Generative Information Extraction: https://arxiv.org/abs/2112.08340
- Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models: https://arxiv.org/abs/1610.02424
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models: https://arxiv.org/abs/2305.10601
Practical information
- General public
- Free
Contact
- edic@epfl.ch