Test-time scaling and search in deep networks

Event details
Date | 05.08.2025 |
Hour | 14:00 › 16:00 |
Speaker | Zhitong Gao |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. Antoine Bosselut
Thesis advisor: Prof. Amir Zamir
Co-examiner: Prof. Martin Schrimpf
Abstract
General-purpose models have achieved substantial performance gains through scaling, not only in training but also at inference time. In particular, large language models (LLMs) benefit significantly from increased test-time computation via reasoning and search, which enables strong performance on complex tasks like scientific or mathematical problem solving. In contrast, visual reasoning remains relatively underexplored, partly because traditional vision models have shown limited benefit from additional test-time computation. Recent studies have shown that generative models, such as diffusion models, can leverage search-based inference-time computation to improve scalability and performance. However, how to effectively design test-time scaling strategies for broader vision and multimodal modelsâand how to build flexible models that fully exploit such scalingâremains an open question.
Selected papers
Exam president: Prof. Antoine Bosselut
Thesis advisor: Prof. Amir Zamir
Co-examiner: Prof. Martin Schrimpf
Abstract
General-purpose models have achieved substantial performance gains through scaling, not only in training but also at inference time. In particular, large language models (LLMs) benefit significantly from increased test-time computation via reasoning and search, which enables strong performance on complex tasks like scientific or mathematical problem solving. In contrast, visual reasoning remains relatively underexplored, partly because traditional vision models have shown limited benefit from additional test-time computation. Recent studies have shown that generative models, such as diffusion models, can leverage search-based inference-time computation to improve scalability and performance. However, how to effectively design test-time scaling strategies for broader vision and multimodal modelsâand how to build flexible models that fully exploit such scalingâremains an open question.
Selected papers
Practical information
- General public
- Free
Contact
- edic@epfl.ch