Multi-Task Scene Representations

Thumbnail

Event details

Date 24.08.2021
Hour 13:0015:00
Speaker Roman Bachmann
Category Conferences - Seminars
EDIC candidacy exam
exam president: Prof. Nicolas Boumal
thesis advisor: Prof. Amir Zamir
co-examiner: Prof. Mackenzie Mathis

Abstract
Current supervised and self-supervised representation learning literature focuses heavily on using large-scale classification datasets to train a network to produce image-level features that can be used for transfer learning. The following questions arise: Does training on classification tasks/datasets really produce the best representations for diverse downstream task learning, and why do we perform transfers from independent image-level features instead of scene-level representations that aggregate information over time and space? Indeed, there is evidence that no pre-training task is the best single choice for all other visual downstream tasks. We propose to learn scene-level representations by merging image-level representations of multiple diverse tasks over the spatial and temporal dimensions, with the goal of creating powerful visual priors for downstream learning. Using such multi-task priors should improve the coverage of the space of features that are useful for visual tasks. Furthermore, the use of scene representations can allow for global and out-of-sight reasoning.

Background papers
1) Big Transfer (BiT): General Visual Representation Learning. Kolesnikov et al. 2019. https://arxiv.org/abs/1912.11370
2) Neural scene representation and rendering. Eslami et al. 2018.: https://storage.googleapis.com/deepmind-media/papers/Neural_Scene_Representation_and_Rendering_preprint.pdf)
3) On the Theory of Transfer Learning: The Importance of Task Diversity. Tripuraneni et al. 2020. https://arxiv.org/abs/2006.11650
 

Practical information

  • General public
  • Free

Contact

  • edic@epfl.ch

Tags

EDIC candidacy exam

Share