Symbolic representation learning

Event details
Date | 11.08.2023 |
Hour | 13:00 › 15:00 |
Speaker | Mohammad Hossein Amani |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. Antoine Bosselut
Thesis advisor: Prof. Robert West
Co-examiner: Prof. Tanja Käser
Abstract
This proposal explores using discrete latent representations in sequential neural models for enabling systematic
generalization under limited labeled data.
We study self-supervised learning of discrete symbolic-like representations using variational autoencoders.
We investigate non-parametric statistical methods to model the probability distribution of sequences of latent
representations, which serves as a natural information bottleneck to encode prior knowledge for data-efficient learning.
Finally, to evaluate our approach, we analyze different notions of systematic generalization and propose using
formal languages as customizable benchmarks for compositionality. Specifically, we focus on finite state
transducers for generating compositional sequence-to-sequence datasets with adjustable levels of task
complexity/compositionality.
The key challenges addressed are methods for training discrete representations, modeling sequence priors for VAEs,
and benchmarking compositional generalization in neural models.
Background papers
Exam president: Prof. Antoine Bosselut
Thesis advisor: Prof. Robert West
Co-examiner: Prof. Tanja Käser
Abstract
This proposal explores using discrete latent representations in sequential neural models for enabling systematic
generalization under limited labeled data.
We study self-supervised learning of discrete symbolic-like representations using variational autoencoders.
We investigate non-parametric statistical methods to model the probability distribution of sequences of latent
representations, which serves as a natural information bottleneck to encode prior knowledge for data-efficient learning.
Finally, to evaluate our approach, we analyze different notions of systematic generalization and propose using
formal languages as customizable benchmarks for compositionality. Specifically, we focus on finite state
transducers for generating compositional sequence-to-sequence datasets with adjustable levels of task
complexity/compositionality.
The key challenges addressed are methods for training discrete representations, modeling sequence priors for VAEs,
and benchmarking compositional generalization in neural models.
Background papers
- Neural Discrete Representation Learning - https://arxiv.org/abs/1711.00937
- A VAE for transformers with non-parametric variational information bottleneck - https://openreview.net/forum?id=6QkjC_cs03X
- Benchmarking Compositionality with Formal Languages - https://aclanthology.org/2022.coling-1.525.pdf
Practical information
- General public
- Free
Contact
- edic@epfl.ch