BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:IC Colloquium: From Sparse Modeling to Sparse Communication
DTSTART:20221003T161500
DTEND:20221003T173000
DTSTAMP:20260406T194710Z
UID:b5f9b82b63c96471d31011811efb210dd1fc389d6019e05834962ff8
CATEGORIES:Conferences - Seminars
DESCRIPTION:By: André Martins - Técnico Lisboa\nVideo of his talk\n\nAbs
 tract\nNeural networks and other machine learning models compute continuou
 s representations\, while humans communicate mostly through discrete symbo
 ls. Reconciling these two forms of communication is desirable for generati
 ng human-readable interpretations or learning discrete latent variable mod
 els\, while maintaining end-to-end differentiability.\n\nIn the first part
  of the talk\, I will describe how sparse modeling techniques can be exten
 ded and adapted for facilitating sparse communication in neural models. Th
 e building block is a family of sparse transformations called alpha-entmax
 \, a drop-in replacement for softmax\, which contains sparsemax as a parti
 cular case. Entmax transformations are differentiable and (unlike softmax)
  they can return sparse probability distributions\, useful to build interp
 retable attention mechanisms. Variants of these sparse transformations hav
 e been applied with success to machine translation\, natural language infe
 rence\, visual question answering\, and other tasks.\n\nIn the second part
 \, I will introduce mixed random variables\, which are in-between the disc
 rete and continuous worlds. We build rigorous theoretical foundations for 
 these hybrids\, via a new “direct sum” base measure defined on the fac
 e lattice of the probability simplex. From this measure\, we introduce new
  entropy and Kullback-Leibler divergence functions that subsume the discre
 te and differential cases and have interpretations in terms of code optima
 lity. Our framework suggests two strategies for representing and sampling 
 mixed random variables\, an extrinsic (“sample-and-project”) and an in
 trinsic one (based on face stratification).\n\nIn the third part\, I will 
 show how sparse transformations can also be used to design new loss functi
 ons\, replacing the cross-entropy loss. To this end\, I will introduce the
  family of Fenchel-Young losses\, revealing connections between generalize
 d entropy regularizers and separation margin. I will illustrate with appli
 cations in natural language generation\, morphology\, and machine translat
 ion.\n\nThis work was funded by the DeepSPIN ERC project (https://deep-spi
 n.github.io).\n\nBio\nAndré Martins (PhD 2012\, Carnegie Mellon Universit
 y and University of Lisbon) is an Associate Professor at Instituto Superio
 r Técnico\, University of Lisbon\, researcher at Instituto de Telecomunic
 ações\, and the VP of AI Research at Unbabel. His research\, funded by a
  ERC Starting Grant (DeepSPIN) and other grants (P2020 project Unbabel4EU 
 and CMU-Portugal project MAIA) include machine translation\, quality estim
 ation\, structure and interpretability in deep learning systems for NLP. H
 is work has received best paper awards at ACL 2009 (long paper) and ACL 20
 19 (system demonstration paper). He co-founded and co-organizes the Lisbon
  Machine Learning School (LxMLS)\, and he is a Fellow of the ELLIS society
 .\n\nMore information
LOCATION:BC 420 https://plan.epfl.ch/?room==BC%20420 https://epfl.zoom.us/
 j/63576950757?pwd=V3I3MW0rQXBvZDNCc2NsdnIwbEJhQT09
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR
