BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:Using structured knowledge for solving multilingual NLP tasks
DTSTART:20210722T130000
DTEND:20210722T150000
DTSTAMP:20260409T002518Z
UID:875a857203b8cc6f03aeccebb3f687f70516822ccced1b9ad8a95ed8
CATEGORIES:Conferences - Seminars
DESCRIPTION:Marija Sakota\nEDIC candidacy exam\nexam president: Prof. Tanj
 a Käser\nthesis advisor: Prof. Robert West\nco-examiner: Prof. Boi Faltin
 gs\n\nAbstract\nIn recent years\, big language models pre-trained in unsup
 ervised manner have become increasingly popular. However\, most of them fo
 cus solely on the English language or monolingual models in other popular 
 languages. Generating text in low-resource languages is still a challengin
 g task. Alternative to monolingual are multilingual models that can handle
  multiple languages in a single model and\, possibly\, exploit knowledge f
 rom high-resource languages to improve performance on low-resource ones. A
 lthough pre-training can enable a model to learn some facts and relations\
 , it still fails to rigidly enforce their usage. The solution for that is 
 to include structured knowledge in these models through\, for example\,  
 the usage of knowledge graphs. Integrating text and knowledge graph inform
 ation turns out to be a non-trivial problem\, mostly because of the struct
 ural difference between them. In this proposal\, first\, a new method for 
 pre-training of complete sequence-to-sequence models by denoising text in 
 multiple languages is introduced. Then\, a method for including informatio
 n from knowledge graphs through a new encoder is presented. Next\, a new f
 orm of extreme summarization task for scientific articles and a method to 
 solve it are showcased. Finally\, possible research directions for multili
 ngual models that use structured data are discussed.\n\n\nBackground paper
 s\n1) Multilingual Denoising Pre-training for Neural Machine Translation\
 , Yinhan Liu\, Jiatao Gu\, Naman Goyal\, Xian Li \, Sergey Edunov\, Marja
 n Ghazvininejad\, Mike Lewis\, and Luke Zettlemoyer: https://www.aclweb.o
 rg/anthology/2020.tacl-1.47.pdf\n2) Text Generation from Knowledge Graphs
  with Graph Transformers\, Rik Koncel-Kedziorski\, Dhanush Bekal\, Yi Lua
 n\, Mirella Lapata\, and Hannaneh Hajishirzi: https://www.aclweb.org/anth
 ology/N19-1238.pdf\n3) TLDR:Extreme Summarization of Scientific Docum
 ents\, Isabel Cachola\, Kyle Lo\, Arman Cohan\, Daniel S. Weld: https://w
 ww.aclweb.org/anthology/2020.findings-emnlp.428.pdf
LOCATION:
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR
