BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:IC Colloquium: Learning Models over Relational Databases
DTSTART:20191007T161500
DTEND:20191007T173000
DTSTAMP:20260406T235858Z
UID:664ec50a7a668033c371acde1435fb3b8c270d3e26284c4a430b6f64
CATEGORIES:Conferences - Seminars
DESCRIPTION:By: Dan Olteanu - University of Oxford\nVideo of his talk\n\nA
 bstract:\nIn this talk\, I will make the case for a first-principles appro
 ach to machine learning over relational databases that exploits recent dev
 elopment in database systems and theory. The input to learning classificat
 ion and regression models is defined by feature extraction queries over re
 lational databases. The mainstream approach to learning over relational da
 ta is to materialize the training dataset\, export it out of the database\
 , and then learn over it using statistical software packages. These three 
 steps are expensive and unnecessary. Instead\, one can cast the machine le
 arning problem as a database problem by decomposing the learning task into
  a batch of aggregates over the feature extraction query and by computing 
 this batch over the input database. Ongoing results show that the performa
 nce of this approach benefits tremendously from structural properties of t
 he relational data and of the feature extraction query\; such properties m
 ay be algebraic (semi-ring)\, combinatorial (hypertree width)\, or statist
 ical (sampling). It also benefits from factorized query evaluation and que
 ry compilation. For a variety of models\, including factorization machines
 \, decision trees\, and support vector machines\, this approach may come w
 ith lower computational complexity than the materialization of the trainin
 g dataset used by the mainstream approach. This translates to several orde
 rs-of-magnitude speed-up over state-of-the-art systems such as TensorFlow\
 , R\, Scikit-learn\, and mlpack.\n\nThis work is part of the FDB project
  (https://fdbresearch.github.io) and based on collaboration with Maximilia
 n Schleich (Oxford)\, Jakub Zavodny (Oxford)\, Milos Nikolic (Edinburgh)\,
  Mahmoud Abo-Khamis\, Ryan Curtin\, Hung Q. Ngo (RelationalAI)\, Ben Mosel
 ey (CMU)\, and XuanLong Nguyen (Michigan).\n\nBio:\nDan Olteanu is Profes
 sor of Computer Science at the University of Oxford and Computer Scientist
  at RelationalAI. He received his PhD from the University of Munich in 200
 5. He spends his time understanding hard computational challenges around d
 ata processing and designing simple and scalable solutions towards these c
 hallenges. He has published over 70 papers in the areas of database system
 s\, AI\, and theoretical computer science\, contributing to XML query proc
 essing\, incomplete information and probabilistic databases\, factorised d
 atabases\, scalable and incremental in-database optimisation\, and the com
 mercial systems LogicBlox and RelationalAI. He co-authored the book « Pro
 babilistic Databases » (2011). He has served as associate editor for PVLD
 B and IEEE TKDE\, as track chair for IEEE ICDE’15\, group leader for ACM
  SIGMOD’15\, vice chair for ACM SIGMOD’17\, and co-chair for AMW’18\
 , and he is currently serving as associate editor for ACM TODS and the SIG
 MOD Record Database Principles column. He is the recipient of an ERC Conso
 lidator grant (2016)\, an Oxford Outstanding Teaching award (2009)\, and t
 he ICDT 2019 best paper award.\n\nMore information
LOCATION:BC 420 https://plan.epfl.ch/?room==BC%20420
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR
