BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:Compile-Time Code Generation of Embedded Data-Intensive Query Lang
 uages
DTSTART:20171006T110000
DTEND:20171006T120000
DTSTAMP:20260407T110325Z
UID:cfcba882372438aae7884b8508be212c33a885412156bcd61f62d01d
CATEGORIES:Conferences - Seminars
DESCRIPTION:Prof. Leonidas Fegaras    \nMany emerging Big-Data program
 ming environments\, such as Spark and Flink\, provide powerful APIs\, insp
 ired by functional programming\, that consist of a small number of higher-
 order operations. However\, because of the complexity involved in developi
 ng and fine-tuning data analysis applications using the provided APIs\, ma
 ny programmers prefer to use declarative languages\, such as Hive and Spar
 k SQL\, to code their distributed  applications. Unfortunately\, current 
 data analysis query languages\, which are typically based on the relationa
 l model\, cannot effectively capture the rich data types and computations 
 required for complex data analysis applications.  Furthermore\, these que
 ry languages are not well-integrated with the host programming language\, 
 as they are based on an incompatible data model\, and are checked for corr
 ectness at run-time\, which results in a significantly longer program deve
 lopment time. In this talk\, I will introduce a new query language for dat
 a-intensive scalable computing\, called DIQL\, that is deeply embedded in 
 Scala\, and a query optimization framework that optimizes and translates D
 IQL queries to byte code at compile-time. DIQL supports nested collections
  and hierarchical data and allows query nesting at any place in a query. W
 ith DIQL\, programmers can express complex data analysis tasks\, such as P
 ageRank and matrix factorization\, using SQL-like syntax exclusively. I wi
 ll also present an algebra for data-intensive scalable computing based on 
 monoid homomorphisms that consists of a small set of operations that captu
 re most features supported by current domain-specific languages for data-c
 entric distributed computing. The DIQL query optimizer\, which is based on
  the monoid algebra\, can find any possible join in a query\, including jo
 ins hidden across deeply nested queries\, thus unnesting any form of query
  nesting.
LOCATION:BC 420 https://plan.epfl.ch/?room==BC%20420
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR
