Adaptive Database Systems for Efficient Analytical Query Processing

Event details

Date	17.06.2019
Hour	10:00 › 12:00
Speaker	Viktor Sanca
Location	BC 229
Category	Conferences - Seminars

EDIC candidacy exam
Exam president: Prof. Christoph Koch
Thesis advisor: Prof. Anastasia Ailamaki
Co-examiner: Prof. Karl Aberer

Abstract
Data integration is one of the most important problems of data analysis, since combining multiple data sources provides previously unknown insights. To analyze the data, users transform heterogeneous data to a compatible format and load them in the DBMS query engine, processing data that will never be used. Additionally, modern data analytics applications require minimal time to insight in presence of ad-hoc workloads. With conflicting requirements of flexibility and performance, analytical query processing needs to adapt efficiently to evolving workload and data characteristics.

In this proposal we examine an analytical query processing engine that enables fast queries over raw, heterogeneous data [1]. We describe the abstractions and mechanisms it introduces in order to efficiently adapt to queries over a variety of data formats, and compare the implementation to specialized systems. To address the cardinality estimation errors of queries caused by simplifying assumptions, we examine a data-driven method guided by workload for cardinality estimate adjustment [2]. Constructing a multitude of fast access methods facilitates query speedup in presence of unknown workloads, incurring a significant storage cost. We present a methodology that proposes to reduce index structure cost by automatic tuning based on adapting to data distribution, and compare its performance to traditional index structures [3].

Finally, inspired by previous approaches, we conclude with our research proposal regarding analytical query processing that adapts to evolving data and workload characteristics.

Background papers
Fast Queries over Heterogeneous Data Through Engine Customization, by M. Karpathiotakis, I. Alagiannis, and A. Ailamaki,Proc. VLDB Endow., vol. 9, no. 12, pp. 972–983, Aug. 2016.
LEO - DB2’s LEarning Optimizer, by M. Stillger, G. M. Lohman, V. Markl, and M. Kandil, in Proceedings of the 27th International Conference on Very Large Data Bases, San Francisco, CA, USA, 2001, pp. 19–28.
The Case for Learned Index Structures, by T. Kraska, A. Beutel, E. H. Chi, J. Dean, and N. Polyzotis, in Proceedings of the 2018 International Conference on Management of Data, New York, NY, USA, 2018, pp. 489–504.

Practical information

General public
Free

Contact

edic@epfl.ch

Export Event

Event broadcasted in

Send a reminder