Towards Domain-Specific Data Management

Event details

Date	07.06.2012
Hour	15:00 › 16:00
Speaker	Prof. Yanif Ahmad, Johns Hopkins University, Baltimore, USA
Location	BC 01
Category	Conferences - Seminars

Abstract:
To overcome scaling challenges, today's computing applications are increasingly exploiting domain-specific properties in their data, computation, and infrastructure models. In data management, specialization of monolithic relational database management systems that cannot service the needs of evermore diverse datasets has led to many new classes of data management systems. In programming languages, embedded domain-specific languages have emerged as a popular technique to express specialized primitives, operators and optimizing toolchains. We argue that data management systems should expose facilities for domain-specific data models, where we can leverage known mathematical properties and representations of our data. Combining mathematical modeling with declarative querying facilitates rich exploratory access to data, and compact data representations with tunable approximation.
In this talk, I will first present Pulse, a query processor for continuous-time databases based on temporal polynomial models. Pulse uses piecewise polynomials to provide a compact, approximate representation of the input dataset and processes queries by solving simultaneous equation systems in contrast to set-at-a-time record processing. Pulse is able to achieve significant performance improvements by directly processing polynomials prior to discretization, and by exploiting user-defined precision bounds to reduce computation overheads. Beyond polynomial models, I will also discuss our ongoing work with processing queries on statistical models, specifically probabilistic graphical models, applying joint incremental query processing and inference techniques in the BLOG (Bayesian Logic) programming language.

Bio:
Yanif Ahmad is an Assistant Professor in the Department of Computer Science at the Johns Hopkins University. His research goals are to enable insightful monitoring of large streaming datasets, and to facilitate easier declarative construction of scalable systems in novel computing applications. In addition to the talk topics, Yanif's ongoing work includes exploring joint database and compiler-style optimizations for large-scale data processing and analytics, and protein data management for a petabyte-scale molecular and drug design dataset in collaboration with the Johns Hopkins Medical School. He received his PhD from Brown University, and has been a postdoctoral associate with the Database Group at Cornell University.

Practical information

General public
Free

Organizer

SuRI 2012

Contact

Simone Muller

Export Event

Event broadcasted in

Send a reminder