Efficient out-of-memory analytics on heterogeneous hardware through workload-aware storage interfaces.
Event details
Date | 24.06.2022 |
Hour | 14:00 › 16:00 |
Speaker | Hamish Nicholson |
Location | |
Category | Conferences - Seminars |
EDIC candidacy exam
Exam president: Prof. Edouard Bugnion
Thesis advisor: Prof. Anastasia Ailamaki
Co-examiner: Prof. Anne-Marie Kermarrec
Abstract
Modern servers increasingly feature heterogeneous
storage (e.g., HDD, SSD, DRAM and emerging memory technologies).
Storage systems, which have historically been designed
for one or two classes of storage devices, must now make
data placement decisions on multiple device classes. Good data
placement decisions can lower storage costs and improve system
response times. Most data placement decisions follow one of
two approaches. The system can passively monitor requests to
make caching decisions based on frequency and recency of use,
or a human database administrator can manually place data
into a specific tier. Passively monitoring requests discards useful
semantic information, which can provide information on data reuse
and co-accessed data, while the manual intervention does not
scale as the decision space of data-placement decisions increases.
In order to more effectively utilize heterogeneous modern storage,
data systems need to provide expressive interfaces that allow
important information to propagate across layers.
Background papers
- Mosaic: a budget-conscious storage engine for relational database systems
- hStorage-DB: heterogeneity-aware data management to exploit the full capability of hybrid storage systems
- OctopusFS: A Distributed File System with Tiered Storage Management
Exam president: Prof. Edouard Bugnion
Thesis advisor: Prof. Anastasia Ailamaki
Co-examiner: Prof. Anne-Marie Kermarrec
Abstract
Modern servers increasingly feature heterogeneous
storage (e.g., HDD, SSD, DRAM and emerging memory technologies).
Storage systems, which have historically been designed
for one or two classes of storage devices, must now make
data placement decisions on multiple device classes. Good data
placement decisions can lower storage costs and improve system
response times. Most data placement decisions follow one of
two approaches. The system can passively monitor requests to
make caching decisions based on frequency and recency of use,
or a human database administrator can manually place data
into a specific tier. Passively monitoring requests discards useful
semantic information, which can provide information on data reuse
and co-accessed data, while the manual intervention does not
scale as the decision space of data-placement decisions increases.
In order to more effectively utilize heterogeneous modern storage,
data systems need to provide expressive interfaces that allow
important information to propagate across layers.
Background papers
- Mosaic: a budget-conscious storage engine for relational database systems
- hStorage-DB: heterogeneity-aware data management to exploit the full capability of hybrid storage systems
- OctopusFS: A Distributed File System with Tiered Storage Management
Practical information
- General public
- Free