Hashing-based-Estimators for Kernel Density in High Dimensions

Event details
Date | 16.06.2017 |
Hour | 09:30 › 10:15 |
Speaker | Moses Charikar, Stanford |
Location | |
Category | Conferences - Seminars |
Given a set of points P and a (kernel) function K, the Kernel Density Estimate at a point x is defined as the average value of the kernel between the point x and every point y in P. We study the problem of designing a data structure that given a data set P and a kernel function, returns approximations to the kernel density of a query point in sublinear time. This problem and weighted versions arise as basic primitives in statistics (density estimation), machine learning (kernel methods) and scientific computing (physical simulations).
We introduce a class of unbiased estimators for kernel density using importance sampling, implemented through locality-sensitive hashing. We give general theorems bounding the variance of such estimators, that give rise to efficient data structures for estimating the kernel density in high dimensions for a variety of commonly used kernels. Our work is the first to provide data-structures with theoretical guarantees that improve upon simple random sampling in high dimensions.
Joint work with Paris Syminelakis.
Links
Practical information
- Expert
- Free
Organizer
- Jggi, Kapralov, Svensson
Contact
- Pauline Raffestin