Hashing-based-Estimators for Kernel Density in High Dimensions

Thumbnail

Event details

Date 16.06.2017
Hour 09:3010:15
Speaker Moses Charikar, Stanford
Location
Category Conferences - Seminars

Given a set of points P and a (kernel) function K, the Kernel Density Estimate at a point x is defined as the average value of the kernel between the point x and every point y in P. We study the problem of designing a data structure that given a data set P and a kernel function, returns approximations to the kernel density of a query point in sublinear time. This problem and weighted versions arise as basic primitives in statistics (density estimation), machine learning (kernel methods) and scientific computing (physical simulations).
We introduce a class of unbiased estimators for kernel density using importance sampling, implemented through locality-sensitive hashing. We give general theorems bounding the variance of such estimators, that give rise to efficient data structures for estimating the kernel density in high dimensions for a variety of commonly used kernels. Our work is the first to provide data-structures with theoretical guarantees that improve upon simple random sampling in high dimensions.
Joint work with Paris Syminelakis.

Links

Practical information

  • Expert
  • Free

Organizer

  • Jggi, Kapralov, Svensson

Contact

  • Pauline Raffestin

Event broadcasted in

Share