Maximum Entropy and Applications in Natural and Social Sciences

Thumbnail

Event details

Date 27.04.2009
Hour 16:15
Speaker Dr. Miroslav Dudik, Carnegie Mellon University, USA
Location
INM202
Category Conferences - Seminars
The maximum entropy approach (maxent), equivalent to maximum likelihood, is a widely used density-estimation technique. However, when trained on small datasets, maxent is likely to overfit, and when trained over large sample spaces, naive implementations of maxent are intractable. To prevent overfitting, we propose a relaxed version of maxent, which turns out to be equivalent to L1-regularized log likelihood. We prove strong statistical guarantees for L1-regularized maxent, and show how it can be generalized to the problem of estimation in the presence of sample-selection bias, and to the problem of simultaneous estimation of multiple densities. To address the computational challenges, we propose an approach based on sampling and coordinate descent. I discuss two applications of maxent: modeling distributions of biological species and modeling cross-cultural negotiation, focusing mainly on the former. Regularized maxent fits species distribution modeling well and offers several advantages over previous techniques. In particular, it addresses the problem in a statistically sound manner and allows principled extensions to situations when the data-collection process is biased or when we have access to data on many related species. Throughout the talk I will demonstrate the benefits of our approach on large, real-world modeling problems. Based on joint work with Rob Schapire, Steven Phillips, Geoff Gordon, Dave Blei and others. Bio: Miroslav Dudik received his PhD in Computer Science from Princeton University in 2007. Currently, he is a postdoctoral researcher at Carnegie Mellon University. His interests are in theoretical and applied aspects of machine learning, both statistical and algorithmic. He focuses on small-sample density estimation and game-theoretic modeling. M. Dudik's homepage

Practical information

  • General public
  • Free

Event broadcasted in

Share