Improving uncertainty quantification in Bayesian cluster analysis

Thumbnail

Event details

Date 11.04.2025
Hour 15:1516:15
Speaker Cecilia Balocchi, University of Edinburgh
Location
Category Conferences - Seminars
Event Language English

The Bayesian approach to clustering is often appreciated for its ability to provide uncertainty in the partition structure. However, summarizing the posterior distribution over the clustering structure can be challenging. Wade and Ghahramani (2018) proposed to summarize the posterior samples using a single optimal clustering estimate, which minimizes the expected posterior Variation of Information (VI).  
In instances where the posterior distribution is multimodal, it can be beneficial to summarize the posterior samples using multiple clustering estimates, each corresponding to a different part of the space of partitions that receives substantial posterior mass.
In this work, we propose to find such clustering estimates by approximating the posterior distribution in a VI-based Wasserstein distance sense. An interesting byproduct is that this problem can be seen as using the k-means algorithm to divide the posterior samples into different groups, each represented by one of the clustering estimates.
Using both synthetic and real datasets, we show that our proposal helps to improve the understanding of uncertainty, particularly when the data clusters are not well separated, or when the employed model is misspecified.
 

Practical information

  • Informed public
  • Free

Organizer

  • Myrto Limnios

Contact

  • Maroussia Schaffner

Event broadcasted in

Share