Improving uncertainty quantification in Bayesian cluster analysis

Event details
Date | 11.04.2025 |
Hour | 15:15 › 16:15 |
Speaker | Cecilia Balocchi, University of Edinburgh |
Location | |
Category | Conferences - Seminars |
Event Language | English |
The Bayesian approach to clustering is often appreciated for its ability to provide uncertainty in the partition structure. However, summarizing the posterior distribution over the clustering structure can be challenging. Wade and Ghahramani (2018) proposed to summarize the posterior samples using a single optimal clustering estimate, which minimizes the expected posterior Variation of Information (VI).
In instances where the posterior distribution is multimodal, it can be beneficial to summarize the posterior samples using multiple clustering estimates, each corresponding to a different part of the space of partitions that receives substantial posterior mass.
In this work, we propose to find such clustering estimates by approximating the posterior distribution in a VI-based Wasserstein distance sense. An interesting byproduct is that this problem can be seen as using the k-means algorithm to divide the posterior samples into different groups, each represented by one of the clustering estimates.
Using both synthetic and real datasets, we show that our proposal helps to improve the understanding of uncertainty, particularly when the data clusters are not well separated, or when the employed model is misspecified.
Practical information
- Informed public
- Free
Organizer
- Myrto Limnios
Contact
- Maroussia Schaffner