High-dimensional model selection: transfer learning and projected computation
Event details
| Date | 30.04.2026 |
| Hour | 16:15 › 17:15 |
| Speaker | David Rossel, Universitat Pompeu Fabra (UPF) and Barcelona School of Economics (BSE). |
| Location | |
| Category | Conferences - Seminars |
| Event Language | English |
We define as model selection the structural learning problem where the goal is to learn the subset of truly zero parameters in a probability model of interest, such as canonical generalized linear models, generalized additive models or graphical models. This is of course one of the most classical and fundamental problems in Statistics and Machine Learning. L0 penalization methods and their related Bayesian model selection counterparts have optimal mathematical properties for this task, yet much mainstream literature considers such methods to be either unnecessary or impractical. This talk discusses these two objections, and how to ameliorate the issues: is it useful to do this, and can we do this computationally?
We first discuss how these methods can be useful, not only in theory but also in practice, and how to improve their performance via data integration (also called transfer learning). For example, sparse model recovery methods enjoy excellent asymptotic properties when certain sparsity and signal strength (betamin) conditions hold, but these assumptions often don't hold in some application domains. We show that data integration pushes the mathematical conditions under which consistent model recovery is possible.
Regarding the second objection of computational impracticality, we review recent optimization and MCMC literature showing that, under somewhat strict sparsity assumptions, the computational cost scales linearly with the problem dimension (with high probability, asymptotically). A key practical issue is that such results assume that one can quickly score each candidate model (at constant cost), but even in least-squares the cost is (at least) quadratic in the model dimension and grows also with the sample size n. We propose a new class of projected model selection criteria that score models at constant cost, after an initial pre-processing step, and which enjoy the same asymptotic and practical performance as the costlier exact model scores.
Practical information
- Informed public
- Free
Organizer
- Rajita Chandak
Contact
- Maroussia Schaffner