Gradient-Based Feature Learning under Structured Data

Thumbnail

Event details

Date 20.10.2023
Hour 13:1514:15
Speaker Alireza Mousavi-Hosseini (University of Toronto)
Location
Category Conferences - Seminars
Event Language English

Recent works have demonstrated that in high-dimensional settings, the sample complexity of gradient-based learning of single index models, i.e. functions that depend on a 1-dimensional projection of the input data, is governed by a quantity of the model called the information exponent. However, these results are only concerned with isotropic data, while in practice the input often contains additional structure which can implicitly guide the algorithm. In this talk, we investigate the effect of a spiked covariance structure and reveal several interesting phenomena. First, we show that in the anisotropic setting, the commonly used spherical gradient dynamics may fail to recover the true direction, which can be alleviated by appropriate weight normalization that is reminiscent of batch normalization. Further, we demonstrate that depending on the spark-target alignment, the sample complexity can go through a three-stage phase transition. In particular, with a suitably large spike, the sample complexity of gradient-based training can be made independent of the information exponent while also outperforming lower bounds for rotationally invariant kernel methods.

Practical information

  • Informed public
  • Free

Organizer

  • Lénaïc Chizat

Share