IC Colloquium : Optimization for large-scale machine learning: big model and big data

Thumbnail

Event details

Date 04.04.2016
Hour 10:1511:30
Location
Category Conferences - Seminars
By : Anna Choromanska - New York University

IC Faculty candidate

Abstract :
The talk will focus on selected challenges in modern large-scale machine learning in two settings: i) big model (deep learning) setting and ii) big data setting. The first part of the talk focuses on the theoretical analysis of challenging non-convex learning setting: deep learning with multilayer networks. Despite the success of convex methods, deep learning methods, where the objective is inherently highly non-convex, have enjoyed a resurgence of interest in the last few years. Deep networks achieve state-of-the-art performances on a number of problems in image recognition, speech recognition, natural language processing, and video recognition, but they are poorly understood from the theoretical perspective. Recent advances in deep learning theory will be presented. The connection between the highly non-convex loss function of a simple model of the fully-connected feed-forward neural network and the Hamiltonian of the spherical spin-glass model will be established. It will be shown that under certain assumptions i) for large-size networks, most local minima are equivalent and yield similar performance on a test set, (ii) the probability of finding a “bad” local minimum, i.e. with high value of loss, is non-zero for small-size networks and decreases with network size, (iii) struggling to find the global minimum on the training set (as opposed to one of the many good local ones) is not useful in practice and may lead to overfitting. The advances made by this research in the field of deep learning and applications will be discussed. Modern machine learning approaches often use big models, like deep learning models discussed in the first part of the talk, to process and learn from the data. The recent widespread development of sensors, data-storage and data-acquisition devices has helped make big data-sets common place. The second part of the talk focuses on a big data setting and addresses the problem of scaling learning algorithms to big data. The multi-class classification problem will be addressed, where the number of classes is extremely large, with the goal of obtaining train and test time complexity logarithmic in the number of classes. A reduction of this problem to a set of binary classification problems organized in a tree structure will be discussed. A top-down online tree construction approach for constructing logarithmic depth trees will be demonstrated, which is based on a new objective function. Under favorable conditions, the new approach leads to logarithmic depth trees that have leaves with low label entropy. Discussed approach comes with theoretical guarantees following from convex analysis, though the underlying problem is inherently non-convex. General discussion concludes the talk.

Bio :
Anna Choromanska is a Post-Doctoral Associate in the Computer Science Department at Courant Institute of Mathematical Sciences, New York University. She is working in the Computational and Biological Learning Lab, which is a part of Computational Intelligence, Learning, Vision, and Robotics Lab, of prof. Yann LeCun. She graduated with her PhD from Columbia University, Department of Electrical Engineering, where she was the The Fu Foundation School of Engineering and Applied Science Presidential Fellowship holder. She was advised by prof. Tony Jebara. She completed her MSc with distinctions in the Department of Electronics and Information Technology, Warsaw University of Technology with double specialization, Electronics and Computer Engineering and Electronics and Informatics in Medicine. She was working with various industrial institutions, including AT&T Research Laboratories, IBM T.J. Watson Research Center and Microsoft Research New York. Her  research interests are in machine learning, optimization and statistics with applications in biomedicine and neurobiology. She also holds a music degree from Mieczyslaw Karlowicz Music School in Warsaw, Department of Piano Play. She is an avid salsa dancer performing with the Ache Performance Group. Her other hobbies is painting and photography.

More information

Practical information

  • General public
  • Free
  • This event is internal

Contact

  • Host : Ruediger Urbanke

Share