BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:IC Colloquium : Optimization for large-scale machine learning: big
  model and big data
DTSTART:20160404T101500
DTEND:20160404T113000
DTSTAMP:20260407T125441Z
UID:e75eb060f8978c090449eb6964c61ff4082469bb515e19ce61a37c3a
CATEGORIES:Conferences - Seminars
DESCRIPTION:By : Anna Choromanska - New York University\nIC Faculty candid
 ateAbstract :\nThe talk will focus on selected challenges in modern large-
 scale machine learning in two settings: i) big model (deep learning) setti
 ng and ii) big data setting. The first part of the talk focuses on the the
 oretical analysis of challenging non-convex learning setting: deep learnin
 g with multilayer networks. Despite the success of convex methods\, deep l
 earning methods\, where the objective is inherently highly non-convex\, ha
 ve enjoyed a resurgence of interest in the last few years. Deep networks a
 chieve state-of-the-art performances on a number of problems in image reco
 gnition\, speech recognition\, natural language processing\, and video rec
 ognition\, but they are poorly understood from the theoretical perspective
 . Recent advances in deep learning theory will be presented. The connectio
 n between the highly non-convex loss function of a simple model of the ful
 ly-connected feed-forward neural network and the Hamiltonian of the spheri
 cal spin-glass model will be established. It will be shown that under cert
 ain assumptions i) for large-size networks\, most local minima are equival
 ent and yield similar performance on a test set\, (ii) the probability of 
 finding a “bad” local minimum\, i.e. with high value of loss\, is non-
 zero for small-size networks and decreases with network size\, (iii) strug
 gling to find the global minimum on the training set (as opposed to one of
  the many good local ones) is not useful in practice and may lead to overf
 itting. The advances made by this research in the field of deep learning a
 nd applications will be discussed. Modern machine learning approaches ofte
 n use big models\, like deep learning models discussed in the first part o
 f the talk\, to process and learn from the data. The recent widespread dev
 elopment of sensors\, data-storage and data-acquisition devices has helped
  make big data-sets common place. The second part of the talk focuses on a
  big data setting and addresses the problem of scaling learning algorithms
  to big data. The multi-class classification problem will be addressed\, w
 here the number of classes is extremely large\, with the goal of obtaining
  train and test time complexity logarithmic in the number of classes. A re
 duction of this problem to a set of binary classification problems organiz
 ed in a tree structure will be discussed. A top-down online tree construct
 ion approach for constructing logarithmic depth trees will be demonstrated
 \, which is based on a new objective function. Under favorable conditions\
 , the new approach leads to logarithmic depth trees that have leaves with 
 low label entropy. Discussed approach comes with theoretical guarantees fo
 llowing from convex analysis\, though the underlying problem is inherently
  non-convex. General discussion concludes the talk.Bio :\nAnna Choromanska
  is a Post-Doctoral Associate in the Computer Science Department at Couran
 t Institute of Mathematical Sciences\, New York University. She is working
  in the Computational and Biological Learning Lab\, which is a part of Com
 putational Intelligence\, Learning\, Vision\, and Robotics Lab\, of prof. 
 Yann LeCun. She graduated with her PhD from Columbia University\, Departme
 nt of Electrical Engineering\, where she was the The Fu Foundation School 
 of Engineering and Applied Science Presidential Fellowship holder. She was
  advised by prof. Tony Jebara. She completed her MSc with distinctions in 
 the Department of Electronics and Information Technology\, Warsaw Universi
 ty of Technology with double specialization\, Electronics and Computer Eng
 ineering and Electronics and Informatics in Medicine. She was working with
  various industrial institutions\, including AT&T Research Laboratories\, 
 IBM T.J. Watson Research Center and Microsoft Research New York. Her  res
 earch interests are in machine learning\, optimization and statistics with
  applications in biomedicine and neurobiology. She also holds a music degr
 ee from Mieczyslaw Karlowicz Music School in Warsaw\, Department of Piano 
 Play. She is an avid salsa dancer performing with the Ache Performance Gro
 up. Her other hobbies is painting and photography.More information
LOCATION:BC 420 https://plan.epfl.ch/?room==BC%20420
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR
