Optimization Landscape and Two-Layer Neural Networks

Modern machine learning often optimizes a nonconvex objective using simple algorithm such as gradient descent. One way of explaining the success of such simple algorithms is by analyzing the optimization landscape and show that all local minima are also globally optimal. However, even for two-layer neural networks, the optimization landscape is hard to analyze and have many different regimes, depending on the size of a student network compared to the teacher network or the size of training set. We will talk about some recent results on the mildly overparametrized setting.

Date

October 23, 2019

Speakers

Rong Ge, Institute for Advanced Study

Affiliation

Duke University; Member, School of Mathematics

School of Mathematics

Seminar on Theoretical Machine Learning