Towards Theoretical Understanding of Deep Learning

Organizers: Amit Daniely (, Rong Ge (, Tengyu Ma (, Ohad Shamir (


Deep learning has resulted in breakthroughs in many areas of artificial intelligence including computer vision, speech, natural language processing, reinforcement learning, robotics etc. However, theoretical understanding of deep learning has been scarce.

Why can simple algorithms (such as stochastic gradient descent) find high-quality solutions even though the objective function is non-convex? Why do the learned neural networks generalize to the test data even though the number of parameters is much more than the number of examples? These fundamental problems about the optimization and generalization  have attracted a lot of recent attention, but we are still far from satisfying answers.

The new models and formulations in deep learning has also introduced new algorithmic challenges. For example, Generative Adversarial Nets (GANs) are very effective in generating images, but its training procedure is still very unstable. Could we design algorithms with convergence guarantees for GANs?Neural network models are often susceptible to adversarial examples. How can we find models that are robust to adversarial perturbations?


The workshop will serve as an introduction to recent developments in theoretical understanding of deep neural networks, including techniques, results, and research directions. We hope to bring together researchers in theory community, foster research discussions between theory and practice, and eventually lead to interesting results that could impact deep learning practice.


Full talk abstracts

08:50-09:30 amSanjeev AroraUnderstanding the "effective capacity" of deep nets via a compression approachSlides
09:30-10:10 amAleksander MadryTowards ML You Can Rely OnSlides
10:10-10:25 amBreak
10:25-11:15 amOhad ShamirIs Depth Needed for Deep Learning? Circuit Complexity in Neural NetworksSlides
11:15-11:45 amTengyu MaAlgorithmic Regularization in Over-parameterized Matrix Recovery and Neural Networks with Quadratic ActivationsSlides
Lunch Break
1:00-1:50 pmNathan SreboGeneralization and Implicit Regularization in Deep Learning.
1:50-2:00 pmBreak
2:00-2:50 pmRong GeDo Deep Networks have Bad Local Minima?Brief survey on optimization landscape for neural networksSlides
2:50-3:00 pmBreak
3:00-4:00 pmAmit DanielyOn PAC learning and deep learningSlides