Towards Theoretical Understanding of Deep Learning

Organizers: Amit Daniely (amit.daniely@mail.huji.ac.il), Rong Ge (rongge@cs.duke.edu), Tengyu Ma (tengyuma@cs.stanford.edu), Ohad Shamir (ohad.shamir@weizmann.ac.il)

Description

Deep learning has resulted in breakthroughs in many areas of artificial intelligence including computer vision, speech, natural language processing, reinforcement learning, robotics etc. However, theoretical understanding of deep learning has been scarce.

Why can simple algorithms (such as stochastic gradient descent) find high-quality solutions even though the objective function is non-convex? Why do the learned neural networks generalize to the test data even though the number of parameters is much more than the number of examples? These fundamental problems about the optimization and generalization have attracted a lot of recent attention, but we are still far from satisfying answers.

The new models and formulations in deep learning has also introduced new algorithmic challenges. For example, Generative Adversarial Nets (GANs) are very effective in generating images, but its training procedure is still very unstable. Could we design algorithms with convergence guarantees for GANs?Neural network models are often susceptible to adversarial examples. How can we find models that are robust to adversarial perturbations?

Goal

The workshop will serve as an introduction to recent developments in theoretical understanding of deep neural networks, including techniques, results, and research directions. We hope to bring together researchers in theory community, foster research discussions between theory and practice, and eventually lead to interesting results that could impact deep learning practice.

Schedule

08:50-09:30 am | Sanjeev Arora | Understanding the "effective capacity" of deep nets via a compression approach | Slides |

09:30-10:10 am | Aleksander Madry | Towards ML You Can Rely On | Slides |

10:10-10:25 am | Break | ||

10:25-11:15 am | Ohad Shamir | Is Depth Needed for Deep Learning? Circuit Complexity in Neural Networks | Slides |

11:15-11:45 am | Tengyu Ma | Algorithmic Regularization in Over-parameterized Matrix Recovery and Neural Networks with Quadratic Activations | Slides |

Lunch Break | |||

1:00-1:50 pm | Nathan Srebo | Generalization and Implicit Regularization in Deep Learning. | |

1:50-2:00 pm | Break | ||

2:00-2:50 pm | Rong Ge | Do Deep Networks have Bad Local Minima?Brief survey on optimization landscape for neural networks | Slides |

2:50-3:00 pm | Break | ||

3:00-4:00 pm | Amit Daniely | On PAC learning and deep learning | Slides |