On the Implicit Bias of Dropout | Department of Mathematics

Seminar:

Applied Mathematics

Event time:

Tuesday, April 27, 2021 - 2:00pm

Location:

Zoom Meeting ID: 97670014308

Speaker:

Rene Vidal

Speaker affiliation:

Johns Hopkins University

Event description:

Abstract: Dropout is a simple yet effective regularization technique that has been applied to various machine learning tasks, including linear classification, matrix factorization and deep learning. However, the theoretical properties of dropout as a regularizer remain quite elusive. This talk will present a theoretical analysis of dropout for single hidden-layer linear neural networks. We demonstrate that dropout is a stochastic gradient descent method for minimizing a certain regularized loss. We show that the regularizer induces solutions that are low-rank, in the sense of minimizing the number of neurons. We also show that the global optimum is balanced, in the sense that the product of the norms of incoming and outgoing weight vectors of all the hidden nodes equal. Finally, we provide a complete characterization of the optimization landscape induced by dropout.

Bio: Rene Vidal is the Herschel Seder Professor of Biomedical Engineering and the Inaugural Director of the Mathematical Institute for Data Science at The Johns Hopkins University. He has secondary appointments in Computer Science, Electrical and Computer Engineering, and Mechanical Engineering. He is also a faculty member in the Center for Imaging Science (CIS), the Institute for Computational Medicine (ICM) and the Laboratory for Computational Sensing and Robotics (LCSR). Vidal’s research focuses on the development of theory and algorithms for the analysis of complex high-dimensional datasets such as images, videos, time-series and biomedical data. His current major research focus is understanding the mathematical foundations of deep learning and its applications in computer vision and biomedical data science. His lab has pioneered the development of methods for dimensionality reduction and clustering, such as Generalized Principal Component Analysis and Sparse Subspace Clustering, and their applications to face recognition, object recognition, motion segmentation and action recognition. His lab creates new technologies for a variety of biomedical applications, including detection, classification and tracking of blood cells in holographic images, classification of embryonic cardio-myocytes in optical images, and assessment of surgical skill in surgical videos.

email tatianna.curtis@yale.edu for info