On the Implicit Bias of Dropout

Seminar: 
Applied Mathematics
Event time: 
Tuesday, April 27, 2021 - 2:00pm
Location: 
Zoom Meeting ID: 97670014308
Speaker: 
Rene Vidal
Speaker affiliation: 
Johns Hopkins University
Event description: 

Abstract: Dropout is a simple yet effective regularization technique that has been applied to various machine learning tasks, including linear classification, matrix factorization and deep learning. However, the theoretical properties of dropout as a regularizer remain quite elusive. This talk will present a theoretical analysis of dropout for single hidden-layer linear neural networks. We demonstrate that dropout is a stochastic gradient descent method for minimizing a certain regularized loss. We show that the regularizer induces solutions that are low-rank, in the sense of minimizing the number of neurons. We also show that the global optimum is balanced, in the sense that the product of the norms of incoming and outgoing weight vectors of all the hidden nodes equal. Finally, we provide a complete characterization of the optimization landscape induced by dropout.

Bio: Rene Vidal is the Herschel Seder Professor of Biomedical Engineering and the Inaugural Director of the Mathematical Institute for Data Science at The Johns Hopkins University. He has secondary appointments in Computer Science, Electrical and Computer Engineering, and Mechanical Engineering. He is also a faculty member in the Center for Imaging Science (CIS), the Institute for Computational Medicine (ICM) and the Laboratory for Computational Sensing and Robotics (LCSR). Vidal’s research focuses on the development of theory and algorithms for the analysis of complex high-dimensional datasets such as images, videos, time-series and biomedical data. His current major research focus is understanding the mathematical foundations of deep learning and its applications in computer vision and biomedical data science. His lab has pioneered the development of methods for dimensionality reduction and clustering, such as Generalized Principal Component Analysis and Sparse Subspace Clustering, and their applications to face recognition, object recognition, motion segmentation and action recognition. His lab creates new technologies for a variety of biomedical applications, including detection, classification and tracking of blood cells in holographic images, classification of embryonic cardio-myocytes in optical images, and assessment of surgical skill in surgical videos.

email tatianna.curtis@yale.edu for info