Inductive biases for deep learning over sequential data: from connectivity to memory addressing

Seminar:

Applied Mathematics

Event time:

Monday, January 18, 2021 - 2:30pm

Location:

Zoom Meeting ID: 97670014308

Speaker:

Guillaume Lajoie

Speaker affiliation:

MILA/Universite De Montreal

Event description:

ABSTRACT: In neural networks, a key hurdle for efficient learning involving sequential data is ensuring good signal propagation over long timescales, while simultaneously allowing systems to be expressive enough to implement complex computations. The brain has evolved to tackle this problem on different scales, and deriving architectural inductive biases based on these strategies can help design better AI systems.

In this talk, I will present two examples of such inductive biases for recurrent neural networks with and without self-attention. In the first, we propose a novel connectivity structure based on « hidden feed forward » features, using an efficient parametrization of connectivity matrices based on the Schur decomposition. In the second, we present a formal analysis of how self-attention affects gradient propagation in recurrent networks, and prove that it mitigates the problem of vanishing gradients when trying to capture long-term dependencies.

contact tatianna.curtis@yale.edu for password.