Reconstructing Flexible Proteins from Massive Microscopy Datasets

Applied Mathematics
Event time: 
Wednesday, February 7, 2024 - 3:00pm
LOM 214
Marc Aurele Gilles
Speaker affiliation: 
Event description: 

The reconstruction of flexible proteins is one of the most critical challenges in structural biology, allowing us to gain insights into the functions and mechanisms of biomolecules by observing their motion. Cryogenic electron microscopy (cryo-EM) stands out as an ideal technique for studying the dynamic conformational landscape (i.e., range of motions) as it can capture a snapshot of the entire conformational ensemble. However, this reconstruction task comes with notable mathematical and computational challenges due to massive datasets, sometimes exceeding terabytes, high dimensionality, and substantial noise.


After delving into the basics of the cryo-EM reconstruction problem, I will present a framework for reconstructing the protein distribution in a dataset by representing it in a linear subspace. The initial step can be viewed as a linear algebra problem: how can one compute a basis for proteins (3D volumes) from only incomplete and noisy measurements (2D images)? I will propose a method based on a Nyström extension of a regularized estimator of the covariance of the volumes. In subsequent steps, we will use this low-dimensional basis to reconstruct individual volumes using standard statistical methods and infer motions using elements of optimal control. I will conclude by discussing remaining challenges and open problems.