Covariance Alignment with Optimal Transport

Seminar:

Applied Mathematics

Event time:

Wednesday, April 10, 2024 - 3:00pm

Location:

LOM 214

Speaker:

George Stepaniants

Speaker affiliation:

MIT

Event description:

Dataset or feature alignment is a longstanding problem appearing in many areas including computer vision, natural language translation, and biostatistics. Here we show how a novel type of alignment problem arises in the matching of “untargeted” biological data where the concentrations of unlabeled biological molecules (features) are recorded over a collection of samples or patients. Partnering with biologists at the International Agency for Research on Cancer (IARC), we develop a practical and efficient tool for untargeted dataset alignment to be used in laboratory settings. Our approach aligns feature covariance matrices between datasets using the celebrated Gromov-Wasserstein (GW) algorithm from optimal transport. Motivated by the success of our approach, we investigate the statistical complexity of Gromov-Wasserstein for aligning empirical covariance matrices. Remarkably, we find that the GW algorithm achieves the same minimax optimal rates for this problem as a (quasi) maximum likelihood estimator, proving that it is statistically competitive. These results offer a new challenging setting for graph matching of Wishart (covariance) matrices and the first statistical rates of estimation for the Gromov-Wasserstein algorithm.