Spectral methods for unsupervised ensemble learning and latent variable models

Seminar: 
Applied Mathematics/Analysis Seminar
Event time: 
Tuesday, January 16, 2018 - 4:00pm to 5:00pm
Speaker: 
Ariel Jaffe
Speaker affiliation: 
Weizmann Institute
Event description: 

With the availability of huge amounts of unlabeled data, unsupervised learning methods are gaining increasing popularity and importance. We focus on ”unsupervised ensemble learning”, where one obtains the predictions of multiple classifiers over a set of unlabeled instances. The classifiers may be human experts as in crowdsourcing, or prediction algorithms developed by research groups worldwide. The challenge is to estimate the accuracies of the different classifiers and combine them to an accurate meta-learner. To tackle this problems we show howit relates to latent variable models, and derive simple estimates for the classifiers’ accuracies based on a spectral analysis of the observed data. On the experimental side, we apply our methods to a problem in Computational Biology, where for various classification tasks one combines theresults of multiple algorithms for improved accuracy. In the second partof the talk, I will focus on extending the techniques developed for unsupervised ensemble learning to a specific family of linear latent variablemodels. For cases where the latent layer is binary, we derive an interesting relation between the model parameters and the relatively recent notion of tensor eigenvectors of the data higher order moments. We apply ourmethods to overlapping clustering, a problem that gained popularity dueits applicability in various domains such as gene expressions analysis and text categorization.