[Colloquium] [Staff] Theory Seminars at Computer Science

Donna Brooms donna at cs.uchicago.edu
Wed Dec 26 11:30:15 CST 2012


THEORY SEMINAR!

~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 

Wednesday, January 9, 2013
3:00 p.m.

Ryerson 251



Yuval Rabani

The Hebrew University of Jerusalem

www.cs.huji.ac.il/~yrabani/



Title: Learning Mixtures of Arbitrary Distributions over Large Discrete Domains

Abstract: We give an algorithm for learning a mixture of unstructured distributions. This problem arises in various unsupervised learning scenarios, for example in learning topic models from a corpus of documents spanning several topics. We show how to learn the constituents (the topic distributions and the mixture weights) of a mixture of k (constant) arbitrary distributions over a large discrete domain [n] = {1, 2, ... , n}, using O(n polylog n) samples.

This task is information-theoretically impossible for k > 1 under the usual sampling process from a mixture distribution. However, there are situations (such as the above-mentioned topic model case) in which each sample point consists of several observations from the same mixture constituent. This number of observations, which we call the “sampling aperture”, is a crucial parameter of the problem. We show that efficient learning is possible exactly at the information- theoretically least-possible aperture of 2k − 1. (Independent work by others places certain restrictions on the model, which enables learning with smaller aperture, albeit using, in general, a significantly larger sample size.)

A sequence of tools contribute to the algorithm, such as concentration results for Random matrices, dimension reduction, moment estimations, and sensitivity analysis.

This is joint work with Leonard Schulman and Chaitanya Swamy.

Host: Prof. Subhash Khot

*Refreshments will be served prior to the talk at 2:30 in Ryerson 255*

 



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20121226/77a704a4/attachment.htm 


More information about the Colloquium mailing list