[Colloquium] REMINDER - Tomorrow 10:30 - 11:30 am ~ Machine Learning Seminar ~ Madeleine Udell

Annie Simmons simmons3 at cs.uchicago.edu
Thu Oct 17 10:51:25 CDT 2019


University of Chicago and Toyota Technological Institute at Chicago
Machine Learning Seminar Series


Madeleine Udell
Assistant Professor Operations Research & Information Engineering
Cornell

Friday, October 18 10:30 – 11:30 am
JCL, RM 390


Title:
Big data is low rank

Abstract:
Matrices of low rank are pervasive in big data, appearing in recommender systems, movie preferences, topic models, medical records, and genomics. While there is a vast literature on how to exploit low rank structure in these datasets, there is less attention on explaining why low rank structure appears in the first place.   In this talk, we explain the abundance of low rank matrices in big data by proving that certain latent variable models associated to piecewise analytic functions are of log-rank. Any large matrix from such a latent variable model can be approximated, up to a small error, by a low rank matrix. Armed with this theorem, we show how to use a low rank modeling framework to exploit low rank structure even for datasets that are not numeric, with applications in the social sciences, medicine, and automated machine learning.

Bio:
Madeleine Udell is Assistant Professor of Operations Research and Information Engineering and Richard and Sybil Smith Sesquicentennial Fellow at Cornell University. She studies optimization and machine learning for large scale data analysis and control, with applications in marketing, demographic modeling, medical informatics, engineering system design, and automated machine learning. Her research in optimization centers on detecting and exploiting novel structures in optimization problems, with a particular focus on convex and low rank problems. Her research in machine learning centers on methods for imputing missing data in large tabular data sets. Her work on generalized low rank models (GLRMs) extends principal components analysis (PCA) to embed tabular data sets with heterogeneous (numerical, Boolean, categorical, and ordinal) types into a low dimensional space, providing a coherent framework for compressing, denoising, and imputing missing entries. 
Host: Rebecca Willett





Annie Simmons
Project Assistant IV
Computer Science Department
John Crerar Library Building
5730 S. Ellis
Chicago, IL 60637 
773.834.2750
773.702.8487
simmons3 at cs.uchicago.edu

“The dream is free the hustle is sold separately"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20191017/4e2de6a4/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Madeleine Udell.pdf
Type: application/pdf
Size: 1024833 bytes
Desc: not available
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20191017/4e2de6a4/attachment-0001.pdf>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20191017/4e2de6a4/attachment-0003.html>


More information about the Colloquium mailing list