[Colloquium] REMINDER: 1/25 Talks at TTIC: Ilya Razenshteyn, MIT

Mary Marre via Colloquium colloquium at mailman.cs.uchicago.edu
Tue Jan 24 14:14:57 CST 2017


When:     Wednesday, January 25th at 11:00 am

Where:    TTIC, 6045 S Kenwood Avenue, 5th Floor, Room 526

Who:       Ilya Razenshteyn, MIT: CSAIL


Title:        New Algorithms for High-Dimensional Data

Abstract: A popular approach in data analysis is to represent a dataset in
a high-dimensional feature vector space, and reduce a given task to a
geometric computational problem. As it turns out, the running times of most
of the classic geometric algorithms suffer from an exponential dependence
on the dimension and are typically not applicable to the high-dimensional
regime. This necessitates the development of new algorithmic approaches
that overcome this "curse of dimensionality". In this talk I will give an
overview of my work in this area. In particular:

* I will describe new algorithms for the high-dimensional Nearest Neighbor
Search (NNS) problem.  Our algorithms improve, for the first time, upon the
popular Locality-Sensitive Hashing (LSH) approach. The main insight behind
the new algorithms is the use of hash functions that are *data dependent*,
i.e., , that are carefully tailored to a given dataset. In contrast, prior
algorithms used generic, data-oblivious hash families.

* Next, I will outline how to make the core component of the above
algorithms---the optimal LSH for cosine distance---practical. This yields
an implementation of LSH which is competitive with the state of the art
heuristics for the NNS problem.

* Finally, I will show how to determine the limits of distance-preserving
data sketching, i.e., vector compression methods that approximately
preserve the distances between vectors. In particular, I will show the
first non-trivial lower bound for the sketching complexity of the
well-studied Earth Mover Distance.

The common theme that unifies these and other algorithms for
high-dimensional data is the use of "efficient data representations":
randomized hashing, sketching, dimensionality reduction, metric embeddings,
and others. One goal of the talk is to describe a holistic view of all of
these techniques.


Host: Yury Makarychev <yury at ttic.edu>




Mary C. Marre
Administrative Assistant
*Toyota Technological Institute*
*6045 S. Kenwood Avenue*
*Room 504*
*Chicago, IL  60637*
*p:(773) 834-1757*
*f: (773) 357-6970*
*mmarre at ttic.edu <mmarre at ttic.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20170124/3342a6b8/attachment.html>


More information about the Colloquium mailing list