[Colloquium] ML Seminar: Daniel Hsu, UCSD

Tue Oct 28 12:17:13 CDT 2008

When:              Wednesday, October 29 @ 11:00am

Where:            TTI-C Conference Room: 1427 E. 60th St, 2nd Floor

Who:                Daniel Hsu, University of California, San Diego

Title:                 Consistent sampling strategies for active learning

In many applications, labeled data typically comes at a higher cost than
unlabeled data (e.g. in time, effort).  An active learner is given unlabeled
data and must pay to view any label.  The hope is that significantly fewer
labeled examples are used than in the supervised (non-active) learning
model.

A typical strategy for active learning starts by querying a few
randomly-chosen points to get a very rough idea of the decision boundary,
and then queries points that are increasingly closer to its current estimate
of the boundary.  Such selective sampling methods immediately bring to the
forefront the unique difficulty of active learning: sampling bias.

In this talk, I'll describe active learning strategies that properly manage
the sampling bias of selective sampling and contrast them with popular
heuristics that are provably inconsistent.

Based on joint work with Sanjoy Dasgupta and Claire Monteleoni.

Contact:          Shai Shalev-Shwartz, TTI-C         shai at tti-c.org
834-6850

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20081028/54f89931/attachment.htm