ColloquiaTTI-C Talk by David Lewis on 1/29/03
Meridel Trimble
mtrimble at tti-c.org
Thu Jan 23 09:20:22 CST 2003
---------------------------------------------------------------
TOYOTA TECHNOLOGICAL INSTITUTE - TALK
---------------------------------------------------------------
Date: Wednesday, January 29th, 2003
Time: 2:30 p.m.
Place: Ryerson Hall 251
Speaker: David Lewis
www.daviddlewis.com
Title: Text Classification with Many Classes
-------------------------------------
Abstract:
I recently applied several learning algorithms (including
naive Bayes, boosting, and Winnow) to categorizing partially
textual records into 1 of 797 disjoint categories. In
addition to the usual pleasures of noisy data, complex
taxonomies, inconsistent manual classification, and
awkwardly structured prior knowledge, I'll suggest more
fundamental issues this work raises about classification
with large number of categories. (Joint work with Tom
Pollak and Sheryl Romeo of the Urban Institute.)
I'll also briefly discuss two ongoing projects motivated by
the above results. One is a sparse nearest neighbor
classifier with runtime that scales sublinearly with number of
classes. The second is a variant of probit regression that closely
approximates support vector machines while incorporating
prior knowledge in a Bayesian framework (joint work with David
Madigan and Alex Genkin of Rutgers University).
Biography: David D. Lewis is an independent consultant based
in Chicago, IL. He has previously held research positions
at AT&T Labs, Bell Labs, and the University of Chicago.
Lewis has published more than 40 papers and 5 patents, has
created several widely used test collections, and has been
extensively involved in designing US government evaluations
of language processing technology. He received his Ph.D. in
Computer Science from the University of Massachusetts at
Amherst, and his dissertation won the 1992 American Society
for Information Science Doctoral Forum Award.
--------------------------------------------------
*The talk will be followed by refreshments in Ryerson 255*
If you would like to meet with the speaker, please send e-mail to Meridel
Trimble: mtrimble at tti-c.org
More information about the Colloquium
mailing list