ColloquiaTTI-C Talk by David Lewis on 1/29/03

Meridel Trimble mtrimble at tti-c.org
Thu Jan 23 09:20:22 CST 2003


--------------------------------------------------------------- 
TOYOTA TECHNOLOGICAL INSTITUTE - TALK 
--------------------------------------------------------------- 

Date: Wednesday, January 29th, 2003 

Time: 2:30 p.m. 

Place: Ryerson Hall 251 

Speaker: David Lewis
www.daviddlewis.com

Title: Text Classification with Many Classes

------------------------------------- 
Abstract: 

I recently applied several learning algorithms (including 
naive Bayes, boosting, and Winnow) to categorizing partially 
textual records into 1 of 797 disjoint categories.  In 
addition to the usual pleasures of noisy data, complex 
taxonomies, inconsistent manual classification, and 
awkwardly structured prior knowledge, I'll suggest more 
fundamental issues this work raises about classification 
with large number of categories.  (Joint work with Tom 
Pollak and Sheryl Romeo of the Urban Institute.) 

I'll also briefly discuss two ongoing projects motivated by 
the above results. One is a sparse nearest neighbor 
classifier with runtime that scales sublinearly with number of 
classes. The second is a variant of probit regression that closely 
approximates support vector machines while incorporating 
prior knowledge in a Bayesian framework (joint work with David 
Madigan and Alex Genkin of Rutgers University). 

Biography: David D. Lewis is an independent consultant based 
in Chicago, IL.  He has previously held research positions 
at AT&T Labs, Bell Labs, and the University of Chicago. 
Lewis has published more than 40 papers and 5 patents, has 
created several widely used test collections, and has been 
extensively involved in designing US government evaluations 
of language processing technology.  He received his Ph.D. in 
Computer Science from the University of Massachusetts at 
Amherst, and his dissertation won the 1992 American Society 
for Information Science Doctoral Forum Award. 
-------------------------------------------------- 

*The talk will be followed by refreshments in Ryerson 255* 

If you would like to meet with the speaker, please send e-mail to Meridel 
Trimble: mtrimble at tti-c.org 



More information about the Colloquium mailing list