[Colloquium] Guest Speaker @ TTI-C This Friday (3/10/06)

Katherine Cumming kcumming at tti-c.org
Tue Mar 7 07:43:08 CST 2006


****TTI-C Guest Speaker This Friday*****

Speaker:  Dean Foster
Organization:  Statistics, Wharton University of Pennsylvania
Speaker's homepage:  http://gosset.wharton.upenn.edu/


Date:  Friday, March 10, 2006
Location:  TTI-C Conference Room
Time:  12:00 noon (Machine Learning Reading Group)

Title: "Feature selection: An auction model for finding good features"

Abstract: 
In machine learning, features are typically regularized rather than
selected.  In statistics, selection is more common using a tool called
stepwise regression.  This talk will introduce a replacement for
stepwise regression that has controllable properties and is designed
for large data sets.  We have run it on bankruptcy data which used
millions of observations (but only 100k features) and on other smaller
datasets (e.g. NIPS) for which we generated millions of features.  In
both of these, we automatically avoided over fitting--in other words,
we didn't use a hold out sample.  Further we could dynamically change
which features to try based on wether or not the previous features  
improved the fit.

In detail, we use an auction as a metaphor that allows one to exploit
multiple streams of predictors.  In this metaphor, the bidders in the
auctions suggest predictors.  If they provide good predictors, they
are rewarded and can suggest more predictors.  This success is
measured in bits, ala MDL.  By controlling the total bits available to
bidders, the auction avoids over-fitting.  (research done with Robert
A. Stine and Lyle H. Ungar).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 15698 bytes
Desc: not available
Url : http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20060307/ae1b98a5/winmail.bin


More information about the Colloquium mailing list