[Colloquium] Reminder and CHANGE: TTI-C Talk Rich Caruana, November 4th @ 1:30pm

Fri Nov 4 08:34:25 CST 2005

TOYOTA TECHNOLOGICAL INSTITUTE TALK

Speaker:  Rich Caruana, Cornell
Speaker's homepage:  http://www.cs.cornell.edu/~caruana/
NOTE:  TIME HAS BEEN CHANGED TO 1:30 pm 
Time:  Friday, November 4th, 1:30pm
Location:  TTI-C Conference Room 

Title: Which Supervised Learning Method is Best For What? An Empirical
Comparison of Supervised Learning Methods.

Abstract:

Decision trees may be intelligible, but do they perform well enough that
you'd really want to use them? Have SVMs replaced neural nets, or are neural
nets still the best models for regression, and SVMs best for classification?
Boosting maximizes a margin similar to SVMs, but can boosting compete with
SVMs? And if it does compete, is it better to boost weak models, as theory
might suggest, or to boost stronger models? Bagging is much simpler than
boosting -- how well does bagging stack up against boosting? Breiman said
Random Forests are better than bagging and as good as boosting. Was he
right? And what about old friends like logistic regression, MBL, and naive
bayes, -- should they be put out to pasture, or do they still fill important
niches?

In this talk we compare the performance of the ten supervised learning
methods above on nine different criteria: Accuracy, F-score, Lift,
Precision/Recall Break-Even Point, Area under the ROC, Average Precision,
Squared Error, Cross-Entropy, and Probability Calibration. The results show
that no one learning method does it all, but some methods can be "repaired"
so that they do well across all performance metrics. In particular, we show
how to obtain the best probabilities from maximum margin methods such as
SVMs and boosting. We then describe an ensemble learning method that
combines select models from these ten learning methods to yield even better
performance. Finally, if time permits, we'll discuss how the nine
performance metrics relate to each other, and which of them you probably
should or shouldn't use.

Bio:

Rich Caruana is an Assistant Professor of Computer Science at Cornell
University. He got his Ph.D. at CMU in 1997 where he worked with Tom
Mitchell and Herb Simon. Before joining the faculty at Cornell in 2001 was
on the faculty in the Medical School at UCLA and at CMU's Center for
Learning and Discovery (CALD). Rich's research is in machine learning and
data mining, and applications of these to medical decision making,
bioinformatics, and weather forecasting. He is best known for his work in
inductive transfer, semi-supervised learning, and optimizing learning for
different performance criteria. Rich likes to mix algorithm development with
applications work to insure that the methods he developes really work in
practice. 

-----------------------------------------------------------------

If you have questions, or would like to meet the speaker, please contact
Katherine at 773-834-1994 or kcumming at tti-c.org.   For information on future
TTI-C talks and events, please go to the TTI-C Events page:
http://www.tti-c.org/events.html.  TTI-C (1427 East 60th Street, Chicago, IL
60637)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20051104/a0dcd3b4/attachment.htm