[Colloquium] Michael Collins talk tomorrow (Fri, 11/14) at TTI
Meridel Trimble
mtrimble at tti-c.org
Thu Nov 13 08:57:13 CST 2003
TOYOTA TECHNOLOGICAL INSTITUTE TALK
Speaker: Michael Collins
MIT
Speakers homepage: http://www.ai.mit.edu/people/mcollins/
Time: 11:45am
Date: Friday, November 14th, 2003
Place: TTI-C (1427 East 60th Street, Second Floor - Press Building)
*FREE LUNCH FOLLOWING THE TALK*
Title: Large-Margin Methods for Natural Language Learning
Abstract: Sequential data is everywhere: obvious examples being text (the web,
or digital libraries), speech, and biological sequences. Algorithms which
recover structure underlying this data are becoming increasingly important.
This leads to an interesting class of learning problems: how to learn
functions which map strings to other discrete structures such as trees,
segmentations, or underlying state sequences?
In this talk I will present new algorithms for these problems, derived from
Freund and Schapire's voted perceptron algorithm for classification tasks.
Properties of the algorithm depend directly on a modified notion of "margin"
on training examples. I will describe how the algorithm can be used to rerank
N-best output from an existing probabilistic model, using essentially
arbitrary features of competing analyses; how the "kernel trick" applied to
discrete structures can lead to efficient learning with representations
tracking an exponential number of "sub-fragments" of a tree or tagged
sequence; and how the algorithm can be used for efficient discriminative
training of weighted automata.
A first motivation for the new algorithms concerns *representation*: in
comparison to hidden markov models, or probabilistic context-free grammars,
the methods are considerably more flexible in the features that can be used to
discriminate between competing structures. A second theme is *discriminative
training*: the parameter estimation methods make relatively weak assumptions
about the distribution underlying examples. During the talk I will present
experiments with the methods, showing their utility on a number of natural
language problems.
If you have questions, or would like to meet the speaker, please contact
Meridel at: 4-9873 or mtrimble at tti-c.org
For information on future TTI-C talks or events, please go to the TTI-C Events
page: http://www.tti-c.org/events.shtml
More information about the Colloquium
mailing list