[Colloquium] TTI-C Talk: Slav Petrov, UC Berkeley

Julia MacGlashan macglashan at tti-c.org
Tue Mar 17 09:30:28 CDT 2009


REMINDER

When:             Wednesday, March 18th @ 11:00am (lunch will be provided
after talk)

Where:            6045 S Kenwood Ave, TTI-C Conference Room #526 (5th Floor)

Who:               Slav Petrov (University of California, Berkeley)

Title:                Coarse-to-Fine Methods for Natural Language Processing


State-of-the-art NLP models are anything but compact. Parsers have huge
grammars, machine translation systems have huge transfer tables, and so on
across a range of tasks. With such complexity come two challenges. First,
how can we learn highly complex models? Second, how can we efficiently infer
optimal structures within them?

Hierarchical coarse-to-fine methods address both questions.
Coarse-to-fine approaches exploit a sequence of models which introduce
complexity gradually. At the top of the sequence is a trivial model in which
learning and inference are both cheap.  Each subsequent model refines the
previous one, until a final, full-complexity model is reached. Because each
refinement introduces only limited complexity, both learning and inference
can be done in an incremental fashion. In this talk, I describe several
coarse-to-fine systems.

In the domain of syntactic parsing, complexity is in the grammar. I present
a latent-variable approach which begins with an X-bar grammar and learns to
iteratively refine grammar categories. For example, noun phrases might be
split into subcategories for subjects and objects, singular and plural, and
so on. This splitting process admits an efficient incremental inference
scheme which reduces parsing times by orders of magnitude. This approach
produces the best parsing accuracies across an array of languages, in a
fully language-general fashion.

In the domain of syntactic machine translation, complexity arises because
there and too many target language word types. To manage this complexity, we
translate into target language clusterings of increasing vocabulary size.
This approach gives dramatic speed-ups while actually increasing final
translation quality.

Contact:          Karen Livescu, TTI-C		klivescu at tti-c.org
834-2549




-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20090317/d86a7807/attachment-0001.htm 


More information about the Colloquium mailing list