[Colloquium] TTIC Talk: Markus Dreyer, Johns Hopkins University

Julia MacGlashan macglashan at tti-c.org
Tue Jan 19 09:28:18 CST 2010


When:             *Thursday, Jan 21 @ 11:00am*

Where:           * TTI-C Conference Room #526*, 6045 S Kenwood Ave


Who:               *Markus Dreyer*, Johns Hopkins University


Title:          *      **Language Processing with Graphical Models over
Strings***



 Learning natural language is a complicated task. Various levels of
linguistic description -- orthographic, morphological, phonological,
syntactic, and others -- interact with each other. There are irregularities
and ambiguities, and while raw data are typically available in large
amounts, annotations or labels are often incomplete and coarse. To teach
computers language, we need to create computational models that are robust
enough to learn from large
amounts of incomplete, sparsely annotated data but expressive enough to
discover rich linguistic knowledge.

I will present graphical models over multiple strings, which are an example
of such a robust and expressive modeling technique. These are joint models
over multiple related strings that can represent words in various forms.
These models naturally combine machinery from
computational linguistic (finite-state transducers) with techniques from the
machine learning literature (graphical models, belief propagation) and open
up new ways of modeling transliterations, pronunciations, inflections and
other linguistic entities. For inflectional morphology in particular, I will
show how joint inference of multiple word forms can substantially reduce
prediction error rate.  I will also describe the use of latent variables to
infer missing annotation, to refine coarse labels, and to deal with the
irregularities of language.
*
** Bio:

Markus Dreyer is a final year PhD student in the Computer Science Department
at Johns Hopkins University, working with Jason Eisner. He is a member of
the Center for Language and Speech Processing and the Human Language
Technology Center of Excellence. His research focuses on the intersection of
computational linguistics and machine learning. Other research interests
include finite-state modeling, machine translation and parsing. He has been
the recipient of various fellowships, including the Wolman Fellowship and an
NSF research award. In his spare time, he enjoys long-distance running and
traveling.

* Host:              Karen Livescu, klivescu at ttic.edu <gregory at ttic.edu>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20100119/31cc6cd0/attachment.htm 


More information about the Colloquium mailing list