[Colloquium] Talk #1 by Leonid Peshkin (Monday, July 14th) at TTI
Meridel Trimble
mtrimble at tti-c.org
Tue Jul 8 13:35:27 CDT 2003
--------------------------------------------------------------------------------
--------
TOYOTA TECHNOLOGICAL INSTITUTE
--------------------------------------------------------------------------------
--------
Date: Monday, July 14th, 2003
Time: 3:30 p.m.
Place: Toyota Technological Institute conference room (The Press Building -
1427 E. 60th St.)
Speaker: Leonid Peshkin
Harvard University
Title: "Dynamic Bayesian Nets for Language Modeling"
Abstract:
Statistical methods in NLP exclude linguistically plausible models due to the
prohibitive complexity of inference in such models. Dynamic Bayesian networks
(DBNs) offer an elegant way to integrate various aspects of language in one
model. Many existing algorithms developed for learning and inference in DBNs
are applicable to probabilistic language modeling. In particular, a recent leap
in approximate inference algorithms enables inference in rich probabilistic
models. To demonstrate the potential of DBNs for natural language processing,
we employ a DBN for information extraction and part-of-speech tagging tasks.
Our methods outperform previously published results on an established benchmark
domain.
This talk will overview the following papers, available from
http://www.ai.mit.edu/~pesha/Public/papers.html
"Why Build Another Part-Of-Speech Tagger?" (In review)
http://www.ai.mit.edu/~pesha/Public/peshkin.pdf
- How simple a PoS tagger could we make?
- How could it be trained independently, then integrated into a system?
- Is the key to PoS tagging in the features or in the model after all?
- Do linguistic features really help?
"Bayesian Nets in Syntactic Categorization of Novel Words" NAACL-HLT 2003
http://www.ai.mit.edu/~pesha/Public/naacl03.pdf
- Our PoS tagger fares well on novel data, trained on WSJ,
tested on Brown corpus, email corpus and even "Jabberwocky".
"Bayesian Information Extraction Network" - IJCAI - 2003
http://www.eecs.harvard.edu/~pesha/Public/BIEN.pdf
- We assemble wealth of emerging linguistic instruments for shallow parsing,
syntactic and semantic tagging, morphological decomposition, named entity
recognition etc. in order to incrementally build a robust information
extraction system.
*Refreshments will be served after the talk
If you wish to meet with the speaker, please send e-mail to Meridel
(mtrimble at tti-c.org)
More information about the Colloquium
mailing list