[Colloquium] TODAY! Seminar Announcement: Global and Local Approach of Part-of-Speech Tagging for Large Corpora

Ninfa Mayorga ninfa at ci.uchicago.edu
Fri Oct 12 08:45:21 CDT 2012


Computation Institute Presentation - Data Lunch Seminar (DLS)

Speaker: Shi Yu, Institute for Genomics and Systems Biology
Host: Tanu Malik, Kyle Chard 
Date: October 12, 2012
Time: 12:00 PM - 1:00 PM
Location: University of Chicago, Searle 240A, 5735 S. Ellis Avenue

Global and Local Approach of Part-of-Speech Tagging for Large Corpora

Abstract:
We present Global-Local POS tagging, a framework to train generative stochastic Part-of-Speech models on large corpora. Global Taggers offer several advantages over their counter parts trained on small, curated corpus, including the ability to automatically extend and update their models to new text. Global Taggers also avoid a fundamental limitation of current models, whose performance heavily relies on curated text with manually assigned labels. We illustrate our approach by training several Global Taggers, implemented with generative stochastic models, on two large corpora using high performance computing architecture. We further demonstrate that global taggers can be improved by incorporating models trained on curated text, called Local Taggers, for better tagging performance derived from specific topics.

Bio:
Shi Yu's main research interests are cloud computing, biomedical text mining, computational linguistics, parametric statistical learning methods, and non-parametric machine learning methods, consensus learning and data integration, crowd sourcing and game based machine learning theory. He has been working in the cross disciplinary areas of these topics. He obtained a Bachelor degree in Mechanical and Electrical Engineering at China Textile University, a Masters in Artificial Intelligence and a Ph.D in Electrical Engineering (main area in Bioinformatics) at University of Leuven . He is now post-doc scholar at Institute for Genomics and Systems Biology, University of Chicago. He has published one book and more than 10 papers in areas of machine learning, text mining, computational linguistics and bioinformatics.

Information: 

DLS website: https://sites.google.com/site/cidatalunchseminar/ 
DLS email:    dlseminar at ci.uchicago.edu 


Lunch will be provided
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20121012/dea5ec39/attachment.htm 


More information about the Colloquium mailing list