[Colloquium] [Talks at TTIC] 1/24 Research at TTIC: Kevin Gimpel, TTIC

Jerome Allen jallen at ttic.edu
Fri Jan 17 12:00:00 CST 2020


 *When:*     Friday, January 24th.  *Refreshments at 12:00pm. **Talk at
12:20pm. *

*Where:*    TTIC, 6045 S Kenwood Avenue, 5th Floor, Room 526

*Who: *     Kevin Gimpel, TTIC

*Title:   *     Learning to do Structured Inference in Natural Language
Processing

*Abstract: *
Many tasks in natural language processing, computer vision, and
computational biology involve predicting structured outputs. Researchers
are increasingly applying deep representation learning to these problems,
but the *structured* component of these approaches is usually quite
simplistic. For example, neural machine translation systems use
unstructured training of local factors followed by beam search for
test-time inference. There have been several proposals for deep
energy-based structured modeling, but they pose difficulties for learning
and inference, preventing their widespread adoption. We focus in this talk
on structured prediction energy networks (SPENs; Belanger & McCallum 2016),
which use neural network architectures to define energy functions that can
capture arbitrary dependencies among parts of structured outputs.

Prior work with SPENs used gradient descent for inference, relaxing the
structured output to a set of continuous variables and then optimizing the
energy with respect to them. We replace this use of gradient descent with a
neural network trained to approximate structured argmax inference. This
"inference network" outputs continuous values that we treat as the output
structure. We develop large-margin training objectives to jointly train
deep energy functions and inference networks. The objectives resemble the
alternating optimization framework of generative adversarial networks
(GANs; Goodfellow et al. 2014): the inference network is analogous to the
generator and the energy function is analogous to the discriminator. We
present experimental results on several NLP tasks, including multi-label
classification, part-of-speech tagging, named entity recognition, and
machine translation. Inference networks achieve a better
speed/accuracy/search error trade-off than gradient descent, while also
being faster than exact inference at similar accuracy levels. This
increased efficiency allows us to experiment with deep, global energy
terms, which further improve results.
********************************************************************************************************

*Research at TTIC Seminar Series*

TTIC is hosting a weekly seminar series presenting the research currently
underway at the Institute. Every week a different TTIC faculty member will
present their research.  The lectures are intended both for students
seeking research topics and adviser, and for the general TTIC and
University of Chicago communities interested in hearing what their
colleagues are up to.

To receive announcements about the seminar series, please subscribe to the
mailing list: https://groups.google.com/a/ttic.edu/group/talks/subscribe

Speaker details can be found at: http://www.ttic.edu/tticseminar.php.

For additional questions, please contact Nathan Srebro at nati at ttic.edu
<mcallester at ttic.edu>.


*Jerome Allen*
Executive Assistant
*Toyota Technological Institute*
6045 S. Kenwood Avenue
Room 518
Chicago, IL  60637
p:(773) 702-2311
*jallen at ttic.edu <jallen at ttic.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20200117/33586a47/attachment.html>


More information about the Colloquium mailing list