[Colloquium] 1/24 Research at TTIC Talks: Kevin Gimpel, TTIC

Jerome Allen jallen at ttic.edu
Thu Jan 23 15:28:43 CST 2020


> *When:*    Friday, January 24th.  *Refreshments at 12:00pm. **Talk at
> 12:20pm. *
>
> *Where:*  TTIC, 6045 S Kenwood Avenue, 5th Floor, Room 526
>
> *Who: *     Kevin Gimpel, TTIC
>
> *Title:   *   Learning to do Structured Inference in Natural Language
> Processing
>
> *Abstract: *
> Many tasks in natural language processing, computer vision, and
> computational biology involve predicting structured outputs. Researchers
> are increasingly applying deep representation learning to these problems,
> but the *structured* component of these approaches is usually quite
> simplistic. For example, neural machine translation systems use
> unstructured training of local factors followed by beam search for
> test-time inference. There have been several proposals for deep
> energy-based structured modeling, but they pose difficulties for learning
> and inference, preventing their widespread adoption. We focus in this talk
> on structured prediction energy networks (SPENs; Belanger & McCallum 2016),
> which use neural network architectures to define energy functions that can
> capture arbitrary dependencies among parts of structured outputs.
>
> Prior work with SPENs used gradient descent for inference, relaxing the
> structured output to a set of continuous variables and then optimizing the
> energy with respect to them. We replace this use of gradient descent with a
> neural network trained to approximate structured argmax inference. This
> "inference network" outputs continuous values that we treat as the output
> structure. We develop large-margin training objectives to jointly train
> deep energy functions and inference networks. The objectives resemble the
> alternating optimization framework of generative adversarial networks
> (GANs; Goodfellow et al. 2014): the inference network is analogous to the
> generator and the energy function is analogous to the discriminator. We
> present experimental results on several NLP tasks, including multi-label
> classification, part-of-speech tagging, named entity recognition, and
> machine translation. Inference networks achieve a better
> speed/accuracy/search error trade-off than gradient descent, while also
> being faster than exact inference at similar accuracy levels. This
> increased efficiency allows us to experiment with deep, global energy
> terms, which further improve results.
>
> ********************************************************************************************************
>
> *Research at TTIC Seminar Series*
>
> TTIC is hosting a weekly seminar series presenting the research currently
> underway at the Institute. Every week a different TTIC faculty member will
> present their research.  The lectures are intended both for students
> seeking research topics and adviser, and for the general TTIC and
> University of Chicago communities interested in hearing what their
> colleagues are up to.
>
> To receive announcements about the seminar series, please subscribe to the
> mailing list: https://groups.google.com/a/ttic.edu/group/talks/subscribe
>
> Speaker details can be found at: http://www.ttic.edu/tticseminar.php.
>
> For additional questions, please contact Nathan Srebro at nati at ttic.edu
> <mcallester at ttic.edu>.
>
>
>
> *Jerome Allen*
> Executive Assistant
> *Toyota Technological Institute*
> 6045 S. Kenwood Avenue
> Room 518
> Chicago, IL  60637
> p:(773) 702-2311
> *jallen at ttic.edu <jallen at ttic.edu>*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20200123/42809e49/attachment-0001.html>


More information about the Colloquium mailing list