[Colloquium] [Reminder]: 1/24 Research at TTIC Talks: Kevin Gimpel, TTIC

Fri Jan 24 11:00:00 CST 2020

> *When:*    Friday, January 24th.  *Refreshments at 12:00pm. **Talk at
>> 12:20pm. *
>>
>> *Where:*  TTIC, 6045 S Kenwood Avenue, 5th Floor, Room 526
>>
>> *Who: *     Kevin Gimpel, TTIC
>>
>> *Title:   *   Learning to do Structured Inference in Natural Language
>> Processing
>>
>> *Abstract: *
>> Many tasks in natural language processing, computer vision, and
>> computational biology involve predicting structured outputs. Researchers
>> are increasingly applying deep representation learning to these problems,
>> but the *structured* component of these approaches is usually quite
>> simplistic. For example, neural machine translation systems use
>> unstructured training of local factors followed by beam search for
>> test-time inference. There have been several proposals for deep
>> energy-based structured modeling, but they pose difficulties for learning
>> and inference, preventing their widespread adoption. We focus in this talk
>> on structured prediction energy networks (SPENs; Belanger & McCallum 2016),
>> which use neural network architectures to define energy functions that can
>> capture arbitrary dependencies among parts of structured outputs.
>>
>> Prior work with SPENs used gradient descent for inference, relaxing the
>> structured output to a set of continuous variables and then optimizing the
>> energy with respect to them. We replace this use of gradient descent with a
>> neural network trained to approximate structured argmax inference. This
>> "inference network" outputs continuous values that we treat as the output
>> structure. We develop large-margin training objectives to jointly train
>> deep energy functions and inference networks. The objectives resemble the
>> alternating optimization framework of generative adversarial networks
>> (GANs; Goodfellow et al. 2014): the inference network is analogous to the
>> generator and the energy function is analogous to the discriminator. We
>> present experimental results on several NLP tasks, including multi-label
>> classification, part-of-speech tagging, named entity recognition, and
>> machine translation. Inference networks achieve a better
>> speed/accuracy/search error trade-off than gradient descent, while also
>> being faster than exact inference at similar accuracy levels. This
>> increased efficiency allows us to experiment with deep, global energy
>> terms, which further improve results.
>>
>> ********************************************************************************************************
>>
>> *Research at TTIC Seminar Series*
>>
>> TTIC is hosting a weekly seminar series presenting the research currently
>> underway at the Institute. Every week a different TTIC faculty member will
>> present their research.  The lectures are intended both for students
>> seeking research topics and adviser, and for the general TTIC and
>> University of Chicago communities interested in hearing what their
>> colleagues are up to.
>>
>> To receive announcements about the seminar series, please subscribe to
>> the mailing list:
>> https://groups.google.com/a/ttic.edu/group/talks/subscribe
>>
>> Speaker details can be found at: http://www.ttic.edu/tticseminar.php.
>>
>> For additional questions, please contact Nathan Srebro at nati at ttic.edu
>> <mcallester at ttic.edu>.
>>
>>
>>
>> *Jerome Allen*
>> Executive Assistant
>> *Toyota Technological Institute*
>> 6045 S. Kenwood Avenue
>> Room 518
>> Chicago, IL  60637
>> p:(773) 702-2311
>> *jallen at ttic.edu <jallen at ttic.edu>*
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20200124/2fc70a3b/attachment-0001.html>