[Colloquium] 4/3 TTIC Colloquium: Alexander Rush, Harvard

Mon Mar 27 17:23:36 CDT 2017

When:     Monday, April 3rd at 11:00 a.m.

Where:    TTIC, 6045 S. Kenwood Avenue, 5th Floor, Room 526

Who:      Alexander Rush, Harvard

Title:       Structured Attention Networks

Abstract:
Recent deep learning systems for NLP and related fields have relied heavily
on the use of neural attention, which allows models to learn to focus on
selected regions of their input or memory. The use of neural attention has
proven to be a crucial component for advances in machine translation, image
captioning, question answering, summarization, end-to-end speech
recognition, and more. In this talk, I will give an overview of the current
uses of neural attention and memory, describe how the selection paradigm
has provided NLP researchers flexibility in designing neural models, and
demonstrate some fun applications of this approach from our group.

I will then argue that selection-based attention may be an unnecessarily
simplistic approach for NLP, and discuss our recent work on Structured
Attention Networks [Kim et al 2017]. These models integrate structured
prediction as a hidden layer within deep neural networks to form a variant
of attention that enables soft-selection over combinatorial structures,
such as segmentations, labelings, and even parse trees. While this approach
is inspired by structured prediction methods in NLP, building structured
attention layers within a deep network is quite challenging, and I will
describe the interesting dynamic programming approach needed for exact
computation. Experiments test the approach on a range of NLP tasks
including translation, question answering, and natural langauge inference,
demonstrating improvements upon standard attention in performance and
interpretability. Time pending, I will conclude by discussing recent
related work exploring other variants of neural memory for algorithmic
learning [Yang+Rush, 2017].

Bio: Alexander "Sasha" Rush is an Assistant Professor at Harvard School of
Engineering and Applied Sciences where he runs the HarvardNLP group.
Alexander received his PhD from MIT (2014) under the guidance of Michael
Collins and worked as a postdoc at Facebook Artificial Intelligence
Research (FAIR) under Yann LeCun. He is interested in machine learning and
deep learning methods for large-scale natural language processing and
understanding, including applications in neural machine translation (
http://opennmt.net), abstractive summarization, image-to-text prediction,
and long-form generation. His past work introduced novel combinatorial
methods for structured prediction with applications to syntactic parsing
and machine translation. His work has received three best paper/honorable
mention awards at major NLP conferences. His group web page is
http://nlp.seas.harvard.edu/, and he tweets at http://twitter.com/harvardnlp
.

Host: Kevin Gimpel <kgimpel at ttic.edu>

For more information on the colloquium series or to subscribe to the
mailing list, please see http://www.ttic.edu/colloquium.php

Mary C. Marre
Administrative Assistant
*Toyota Technological Institute*
*6045 S. Kenwood Avenue*
*Room 504*
*Chicago, IL  60637*
*p:(773) 834-1757*
*f: (773) 357-6970*
*mmarre at ttic.edu <mmarre at ttic.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20170327/1c894fb9/attachment.html>