[Colloquium] Talk Reminder-TTIC Colloquium: Dr. Brian Kingsbury, IBM

Fri Mar 8 10:26:47 CST 2013

When:     Monday, March 11th @ 11 a.m.

Where:    TTIC, 6045 S. Kenwood Avenue, Room
               #526

Who:       Dr. Brian Kingsbury, IBM

Title:        Distributed Hessian-free Optimization for
               Deep Neural Network Acoustic Models

Abstract:

Neural network acoustic models have recently enjoyed a renaissance, with
multiple research groups finding that they outperform state-of-the-art
Gaussian mixture model (GMM) acoustic models on a wide variety of tasks.

Three architectural factors distinguish modern
neural network acoustic models from previous such models: (1) they are deep,
typically using five or more hidden layers; (2) they are wide, using
thousands of units per hidden layer; and (3) they classify audio features
into thousands of context-dependent HMM state targets.
Together, these factors mean that neural network acoustic models have a
large number of parameters, typically on the order of tens of
millions.  Additionally, these models may be trained using
sequence-discriminative
criteria such as maximum mutual information or minimum Bayes risk.  The
result is that training such models using
standard stochastic gradient descent is a slow process, requiring weeks of
compute time.  In contrast, standard GMM acoustic models can
be trained in a few days, thanks to parallelization on large compute clusters.
 In this talk, I will describe a distributed neural network
training algorithm, based on Hessian-free optimization (Martens, ICML 2010),
that can take advantage of large compute clusters to scale to
deep networks and large data sets.  Using examples from broadcast news and
Switchboard transcription, and Babel transcription and keyword
search, I will show how Hessian-free optimization can reduce training times
and improve system performance.

Brian Kingsbury is a research staff member at the IBM T. J. Watson Research
Center.  He joined IBM Research in 1999 after completing his
PhD at the University of California, Berkeley.  He is co-PI and technical
lead for the LORELEI consortium, an IBM-led group working on
the IARPA Babel program.  Brian is currently an associate editor for IEEE
Transactions on Audio, Speech, and Language Processing.  He was a
member of the Speech and Language Technical Committee of the IEEE Signal
Processing Society from 2009 to 2011, and he served as a speech
area chair for the 2010, 2011, and 2012 ICASSP conferences.  He is
co-organizing
an ICASSP 2013 special session on new types of deep
neural network learning for speech recognition and related applications and
an ICML 2013 workshop on deep learning for audio, speech, and language
processing.  His research interests include deep
neural network acoustic modeling, large-vocabulary speech transcription,
and keyword search in audio.

Host: Karen Livescu, klivescu at ttic.edu

-- 
*Dawn Ellis*
Administrative Assistant
773-834-1757
dellis at ttic.edu

TTIC
6045 S. Kenwood Ave.
Chicago, IL. 60637
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20130308/93dd2de0/attachment.htm