[Colloquium] [TTIC Talks] 5/1 Talks at TTIC: Shirley Wu, University of Texas at Austin

Alicia McClarin amcclarin at ttic.edu
Fri Apr 24 18:00:53 CDT 2020


*When:*      Friday, May 1st at 10:30am

*Where:*     Zoom Virtual Talk (see details below)

*Who: *       Shirley Wu, University of Texas at Austin

*Title: *       Normalization Methods: Auto-tuning Stepsize and Implicit
Regularization

*Abstract:* Neural network optimization with stochastic gradient descent
(SGD) require many interesting techniques including normalization methods
such as batch normalization (Ioffe and Szegedy, 2015), and adaptive
gradient methods such as ADAM (Kingma and Ba, 2014), to attain optimal
performance. While these methods are successful, their theoretical
understanding has only recently started to emerge. A significant challenge
in understanding these methods is the highly non-convex and non-linear
nature of neural networks.

In this talk, I will present an interesting connection between
normalization methods and adaptive gradient methods, and provide rigorous
justification for why these methods require less hyper-parameters tuning.
Meanwhile, I will talk about convergence results for adaptive gradient
methods in general non-convex landscapes and two-layer over-parameterized
neural networks.  Beyond convergence, I will also show a new perspective on
the implicit regularization in these normalization algorithms.


*Bio:*         Xiaoxia (Shirley) Wu is a Ph.D. student at The University of
Texas at Austin, advised by Rachel Ward. Previously, she was a research
intern, mentored by Léon Bottou, at Facebook AI Research (FAIR) where she
worked on batch/weight normalization. She was also a visiting student at
Simons Institute for the Theory of Computing (UC Berkeley) in Fall 2018 and
Summer 2019, and Institute for Advanced Study (Princeton) in Fall 2019. Her
primary research interests lie in the area of optimization, including
stochastic and robust optimization. Her current research is on
understanding and improving the optimization methods for non-convex
landscapes (neural networks), such as adaptive gradient methods and
normalization methods. She was a recipient of the UT Austin Graduate School
Fellowship.

----------------------------------------------------------------------------------------------------------

Register in advance for this meeting:
https://uchicago.zoom.us/meeting/register/tJwsf-CqpjMvG9E2RX-3yRQsaF-T-4eCeGQO


After registering, you will receive a confirmation email containing
information about joining the meeting.

-- 
*Alicia McClarin*
*Toyota Technological Institute at Chicago*
*6045 S. Kenwood Ave., **Office 504*
*Chicago, IL 60637*
*773-834-3321*
*www.ttic.edu* <http://www.ttic.edu/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20200424/0e538215/attachment.html>


More information about the Colloquium mailing list