[Theory] TODAY: [TTIC Talks] Talks at TTIC: 2/13 Jeremy Cohen, Carnegie Mellon
Brandie Jones
bjones at ttic.edu
Tue Feb 13 08:00:00 CST 2024
*When:* Tuesday, February 13th at* 11AM CT *
*Where:* Talk will be given *live, in-person* at
TTIC, 6045 S. Kenwood Avenue
5th Floor, Room 530
*Virtually:* via Panopto (Livestream
<https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=edb9fb11-e93f-4e11-8e82-b109010ac95e>
)
*Who: *Jeremy Cohen, Carnegie Mellon
*Title:* The Dynamics of Gradient Descent in Deep Learning
*Abstract: *Since 1986, neural networks have been trained using variants
of gradient descent. Why does gradient descent work? The conventional
wisdom holds that gradient descent works because its learning rate is set
sufficiently small relative to the curvature of the objective function,
which is treated as fixed a priori. In this talk, we see that this is not
the reason why gradient descent works. Instead, we will see that gradient
descent possesses an inbuilt mechanism for automatically regulating the
curvature along its own optimization trajectory. This mechanism, not
previously known to optimization theory, is responsible for the successful
convergence of gradient descent in neural network training.
*Bio:* Jeremy Cohen is a PhD student at Carnegie Mellon, advised by Zico
Kolter and Ameet Talwalkar. He is interested in making deep learning less
like alchemy and more like engineering.
*Host: Zhiyuan Li <zhiyuanli at ttic.edu> & Nati Srebro <nati at ttic.edu>*
--
*Brandie Jones *
*Executive **Administrative Assistant*
Toyota Technological Institute
6045 S. Kenwood Avenue
Chicago, IL 60637
www.ttic.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20240213/2279a37e/attachment-0001.html>
More information about the Theory
mailing list