[Theory] [TTIC Talks] 4/2 Talks at TTIC: Sadhika Malladi, Princeton

Brandie Jones via Theory theory at mailman.cs.uchicago.edu
Wed Mar 26 12:30:00 CDT 2025


*When:*        Wednesday, April 2nd at *11:30AM CT*


*Where:       *Talk will be given *live, in-person* at

                       TTIC, 6045 S. Kenwood Avenue

                       5th Floor, Room 530


*Virtually:*  via Panopto (livestream
<https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=7191df18-f3a7-47d0-b403-b29e0106fa67>
)


*Who: *         Sadhika Malladi, Princeton

*Title:*           Deep Learning Theory in the Age of Generative AI
*Abstract:  *Modern deep learning has achieved remarkable results, but the
design of training methodologies largely relies on guess-and-check
approaches. Thorough empirical studies of recent massive language models
(LMs) is prohibitively expensive, underscoring the need for theoretical
insights, but classical ML theory struggles to describe modern training
paradigms. I present a novel approach to developing prescriptive
theoretical results that can directly translate to improved training
methodologies for LMs. My research has yielded actionable improvements in
model training across the LM development pipeline — for example, my theory
motivates the design of MeZO, a fine-tuning algorithm that reduces memory
usage by up to 12x and halves the number of GPU-hours required. Throughout
the talk, to underscore the prescriptiveness of my theoretical insights, I
will demonstrate the success of these theory-motivated algorithms on novel
empirical settings published after the theory.

*Short Bio*:  Sadhika Malladi is a final-year PhD student in Computer
Science at Princeton University advised by Sanjeev Arora. Her research
advances deep learning theory to capture modern-day training settings,
yielding practical training improvements and meaningful insights into model
behavior. She has co-organized multiple workshops, including Mathematical
and Empirical Understanding of Foundation Models at ICLR 2024 and
Mathematics for Modern Machine Learning (M3L) at NeurIPS 2024. She was
named a 2025 Siebel Scholar.


*Host: Zhiyuan Li <zhiyuanli at ttic.edu>*

-- 
*Brandie Jones *
*Executive **Administrative Assistant*
Toyota Technological Institute
6045 S. Kenwood Avenue
Chicago, IL  60637
www.ttic.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20250326/3063fe5d/attachment.html>


More information about the Theory mailing list