[Colloquium] 2/28: Data Science/Stats Candidate Talk - Mahdi Soltanolkotabi (USC)

Fri Feb 25 13:23:29 CST 2022

*Data Science Institute/Statistics Candidate Seminar*

*Mahdi Soltanolkotabi *
*Associate Professor*
*University of Southern California*

*Monday, February 28th*
*4:30 p.m. - 5:30 p.m.*
*In Person: John Crerar Library, Room 390*
*Remote: Live Stream
<http://live.cs.uchicago.edu/mahdisoltanolkotabi/> or Zoom
<https://uchicago.zoom.us/j/91609960635?pwd=L3VSTzBOb1ovOVgxdHAwa3NoTWo0UT09>
(details
below)*

*Towards Stronger Foundations for AI and its Applications to the Sciences*

Despite wide empirical success, many of the most commonly used learning
approaches lack a clear mathematical foundation and often rely on poorly
understood heuristics. Even when theoretical guarantees do exist they are
often too crude and/or pessimistic to explain their success in practical
regimes of operation or serve as a guiding principle for practitioners.
Furthermore, in many scenarios such as those arising in scientific
applications they require significant resources (compute, data, etc.) to
work reliably.

The first part of the talk takes a step towards building a stronger
theoretical foundation for such nonconvex learning. In particular, I will
focus on demystifying the generalization and feature learning capability of
modern overparameterized learning where the parameters of the learning
model (e.g. neural network) exceed the size of the training data. Our
result is based on an intriguing spectral bias phenomena for gradient
descent, that puts the iterations on a particular trajectory towards
solutions that are not only globally optimal but also generalize well.
Notably this analysis overcomes a major theoretical bottleneck in the
existing literature and goes beyond the “lazy” training regime which
requires unrealistic hyperparameter choices (e.g. very small step sizes,
large initialization or wide models). In the second part of the talk I will
discuss the challenges and opportunities of using AI for scientific
applications and medical image reconstruction in particular. I will discuss
our work on designing new architectures that lead to state of the art
performance and report on techniques to significantly reduce the required
data for training.

*Bio*: Mahdi Soltanolkotabi <https://viterbi-web.usc.edu/~soltanol/> is an
associate professor in the Ming Hsieh Department of Electrical and Computer
Engineering and Computer Science at the University of Southern California
where he holds an Andrew and Erna Viterbi Early Career Chair. Prior to
joining USC, he completed his PhD in electrical engineering at Stanford in
2014. He was a postdoctoral researcher in the EECS department at UC
Berkeley during the 2014-2015 academic year. His research focuses on
developing the mathematical foundations of modern data science via
characterizing the behavior and pitfalls of contemporary nonconvex learning
and optimization algorithms with applications in deep learning, large scale
distributed training, federated learning, computational imaging, and AI for
scientific applications. Mahdi is the recipient of the Information Theory
Society Best Paper Award, Packard Fellowship in Science and Engineering, a
Sloan Research Fellowship in mathematics, an NSF Career award, an Airforce
Office of Research Young Investigator award (AFOSR-YIP), the Viterbi school
of engineering junior faculty research award, and faculty research awards
from Google and Amazon.

*Host*: Rebecca Willett

*Zoom Info:*
https://uchicago.zoom.us/j/91609960635?pwd=L3VSTzBOb1ovOVgxdHAwa3NoTWo0UT09
ID: 916 0996 0635
Password: ds2022

-- 
*Rob Mitchum*

*Associate Director of Communications for Data Science and Computing*
*University of Chicago*
*rmitchum at uchicago.edu <rmitchum at ci.uchicago.edu>*
*773-484-9890*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20220225/4c676dcc/attachment.html>