[Theory] 9/20 Talks at TTIC: Idan Attias, IDEAL Institute

Sun Sep 15 21:01:11 CDT 2024

*When*:    Friday, September 20, 2024 at* 11:00** am** CT *

*Where*:   Talk will be given *live, in-person* at
               TTIC, 6045 S. Kenwood Avenue
               5th Floor, *Room 530*

*Virtually*: tba

*Who:  *    Idan Attias, IDEAL Institute

*Title:* Information Complexity of Stochastic Convex Optimization:
Applications to Generalization, Memorization and Privacy
*Abstract:* Despite intense study, the relationship between generalization
and memorization in machine learning has yet to be fully characterized.
Classically, ideal learning algorithms would primarily extract relevant
information from their training data, avoiding memorization of irrelevant
information. This intuition is supported by theoretical work demonstrating
the benefits of limited memorization for strong generalization. This
intuition, however, is challenged by the success of modern
overparameterized deep neural networks. These models often achieve high
test accuracy despite memorizing a significant number of training data.
Recent studies suggest that memorization plays a more complex role in
generalization than previously thought: memorization might even be
necessary for good generalization.

In this work, we investigate the interplay between memorization and
learning in the context of stochastic convex optimization (SCO). We define
memorization via the information a learning algorithm reveals about its
training data points. We then quantify this information using the framework
of conditional mutual information (CMI) proposed by Steinke and Zakynthinou
[SZ20]. Our main result is a precise characterization of the tradeoff
between the accuracy of a learning algorithm and its CMI, answering an open
question posed by Livni [Liv23]. We show that in the Lipschitz–bounded
setting and under strong convexity, every learner with an excess error ε
has CMI bounded below by Ω(1/ε^2 ) and Ω(1/ε), respectively. We further
demonstrate the essential role of memorization in learning problems in SCO
by designing an adversary capable of accurately identifying a significant
fraction of the training samples in specific SCO problems. Finally, we
enumerate several implications of our results, such as a limitation of
generalization bounds based on CMI and the incompressibility of samples in
SCO problems.

*Bio:* Idan Attias is a postdoctoral researcher at the IDEAL Institute,
hosted by Lev Reyzin (UIC) and Avrim Blum (TTIC). He obtained his Ph.D. in
Computer Science under the supervision of Aryeh Kontorovich (BGU) and
Yishay Mansour (TAU and Google Research). He also holds a B.Sc. in
Mathematics and Computer Science from TAU.

Idan's primary research interests lie in the foundations of machine
learning theory and data-driven sequential decision-making, with
intersections in game theory, optimization, statistics, private data
analysis, causal inference, and information theory. He has published
several papers in top machine learning and theoretical computer science
venues, including NeurIPS, ICML, COLT, AAAI, ALT, JMLR, TMLR, ITCS, and
Algorithmica. Idan's work has been recognized with multiple Oral and
Spotlight presentations, and he recently received the ICML 2024 Best Paper
Award.

*Host:* Avrim Blum

Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue, Rm 517*
*Chicago, IL  60637*
*773-834-1757*
*mmarre at ttic.edu <mmarre at ttic.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20240915/b7d1d175/attachment.html>