[Theory] TODAY: 9/20 Talks at TTIC: Idan Attias, IDEAL Institute (10:30am)

Mary Marre via Theory theory at mailman.cs.uchicago.edu
Fri Sep 20 09:30:00 CDT 2024


*Please Note! Time Changed from 11am to 10:30 am CT *

*When*:    Friday, September 20, 2024 at* 10:30** am** CT *

*Where*:   Talk will be given *live, in-person* at
               TTIC, 6045 S. Kenwood Avenue
               5th Floor, *Room 530*

*Virtually*:  * view talk here*
<https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=3623eef8-0273-43dd-8383-b1ef01038b31>

*Who:  *    Idan Attias, IDEAL Institute


*Title:* Information Complexity of Stochastic Convex Optimization:
Applications to Generalization, Memorization and Privacy
*Abstract:* Despite intense study, the relationship between generalization
and memorization in machine learning has yet to be fully characterized.
Classically, ideal learning algorithms would primarily extract relevant
information from their training data, avoiding memorization of irrelevant
information. This intuition is supported by theoretical work demonstrating
the benefits of limited memorization for strong generalization. This
intuition, however, is challenged by the success of modern
overparameterized deep neural networks. These models often achieve high
test accuracy despite memorizing a significant number of training data.
Recent studies suggest that memorization plays a more complex role in
generalization than previously thought: memorization might even be
necessary for good generalization.

In this work, we investigate the interplay between memorization and
learning in the context of stochastic convex optimization (SCO). We define
memorization via the information a learning algorithm reveals about its
training data points. We then quantify this information using the framework
of conditional mutual information (CMI) proposed by Steinke and Zakynthinou
[SZ20]. Our main result is a precise characterization of the tradeoff
between the accuracy of a learning algorithm and its CMI, answering an open
question posed by Livni [Liv23]. We show that in the Lipschitz–bounded
setting and under strong convexity, every learner with an excess error ε
has CMI bounded below by Ω(1/ε^2 ) and Ω(1/ε), respectively. We further
demonstrate the essential role of memorization in learning problems in SCO
by designing an adversary capable of accurately identifying a significant
fraction of the training samples in specific SCO problems. Finally, we
enumerate several implications of our results, such as a limitation of
generalization bounds based on CMI and the incompressibility of samples in
SCO problems.

*Bio:* Idan Attias is a postdoctoral researcher at the IDEAL Institute,
hosted by Lev Reyzin (UIC) and Avrim Blum (TTIC). He obtained his Ph.D. in
Computer Science under the supervision of Aryeh Kontorovich (BGU) and
Yishay Mansour (TAU and Google Research). He also holds a B.Sc. in
Mathematics and Computer Science from TAU.

Idan's primary research interests lie in the foundations of machine
learning theory and data-driven sequential decision-making, with
intersections in game theory, optimization, statistics, private data
analysis, causal inference, and information theory. He has published
several papers in top machine learning and theoretical computer science
venues, including NeurIPS, ICML, COLT, AAAI, ALT, JMLR, TMLR, ITCS, and
Algorithmica. Idan's work has been recognized with multiple Oral and
Spotlight presentations, and he recently received the ICML 2024 Best Paper
Award.

*Host:* Avrim Blum



Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue, Rm 517*
*Chicago, IL  60637*
*773-834-1757*
*mmarre at ttic.edu <mmarre at ttic.edu>*


On Thu, Sep 19, 2024 at 3:03 PM Mary Marre <mmarre at ttic.edu> wrote:

> *Please Note! Time Changed from 11:am to 10:30 am CT *
>
> *When*:    Friday, September 20, 2024 at* 10:30** am** CT *
>
> *Where*:   Talk will be given *live, in-person* at
>                TTIC, 6045 S. Kenwood Avenue
>                5th Floor, *Room 530*
>
> *Virtually*:  * view talk here*
> <https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=3623eef8-0273-43dd-8383-b1ef01038b31>
>
> *Who:  *    Idan Attias, IDEAL Institute
>
>
> *Title:* Information Complexity of Stochastic Convex Optimization:
> Applications to Generalization, Memorization and Privacy
> *Abstract:* Despite intense study, the relationship between
> generalization and memorization in machine learning has yet to be fully
> characterized. Classically, ideal learning algorithms would primarily
> extract relevant information from their training data, avoiding
> memorization of irrelevant information. This intuition is supported by
> theoretical work demonstrating the benefits of limited memorization for
> strong generalization. This intuition, however, is challenged by the
> success of modern overparameterized deep neural networks. These models
> often achieve high test accuracy despite memorizing a significant number of
> training data. Recent studies suggest that memorization plays a more
> complex role in generalization than previously thought: memorization might
> even be necessary for good generalization.
>
> In this work, we investigate the interplay between memorization and
> learning in the context of stochastic convex optimization (SCO). We define
> memorization via the information a learning algorithm reveals about its
> training data points. We then quantify this information using the framework
> of conditional mutual information (CMI) proposed by Steinke and Zakynthinou
> [SZ20]. Our main result is a precise characterization of the tradeoff
> between the accuracy of a learning algorithm and its CMI, answering an open
> question posed by Livni [Liv23]. We show that in the Lipschitz–bounded
> setting and under strong convexity, every learner with an excess error ε
> has CMI bounded below by Ω(1/ε^2 ) and Ω(1/ε), respectively. We further
> demonstrate the essential role of memorization in learning problems in SCO
> by designing an adversary capable of accurately identifying a significant
> fraction of the training samples in specific SCO problems. Finally, we
> enumerate several implications of our results, such as a limitation of
> generalization bounds based on CMI and the incompressibility of samples in
> SCO problems.
>
> *Bio:* Idan Attias is a postdoctoral researcher at the IDEAL Institute,
> hosted by Lev Reyzin (UIC) and Avrim Blum (TTIC). He obtained his Ph.D.
> in Computer Science under the supervision of Aryeh Kontorovich (BGU) and
> Yishay Mansour (TAU and Google Research). He also holds a B.Sc. in
> Mathematics and Computer Science from TAU.
>
> Idan's primary research interests lie in the foundations of machine
> learning theory and data-driven sequential decision-making, with
> intersections in game theory, optimization, statistics, private data
> analysis, causal inference, and information theory. He has published
> several papers in top machine learning and theoretical computer science
> venues, including NeurIPS, ICML, COLT, AAAI, ALT, JMLR, TMLR, ITCS, and
> Algorithmica. Idan's work has been recognized with multiple Oral and
> Spotlight presentations, and he recently received the ICML 2024 Best Paper
> Award.
>
> *Host:* Avrim Blum
>
>
> Mary C. Marre
> Faculty Administrative Support
> *Toyota Technological Institute*
> *6045 S. Kenwood Avenue, Rm 517*
> *Chicago, IL  60637*
> *773-834-1757*
> *mmarre at ttic.edu <mmarre at ttic.edu>*
>
>
> On Wed, Sep 18, 2024 at 11:03 AM Mary Marre <mmarre at ttic.edu> wrote:
>
>> *Please Note! Time Changed to 10:30 am CT *
>>
>> *When*:    Friday, September 20, 2024 at* 10:30** am** CT *
>>
>> *Where*:   Talk will be given *live, in-person* at
>>                TTIC, 6045 S. Kenwood Avenue
>>                5th Floor, *Room 530*
>>
>> *Virtually*:  * view talk here*
>> <https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=3623eef8-0273-43dd-8383-b1ef01038b31>
>>
>> *Who:  *    Idan Attias, IDEAL Institute
>>
>>
>> *Title:* Information Complexity of Stochastic Convex Optimization:
>> Applications to Generalization, Memorization and Privacy
>> *Abstract:* Despite intense study, the relationship between
>> generalization and memorization in machine learning has yet to be fully
>> characterized. Classically, ideal learning algorithms would primarily
>> extract relevant information from their training data, avoiding
>> memorization of irrelevant information. This intuition is supported by
>> theoretical work demonstrating the benefits of limited memorization for
>> strong generalization. This intuition, however, is challenged by the
>> success of modern overparameterized deep neural networks. These models
>> often achieve high test accuracy despite memorizing a significant number of
>> training data. Recent studies suggest that memorization plays a more
>> complex role in generalization than previously thought: memorization might
>> even be necessary for good generalization.
>>
>> In this work, we investigate the interplay between memorization and
>> learning in the context of stochastic convex optimization (SCO). We define
>> memorization via the information a learning algorithm reveals about its
>> training data points. We then quantify this information using the framework
>> of conditional mutual information (CMI) proposed by Steinke and Zakynthinou
>> [SZ20]. Our main result is a precise characterization of the tradeoff
>> between the accuracy of a learning algorithm and its CMI, answering an open
>> question posed by Livni [Liv23]. We show that in the Lipschitz–bounded
>> setting and under strong convexity, every learner with an excess error ε
>> has CMI bounded below by Ω(1/ε^2 ) and Ω(1/ε), respectively. We further
>> demonstrate the essential role of memorization in learning problems in SCO
>> by designing an adversary capable of accurately identifying a significant
>> fraction of the training samples in specific SCO problems. Finally, we
>> enumerate several implications of our results, such as a limitation of
>> generalization bounds based on CMI and the incompressibility of samples in
>> SCO problems.
>>
>> *Bio:* Idan Attias is a postdoctoral researcher at the IDEAL Institute,
>> hosted by Lev Reyzin (UIC) and Avrim Blum (TTIC). He obtained his Ph.D. in
>> Computer Science under the supervision of Aryeh Kontorovich (BGU) and
>> Yishay Mansour (TAU and Google Research). He also holds a B.Sc. in
>> Mathematics and Computer Science from TAU.
>>
>> Idan's primary research interests lie in the foundations of machine
>> learning theory and data-driven sequential decision-making, with
>> intersections in game theory, optimization, statistics, private data
>> analysis, causal inference, and information theory. He has published
>> several papers in top machine learning and theoretical computer science
>> venues, including NeurIPS, ICML, COLT, AAAI, ALT, JMLR, TMLR, ITCS, and
>> Algorithmica. Idan's work has been recognized with multiple Oral and
>> Spotlight presentations, and he recently received the ICML 2024 Best Paper
>> Award.
>>
>> *Host:* Avrim Blum
>>
>>
>>
>> Mary C. Marre
>> Faculty Administrative Support
>> *Toyota Technological Institute*
>> *6045 S. Kenwood Avenue, Rm 517*
>> *Chicago, IL  60637*
>> *773-834-1757*
>> *mmarre at ttic.edu <mmarre at ttic.edu>*
>>
>>
>> On Sun, Sep 15, 2024 at 9:01 PM Mary Marre <mmarre at ttic.edu> wrote:
>>
>>> *When*:    Friday, September 20, 2024 at* 11:00** am** CT *
>>>
>>> *Where*:   Talk will be given *live, in-person* at
>>>                TTIC, 6045 S. Kenwood Avenue
>>>                5th Floor, *Room 530*
>>>
>>> *Virtually*: tba
>>>
>>> *Who:  *    Idan Attias, IDEAL Institute
>>>
>>>
>>> *Title:* Information Complexity of Stochastic Convex Optimization:
>>> Applications to Generalization, Memorization and Privacy
>>> *Abstract:* Despite intense study, the relationship between
>>> generalization and memorization in machine learning has yet to be fully
>>> characterized. Classically, ideal learning algorithms would primarily
>>> extract relevant information from their training data, avoiding
>>> memorization of irrelevant information. This intuition is supported by
>>> theoretical work demonstrating the benefits of limited memorization for
>>> strong generalization. This intuition, however, is challenged by the
>>> success of modern overparameterized deep neural networks. These models
>>> often achieve high test accuracy despite memorizing a significant number of
>>> training data. Recent studies suggest that memorization plays a more
>>> complex role in generalization than previously thought: memorization might
>>> even be necessary for good generalization.
>>>
>>> In this work, we investigate the interplay between memorization and
>>> learning in the context of stochastic convex optimization (SCO). We define
>>> memorization via the information a learning algorithm reveals about its
>>> training data points. We then quantify this information using the framework
>>> of conditional mutual information (CMI) proposed by Steinke and Zakynthinou
>>> [SZ20]. Our main result is a precise characterization of the tradeoff
>>> between the accuracy of a learning algorithm and its CMI, answering an open
>>> question posed by Livni [Liv23]. We show that in the Lipschitz–bounded
>>> setting and under strong convexity, every learner with an excess error ε
>>> has CMI bounded below by Ω(1/ε^2 ) and Ω(1/ε), respectively. We further
>>> demonstrate the essential role of memorization in learning problems in SCO
>>> by designing an adversary capable of accurately identifying a significant
>>> fraction of the training samples in specific SCO problems. Finally, we
>>> enumerate several implications of our results, such as a limitation of
>>> generalization bounds based on CMI and the incompressibility of samples in
>>> SCO problems.
>>>
>>> *Bio:* Idan Attias is a postdoctoral researcher at the IDEAL Institute,
>>> hosted by Lev Reyzin (UIC) and Avrim Blum (TTIC). He obtained his Ph.D. in
>>> Computer Science under the supervision of Aryeh Kontorovich (BGU) and
>>> Yishay Mansour (TAU and Google Research). He also holds a B.Sc. in
>>> Mathematics and Computer Science from TAU.
>>>
>>> Idan's primary research interests lie in the foundations of machine
>>> learning theory and data-driven sequential decision-making, with
>>> intersections in game theory, optimization, statistics, private data
>>> analysis, causal inference, and information theory. He has published
>>> several papers in top machine learning and theoretical computer science
>>> venues, including NeurIPS, ICML, COLT, AAAI, ALT, JMLR, TMLR, ITCS, and
>>> Algorithmica. Idan's work has been recognized with multiple Oral and
>>> Spotlight presentations, and he recently received the ICML 2024 Best Paper
>>> Award.
>>>
>>> *Host:* Avrim Blum
>>>
>>>
>>>
>>> Mary C. Marre
>>> Faculty Administrative Support
>>> *Toyota Technological Institute*
>>> *6045 S. Kenwood Avenue, Rm 517*
>>> *Chicago, IL  60637*
>>> *773-834-1757*
>>> *mmarre at ttic.edu <mmarre at ttic.edu>*
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20240920/aabcfda9/attachment-0001.html>


More information about the Theory mailing list