<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-size:small"><div class="gmail_default"><div class="gmail_default"><b><font color="#cc0000">Please Note!</font><font color="#ff0000"> </font>Time Changed from 11:am to <font face="verdana, sans-serif" style="background-color:rgb(255,255,0)">10:30 am CT </font></b></div><div class="gmail_default"><b><br></b></div><div class="gmail_default"><b>When</b>: <font style="font-family:arial,sans-serif;color:rgb(0,0,0)">Fri</font><span class="gmail_default" style="font-family:arial,sans-serif;color:rgb(0,0,0)">day, September <span class="gmail-il">20</span><span class="gmail_default">, </span>2024</span><font style="font-family:arial,sans-serif;color:rgb(0,0,0)"> at</font><b style="font-family:arial,sans-serif;color:rgb(0,0,0)"> <u>10:30</u></b><span style="color:rgb(80,0,80);font-family:arial,sans-serif"><b><u><font color="#000000"> am</font></u></b><b><u><font color="#000000"> CT</font></u><font color="#000000"> </font></b></span></div><div class="gmail_default"><div><b><br></b></div><div><b>Where</b>: <span class="gmail-il">Talk</span> will be given <b><font color="#0000ff">live, in-person</font></b> at<br> <span class="gmail-il">TTIC</span>, 6045 S. Kenwood Avenue<br> 5th Floor, <b><u><font color="#000000">Room 530</font></u></b></div><div><br><b>Virtually</b>: <a href="https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=3623eef8-0273-43dd-8383-b1ef01038b31" target="_blank"><b><font size="1"> view <span class="gmail-il">talk</span> here</font></b></a><br></div><div> </div><div><b>Who: </b> Idan Attias, IDEAL Institute</div><div><br></div></div><div class="gmail_default"><div style="border-top:none;border-right:none;border-left:none;border-bottom:2.25pt solid rgb(11,118,159);padding:0in 0in 1pt"></div><p class="MsoNormal" style="margin:0in 0in 8pt;font-size:11pt;text-align:justify;line-height:15.6933px;font-family:Aptos,sans-serif"><b style="font-family:arial,sans-serif;font-size:small"><br></b></p><p class="MsoNormal" style="margin:0in 0in 8pt;font-size:11pt;text-align:justify;line-height:15.6933px;font-family:Aptos,sans-serif"><b style="font-family:arial,sans-serif;font-size:small">Title:</b><span style="font-family:arial,sans-serif;font-size:small"> Information Complexity of Stochastic Convex Optimization: Applications to Generalization, Memorization and Privacy</span><br></p></div></div><div class="gmail_default"><font face="arial, sans-serif"><b>Abstract:</b> Despite intense study, the relationship between generalization and memorization in machine learning has yet to be fully characterized. Classically, ideal learning algorithms would primarily extract relevant information from their training data, avoiding memorization of irrelevant information. This intuition is supported by theoretical work demonstrating the benefits of limited memorization for strong generalization. This intuition, however, is challenged by the success of modern overparameterized deep neural networks. These models often achieve high test accuracy despite memorizing a significant number of training data. Recent studies suggest that memorization plays a more complex role in generalization than previously thought: memorization might even be necessary for good generalization. </font><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">In this work, we investigate the interplay between memorization and learning in the context of stochastic convex optimization (SCO). We define memorization via the information a learning algorithm reveals about its training data points. We then quantify this information using the framework of conditional mutual information (CMI) proposed by Steinke and Zakynthinou [SZ20]. Our main result is a precise characterization of the tradeoff between the accuracy of a learning algorithm and its CMI, answering an open question posed by Livni [Liv23]. We show that in the Lipschitz–bounded setting and under strong convexity, every learner with an excess error ε has CMI bounded below by Ω(1/ε^2 ) and Ω(1/ε), respectively. We further demonstrate the essential role of memorization in learning problems in SCO by designing an adversary capable of accurately identifying a significant fraction of the training samples in specific SCO problems. Finally, we enumerate several implications of our results, such as a limitation of generalization bounds based on CMI and the incompressibility of samples in SCO problems.<br><br><b>Bio:</b> Idan Attias is a postdoctoral researcher at the IDEAL Institute, hosted by Lev Reyzin (UIC) and Avrim Blum (<span class="gmail-il">TTIC</span>). He obtained his Ph.D. in Computer Science under the supervision of Aryeh Kontorovich (BGU) and Yishay Mansour (TAU and Google Research). He also holds a B.Sc. in Mathematics and Computer Science from TAU. <br><br>Idan's primary research interests lie in the foundations of machine learning theory and data-driven sequential decision-making, with intersections in game theory, optimization, statistics, private data analysis, causal inference, and information theory. He has published several papers in top machine learning and theoretical computer science venues, including NeurIPS, ICML, COLT, AAAI, ALT, JMLR, TMLR, ITCS, and Algorithmica. Idan's work has been recognized with multiple Oral and Spotlight presentations, and he recently received the ICML 2024 Best Paper Award.</font></div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif"><b>Host:</b> Avrim Blum</font></div><div><br></div><div><br></div></div></div><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue, Rm 517</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">773-834-1757</font></i></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Sep 18, 2024 at 11:03 AM Mary Marre <<a href="mailto:mmarre@ttic.edu">mmarre@ttic.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div><div><div style="font-size:small"><b><font color="#cc0000">Please Note!</font><font color="#ff0000"> </font>Time Changed to <font face="verdana, sans-serif" style="background-color:rgb(255,255,0)">10:30 am CT </font></b></div><div style="font-size:small"><b><br></b></div><div style="font-size:small"><b>When</b>: <font style="font-family:arial,sans-serif;color:rgb(0,0,0)">Fri</font><span class="gmail_default" style="font-family:arial,sans-serif;color:rgb(0,0,0)">day, September 20<span class="gmail_default">, </span>2024</span><font style="font-family:arial,sans-serif;color:rgb(0,0,0)"> at</font><b style="font-family:arial,sans-serif;color:rgb(0,0,0)"> <u style="background-color:rgb(255,255,255)">10:30</u></b><span style="color:rgb(80,0,80);font-family:arial,sans-serif;background-color:rgb(255,255,255)"><b><u><font color="#000000"> am</font></u></b><b><u><font color="#000000"> CT</font></u><font color="#000000"> </font></b></span></div><div><div style="font-size:small"><b><br></b></div><div style="font-size:small"><b>Where</b>: Talk will be given <b><font color="#0000ff">live, in-person</font></b> at<br> TTIC, 6045 S. Kenwood Avenue<br> 5th Floor, <b><u><font color="#000000">Room 530</font></u></b></div><div><br><b style="font-size:small">Virtually</b>: <a href="https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=3623eef8-0273-43dd-8383-b1ef01038b31" target="_blank"><b><font size="1"> view talk here</font></b></a><br></div><div style="font-size:small"> </div><div style="font-size:small"><b>Who: </b> Idan Attias, IDEAL Institute</div><div style="font-size:small"><br></div></div><div style="font-size:small"><div style="border-top:none;border-right:none;border-left:none;border-bottom:2.25pt solid rgb(11,118,159);padding:0in 0in 1pt"></div><p class="MsoNormal" style="margin:0in 0in 8pt;font-size:11pt;text-align:justify;line-height:15.6933px;font-family:Aptos,sans-serif"><b style="font-family:arial,sans-serif;font-size:small"><br></b></p><p class="MsoNormal" style="margin:0in 0in 8pt;font-size:11pt;text-align:justify;line-height:15.6933px;font-family:Aptos,sans-serif"><b style="font-family:arial,sans-serif;font-size:small">Title:</b><span style="font-family:arial,sans-serif;font-size:small"> Information Complexity of Stochastic Convex Optimization: Applications to Generalization, Memorization and Privacy</span><br></p></div></div><div style="font-size:small"><font face="arial, sans-serif"><b>Abstract:</b> Despite intense study, the relationship between generalization and memorization in machine learning has yet to be fully characterized. Classically, ideal learning algorithms would primarily extract relevant information from their training data, avoiding memorization of irrelevant information. This intuition is supported by theoretical work demonstrating the benefits of limited memorization for strong generalization. This intuition, however, is challenged by the success of modern overparameterized deep neural networks. These models often achieve high test accuracy despite memorizing a significant number of training data. Recent studies suggest that memorization plays a more complex role in generalization than previously thought: memorization might even be necessary for good generalization. </font><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">In this work, we investigate the interplay between memorization and learning in the context of stochastic convex optimization (SCO). We define memorization via the information a learning algorithm reveals about its training data points. We then quantify this information using the framework of conditional mutual information (CMI) proposed by Steinke and Zakynthinou [SZ20]. Our main result is a precise characterization of the tradeoff between the accuracy of a learning algorithm and its CMI, answering an open question posed by Livni [Liv23]. We show that in the Lipschitz–bounded setting and under strong convexity, every learner with an excess error ε has CMI bounded below by Ω(1/ε^2 ) and Ω(1/ε), respectively. We further demonstrate the essential role of memorization in learning problems in SCO by designing an adversary capable of accurately identifying a significant fraction of the training samples in specific SCO problems. Finally, we enumerate several implications of our results, such as a limitation of generalization bounds based on CMI and the incompressibility of samples in SCO problems.<br><br><b>Bio:</b> Idan Attias is a postdoctoral researcher at the IDEAL Institute, hosted by Lev Reyzin (UIC) and Avrim Blum (TTIC). He obtained his Ph.D. in Computer Science under the supervision of Aryeh Kontorovich (BGU) and Yishay Mansour (TAU and Google Research). He also holds a B.Sc. in Mathematics and Computer Science from TAU. <br><br>Idan's primary research interests lie in the foundations of machine learning theory and data-driven sequential decision-making, with intersections in game theory, optimization, statistics, private data analysis, causal inference, and information theory. He has published several papers in top machine learning and theoretical computer science venues, including NeurIPS, ICML, COLT, AAAI, ALT, JMLR, TMLR, ITCS, and Algorithmica. Idan's work has been recognized with multiple Oral and Spotlight presentations, and he recently received the ICML 2024 Best Paper Award.</font></div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif"><b>Host:</b> Avrim Blum</font></div><div><br></div><div><br></div><div><br></div></div></div><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue, Rm 517</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">773-834-1757</font></i></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Sep 15, 2024 at 9:01 PM Mary Marre <<a href="mailto:mmarre@ttic.edu" target="_blank">mmarre@ttic.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div style="font-size:small"><div><b>When</b>: <font style="font-family:arial,sans-serif;color:rgb(0,0,0)">Fri</font><span class="gmail_default" style="font-family:arial,sans-serif;color:rgb(0,0,0)">day, September 20<span class="gmail_default">, </span>2024</span><font style="font-family:arial,sans-serif;color:rgb(0,0,0)"> at</font><b style="font-family:arial,sans-serif;color:rgb(0,0,0)"> <span style="background-color:rgb(255,255,0)"><u>11:00</u></span></b><span style="color:rgb(80,0,80);font-family:arial,sans-serif;background-color:rgb(255,255,0)"><b><u><font color="#000000"> am</font></u></b><b><u><font color="#000000"> CT</font></u><font color="#000000"> </font></b></span></div><div><div><b><br></b></div><div><b>Where</b>: Talk will be given <b><font color="#0000ff">live, in-person</font></b> at<br> TTIC, 6045 S. Kenwood Avenue<br> 5th Floor, <b><u><font color="#000000">Room 530</font></u></b></div><div><br><b>Virtually</b>: tba <br></div><div> </div><div><b>Who: </b> Idan Attias, IDEAL Institute</div><div><br></div></div><div><div style="border-top:none;border-right:none;border-left:none;border-bottom:2.25pt solid rgb(11,118,159);padding:0in 0in 1pt"></div><p class="MsoNormal" style="margin:0in 0in 8pt;font-size:11pt;text-align:justify;line-height:15.6933px;font-family:Aptos,sans-serif"><b style="font-family:arial,sans-serif;font-size:small"><br></b></p><p class="MsoNormal" style="margin:0in 0in 8pt;font-size:11pt;text-align:justify;line-height:15.6933px;font-family:Aptos,sans-serif"><b style="font-family:arial,sans-serif;font-size:small">Title:</b><span style="font-family:arial,sans-serif;font-size:small"> Information Complexity of Stochastic Convex Optimization: Applications to Generalization, Memorization and Privacy</span><br></p></div></div><div style="font-size:small"><font face="arial, sans-serif"><b>Abstract:</b> Despite intense study, the relationship between generalization and memorization in machine learning has yet to be fully characterized. Classically, ideal learning algorithms would primarily extract relevant information from their training data, avoiding memorization of irrelevant information. This intuition is supported by theoretical work demonstrating the benefits of limited memorization for strong generalization. This intuition, however, is challenged by the success of modern overparameterized deep neural networks. These models often achieve high test accuracy despite memorizing a significant number of training data. Recent studies suggest that memorization plays a more complex role in generalization than previously thought: memorization might even be necessary for good generalization. </font><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif">In this work, we investigate the interplay between memorization and learning in the context of stochastic convex optimization (SCO). We define memorization via the information a learning algorithm reveals about its training data points. We then quantify this information using the framework of conditional mutual information (CMI) proposed by Steinke and Zakynthinou [SZ20]. Our main result is a precise characterization of the tradeoff between the accuracy of a learning algorithm and its CMI, answering an open question posed by Livni [Liv23]. We show that in the Lipschitz–bounded setting and under strong convexity, every learner with an excess error ε has CMI bounded below by Ω(1/ε^2 ) and Ω(1/ε), respectively. We further demonstrate the essential role of memorization in learning problems in SCO by designing an adversary capable of accurately identifying a significant fraction of the training samples in specific SCO problems. Finally, we enumerate several implications of our results, such as a limitation of generalization bounds based on CMI and the incompressibility of samples in SCO problems.<br><br><b>Bio:</b> <span>Idan</span> Attias is a postdoctoral researcher at the IDEAL Institute, hosted by Lev Reyzin (UIC) and Avrim Blum (TTIC). He obtained his Ph.D. in Computer Science under the supervision of Aryeh Kontorovich (BGU) and Yishay Mansour (TAU and Google Research). He also holds a B.Sc. in Mathematics and Computer Science from TAU. <br><br><span>Idan</span>'s primary research interests lie in the foundations of machine learning theory and data-driven sequential decision-making, with intersections in game theory, optimization, statistics, private data analysis, causal inference, and information theory. He has published several papers in top machine learning and theoretical computer science venues, including NeurIPS, ICML, COLT, AAAI, ALT, JMLR, TMLR, ITCS, and Algorithmica. <span>Idan</span>'s work has been recognized with multiple Oral and Spotlight presentations, and he recently received the ICML 2024 Best Paper Award.</font></div><div><font face="arial, sans-serif"><br></font></div><div><font face="arial, sans-serif"><b>Host:</b> Avrim Blum</font></div><div><br></div><div><br></div><div><br></div></div><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue, Rm 517</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">773-834-1757</font></i></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div></div>
</blockquote></div></div>
</blockquote></div></div>