<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-size:small"><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">When:</span></b><span style="color:black"> Monday, February 27, 2023 at<b> <u>11:30 am CT</u> </b></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:rgb(80,0,80)">Where: </span></b><span style="color:black">Talk will be given <b><u>live, in-person</u></b></span><b><span style="color:rgb(80,0,80)"> </span></b><span style="color:rgb(80,0,80)">at</span><span style="color:rgb(80,0,80)"></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><span style="color:rgb(80,0,80)"> </span><span style="color:black"> TTIC, 6045 S. Kenwood Avenue</span><span style="color:rgb(80,0,80)"></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><span style="color:black"> 5th Floor, Room 530<b> </b></span><span style="color:rgb(80,0,80)"></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:rgb(60,64,67);letter-spacing:0.15pt">Virtually:</span></b><span style="color:rgb(60,64,67);letter-spacing:0.15pt"> <i>via</i> Panopto </span><span style="color:rgb(80,0,80)">(<b><a href="https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=8759978c-a0e0-4a2e-9f47-afaf0179b889" target="_blank">livestream</a></b></span><span style="color:rgb(60,64,67);letter-spacing:0.15pt">)</span><span style="color:rgb(80,0,80)"></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:rgb(80,0,80)">Who: </span></b><span style="color:rgb(80,0,80)"> Naomi Saphra, NYU</span><span style="color:rgb(80,0,80)"></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p><div class="MsoNormal" align="center" style="margin:0in 0in 8pt;text-align:center;line-height:13.91px"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"><hr size="2" width="100%" align="center"></font></span></div><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">Title:</span></b><span style="color:black"> Interpreting Training</span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"> </font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">Abstract:</span></b><span style="color:black"> Interpretability research in NLP often follows a predictable pattern—pick an indicator of structure or knowledge such as probe or challenge set accuracy, measure that indicator in a fully trained model, and assert that this structure or information is integral to how the model functions. However, we can achieve a much deeper understanding by considering how these indicators emerge from the training process. First, this talk will discuss research on the relationship between interpretable generalization behavior and the presence of multiple basins on the loss landscapes of fine tuned text classifiers. Then, I will describe how manipulating interpretable behaviors during the training process can shed light on the role of syntactic signals in attention distributions and generally on how an optimizer’s bias towards early-learned and simple strategies both helps and hurts language model performance. These results form the basis of a manifesto for exploring developmental explanations when researching interpretability and generalization behavior.</span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"> </font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">Bio: </span></b><span style="color:black">Naomi Saphra is a postdoctoral researcher at NYU with Kyunghyun Cho. She is interested in NLP training dynamics: how models learn to encode linguistic patterns or other structures and how we can encode useful inductive biases into the training process. Previously, she earned a PhD from the University of Edinburgh on Training Dynamics of Neural Language Models, worked at Google and Facebook, and attended Johns Hopkins and Carnegie Mellon University. Outside of research, she plays roller derby under the name Gaussian Retribution, does stand-up comedy, and shepherds disabled programmers into the world of code dictation.</span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"> </font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">Host:</span></b><span style="color:black"> <a href="mailto:klivescu@ttic.edu" target="_blank">Karen Livescu</a></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><br></font></p><p class="MsoNormal" style="margin:0in;font-size:11pt;line-height:normal;font-family:Calibri,sans-serif"><br></p><p class="MsoNormal" style="margin:0in;font-size:11pt;line-height:normal;font-family:Calibri,sans-serif"><br></p><p class="MsoNormal" style="margin:0in;font-size:11pt;line-height:normal;font-family:Calibri,sans-serif"><br></p></div><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue, Rm 517</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">773-834-1757</font></i></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Feb 26, 2023 at 3:17 PM Mary Marre <<a href="mailto:mmarre@ttic.edu">mmarre@ttic.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">When:</span></b><span style="color:black"> Monday, February 27, 2023 at<b> <u>11:30 am
CT</u> </b></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:rgb(80,0,80)">Where: </span></b><span style="color:black">Talk will be given <b><u>live,
in-person</u></b></span><b><span style="color:rgb(80,0,80)"> </span></b><span style="color:rgb(80,0,80)">at</span><span style="color:rgb(80,0,80)"></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><span style="color:rgb(80,0,80)">
</span><span style="color:black"> TTIC,
6045 S. Kenwood Avenue</span><span style="color:rgb(80,0,80)"></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><span style="color:black">
5th Floor, Room 530<b> </b></span><span style="color:rgb(80,0,80)"></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:rgb(60,64,67);letter-spacing:0.15pt">Virtually:</span></b><span style="color:rgb(60,64,67);letter-spacing:0.15pt"> <i>via</i>
Panopto </span><span style="color:rgb(80,0,80)">(<b><a href="https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=8759978c-a0e0-4a2e-9f47-afaf0179b889" target="_blank">livestream</a></b></span><span style="color:rgb(60,64,67);letter-spacing:0.15pt">)</span><span style="color:rgb(80,0,80)"></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:rgb(80,0,80)">Who: </span></b><span style="color:rgb(80,0,80)"> Naomi Saphra, NYU</span><span style="color:rgb(80,0,80)"></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p>
<div class="MsoNormal" align="center" style="text-align:center;margin:0in 0in 8pt;line-height:107%"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif">
<hr size="2" width="100%" align="center">
</font></span></div>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">Title:</span></b><span style="color:black"> Interpreting Training</span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"> </font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">Abstract:</span></b><span style="color:black"> Interpretability research in NLP often follows a predictable
pattern—pick an indicator of structure or knowledge such as probe or challenge
set accuracy, measure that indicator in a fully trained model, and assert that
this structure or information is integral to how the model functions. However,
we can achieve a much deeper understanding by considering how these indicators
emerge from the training process. First, this talk will discuss research on the
relationship between interpretable generalization behavior and the presence of
multiple basins on the loss landscapes of fine tuned text classifiers. Then, I
will describe how manipulating interpretable behaviors during the training
process can shed light on the role of syntactic signals in attention
distributions and generally on how an optimizer’s bias towards early-learned
and simple strategies both helps and hurts language model performance. These
results form the basis of a manifesto for exploring developmental explanations
when researching interpretability and generalization behavior.</span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"> </font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">Bio: </span></b><span style="color:black">Naomi Saphra is a postdoctoral researcher at NYU with
Kyunghyun Cho. She is interested in NLP training dynamics: how models learn to
encode linguistic patterns or other structures and how we can encode useful
inductive biases into the training process. Previously, she earned a PhD from
the University of Edinburgh on Training Dynamics of Neural Language Models,
worked at Google and Facebook, and attended Johns Hopkins and Carnegie Mellon
University. Outside of research, she plays roller derby under the name Gaussian
Retribution, does stand-up comedy, and shepherds disabled programmers into the
world of code dictation.</span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"> </font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">Host:</span></b><span style="color:black"> <a href="mailto:klivescu@ttic.edu" target="_blank">Karen Livescu</a></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><br></font></p><p class="MsoNormal" style="font-size:11pt;margin:0in;line-height:normal;font-family:Calibri,sans-serif"><br></p><p class="MsoNormal" style="font-size:11pt;margin:0in;line-height:normal;font-family:Calibri,sans-serif"><br></p><p class="MsoNormal" style="font-size:11pt;margin:0in;line-height:normal;font-family:Calibri,sans-serif"><br></p></div><div><div dir="ltr"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue, Rm 517</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">773-834-1757</font></i></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Feb 20, 2023 at 5:09 PM Mary Marre <<a href="mailto:mmarre@ttic.edu" target="_blank">mmarre@ttic.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div style="font-size:small"><font face="arial, sans-serif"><font style="color:rgb(0,0,0);vertical-align:inherit"><font style="vertical-align:inherit"><b>When:</b> </font></font><font style="vertical-align:inherit"><font style="vertical-align:inherit"><font style="color:rgb(0,0,0)"> Monday</font><span class="gmail_default" style="color:rgb(0,0,0)">, February 27, 2023</span><font style="color:rgb(0,0,0)"> at</font><b style="color:rgb(0,0,0)"> <u>11:30</u></b><b><u><font color="#000000"> a</font></u></b><b><u><font color="#000000">m CT</font></u><font color="#000000"> </font></b></font></font><br></font></div><div><p style="color:rgb(80,0,80);font-size:small;font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal;margin:0px"><font face="arial, sans-serif" color="#000000"><font style="vertical-align:inherit"><font style="vertical-align:inherit"><b><span style="background-color:rgb(255,255,0)"><br></span></b></font></font></font></p><div style="color:rgb(80,0,80);font-size:small"><font face="arial, sans-serif"><b><font color="#500050">Where: </font></b><font color="#000000">Talk will be given </font><font color="#000000" style="font-weight:bold"><u>live, in-person</u></font><font style="font-weight:bold"> </font>at</font></div><p class="MsoNormal" style="color:rgb(80,0,80);font-size:small;margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><font color="#500050"> </font><font color="#000000"> TTIC, 6045 S. Kenwood Avenue</font></font></p><p class="MsoNormal" style="color:rgb(80,0,80);font-size:small;margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif" color="#000000"> 5th Floor, Room 530<b> </b></font></p><p class="MsoNormal" style="color:rgb(80,0,80);font-size:small;margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><b><span style="color:black"><br></span></b></font></p><p class="MsoNormal" style="color:rgb(80,0,80);font-size:small;margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><b style="color:rgb(60,64,67);letter-spacing:0.2px;white-space:pre-wrap">Virtually:</b><span style="color:rgb(60,64,67);letter-spacing:0.2px;white-space:pre-wrap"> <i>via</i> Panopto </span>(<b><a href="https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=8759978c-a0e0-4a2e-9f47-afaf0179b889" target="_blank">livestream</a></b><span style="color:rgb(60,64,67);letter-spacing:0.2px;white-space:pre-wrap">)</span><br></font></p><p class="MsoNormal" style="color:rgb(80,0,80);font-size:small;margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p class="MsoNormal" style="color:rgb(80,0,80);font-size:small;margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><font style="vertical-align:inherit"><font style="vertical-align:inherit"><b>Who: </b> <font color="#500050"> </font><font color="#000000"><font color="#500050"> </font></font></font></font>Naomi Saphra, NYU</font></p><p class="MsoNormal" style="color:rgb(80,0,80);font-size:small;margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><div class="MsoNormal" align="center" style="color:rgb(80,0,80);font-size:small;margin:0in 0in 8pt;text-align:center;line-height:15.6933px"><hr size="2" width="100%" align="center"></div><div><div><div><font color="#000000" face="arial, sans-serif"><b>Title:</b> Interpreting Training</font></div><div><font color="#000000" face="arial, sans-serif"><br></font></div><div><p style="font-variant-numeric:normal;font-variant-east-asian:normal;margin:0in;font-stretch:normal"><font color="#000000" face="arial, sans-serif"><b>Abstract:</b> Interpretability research in NLP often follows a predictable pattern—pick an indicator of structure or knowledge such as probe or challenge set accuracy, measure that indicator in a fully trained model, and assert that this structure or information is integral to how the model functions. However, we can achieve a much deeper understanding by considering how these indicators emerge from the training process. First, this talk will discuss research on the relationship between interpretable generalization behavior and the presence of multiple basins on the loss landscapes of finetuned text classifiers. Then, I will describe how manipulating interpretable behaviors during the training process can shed light on the role of syntactic signals in attention distributions and generally on how an optimizer’s bias towards early-learned and simple strategies both helps and hurts language model performance. These results form the basis of a manifesto for exploring developmental explanations when researching interpretability and generalization behavior.<u></u><u></u></font></p><div><font color="#000000" face="arial, sans-serif"><br></font></div><div><font color="#000000" face="arial, sans-serif"><span><b>Bio: </b>Naomi</span> Saphra is a postdoctoral researcher at NYU with Kyunghyun Cho. She is interested in NLP training dynamics: how models learn to encode linguistic patterns or other structure and how we can encode useful inductive biases into the training process. Previously, she earned a PhD from the University of Edinburgh on Training Dynamics of Neural Language Models, worked at Google and Facebook, and attended Johns Hopkins and Carnegie Mellon University. Outside of research, she plays roller derby under the name Gaussian Retribution, does standup comedy, and shepherds disabled programmers into the world of code dictation.</font></div><div><font color="#000000" face="arial, sans-serif"><br></font></div><div><font color="#000000" face="arial, sans-serif"><b>Host:</b> <a href="mailto:klivescu@ttic.edu" target="_blank">Karen Livescu</a></font></div></div></div></div></div><font color="#000000" face="arial, sans-serif"><br></font></div><div><br></div><div><br></div><div><br></div><div><div dir="ltr"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue, Rm 517</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">773-834-1757</font></i></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div></div>
</blockquote></div></div>
</blockquote></div></div>