<div dir="ltr"><div class="gmail_default" style="font-size:small"><div dir="ltr" style=""><div class="gmail_default" style=""><font face="arial, sans-serif" style=""><font style="color:rgb(80,0,80);vertical-align:inherit"><font style="vertical-align:inherit"><b>When:</b>    </font></font><font style="vertical-align:inherit"><font style="vertical-align:inherit"><font color="#500050">  </font><font color="#000000">Wednesday, March 3rd</font><font color="#500050"> at</font><b style="color:rgb(80,0,80)"> 11:10 am CT</b></font></font><br></font></div></div><div dir="ltr" style="color:rgb(80,0,80)"><p class="MsoNormal" style="margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"> </font></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><font style="vertical-align:inherit"><font style="vertical-align:inherit"><b>Where:</b>     </font></font><font color="#000000">Zoom Virtual Talk (</font><b><font color="#0000ff"><a href="https://uchicagogroup.zoom.us/webinar/register/WN_Sbr9rWroR2i3L_0U8Mt6-w">register in advance here</a></font></b><font color="#000000">)</font></font></p><p class="MsoNormal" style="margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"> </font></p></div><div class="gmail_default" style="color:rgb(80,0,80)"><font face="arial, sans-serif"><font style="vertical-align:inherit"><font style="vertical-align:inherit"><b>Who: </b>       </font></font></font><span style="color:rgb(34,34,34)">Kartik Goyal, Carnegie Mellon</span></div></div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style=""><div style=""><span id="gmail-m_-3716854320400901739gmail-docs-internal-guid-95fbe0e5-7fff-e4ff-f92a-77d514380bc7" style=""><font face="arial, sans-serif" style=""><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Title</b></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>:</b>        </span><span style="background-color:transparent;color:rgb(0,0,0);white-space:pre-wrap">Revisiting Training and Decoding in Neural Sequence Models</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="background-color:transparent;white-space:pre-wrap;color:rgb(0,0,0)"><br></span></p></font></span></div><div style=""><span id="gmail-m_-3716854320400901739gmail-docs-internal-guid-5c66b4a7-7fff-80f2-1bb6-247ca56f2a05"><font face="arial, sans-serif"><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Abstract</b></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>:</b> </span><span style="background-color:transparent;color:rgb(0,0,0);white-space:pre-wrap">Commonly prevalent locally-normalized autoregressive neural sequence models, while being highly effective for various NLP tasks, suffer from optimization issues including, but not limited to exposure bias, label bias, and inadequate representation of context. Furthermore during decoding, they yield degenerate sequences and exhibit poor calibration. To address these issues, first I describe our work on designing differentiable training procedures for neural sequence models that take into account the methods used for decoding with these models like local argmax, sampling, and beam search. Then, I discuss the potential of globally-normalized models to ameliorate these issues and describe their relationship to the highly effective masked token reconstruction objective for training neural sequence models. Specifically, I describe our scheme inspired by Metropolis Hastings Monte Carlo that enables drawing of representative samples from these masked language models.</span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><br></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Bio</b></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>:</b> </span><span class="gmail-il" style="background-color:transparent;color:rgb(0,0,0);white-space:pre-wrap">Kartik</span><span style="background-color:transparent;color:rgb(0,0,0);white-space:pre-wrap"> Goyal is a PhD candidate at Language Technologies Institute, Carnegie Mellon University, where he is </span>coadvised<span style="background-color:transparent;color:rgb(0,0,0);white-space:pre-wrap"> by Chris Dyer and Taylor Berg-Kirkpatrick. He is interested in designing statistical training and inference procedures for modelling artifacts with rich latent structure to address research questions in Natural Language Processing and Digital Humanities.</span></p></font></span><div class="gmail-yj6qo gmail-ajU" style="outline:none;padding:10px 0px;width:22px;margin:2px 0px 0px"><font face="arial, sans-serif"><br class="gmail-Apple-interchange-newline"></font></div></div><font face="arial, sans-serif"><b>Host: </b> <a href="mailto:kgimpel@ttic.edu">Kevin Gimpel</a></font></div><div class="gmail_default" style=""><br></div><div class="gmail_default" style=""><br></div><div class="gmail_default" style=""><br></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><font face="arial, helvetica, sans-serif">Mary C. Marre</font><div><font face="arial, helvetica, sans-serif">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6">6045 S. Kenwood Avenue</font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Room 517</font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL  60637</font></i></div><div><i><font face="arial, helvetica, sans-serif">p:(773) 834-1757</font></i></div><div><i><font face="arial, helvetica, sans-serif">f: (773) 357-6970</font></i></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div></div>