<div dir="ltr"><div dir="ltr"><div class="gmail_default" style="font-size:small"><div dir="ltr"><div><b style="font-family:verdana,sans-serif;font-size:large;color:rgb(80,0,80);background-color:rgb(207,226,243)">Thesis Defense: Shubham Toshniwal, TTIC</b><br></div></div><div dir="ltr"><div><div style="color:rgb(80,0,80);font-family:arial,helvetica,sans-serif"><br></div><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">When: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Wednesday, August 24th at </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap;background-color:rgb(255,255,0)"><b>12:00 - 2:00 pm CT</b></span></font></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><span id="gmail-m_-5414898035933464851gmail-m_8258976848810308636gmail-m_-4777337093436806164gmail-m_8936831616554927923gmail-m_-6065660192932035087gmail-docs-internal-guid-b1ddca98-7fff-5c00-95c7-64a7230be94a"><font face="arial, sans-serif"><br></font></span></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Where</span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">: Talk will be given </span><span style="font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><font color="#0000ff"><u>live, in-person</u></font></b></span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> at</span></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif"> TTIC, 6045 S. Kenwood Avenue</font></span></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif"> 5th Floor, Room 530 </font></span></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Virtually: </span><span style="font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font color="#0000ff"><a href="https://uchicagogroup.zoom.us/meeting/register/tJUtcOqtrj0uHdSu3hNQyRXrEkZis5pz3vBa" target="_blank">attend virtually here</a></font></span></font></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Who: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Shubham Toshniwal, TTIC</span></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><br></span></font></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Thesis Title: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Efficient and Interpretable Neural Models for Entity Tracking</span></font></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Abstract: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">What would it take for a natural language model to understand a novel, such as The Lord of the Rings? Among other things, such a model must be able to: (a) identify and record new characters (entities) and their attributes as they are introduced in the text, and (b) identify subsequent references to the characters previously introduced and update their attributes. This problem of entity tracking is essential for language understanding, and thus, useful for a wide array of downstream applications in NLP such as question-answering, summarization, etc. </span></font></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">In this thesis, we focus on two key problems in relation to facilitating the use of entity tracking models: (i) scaling entity tracking models to long documents, such as a novel, and (ii) integrating entity tracking into language models. Applying language technologies to long documents has garnered interest recently, but computational reasons are a significant bottleneck in scaling up current methods. In this thesis, we argue that computationally efficient entity tracking models can be developed by representing entities with rich, fixed-dimensional vector representations derived from pretrained language models, and by exploiting the ephemeral nature of entities. We also argue for the integration of entity tracking into language models as it will allow for: (i) a wider application given the current ubiquitous use of pretrained language models in NLP applications, and (ii) an easier adoption since it is much easier to swap in a new pretrained language model than integrating a separate standalone entity tracking model. </font></span></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">The thesis is divided into two parts. In the first half of this thesis, we focus on a specific class of entity tracking problem referred to as coreference resolution. The goal here is to identify text spans referring to the same entity. We propose memory models where the external memory module is trained to explicitly track the entities mentioned in the text. We first discuss a sparsely supervised memory model for the pronoun resolution task. This model outperformed prior work on both the end task and the interpretability measures at the time of its introduction. We then adapt this memory model for the full coreference resolution task. The proposed memory models can effectively scale to long documents, and in particular, the proposed bounded memory model offers a linear runtime complexity in document length while remaining competitive with the state-of-the-art models. Next, we test the presented models for their generalization capability, specifically their zero-shot evaluation on other coreference benchmarks. We find that domain shift is a challenge in coreference resolution though annotation differences across datasets partly exaggerate this challenge. We also find that joint training on multiple datasets moderately alleviates the domain shift challenge. Finally, the presented models have achieved state-of-the-art performance on multiple coreference benchmarks. </font></span></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">In the latter half, we focus on integrating entity tracking capability into neural language models. As a first step, we propose the task of language modeling for the game of chess to evaluate the entity tracking capabilities of transformer LMs. Our experiments on chess suggest that augmenting LM training instances with board state information (represented as text tokens) aids state tracking and language modeling performance. Training LMs with state-augmented instances also allows probing for entity state at inference time simply via prompting. Next, we extend these findings from chess to natural language. We first experiment in a closed domain where we show that state-augmented training improves the state tracking performance and the text generation quality. Finally, we adapt the state-augmented training for baking coreference knowledge into natural language models and show improvements on a popular cloze task.</font></span></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Thesis Committee</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">: </span><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><font color="#0000ff">Kevin Gimpel</font></b><font color="#000000"> (</font></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Thesis Advisor</b></span><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font color="#000000">), </font><b><font color="#0000ff">Karen Livescu </font></b><font color="#000000">(</font></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Thesis Advisor</b></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">),</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> </span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Sam Wiseman, Kenton Lee, Yejin Choi</span></font></p><br></div><div><br></div></div></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Aug 24, 2022 at 11:17 AM Mary Marre <<a href="mailto:mmarre@ttic.edu">mmarre@ttic.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div style="font-size:small"><div dir="ltr"><div><b style="font-family:verdana,sans-serif;font-size:large;color:rgb(80,0,80);background-color:rgb(207,226,243)">Thesis Defense: Shubham Toshniwal, TTIC</b><br></div></div><div dir="ltr"><div><div style="color:rgb(80,0,80);font-family:arial,helvetica,sans-serif"><br></div><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">When: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Wednesday, August 24th at </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap;background-color:rgb(255,255,0)"><b>12:00 - 2:00 pm CT</b></span></font></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><span id="gmail-m_-5414898035933464851gmail-m_8258976848810308636gmail-m_-4777337093436806164gmail-m_8936831616554927923gmail-m_-6065660192932035087gmail-docs-internal-guid-b1ddca98-7fff-5c00-95c7-64a7230be94a"><font face="arial, sans-serif"><br></font></span></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Where</span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">: Talk will be given </span><span style="font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><font color="#0000ff"><u>live, in-person</u></font></b></span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> at</span></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif"> TTIC, 6045 S. Kenwood Avenue</font></span></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif"> 5th Floor, Room 530 </font></span></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Virtually: </span><span style="font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font color="#0000ff"><a href="https://uchicagogroup.zoom.us/meeting/register/tJUtcOqtrj0uHdSu3hNQyRXrEkZis5pz3vBa" target="_blank">attend virtually here</a></font></span></font></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Who: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Shubham Toshniwal, TTIC</span></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><br></span></font></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Thesis Title: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Efficient and Interpretable Neural Models for Entity Tracking</span></font></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Abstract: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">What would it take for a natural language model to understand a novel, such as The Lord of the Rings? Among other things, such a model must be able to: (a) identify and record new characters (entities) and their attributes as they are introduced in the text, and (b) identify subsequent references to the characters previously introduced and update their attributes. This problem of entity tracking is essential for language understanding, and thus, useful for a wide array of downstream applications in NLP such as question-answering, summarization, etc. </span></font></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">In this thesis, we focus on two key problems in relation to facilitating the use of entity tracking models: (i) scaling entity tracking models to long documents, such as a novel, and (ii) integrating entity tracking into language models. Applying language technologies to long documents has garnered interest recently, but computational reasons are a significant bottleneck in scaling up current methods. In this thesis, we argue that computationally efficient entity tracking models can be developed by representing entities with rich, fixed-dimensional vector representations derived from pretrained language models, and by exploiting the ephemeral nature of entities. We also argue for the integration of entity tracking into language models as it will allow for: (i) a wider application given the current ubiquitous use of pretrained language models in NLP applications, and (ii) an easier adoption since it is much easier to swap in a new pretrained language model than integrating a separate standalone entity tracking model. </font></span></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">The thesis is divided into two parts. In the first half of this thesis, we focus on a specific class of entity tracking problem referred to as coreference resolution. The goal here is to identify text spans referring to the same entity. We propose memory models where the external memory module is trained to explicitly track the entities mentioned in the text. We first discuss a sparsely supervised memory model for the pronoun resolution task. This model outperformed prior work on both the end task and the interpretability measures at the time of its introduction. We then adapt this memory model for the full coreference resolution task. The proposed memory models can effectively scale to long documents, and in particular, the proposed bounded memory model offers a linear runtime complexity in document length while remaining competitive with the state-of-the-art models. Next, we test the presented models for their generalization capability, specifically their zero-shot evaluation on other coreference benchmarks. We find that domain shift is a challenge in coreference resolution though annotation differences across datasets partly exaggerate this challenge. We also find that joint training on multiple datasets moderately alleviates the domain shift challenge. Finally, the presented models have achieved state-of-the-art performance on multiple coreference benchmarks. </font></span></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">In the latter half, we focus on integrating entity tracking capability into neural language models. As a first step, we propose the task of language modeling for the game of chess to evaluate the entity tracking capabilities of transformer LMs. Our experiments on chess suggest that augmenting LM training instances with board state information (represented as text tokens) aids state tracking and language modeling performance. Training LMs with state-augmented instances also allows probing for entity state at inference time simply via prompting. Next, we extend these findings from chess to natural language. We first experiment in a closed domain where we show that state-augmented training improves the state tracking performance and the text generation quality. Finally, we adapt the state-augmented training for baking coreference knowledge into natural language models and show improvements on a popular cloze task.</font></span></p><p class="MsoNormal" style="margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="color:rgb(80,0,80);line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Thesis Committee</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">: </span><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><font color="#0000ff">Kevin Gimpel</font></b><font color="#000000"> (</font></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Thesis Advisor</b></span><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font color="#000000">), </font><b><font color="#0000ff">Karen Livescu </font></b><font color="#000000">(</font></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Thesis Advisor</b></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">),</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> </span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Sam Wiseman, Kenton Lee, Yejin Choi</span></font></p><br></div><div><br></div><div><br></div><div><br></div></div></div><div><div dir="ltr"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Aug 23, 2022 at 11:32 PM Mary Marre <<a href="mailto:mmarre@ttic.edu" target="_blank">mmarre@ttic.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div style="font-size:small"><div dir="ltr"><div><b style="font-family:verdana,sans-serif;font-size:large;color:rgb(80,0,80);background-color:rgb(207,226,243)"><span>Thesis</span> Defense: Shubham Toshniwal, TTIC</b><br></div></div><div dir="ltr"><div><div style="color:rgb(80,0,80);font-family:arial,helvetica,sans-serif"><br></div><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">When: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Wednesday, August 24th at </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap;background-color:rgb(255,255,0)"><b>12:00 - 2:00 pm CT</b></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><span id="gmail-m_-5414898035933464851gmail-m_8258976848810308636gmail-m_-4777337093436806164gmail-m_8936831616554927923gmail-m_-6065660192932035087gmail-docs-internal-guid-b1ddca98-7fff-5c00-95c7-64a7230be94a"><font face="arial, sans-serif"><br></font></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Where</span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">: Talk will be given </span><span style="font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><font color="#0000ff"><u>live, in-person</u></font></b></span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> at</span></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif"> TTIC, 6045 S. Kenwood Avenue</font></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif"> 5th Floor, Room 530 </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Virtually: </span><span style="font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font color="#0000ff"><a href="https://uchicagogroup.zoom.us/meeting/register/tJUtcOqtrj0uHdSu3hNQyRXrEkZis5pz3vBa" target="_blank">attend virtually here</a></font></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Who: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Shubham Toshniwal, TTIC</span></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><br></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><span>Thesis</span> Title: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Efficient and Interpretable Neural Models for Entity Tracking</span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Abstract: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">What would it take for a natural language model to understand a novel, such as The Lord of the Rings? Among other things, such a model must be able to: (a) identify and record new characters (entities) and their attributes as they are introduced in the text, and (b) identify subsequent references to the characters previously introduced and update their attributes. This problem of entity tracking is essential for language understanding, and thus, useful for a wide array of downstream applications in NLP such as question-answering, summarization, etc. </span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">In this <span>thesis</span>, we focus on two key problems in relation to facilitating the use of entity tracking models: (i) scaling entity tracking models to long documents, such as a novel, and (ii) integrating entity tracking into language models. Applying language technologies to long documents has garnered interest recently, but computational reasons are a significant bottleneck in scaling up current methods. In this <span>thesis</span>, we argue that computationally efficient entity tracking models can be developed by representing entities with rich, fixed-dimensional vector representations derived from pretrained language models, and by exploiting the ephemeral nature of entities. We also argue for the integration of entity tracking into language models as it will allow for: (i) a wider application given the current ubiquitous use of pretrained language models in NLP applications, and (ii) an easier adoption since it is much easier to swap in a new pretrained language model than integrating a separate standalone entity tracking model. </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">The <span>thesis</span> is divided into two parts. In the first half of this <span>thesis</span>, we focus on a specific class of entity tracking problem referred to as coreference resolution. The goal here is to identify text spans referring to the same entity. We propose memory models where the external memory module is trained to explicitly track the entities mentioned in the text. We first discuss a sparsely supervised memory model for the pronoun resolution task. This model outperformed prior work on both the end task and the interpretability measures at the time of its introduction. We then adapt this memory model for the full coreference resolution task. The proposed memory models can effectively scale to long documents, and in particular, the proposed bounded memory model offers a linear runtime complexity in document length while remaining competitive with the state-of-the-art models. Next, we test the presented models for their generalization capability, specifically their zero-shot evaluation on other coreference benchmarks. We find that domain shift is a challenge in coreference resolution though annotation differences across datasets partly exaggerate this challenge. We also find that joint training on multiple datasets moderately alleviates the domain shift challenge. Finally, the presented models have achieved state-of-the-art performance on multiple coreference benchmarks. </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">In the latter half, we focus on integrating entity tracking capability into neural language models. As a first step, we propose the task of language modeling for the game of chess to evaluate the entity tracking capabilities of transformer LMs. Our experiments on chess suggest that augmenting LM training instances with board state information (represented as text tokens) aids state tracking and language modeling performance. Training LMs with state-augmented instances also allows probing for entity state at inference time simply via prompting. Next, we extend these findings from chess to natural language. We first experiment in a closed domain where we show that state-augmented training improves the state tracking performance and the text generation quality. Finally, we adapt the state-augmented training for baking coreference knowledge into natural language models and show improvements on a popular cloze task.</font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><span>Thesis</span> Committee</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">: </span><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><font color="#0000ff">Kevin Gimpel</font></b><font color="#000000"> (</font></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><span>Thesis</span> Advisor</b></span><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font color="#000000">), </font><b><font color="#0000ff">Karen Livescu </font></b><font color="#000000">(</font></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><span>Thesis</span> Advisor</b></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">),</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> </span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Sam Wiseman, Kenton Lee, Yejin Choi</span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p></div><span style="color:rgb(80,0,80)"><div><div style="color:rgb(34,34,34)"><font color="#888888" face="arial, sans-serif"><div><br></div></font></div></div></span></div></div><div><div dir="ltr"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sun, Aug 21, 2022 at 11:50 AM Mary Marre <<a href="mailto:mmarre@ttic.edu" target="_blank">mmarre@ttic.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div style="font-size:small"><div dir="ltr"><div><b style="font-family:verdana,sans-serif;font-size:large;color:rgb(80,0,80);background-color:rgb(207,226,243)">Thesis Defense: <span>Shubham</span> Toshniwal, TTIC</b><br></div></div><div dir="ltr"><div><div style="color:rgb(80,0,80);font-family:arial,helvetica,sans-serif"><br></div><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">When: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Wednesday, August 24th at </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap;background-color:rgb(255,255,0)"><b>12:00 - 2:00 pm CT</b></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><span id="gmail-m_-5414898035933464851gmail-m_8258976848810308636gmail-m_-4777337093436806164gmail-m_8936831616554927923gmail-m_-6065660192932035087gmail-docs-internal-guid-b1ddca98-7fff-5c00-95c7-64a7230be94a"><font face="arial, sans-serif"><br></font></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Where</span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">: Talk will be given </span><span style="font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><font color="#0000ff"><u>live, in-person</u></font></b></span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> at</span></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif"> TTIC, 6045 S. Kenwood Avenue</font></span></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif"> 5th Floor, Room 530 </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Virtually: </span><span style="font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font color="#0000ff"><a href="https://uchicagogroup.zoom.us/meeting/register/tJUtcOqtrj0uHdSu3hNQyRXrEkZis5pz3vBa" target="_blank">attend virtually here</a></font></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Who: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><span>Shubham</span> Toshniwal, TTIC</span></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><br></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Thesis Title: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Efficient and Interpretable Neural Models for Entity Tracking</span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Abstract: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">What would it take for a natural language model to understand a novel, such as The Lord of the Rings? Among other things, such a model must be able to: (a) identify and record new characters (entities) and their attributes as they are introduced in the text, and (b) identify subsequent references to the characters previously introduced and update their attributes. This problem of entity tracking is essential for language understanding, and thus, useful for a wide array of downstream applications in NLP such as question-answering, summarization, etc. </span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">In this thesis, we focus on two key problems in relation to facilitating the use of entity tracking models: (i) scaling entity tracking models to long documents, such as a novel, and (ii) integrating entity tracking into language models. Applying language technologies to long documents has garnered interest recently, but computational reasons are a significant bottleneck in scaling up current methods. In this thesis, we argue that computationally efficient entity tracking models can be developed by representing entities with rich, fixed-dimensional vector representations derived from pretrained language models, and by exploiting the ephemeral nature of entities. We also argue for the integration of entity tracking into language models as it will allow for: (i) a wider application given the current ubiquitous use of pretrained language models in NLP applications, and (ii) an easier adoption since it is much easier to swap in a new pretrained language model than integrating a separate standalone entity tracking model. </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">The thesis is divided into two parts. In the first half of this thesis, we focus on a specific class of entity tracking problem referred to as coreference resolution. The goal here is to identify text spans referring to the same entity. We propose memory models where the external memory module is trained to explicitly track the entities mentioned in the text. We first discuss a sparsely supervised memory model for the pronoun resolution task. This model outperformed prior work on both the end task and the interpretability measures at the time of its introduction. We then adapt this memory model for the full coreference resolution task. The proposed memory models can effectively scale to long documents, and in particular, the proposed bounded memory model offers a linear runtime complexity in document length while remaining competitive with the state-of-the-art models. Next, we test the presented models for their generalization capability, specifically their zero-shot evaluation on other coreference benchmarks. We find that domain shift is a challenge in coreference resolution though annotation differences across datasets partly exaggerate this challenge. We also find that joint training on multiple datasets moderately alleviates the domain shift challenge. Finally, the presented models have achieved state-of-the-art performance on multiple coreference benchmarks. </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">In the latter half, we focus on integrating entity tracking capability into neural language models. As a first step, we propose the task of language modeling for the game of chess to evaluate the entity tracking capabilities of transformer LMs. Our experiments on chess suggest that augmenting LM training instances with board state information (represented as text tokens) aids state tracking and language modeling performance. Training LMs with state-augmented instances also allows probing for entity state at inference time simply via prompting. Next, we extend these findings from chess to natural language. We first experiment in a closed domain where we show that state-augmented training improves the state tracking performance and the text generation quality. Finally, we adapt the state-augmented training for baking coreference knowledge into natural language models and show improvements on a popular cloze task.</font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Thesis Committee</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">: </span><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><font color="#0000ff">Kevin Gimpel</font></b><font color="#000000"> (</font></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Thesis Advisor</b></span><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font color="#000000">), </font><b><font color="#0000ff">Karen Livescu </font></b><font color="#000000">(</font></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Thesis Advisor</b></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">),</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> </span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Sam Wiseman, Kenton Lee, Yejin Choi</span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p></div><span style="color:rgb(80,0,80)"><div><div style="color:rgb(34,34,34)"><font color="#888888" face="arial, sans-serif"><div><br></div></font></div></div></span></div></div><div><div dir="ltr"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Aug 13, 2022 at 9:47 AM Mary Marre <<a href="mailto:mmarre@ttic.edu" target="_blank">mmarre@ttic.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div dir="ltr" style="font-size:small"><div><b style="font-family:verdana,sans-serif;font-size:large;color:rgb(80,0,80);background-color:rgb(207,226,243)">Thesis Defense: Shubham Toshniwal, TTIC</b><br></div></div><div dir="ltr"><div><div style="font-size:small;color:rgb(80,0,80);font-family:arial,helvetica,sans-serif"><br></div><p dir="ltr" style="font-size:small;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">When: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Wednesday, August 24th at </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap;background-color:rgb(255,255,0)"><b>12:00 - 2:00 pm CT</b></span></font></p><p class="MsoNormal" style="font-size:small;margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><span id="gmail-m_-5414898035933464851gmail-m_8258976848810308636gmail-m_-4777337093436806164gmail-m_8936831616554927923gmail-m_-6065660192932035087gmail-docs-internal-guid-b1ddca98-7fff-5c00-95c7-64a7230be94a"><font face="arial, sans-serif"><br></font></span></p><p dir="ltr" style="font-size:small;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Where</span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">: Talk will be given </span><span style="font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><font color="#0000ff"><u>live, in-person</u></font></b></span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> at</span></font></p><p dir="ltr" style="font-size:small;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif"> TTIC, 6045 S. Kenwood Avenue</font></span></p><p dir="ltr" style="font-size:small;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif"> 5th Floor, Room 530 </font></span></p><p class="MsoNormal" style="font-size:small;margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="font-size:small;line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Virtually: </span><span style="font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font color="#0000ff"><a href="https://uchicagogroup.zoom.us/meeting/register/tJUtcOqtrj0uHdSu3hNQyRXrEkZis5pz3vBa" target="_blank">attend virtually here</a></font></span></font></p><p class="MsoNormal" style="font-size:small;margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Who: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Shubham Toshniwal, TTIC</span></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><br></span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Thesis Title: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Efficient and Interpretable Neural Models for Entity Tracking</span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Abstract: </span><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">What would it take for a natural language model to understand a novel, such as The Lord of the Rings? Among other things, such a model must be able to: (a) identify and record new characters (entities) and their attributes as they are introduced in the text, and (b) identify subsequent references to the characters previously introduced and update their attributes. This problem of entity tracking is essential for language understanding, and thus, useful for a wide array of downstream applications in NLP such as question-answering, summarization, etc. </span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">In this thesis, we focus on two key problems in relation to facilitating the use of entity tracking models: (i) scaling entity tracking models to long documents, such as a novel, and (ii) integrating entity tracking into language models. Applying language technologies to long documents has garnered interest recently, but computational reasons are a significant bottleneck in scaling up current methods. In this thesis, we argue that computationally efficient entity tracking models can be developed by representing entities with rich, fixed-dimensional vector representations derived from pretrained language models, and by exploiting the ephemeral nature of entities. We also argue for the integration of entity tracking into language models as it will allow for: (i) a wider application given the current ubiquitous use of pretrained language models in NLP applications, and (ii) an easier adoption since it is much easier to swap in a new pretrained language model than integrating a separate standalone entity tracking model. </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">The thesis is divided into two parts. In the first half of this thesis, we focus on a specific class of entity tracking problem referred to as coreference resolution. The goal here is to identify text spans referring to the same entity. We propose memory models where the external memory module is trained to explicitly track the entities mentioned in the text. We first discuss a sparsely supervised memory model for the pronoun resolution task. This model outperformed prior work on both the end task and the interpretability measures at the time of its introduction. We then adapt this memory model for the full coreference resolution task. The proposed memory models can effectively scale to long documents, and in particular, the proposed bounded memory model offers a linear runtime complexity in document length while remaining competitive with the state-of-the-art models. Next, we test the presented models for their generalization capability, specifically their zero-shot evaluation on other coreference benchmarks. We find that domain shift is a challenge in coreference resolution though annotation differences across datasets partly exaggerate this challenge. We also find that joint training on multiple datasets moderately alleviates the domain shift challenge. Finally, the presented models have achieved state-of-the-art performance on multiple coreference benchmarks. </font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><span style="color:rgb(0,0,0);font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font face="arial, sans-serif">In the latter half, we focus on integrating entity tracking capability into neural language models. As a first step, we propose the task of language modeling for the game of chess to evaluate the entity tracking capabilities of transformer LMs. Our experiments on chess suggest that augmenting LM training instances with board state information (represented as text tokens) aids state tracking and language modeling performance. Training LMs with state-augmented instances also allows probing for entity state at inference time simply via prompting. Next, we extend these findings from chess to natural language. We first experiment in a closed domain where we show that state-augmented training improves the state tracking performance and the text generation quality. Finally, we adapt the state-augmented training for baking coreference knowledge into natural language models and show improvements on a popular cloze task.</font></span></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p><p dir="ltr" style="line-height:1.38;margin-top:0pt;margin-bottom:0pt"><font face="arial, sans-serif"><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Thesis Committee</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">: </span><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b><font color="#0000ff">Kevin Gimpel</font></b><font color="#000000"> (</font></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Thesis Advisor</b></span><span style="background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><font color="#000000">), </font><b><font color="#0000ff">Karen Livescu </font></b><font color="#000000">(</font></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"><b>Thesis Advisor</b></span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">),</span><span style="color:rgb(0,0,0);background-color:transparent;font-weight:700;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap"> </span><span style="color:rgb(0,0,0);background-color:transparent;font-variant-numeric:normal;font-variant-east-asian:normal;vertical-align:baseline;white-space:pre-wrap">Sam Wiseman, Kenton Lee, Yejin Choi</span></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><br></font></p></div><span style="color:rgb(80,0,80)"><div><div style="color:rgb(34,34,34)"><font color="#888888" face="arial, sans-serif"><div><br></div></font></div></div></span></div><br></div><div><div dir="ltr"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div></div>
</blockquote></div></div>
</blockquote></div></div>
</blockquote></div></div>
</blockquote></div></div>