[Theory] NOW: 3/19 Talks at TTIC: Peter West, University of Washington

Tue Mar 19 10:56:00 CDT 2024

*When:*        Tuesday, March 19, 2024 at* 11:00** a**m CT   *

*Where:       *Talk will be given *live, in-person* at

                   TTIC, 6045 S. Kenwood Avenue

                   5th Floor, Room 530

*Virtually:*   via Panopto (*livestream
<https://www.google.com/url?q=https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id%3Dc2b3423f-1057-4c5d-8ce1-b134017024ed&sa=D&source=calendar&ust=1710973448052118&usg=AOvVaw2Dc3qDb6aqjgEXUqKXGp5l>*
)

*Who: *         Peter West, University of Washington

------------------------------

*Title:          *Hidden Capabilities and Counterintuitive Limits in Large
Language Models

*Abstract: *Massive scale has been a recent winning recipe in natural
language processing and AI, with extreme-scale language models like GPT-4
receiving most attention. This is in spite of staggering energy and
monetary costs, and further, the continuing struggle of even the largest
models with concepts such as compositional problem solving and linguistic
ambiguity. In this talk, I will propose my vision for a research landscape
where compact language models share the forefront with extreme scale
models, working in concert with many pieces besides scale, such as
algorithms, knowledge, information theory, and more.

The first part of my talk will cover alternative ingredients to scale,
including (1) an inference-time algorithm that combines language models
with elements of discrete search and information theory and (2) a method
for transferring useful knowledge from extreme-scale to compact language
models with synthetically generated data. Next, I will discuss
counterintuitive disparities in the capabilities of even extreme-scale
models, which can meet or exceed human performance in some complex tasks
while trailing behind humans in what seem to be much simpler tasks.
Finally, I will discuss implications and next steps in scale-alternative
methods.

*Bio: *Peter West is a PhD candidate in the Paul G. Allen School of
Computer Science & Engineering at the University of Washington, working
with Yejin Choi. His research is focused on natural language processing and
language models, particularly combining language models with elements of
knowledge, search algorithms, and information theory to equip compact
models with new capabilities. In parallel, he studies the limits that even
extreme-scale models have yet to solve. His work has received multiple
awards, including best methods paper at NAACL 2022, and outstanding paper
awards at ACL and EMNLP in 2023. His work has been supported in part by the
NSERC PGS-D fellowship. Previously, Peter received a BSc in computer
science from the University of British Columbia.
*Host: **David McAllester* <mcallester at ttic.edu>

*Access to the *recorded* talk is limited to TTIC / UChicago (press panopto
link and sign in to your UChicago account with CNetID).

Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue, Rm 517*
*Chicago, IL  60637*
*773-834-1757*
*mmarre at ttic.edu <mmarre at ttic.edu>*

On Tue, Mar 19, 2024 at 10:28 AM Mary Marre <mmarre at ttic.edu> wrote:

> *When:*        Tuesday, March 19, 2024 at* 11:00** a**m CT   *
>
>
> *Where:       *Talk will be given *live, in-person* at
>
>                    TTIC, 6045 S. Kenwood Avenue
>
>                    5th Floor, Room 530
>
>
> *Virtually:*   via Panopto (*livestream
> <https://www.google.com/url?q=https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id%3Dc2b3423f-1057-4c5d-8ce1-b134017024ed&sa=D&source=calendar&ust=1710973448052118&usg=AOvVaw2Dc3qDb6aqjgEXUqKXGp5l>*
> )
>
>
>
> *Who: *         Peter West, University of Washington
>
>
> ------------------------------
>
> *Title:          *Hidden Capabilities and Counterintuitive Limits in
> Large Language Models
>
> *Abstract: *Massive scale has been a recent winning recipe in natural
> language processing and AI, with extreme-scale language models like GPT-4
> receiving most attention. This is in spite of staggering energy and
> monetary costs, and further, the continuing struggle of even the largest
> models with concepts such as compositional problem solving and linguistic
> ambiguity. In this talk, I will propose my vision for a research landscape
> where compact language models share the forefront with extreme scale
> models, working in concert with many pieces besides scale, such as
> algorithms, knowledge, information theory, and more.
>
> The first part of my talk will cover alternative ingredients to scale,
> including (1) an inference-time algorithm that combines language models
> with elements of discrete search and information theory and (2) a method
> for transferring useful knowledge from extreme-scale to compact language
> models with synthetically generated data. Next, I will discuss
> counterintuitive disparities in the capabilities of even extreme-scale
> models, which can meet or exceed human performance in some complex tasks
> while trailing behind humans in what seem to be much simpler tasks.
> Finally, I will discuss implications and next steps in scale-alternative
> methods.
>
> *Bio: *Peter West is a PhD candidate in the Paul G. Allen School of
> Computer Science & Engineering at the University of Washington, working
> with Yejin Choi. His research is focused on natural language processing and
> language models, particularly combining language models with elements of
> knowledge, search algorithms, and information theory to equip compact
> models with new capabilities. In parallel, he studies the limits that even
> extreme-scale models have yet to solve. His work has received multiple
> awards, including best methods paper at NAACL 2022, and outstanding paper
> awards at ACL and EMNLP in 2023. His work has been supported in part by the
> NSERC PGS-D fellowship. Previously, Peter received a BSc in computer
> science from the University of British Columbia.
> *Host: **David McAllester* <mcallester at ttic.edu>
>
> *Access to the *recorded* talk is limited to TTIC / UChicago (press
> panopto link and sign in to your UChicago account with CNetID).
>
>
> Mary C. Marre
> Faculty Administrative Support
> *Toyota Technological Institute*
> *6045 S. Kenwood Avenue, Rm 517*
> *Chicago, IL  60637*
> *773-834-1757*
> *mmarre at ttic.edu <mmarre at ttic.edu>*
>
>
> On Tue, Mar 19, 2024 at 8:30 AM Mary Marre <mmarre at ttic.edu> wrote:
>
>> *When:*        Tuesday, March 19, 2024 at* 11:00** a**m CT   *
>>
>>
>> *Where:       *Talk will be given *live, in-person* at
>>
>>                    TTIC, 6045 S. Kenwood Avenue
>>
>>                    5th Floor, Room 530
>>
>>
>> *Virtually:*   via Panopto (*livestream
>> <https://www.google.com/url?q=https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id%3Dc2b3423f-1057-4c5d-8ce1-b134017024ed&sa=D&source=calendar&ust=1710973448052118&usg=AOvVaw2Dc3qDb6aqjgEXUqKXGp5l>*
>> )
>>
>>
>>
>> *Who: *         Peter West, University of Washington
>>
>>
>> ------------------------------
>>
>> *Title:          *Hidden Capabilities and Counterintuitive Limits in
>> Large Language Models
>>
>> *Abstract: *Massive scale has been a recent winning recipe in natural
>> language processing and AI, with extreme-scale language models like GPT-4
>> receiving most attention. This is in spite of staggering energy and
>> monetary costs, and further, the continuing struggle of even the largest
>> models with concepts such as compositional problem solving and linguistic
>> ambiguity. In this talk, I will propose my vision for a research landscape
>> where compact language models share the forefront with extreme scale
>> models, working in concert with many pieces besides scale, such as
>> algorithms, knowledge, information theory, and more.
>>
>> The first part of my talk will cover alternative ingredients to scale,
>> including (1) an inference-time algorithm that combines language models
>> with elements of discrete search and information theory and (2) a method
>> for transferring useful knowledge from extreme-scale to compact language
>> models with synthetically generated data. Next, I will discuss
>> counterintuitive disparities in the capabilities of even extreme-scale
>> models, which can meet or exceed human performance in some complex tasks
>> while trailing behind humans in what seem to be much simpler tasks.
>> Finally, I will discuss implications and next steps in scale-alternative
>> methods.
>>
>> *Bio: *Peter West is a PhD candidate in the Paul G. Allen School of
>> Computer Science & Engineering at the University of Washington, working
>> with Yejin Choi. His research is focused on natural language processing and
>> language models, particularly combining language models with elements of
>> knowledge, search algorithms, and information theory to equip compact
>> models with new capabilities. In parallel, he studies the limits that even
>> extreme-scale models have yet to solve. His work has received multiple
>> awards, including best methods paper at NAACL 2022, and outstanding paper
>> awards at ACL and EMNLP in 2023. His work has been supported in part by the
>> NSERC PGS-D fellowship. Previously, Peter received a BSc in computer
>> science from the University of British Columbia.
>> *Host: **David McAllester* <mcallester at ttic.edu>
>>
>> *Access to the *recorded* talk is limited to TTIC / UChicago (press
>> panopto link and sign in to your UChicago account with CNetID).
>>
>>
>>
>>
>>
>> Mary C. Marre
>> Faculty Administrative Support
>> *Toyota Technological Institute*
>> *6045 S. Kenwood Avenue, Rm 517*
>> *Chicago, IL  60637*
>> *773-834-1757*
>> *mmarre at ttic.edu <mmarre at ttic.edu>*
>>
>>
>> On Mon, Mar 18, 2024 at 1:00 PM Mary Marre <mmarre at ttic.edu> wrote:
>>
>>> *When:*        Tuesday, March 19, 2024 at* 11:00** a**m CT   *
>>>
>>>
>>> *Where:       *Talk will be given *live, in-person* at
>>>
>>>                    TTIC, 6045 S. Kenwood Avenue
>>>
>>>                    5th Floor, Room 530
>>>
>>>
>>> *Virtually:*   via Panopto (*livestream
>>> <https://www.google.com/url?q=https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id%3Dc2b3423f-1057-4c5d-8ce1-b134017024ed&sa=D&source=calendar&ust=1710973448052118&usg=AOvVaw2Dc3qDb6aqjgEXUqKXGp5l>*
>>> )
>>>
>>>
>>>
>>> *Who: *         Peter West, University of Washington
>>>
>>>
>>> ------------------------------
>>>
>>> *Title:          *Hidden Capabilities and Counterintuitive Limits in
>>> Large Language Models
>>>
>>> *Abstract: *Massive scale has been a recent winning recipe in natural
>>> language processing and AI, with extreme-scale language models like GPT-4
>>> receiving most attention. This is in spite of staggering energy and
>>> monetary costs, and further, the continuing struggle of even the largest
>>> models with concepts such as compositional problem solving and linguistic
>>> ambiguity. In this talk, I will propose my vision for a research landscape
>>> where compact language models share the forefront with extreme scale
>>> models, working in concert with many pieces besides scale, such as
>>> algorithms, knowledge, information theory, and more.
>>>
>>> The first part of my talk will cover alternative ingredients to scale,
>>> including (1) an inference-time algorithm that combines language models
>>> with elements of discrete search and information theory and (2) a method
>>> for transferring useful knowledge from extreme-scale to compact language
>>> models with synthetically generated data. Next, I will discuss
>>> counterintuitive disparities in the capabilities of even extreme-scale
>>> models, which can meet or exceed human performance in some complex tasks
>>> while trailing behind humans in what seem to be much simpler tasks.
>>> Finally, I will discuss implications and next steps in scale-alternative
>>> methods.
>>>
>>> *Bio: *Peter West is a PhD candidate in the Paul G. Allen School of
>>> Computer Science & Engineering at the University of Washington, working
>>> with Yejin Choi. His research is focused on natural language processing and
>>> language models, particularly combining language models with elements of
>>> knowledge, search algorithms, and information theory to equip compact
>>> models with new capabilities. In parallel, he studies the limits that even
>>> extreme-scale models have yet to solve. His work has received multiple
>>> awards, including best methods paper at NAACL 2022, and outstanding paper
>>> awards at ACL and EMNLP in 2023. His work has been supported in part by the
>>> NSERC PGS-D fellowship. Previously, Peter received a BSc in computer
>>> science from the University of British Columbia.
>>> *Host: **David McAllester* <mcallester at ttic.edu>
>>>
>>>
>>>
>>>
>>> Mary C. Marre
>>> Faculty Administrative Support
>>> *Toyota Technological Institute*
>>> *6045 S. Kenwood Avenue, Rm 517*
>>> *Chicago, IL  60637*
>>> *773-834-1757*
>>> *mmarre at ttic.edu <mmarre at ttic.edu>*
>>>
>>>
>>> On Thu, Mar 14, 2024 at 6:19 PM Mary Marre <mmarre at ttic.edu> wrote:
>>>
>>>> *When:*        Tuesday, March 19, 2024 at* 11:00** a**m CT   *
>>>>
>>>>
>>>> *Where:       *Talk will be given *live, in-person* at
>>>>
>>>>                    TTIC, 6045 S. Kenwood Avenue
>>>>
>>>>                    5th Floor, Room 530
>>>>
>>>>
>>>> *Virtually:*   *tba*
>>>>
>>>>
>>>>
>>>> *Who: *         Peter West, University of Washington
>>>>
>>>>
>>>> ------------------------------
>>>>
>>>> *Title:          *Hidden Capabilities and Counterintuitive Limits in
>>>> Large Language Models
>>>>
>>>> *Abstract: *Massive scale has been a recent winning recipe in natural
>>>> language processing and AI, with extreme-scale language models like GPT-4
>>>> receiving most attention. This is in spite of staggering energy and
>>>> monetary costs, and further, the continuing struggle of even the largest
>>>> models with concepts such as compositional problem solving and linguistic
>>>> ambiguity. In this talk, I will propose my vision for a research landscape
>>>> where compact language models share the forefront with extreme scale
>>>> models, working in concert with many pieces besides scale, such as
>>>> algorithms, knowledge, information theory, and more.
>>>>
>>>> The first part of my talk will cover alternative ingredients to scale,
>>>> including (1) an inference-time algorithm that combines language models
>>>> with elements of discrete search and information theory and (2) a method
>>>> for transferring useful knowledge from extreme-scale to compact language
>>>> models with synthetically generated data. Next, I will discuss
>>>> counterintuitive disparities in the capabilities of even extreme-scale
>>>> models, which can meet or exceed human performance in some complex tasks
>>>> while trailing behind humans in what seem to be much simpler tasks.
>>>> Finally, I will discuss implications and next steps in scale-alternative
>>>> methods.
>>>>
>>>> *Bio: *Peter West is a PhD candidate in the Paul G. Allen School of
>>>> Computer Science & Engineering at the University of Washington, working
>>>> with Yejin Choi. His research is focused on natural language processing and
>>>> language models, particularly combining language models with elements of
>>>> knowledge, search algorithms, and information theory to equip compact
>>>> models with new capabilities. In parallel, he studies the limits that even
>>>> extreme-scale models have yet to solve. His work has received multiple
>>>> awards, including best methods paper at NAACL 2022, and outstanding paper
>>>> awards at ACL and EMNLP in 2023. His work has been supported in part by the
>>>> NSERC PGS-D fellowship. Previously, Peter received a BSc in computer
>>>> science from the University of British Columbia.
>>>> *Host: **David McAllester* <mcallester at ttic.edu>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Mary C. Marre
>>>> Faculty Administrative Support
>>>> *Toyota Technological Institute*
>>>> *6045 S. Kenwood Avenue, Rm 517*
>>>> *Chicago, IL  60637*
>>>> *773-834-1757*
>>>> *mmarre at ttic.edu <mmarre at ttic.edu>*
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20240319/7978e827/attachment-0001.html>