[Theory] 3/19 Talks at TTIC: Peter West, University of Washington

Mary Marre mmarre at ttic.edu
Thu Mar 14 18:19:28 CDT 2024


*When:*        Tuesday, March 19, 2024 at* 11:00** a**m CT   *


*Where:       *Talk will be given *live, in-person* at

                   TTIC, 6045 S. Kenwood Avenue

                   5th Floor, Room 530


*Virtually:*   *tba*



*Who: *         Peter West, University of Washington


------------------------------

*Title:          *Hidden Capabilities and Counterintuitive Limits in Large
Language Models

*Abstract: *Massive scale has been a recent winning recipe in natural
language processing and AI, with extreme-scale language models like GPT-4
receiving most attention. This is in spite of staggering energy and
monetary costs, and further, the continuing struggle of even the largest
models with concepts such as compositional problem solving and linguistic
ambiguity. In this talk, I will propose my vision for a research landscape
where compact language models share the forefront with extreme scale
models, working in concert with many pieces besides scale, such as
algorithms, knowledge, information theory, and more.

The first part of my talk will cover alternative ingredients to scale,
including (1) an inference-time algorithm that combines language models
with elements of discrete search and information theory and (2) a method
for transferring useful knowledge from extreme-scale to compact language
models with synthetically generated data. Next, I will discuss
counterintuitive disparities in the capabilities of even extreme-scale
models, which can meet or exceed human performance in some complex tasks
while trailing behind humans in what seem to be much simpler tasks.
Finally, I will discuss implications and next steps in scale-alternative
methods.

*Bio: *Peter West is a PhD candidate in the Paul G. Allen School of
Computer Science & Engineering at the University of Washington, working
with Yejin Choi. His research is focused on natural language processing and
language models, particularly combining language models with elements of
knowledge, search algorithms, and information theory to equip compact
models with new capabilities. In parallel, he studies the limits that even
extreme-scale models have yet to solve. His work has received multiple
awards, including best methods paper at NAACL 2022, and outstanding paper
awards at ACL and EMNLP in 2023. His work has been supported in part by the
NSERC PGS-D fellowship. Previously, Peter received a BSc in computer
science from the University of British Columbia.
*Host: **David McAllester* <mcallester at ttic.edu>





Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue, Rm 517*
*Chicago, IL  60637*
*773-834-1757*
*mmarre at ttic.edu <mmarre at ttic.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20240314/a82c6307/attachment.html>


More information about the Theory mailing list