[Colloquium] TODAY: 4/12 TTIC/UChicago NLP Seminar: Hao Peng, UIUC

Fri Apr 12 09:00:00 CDT 2024

*When:*         Friday, April 12, 2024 at* 11:00** am** CT   *

*Where:       *Talk will be given *live, in-person* at

                   TTIC, 6045 S. Kenwood Avenue

                   5th Floor, *Room 529 *

*Virtually:*    *zoom*
<https://www.google.com/url?q=https://uchicago.zoom.us/j/98297764499?pwd%3DajNQSTZnMHRmMENkd1hjdjlNeW1xdz09&sa=D&source=calendar&ust=1713290314655199&usg=AOvVaw2G35njOZgShzvSRt7jxB2X>

*Who: *         Hao Peng, UIUC
------------------------------
*Title:* Pushing the Boundaries of Length Generalization and Reasoning
Capabilities of Open LLMs

*Abstract: *Recent advancements in open-source pretrained large language
models (LLMs) have created new opportunities for exploring exciting
post-pre training innovations. This talk shares some of our recent works.
The first part of my talk focuses on context length generalization in LLMs.
I will begin with a theoretical analysis that identifies major factors
contributing to the failures of several commonly-used techniques, which
leads to the development of a simple yet effective algorithm that enables
pretrained LLMs to generalize to extreme context lengths without any
parameter update. I will then shift focus to continual pretraining for
length generalization, and share our recent findings highlighting the
importance of training data mixture—a crucial yet previously overlooked
factor. The second part of my talk will be about Eurus, our
recently-released suite of open LLMs. On diverse benchmarks covering
challenging math, coding, and reasoning problems, Eurus achieves
state-of-the-art performance among all open-source models and outperforms
GPT-3.5 Turbo. On two established reward modeling benchmarks, our 7B reward
model achieves better correlation with human judgment than all existing
models including GPT-4. I will especially highlight UltraInteract, our
newly-curated alignment dataset that enables Eurus’s strong performance.

*Bio: *Hao Peng is an Assistant Professor in the Department of Computer
Science of the University of Illinois at Urbana-Champaign (UIUC). He
received his PhD from the University of Washington and his bachelor’s
degree from Peking University. Before joining UIUC, he spent one year at
the Allen Institute for Artificial Intelligence as a Young Investigator,
and time at Microsoft Research, Google, and DeepMind as an intern. His
research interest broadly spans natural language processing and machine
learning.

* Host: *Jiawei Zhou <jzhou at ttic.edu>

Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue, Rm 517*
*Chicago, IL  60637*
*773-834-1757*
*mmarre at ttic.edu <mmarre at ttic.edu>*

On Thu, Apr 11, 2024 at 1:03 PM Mary Marre <mmarre at ttic.edu> wrote:

> *When:*         Friday, April 12, 2024 at* 11:00** am** CT   *
>
>
> *Where:       *Talk will be given *live, in-person* at
>
>                    TTIC, 6045 S. Kenwood Avenue
>
>                    5th Floor, *Room 529 *
>
> *Virtually:*    *zoom*
> <https://www.google.com/url?q=https://uchicago.zoom.us/j/98297764499?pwd%3DajNQSTZnMHRmMENkd1hjdjlNeW1xdz09&sa=D&source=calendar&ust=1713290314655199&usg=AOvVaw2G35njOZgShzvSRt7jxB2X>
>
>
>
> *Who: *         Hao Peng, UIUC
> ------------------------------
> *Title:* Pushing the Boundaries of Length Generalization and Reasoning
> Capabilities of Open LLMs
>
> *Abstract: *Recent advancements in open-source pretrained large language
> models (LLMs) have created new opportunities for exploring exciting
> post-pre training innovations. This talk shares some of our recent works.
> The first part of my talk focuses on context length generalization in LLMs.
> I will begin with a theoretical analysis that identifies major factors
> contributing to the failures of several commonly-used techniques, which
> leads to the development of a simple yet effective algorithm that enables
> pretrained LLMs to generalize to extreme context lengths without any
> parameter update. I will then shift focus to continual pretraining for
> length generalization, and share our recent findings highlighting the
> importance of training data mixture—a crucial yet previously overlooked
> factor. The second part of my talk will be about Eurus, our
> recently-released suite of open LLMs. On diverse benchmarks covering
> challenging math, coding, and reasoning problems, Eurus achieves
> state-of-the-art performance among all open-source models and outperforms
> GPT-3.5 Turbo. On two established reward modeling benchmarks, our 7B reward
> model achieves better correlation with human judgment than all existing
> models including GPT-4. I will especially highlight UltraInteract, our
> newly-curated alignment dataset that enables Eurus’s strong performance.
>
> *Bio: *Hao Peng is an Assistant Professor in the Department of Computer
> Science of the University of Illinois at Urbana-Champaign (UIUC). He
> received his PhD from the University of Washington and his bachelor’s
> degree from Peking University. Before joining UIUC, he spent one year at
> the Allen Institute for Artificial Intelligence as a Young Investigator,
> and time at Microsoft Research, Google, and DeepMind as an intern. His
> research interest broadly spans natural language processing and machine
> learning.
>
> * Host: *Jiawei Zhou <jzhou at ttic.edu>
>
>
>
>
> Mary C. Marre
> Faculty Administrative Support
> *Toyota Technological Institute*
> *6045 S. Kenwood Avenue, Rm 517*
> *Chicago, IL  60637*
> *773-834-1757*
> *mmarre at ttic.edu <mmarre at ttic.edu>*
>
>
> On Tue, Apr 9, 2024 at 5:26 PM Mary Marre <mmarre at ttic.edu> wrote:
>
>> *When:*         Friday, April 12, 2024 at* 11:00** am** CT   *
>>
>>
>> *Where:       *Talk will be given *live, in-person* at
>>
>>                    TTIC, 6045 S. Kenwood Avenue
>>
>>                    5th Floor, *Room 529 *
>>
>>
>>
>> *Who: *         Hao Peng, UIUC
>> ------------------------------
>> *Title:* Pushing the Boundaries of Length Generalization and Reasoning
>> Capabilities of Open LLMs
>>
>> *Abstract: *Recent advancements in open-source pretrained large language
>> models (LLMs) have created new opportunities for exploring exciting
>> post-pre training innovations. This talk shares some of our recent works.
>> The first part of my talk focuses on context length generalization in LLMs.
>> I will begin with a theoretical analysis that identifies major factors
>> contributing to the failures of several commonly-used techniques, which
>> leads to the development of a simple yet effective algorithm that enables
>> pretrained LLMs to generalize to extreme context lengths without any
>> parameter update. I will then shift focus to continual pretraining for
>> length generalization, and share our recent findings highlighting the
>> importance of training data mixture—a crucial yet previously overlooked
>> factor. The second part of my talk will be about Eurus, our
>> recently-released suite of open LLMs. On diverse benchmarks covering
>> challenging math, coding, and reasoning problems, Eurus achieves
>> state-of-the-art performance among all open-source models and outperforms
>> GPT-3.5 Turbo. On two established reward modeling benchmarks, our 7B reward
>> model achieves better correlation with human judgment than all existing
>> models including GPT-4. I will especially highlight UltraInteract, our
>> newly-curated alignment dataset that enables Eurus’s strong performance.
>>
>> *Bio: *Hao Peng is an Assistant Professor in the Department of Computer
>> Science of the University of Illinois at Urbana-Champaign (UIUC). He
>> received his PhD from the University of Washington and his bachelor’s
>> degree from Peking University. Before joining UIUC, he spent one year at
>> the Allen Institute for Artificial Intelligence as a Young Investigator,
>> and time at Microsoft Research, Google, and DeepMind as an intern. His
>> research interest broadly spans natural language processing and machine
>> learning.
>>
>> * Host: *Jiawei Zhou <jzhou at ttic.edu>
>>
>>
>>
>> Mary C. Marre
>> Faculty Administrative Support
>> *Toyota Technological Institute*
>> *6045 S. Kenwood Avenue, Rm 517*
>> *Chicago, IL  60637*
>> *773-834-1757*
>> *mmarre at ttic.edu <mmarre at ttic.edu>*
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20240412/aa786ba0/attachment-0001.html>