[Theory] TOMORROW: 1/27 Talks at TTIC: Ruiqi Zhong, UC Berkeley

Sun Jan 26 14:30:56 CST 2025

*When:*        Monday, January 27, 2025 at* 11:30** am** CT   *

*Where:       *Talk will be given *live, in-person* at

                   TTIC, 6045 S. Kenwood Avenue

                   5th Floor, Room 530

*Virtually:*   *via panopto: **livestream*
<https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=9066f6c5-1339-49c3-804c-b26b01470647>

*                         Note: This has been restricted to TTIC/UChicago
Only*

*Who: *         Ruiqi Zhong, UC Berkeley

*Title:*          Building Strong Language Models from Weak Validators

*Abstract: *Language models (LMs) can process large volumes of information
and perform complex reasoning. They hold the promise of doing what experts
struggle at, such as accelerating science or developing complex software.
However, building these LMs requires humans to validate their outputs,
which is challenging; e.g., developers cannot easily validate whether
complex software is bug-free. If LMs optimize against weak human
validations --- "appearing correct to humans" rather than being actually
correct --- LMs will create a false impression that they can solve complex
tasks, instead of actually solving them.

I propose three general approaches to assist human validators: 1)
validating implications (e.g., the result of executing a program), 2)
validating decompositions (e.g., well-factored programs), and 3) validating
weak points (e.g., corner cases). Given the increasing capabilities of AI
systems, developing effective validation strategies is critical to deploy
them safely and prevent silent failures.

*Bio:* Ruiqi Zhong is a final-year Ph.D. student at UC Berkeley, co-advised
by Jacob Steinhardt and Dan Klein. He was previously a part-time member of
technical staff at Anthropic, where he worked on the automated red teaming
team. His research is at the intersection of machine learning and NLP, and
he develops language model systems to advance the frontier of human
capabilities.

*Host: **Karen Livescu* <klivescu at ttic.edu>

Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue, Rm 517*
*Chicago, IL  60637*
*773-834-1757*
*mmarre at ttic.edu <mmarre at ttic.edu>*

On Mon, Jan 20, 2025 at 2:31 PM Mary Marre <mmarre at ttic.edu> wrote:

> *When:*        Monday, January 27, 2025 at* 11:30** am** CT   *
>
>
> *Where:       *Talk will be given *live, in-person* at
>
>                    TTIC, 6045 S. Kenwood Avenue
>
>                    5th Floor, Room 530
>
>
> *Virtually:*   *via panopto: **livestream*
> <https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=9066f6c5-1339-49c3-804c-b26b01470647>
>
>
> *                         Note: This has been restricted to TTIC/UChicago
> Only*
>
>
>
>
>
> *Who: *         Ruiqi Zhong, UC Berkeley
>
>
>
> *Title:*         Building Strong Language Models from Weak Validators
>
> *Abstract: *Language models (LMs) can process large volumes of
> information and perform complex reasoning. They hold the promise of doing
> what experts struggle at, such as accelerating science or developing
> complex software. However, building these LMs requires humans to validate
> their outputs, which is challenging; e.g., developers cannot easily
> validate whether complex software is bug-free. If LMs optimize against weak
> human validations --- "appearing correct to humans" rather than being
> actually correct --- LMs will create a false impression that they can solve
> complex tasks, instead of actually solving them.
>
> I propose three general approaches to assist human validators: 1)
> validating implications (e.g., the result of executing a program), 2)
> validating decompositions (e.g., well-factored programs), and 3) validating
> weak points (e.g., corner cases). Given the increasing capabilities of AI
> systems, developing effective validation strategies is critical to deploy
> them safely and prevent silent failures.
>
> *Bio:* Ruiqi Zhong is a final-year Ph.D. student at UC Berkeley,
> co-advised by Jacob Steinhardt and Dan Klein. He was previously a part-time
> member of technical staff at Anthropic, where he worked on the automated
> red teaming team. His research is at the intersection of machine learning
> and NLP, and he develops language model systems to advance the frontier of
> human capabilities.
>
> *Host: **Karen Livescu* <klivescu at ttic.edu>
>
>
>
>
>
> Mary C. Marre
> Faculty Administrative Support
> *Toyota Technological Institute*
> *6045 S. Kenwood Avenue, Rm 517*
> *Chicago, IL  60637*
> *773-834-1757*
> *mmarre at ttic.edu <mmarre at ttic.edu>*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20250126/ada21670/attachment.html>