[Colloquium] TODAY, 3PM: Data Science/CS Candidate Rowan Zellers (U. of Washington)

Wed Feb 2 11:10:41 CST 2022

*Data Science Institute/Computer Science Candidate Seminar*

*Rowan Zellers*
*Ph.D. Candidate*
*University of Washington*

*Wednesday, February 2nd*
*3:00 p.m. - 4:00 p.m.*
*In-Person: John Crerar Library, Room 390 (yes, it's still on!)*
*Remote: Live Stream <http://live.cs.uchicago.edu/rowanzellers/> or Zoom
<https://uchicago.zoom.us/j/94411057570?pwd=dXFiZVI0RTJpekdtbmF5eTNWODdQQT09>
(details
below)*

*Grounding Language by Seeing, Hearing, and Interacting*

As humans, our understanding of language is grounded in a rich mental model
about “how the world works” – that we learn through perception and
interaction. We use this understanding to reason beyond what is literally
said, imagining how situations might unfold in the world. Machines today
struggle at making such connections, which limits how they can be safely
used.

In my talk, I will discuss three lines of work to bridge this gap between
machines and humans. I will first discuss how we might measure grounded
understanding. I will introduce a suite of approaches for constructing
benchmarks, using machines in the loop to filter out spurious biases. Next,
I will introduce PIGLeT: a model that learns physical commonsense
understanding by interacting with the world through simulation, using this
knowledge to ground language. PIGLeT learns linguistic form and meaning –
together – and outperforms text-to-text only models that are orders of
magnitude larger. Finally, I will introduce MERLOT, which learns about
situations in the world by watching millions of YouTube videos with
transcribed speech. The model learns to jointly represent video, audio, and
language, together and over time – learning multimodal and neural script
knowledge representations. Together, these directions suggest a path
forward for building machines that learn language rooted in the world.

*Bio*: Rowan Zellers <https://rowanzellers.com/> is a final year PhD
candidate at the University of Washington in Computer Science &
Engineering, advised by Yejin Choi and Ali Farhadi. His research focuses on
enabling machines to understand language, vision, sound, and the world
beyond these modalities. He has been recognized through NSF and ARCS
Graduate Fellowships, and a NeurIPS 2021 outstanding paper award. His work
has appeared in several media outlets, including Wired, the Washington
Post, and the New York Times. In the past, he graduated from Harvey Mudd
College with a B.S. in Computer Science & Mathematics, and has interned at
the Allen Institute for AI.

*Host*: Chenhao Tan

*Zoom Info:*
https://uchicago.zoom.us/j/94411057570?pwd=dXFiZVI0RTJpekdtbmF5eTNWODdQQT09
Meeting ID: 944 1105 7570
Passcode: ds2022
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20220202/ce5d4333/attachment-0001.html>