[Theory] REMINDER: 2/8 Talks at TTIC: Pedro Morgado, UC San Diego

Sun Feb 7 16:00:00 CST 2021

*When:*      Monday, February 8th at* 11:10 am CT*

*Where:*     Zoom Virtual Talk (*register in advance here
<https://uchicagogroup.zoom.us/webinar/register/WN_FroU-b4KRhmuwNorgOzuRg>*)

*Who: *       Pedro Morgado, UC San Diego

*Title:*        Learning to See and Hear from Audio-Visual Co-occurrence

*Abstract: *Imagine the sound of crashing waves. This sound may evoke the
image of a beach. A single sound serves as a bridge to connect multiple
instances of a visual scene. It can group scenes that ‘go together’ and set
apart the ones that do not. Audio can thus serve as a target to learn
powerful representations for visual inputs without relying on costly human
annotations. As computer vision systems become more capable, human
annotations become the bottleneck for further developments. My goal is to
develop effective training procedures that curb the need for direct human
supervision.

In this talk, I will discuss several tasks that benefit from audio-visual
learning, including representation learning for action and object
recognition, visually-driven sound source localization, and spatial sound
generation. I will introduce an effective contrastive learning framework
that learns audio-visual models by answering multiple-choice audio-visual
association questions. I will also discuss important challenges we face
when learning from audio supervision related to frequently noisy
audio-visual associations, and how to overcome these challenges using
robust learning algorithms.

*Bio:* Pedro Morgado is a Ph.D. candidate in the Electrical and Computer
Engineering department at the University of California, San Diego advised
by Prof. Nuno Vasconcelos. He has also spent time at Adobe Research working
with Oliver Wang and Facebook AI Research working with Ishan Misra. His
research is at the intersection of computer vision and machine learning,
focusing on multi-modal self-supervised learning. His work aims to develop
algorithms that make the power of computer vision accessible by lowering
the two major costs of deep learning - the dependence on human annotations
and the high compute requirements of training and deployment. Pedro is the
recipient of a 4-year graduate scholarship from the Portuguese Science and
Technology Foundation. Before arriving at San Diego, he received a
Bachelor’s and Master’s degree at Instituto Superior Técnico in Lisbon,
Portugal.

*Host:* Greg Shakhnarovich <greg at ttic.edu>

Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue*
*Room 517*
*Chicago, IL  60637*
*p:(773) 834-1757*
*f: (773) 357-6970*
*mmarre at ttic.edu <mmarre at ttic.edu>*

On Mon, Feb 1, 2021 at 11:00 PM Mary Marre <mmarre at ttic.edu> wrote:

> *When:*      Monday, February 8th at* 11:10 am CT*
>
>
>
> *Where:*     Zoom Virtual Talk (*register in advance here
> <https://uchicagogroup.zoom.us/webinar/register/WN_FroU-b4KRhmuwNorgOzuRg>*
> )
>
>
>
> *Who: *       Pedro Morgado, UC San Diego
>
>
> *Title:*        Learning to See and Hear from Audio-Visual Co-occurrence
>
> *Abstract: *Imagine the sound of crashing waves. This sound may evoke the
> image of a beach. A single sound serves as a bridge to connect multiple
> instances of a visual scene. It can group scenes that ‘go together’ and set
> apart the ones that do not. Audio can thus serve as a target to learn
> powerful representations for visual inputs without relying on costly human
> annotations. As computer vision systems become more capable, human
> annotations become the bottleneck for further developments. My goal is to
> develop effective training procedures that curb the need for direct human
> supervision.
>
> In this talk, I will discuss several tasks that benefit from audio-visual
> learning, including representation learning for action and object
> recognition, visually-driven sound source localization, and spatial sound
> generation. I will introduce an effective contrastive learning framework
> that learns audio-visual models by answering multiple-choice audio-visual
> association questions. I will also discuss important challenges we face
> when learning from audio supervision related to frequently noisy
> audio-visual associations, and how to overcome these challenges using
> robust learning algorithms.
>
> *Bio:* Pedro Morgado is a Ph.D. candidate in the Electrical and Computer
> Engineering department at the University of California, San Diego advised
> by Prof. Nuno Vasconcelos. He has also spent time at Adobe Research working
> with Oliver Wang and Facebook AI Research working with Ishan Misra. His
> research is at the intersection of computer vision and machine learning,
> focusing on multi-modal self-supervised learning. His work aims to develop
> algorithms that make the power of computer vision accessible by lowering
> the two major costs of deep learning - the dependence on human annotations
> and the high compute requirements of training and deployment. Pedro is
> the recipient of a 4-year graduate scholarship from the Portuguese Science
> and Technology Foundation. Before arriving at San Diego, he received a
> Bachelor’s and Master’s degree at Instituto Superior Técnico in Lisbon,
> Portugal.
>
> *Host:* Greg Shakhnarovich <greg at ttic.edu>
>
>
> Mary C. Marre
> Faculty Administrative Support
> *Toyota Technological Institute*
> *6045 S. Kenwood Avenue*
> *Room 517*
> *Chicago, IL  60637*
> *p:(773) 834-1757*
> *f: (773) 357-6970*
> *mmarre at ttic.edu <mmarre at ttic.edu>*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20210207/2320c0bb/attachment.html>