[Colloquium] REMINDER: 4/18 TTIC Colloquium: Kristen Grauman, UT - Austin

Mon Apr 18 10:14:13 CDT 2016

When:     Monday, April 18th at 11:00 a.m.

Where:    TTIC, 6045 S. Kenwood Avenue, 5th Floor, Room 526

Speaker:  Kristen Grauman, UT - Austin

Title:          Learning Image Representations from Unlabeled Video

Abstract:
The status quo in visual recognition is to learn from “disembodied” bags of
labeled snapshots from the Web.  Yet cognitive science tells us that
perception develops in the context of acting and moving in the world---and
without intensive supervision.  This discrepancy prompts the question: How
can unlabeled video augment visual learning?

I’ll describe our recent work exploring how a system can learn effective
representations by watching video.  First, I will show how a camera’s
egomotion lends important side information. Given unlabeled video captured
on mobile egocentric cameras, the goal is to internalize the link between
“how I move” and “what I see”.  We develop a deep feature learning approach
that embeds information not only from the video stream the observer sees,
but also the motor actions he simultaneously makes.  Specifically, we
enforce that the learned features exhibit equivariance, meaning they
respond predictably to transformations associated with distinct
egomotions.  We demonstrate the impact for recognition and next-best-view
prediction, including a scenario where features learned from ego-video on
an autonomous car substantially improve large-scale scene recognition.
Second, turning to arbitrary unlabeled video, I will present a
generalization of slow feature analysis that captures higher order temporal
coherence.  The resulting representation captures that not only do visual
signals change slowly over time, but also the changes themselves are often
smooth.  We demonstrate the impact for object, scene, and action
recognition.  In the most promising result, we see how features learned
from unlabeled YouTube videos can surpass a standard heavily supervised CNN
pretraining approach for image classification.

This is work with Dinesh Jayaraman.

Bio:
Kristen Grauman is an Associate Professor in the Department of Computer
Science at the University of Texas at Austin.  Her research in computer
vision and machine learning focuses on visual search and object
recognition.  Before joining UT-Austin in 2007, she received her Ph.D. at
MIT.  She is a recipient of a Sloan Fellowship, NSF CAREER, PAMI Young
Researcher Award, a Presidential Early Career Award for Scientists and
Engineers, and the 2013 IJCAI Computers and Thought Award.  She and her
collaborators were recognized with the CVPR Best Student Paper Award in
2008 for their work on hashing algorithms for large-scale image retrieval,
and the Marr Best Paper Prize at ICCV in 2011 for their work on modeling
relative visual attributes.  She was a Program Chair of CVPR 2015 and
currently serves as an Associate Editor in Chief for PAMI.

Host: Greg Shakhnarovich, greg at ttic.edu

For more information on the colloquium series or to subscribe to the
mailing list, please see http://www.ttic.edu/colloquium.php

Mary C. Marre
Administrative Assistant
*Toyota Technological Institute*
*6045 S. Kenwood Avenue*
*Room 504*
*Chicago, IL  60637*
*p:(773) 834-1757*
*f: (773) 357-6970*
*mmarre at ttic.edu <mmarre at ttic.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20160418/73f1712b/attachment-0001.htm