[Colloquium] Talks at TTIC: Ross Girshick, UC Berkeley

Mon Feb 17 11:18:18 CST 2014

When:     Friday, February 21st at 11am

Where:    TTIC, 6045 S Kenwood Avenue, 5th Floor, Room #526

Speaker:  Ross Girshick, UC Berkeley

Title:       Learning Architectures for Visual Object Recognition

Abstract:

It's an exciting time in computer vision. We're rapidly making progress on
fundamental problems such as object recognition and human pose estimation.
When I started my Ph.D. in 2007, the best object detection system could
achieve a mean average precision (mAP) of only 21% on our standard
benchmark dataset (PASCAL VOC 2007). In this talk I'll describe two
systems, one developed during my Ph.D. and the other during my postdoc,
that have more than doubled object detection performance (to 54% mAP) over
the last seven years.

The first system, Deformable Part Models (or DPM), is based on an elegant
framework in which object categories are represented by a type of
context-free grammar. These grammars allow object detectors to be specified
recursively in terms of parts and subparts. Grammars can also naturally
model object classes with variable structure and distinct subclasses. I
will describe how we systematically improved object detection performance
by increasing the structural sophistication of our detectors within this
framework.

In the second part of the talk, I will describe a new approach to object
detection that is already achieving remarkable results. This approach,
Regions with Convolutional Neural Network Features (or R-CNN), applies a
large convolutional neural network to image regions generated by a
bottom-up segmentation algorithm. The key insight behind this work is that
one can train a ConvNet on a large-scale image classification dataset
(ImageNet) and then transfer the learned representation to the problem of
object detection, where we are typically short on labeled training data.

Bio:

Ross Girshick finished his Ph.D. in computer vision at The University of
Chicago under the supervision of Pedro Felzenszwalb in April 2012. Since
then, he's been a postdoctoral fellow working with Jitendra Malik at UC
Berkeley.

Ross's main research interests are in computer vision, AI, and machine
learning. His work focuses on building models for object detection and
recognition that aim to incorporate the "right" biases so that machine
learning algorithms can understand image content from moderate- to
large-scale datasets.

During Ross's Ph.D., he spent time as a research intern at Microsoft
Research Cambridge, UK working on human pose estimation from depth images.
He also participated in several first-place entries into the PASCAL VOC
object detection challenge, and in 2010 was awarded the PASCAL VOC
"lifetime achievement" prize for his work on Deformable Part Models.

Host:  David McAllester,  McAllester at ttic.edu

-- 
*Dawn Ellis*
Administrative Coordinator,
Bookkeeper
773-834-1757
dellis at ttic.edu

TTIC
6045 S. Kenwood Ave.
Chicago, IL. 60637
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20140217/043b20e1/attachment-0001.htm