[Colloquium] 1/11 Talks at TTIC: Sitan Chen, MIT

Mary Marre mmarre at ttic.edu
Tue Jan 5 11:59:35 CST 2021


*When:*      Monday, January 11th at* 11:10 am CT*



*Where:*     Zoom Virtual Talk (*register in advance here
<https://uchicagogroup.zoom.us/webinar/register/WN_96FR8EkDRxS7o7c4oO4nxw>*)



*Who: *       Sitan Chen, MIT


*Title:        *Learning When Gradient Descent Fails

*Abstract: *What are the most powerful algorithms for taking labeled
examples and producing classifiers with good out-of-sample performance?
Stochastic gradient descent is the practitioner's tool of choice, and an
explosion of work in recent years has sought out theoretical justification
for the striking empirical successes of this approach. Yet one can also ask
an orthogonal question: are there natural learning problems where one can
do even better, and provably so?

In this talk, I will first overview my work on supervised learning in the
presence of corruptions, focusing on the classic problem of learning a
halfspace under Massart noise. This is a setting where an adversary can
arbitrarily corrupt a randomly chosen fraction of labels, and for which
minimizing any convex surrogate provably fails. We give a simple and
practical algorithm, which is also the first proper learning algorithm for
this problem to work without any assumptions on the input distribution.
Time permitting, I will mention some of our related results on generalized
linear models and contextual bandits.

In the second half, I will present progress on the well-studied problem of
learning a neural network over Gaussian space: given normally distributed
examples labeled by an unknown teacher network, output a network with high
test accuracy. All previous works on this pertain to depth-two networks and
make some assumption about the weights. We give the first algorithm that
works for arbitrary depths, makes no such assumptions, and runs in time
polynomial in the dimension for any bounded-size network. Notably, a large
class of algorithms, including gradient descent on an overparametrized
student network, provably cannot achieve such a guarantee.

Based on joint works with Adam Klivans, Frederic Koehler, Raghu Meka, Ankur
Moitra, and Morris Yau.

*Bio*: Sitan Chen is a fifth-year graduate student in theoretical computer
science at MIT advised by Ankur Moitra. His research focuses on designing
provable algorithms for fundamental learning problems, with a focus on
robustness, multi-index regression, and mixture models. He has been
supported by a Paul and Daisy Soros Fellowship and an MIT Presidential
Fellowship.

*Host: *Madhur Tulsiani <madhurt at ttic.edu>



Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue*
*Room 517*
*Chicago, IL  60637*
*p:(773) 834-1757*
*f: (773) 357-6970*
*mmarre at ttic.edu <mmarre at ttic.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20210105/41318f49/attachment.html>


More information about the Colloquium mailing list