[Theory] [TTIC Talks] Talks at TTIC: 3/4 Gon Buzaglo, Technion
Brandie Jones
bjones at ttic.edu
Tue Feb 27 08:00:00 CST 2024
*When:* Monday, March 4th at* 11:30AM CT *
*Where:* Talk will be given *live, in-person* at
TTIC, 6045 S. Kenwood Avenue
5th Floor, Room 530
*Virtually:* via Panopto (Livestream
<https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=eee0e89c-b2cc-4696-86cb-b11d011177ce>
)
*Who: *Gon Buzaglo, Technion
*Title:* How Uniform Random Weights Induce Non-uniform Bias:
Typical Interpolating Neural Networks Generalize with Narrow Teachers
*Abstract:* A main theoretical puzzle is why over-parameterized Neural
Networks (NNs) generalize well when trained to zero error (i.e., so they
interpolate the data). Usually, the NN is trained with Stochastic Gradient
Descent (SGD) or one of its variants. However, recent empirical work
examined the generalization of a random NN that interpolates the data: the
NN was sampled from a seemingly uniform prior over the parameters,
conditioned on that the NN perfectly classifying the training set.
Interestingly, such a NN sample typically generalized as well as
SGD-trained NNs.
I will talk about our new paper, where we prove that such a random NN
interpolator typically generalizes well if there exists an underlying
narrow “teacher NN” that agrees with the labels. Specifically, we show that
such a ‘flat’ prior over the NN parametrization induces a rich prior over
the NN functions, due to the redundancy in the NN structure. In particular,
this creates a bias towards simpler functions, which require less relevant
parameters to represent --- enabling learning with a sample complexity
approximately proportional to the complexity of the teacher (roughly, the
number of non-redundant parameters), rather than the student's.
*Bio:* Gon Buzaglo is currently pursuing an MSc in Electrical & Computer
Engineering at the Technion, under the guidance of Prof. Daniel Soudry.
Concurrently, he is completing his final year of undergraduate studies at
the Technion, with a dual major in Computer Science and Physics. His
undergraduate research included a collaboration with Prof. Michal Irani in
the Department of Computer Science and Applied Mathematics at the Weizmann
Institute of Science. Gon's research interests are centered around
theoretical machine learning, specifically employing mathematical insights
and experiments to explore concepts such as generalization, memorization,
and forgetting in neural networks.
*Host: Nati Srebro <nati at ttic.edu>*
--
*Brandie Jones *
*Executive **Administrative Assistant*
Toyota Technological Institute
6045 S. Kenwood Avenue
Chicago, IL 60637
www.ttic.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20240227/b908076d/attachment.html>
More information about the Theory
mailing list