[Theory] [Theory Lunch] Dravy Sharma
Alec Sun via Theory
theory at mailman.cs.uchicago.edu
Tue Apr 1 13:38:56 CDT 2025
Please join us for the second theory lunch of the quarter tomorrow!
*Time: *Wednesday, April 2, 2025, 12pm-1pm in JCL 390
*Speaker: *Dravy Sharma
*Title: *Provable tuning of deep learning model hyperparameters
*Abstract*: Modern machine learning algorithms, especially deep
learning-based techniques, typically involve careful hyperparameter tuning
to achieve the best performance. Despite the surge of intense interest in
practical techniques like Bayesian optimization and random search-based
approaches to automating this laborious and compute-intensive task, the
fundamental learning-theoretic complexity of tuning hyperparameters for
deep neural networks is poorly understood. Inspired by this glaring gap, we
initiate the formal study of hyperparameter tuning complexity in deep
learning under a powerful data-driven paradigm. A major difficulty is that
the utility function as a function of the hyperparameter is very volatile
and furthermore, it is given implicitly by an optimization problem over the
model parameters. To tackle this challenge, we employ subtle concepts from
differential/algebraic geometry and constrained optimization to show that
the learning-theoretic complexity of the corresponding family of utility
functions is bounded. We instantiate our results and provide sample
complexity bounds for concrete applications—tuning a hyperparameter that
interpolates neural activation functions and setting the kernel parameter
in graph neural networks. The talk is based on joint work with Nina Balcan
and Anh Nguyen.
On Tue, Mar 25, 2025, 4:52 PM Alec Sun <alecsun at uchicago.edu> wrote:
> Please join us for the first theory lunch of the quarter tomorrow!
>
> *Time: *Wednesday, March 26, 2025, 12pm-1pm in JCL 390
>
> *Speaker: *Olga Medrano
>
> *Title: *Short Visit to Regularity Lemmas
>
> *Abstract: *In this expository talk, we will overview Szémerédi’s
> regularity lemma, a result from extremal graph theory with different
> applications. We will then mention some considerations about this result,
> including the existence of irregular pairs in the partitions obtained, as
> well as the large size of those partitions. Time permitting, we very
> briefly reflect on how the proof of Szémerédi's regularity lemma is not
> algorithmic and mention a few lines of work that were focused on finding
> algorithms to output regular partitions. In the second half of this talk,
> we will describe versions of this lemma over certain classes of graphs. In
> particular, we state both the Ultra-strong regularity lemma, which works
> for the class of graphs of bounded VC dimension, and the Stable regularity
> lemma, which works for the class of k-edge stable graphs (namely, those not
> containing a bi-induced bipartite graph). We conclude by acknowledging that
> algorithmic questions for both results remain open.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20250401/530a0444/attachment.html>
More information about the Theory
mailing list