[Theory] TODAY - Talk Information Updated!: [TTIC Talks] 5/23 Research at TTIC: Sam Buchanan, TTIC
Brandie Jones via Theory
theory at mailman.cs.uchicago.edu
Fri May 23 08:46:22 CDT 2025
*When: *May 23rd *at 12:30pm CT *
*Where:* Talk will be given *live, in-person* at
TTIC, 6045 S. Kenwood Avenue
5th Floor, Room 530
*Virtually:* via Panopto (Livestream
<https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=1f1d2eb6-4976-4aac-9ed6-b1f701151572>
)
*Who:* Sam Buchanan, TTIC
*Title*: White-Box Transformers via Sparse Rate Reduction
*Abstract:* In this talk, we contend that a natural objective of
representation learning is to compress and transform the distribution of
the data, say sets of tokens, towards a low-dimensional Gaussian mixture
supported on incoherent subspaces. The goodness of such a representation
can be evaluated by a principled measure, called sparse rate reduction,
that simultaneously maximizes the intrinsic information gain and extrinsic
sparsity of the learned representation.
>From this perspective, popular deep network architectures, including
transformers, can be viewed as realizing iterative schemes to optimize this
measure. Particularly, we derive a transformer block from alternating
optimization on parts of this objective: the multi-head self-attention
operator compresses the representation by implementing an approximate
gradient descent step on the coding rate of the features, and the
subsequent multi-layer perceptron sparsifies the features.
This leads to a family of transformer-like deep network architectures,
which we call CRATE, which are mathematically interpretable. Experiments
show that these networks, despite their simplicity, indeed learn to
compress and sparsify representations of large-scale real-world image and
text datasets, and achieve performance close to highly engineered
transformer-based models, including ViT and GPT2.
***********************************************************************************************
*Masks are optional in all common areas. **Full visitor guidance is
available at ttic.edu/visitors <http://ttic.edu/visitors>.*
***********************************************************************************************
*Research at TTIC Seminar Series*
TTIC is hosting a weekly seminar series presenting the research currently
underway at the Institute. Every week a different TTIC faculty member will
present their research. The lecture
--
*Brandie Jones *
*Executive **Administrative Assistant*
Toyota Technological Institute
6045 S. Kenwood Avenue
Chicago, IL 60637
www.ttic.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20250523/28ed42b8/attachment.html>
More information about the Theory
mailing list