[Theory] NOW: [TTIC Talks] 5/23 Research at TTIC: Sam Buchanan, TTIC

Brandie Jones via Theory theory at mailman.cs.uchicago.edu
Fri May 23 12:25:00 CDT 2025


*When:         *May 23rd *at 12:30pm CT  *


*Where:*        Talk will be given *live, in-person* at

                       TTIC, 6045 S. Kenwood Avenue

                        5th Floor, Room 530


*Virtually:*    via Panopto (Livestream
<https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=1f1d2eb6-4976-4aac-9ed6-b1f701151572>
)



*Who:*            Sam Buchanan, TTIC



*Title*:             White-Box Transformers via Sparse Rate Reduction


*Abstract:*   In this talk, we contend that a natural objective of
representation learning is to compress and transform the distribution of
the data, say sets of tokens, towards a low-dimensional Gaussian mixture
supported on incoherent subspaces. The goodness of such a representation
can be evaluated by a principled measure, called sparse rate reduction,
that simultaneously maximizes the intrinsic information gain and extrinsic
sparsity of the learned representation.


>From this perspective, popular deep network architectures, including
transformers, can be viewed as realizing iterative schemes to optimize this
measure. Particularly, we derive a transformer block from alternating
optimization on parts of this objective: the multi-head self-attention
operator compresses the representation by implementing an approximate
gradient descent step on the coding rate of the features, and the
subsequent multi-layer perceptron sparsifies the features.


 This leads to a family of transformer-like deep network architectures,
which we call CRATE, which are mathematically interpretable. Experiments
show that these networks, despite their simplicity, indeed learn to
compress and sparsify representations of large-scale real-world image and
text datasets, and achieve performance close to highly engineered
transformer-based models, including ViT and GPT2.


***********************************************************************************************

*Masks are optional in all common areas. **Full visitor guidance is
available at ttic.edu/visitors <http://ttic.edu/visitors>.*

***********************************************************************************************

*Research at TTIC Seminar Series*



TTIC is hosting a weekly seminar series presenting the research currently
underway at the Institute. Every week a different TTIC faculty member will
present their research.  The lecture


-- 
*Brandie Jones *
*Executive **Administrative Assistant*
Toyota Technological Institute
6045 S. Kenwood Avenue
Chicago, IL  60637
www.ttic.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20250523/c93ae71c/attachment.html>


More information about the Theory mailing list