[Theory] TODAY: [TTIC Talks] 2/14 Research at TTIC: Zhiyuan Li, TTIC

Brandie Jones via Theory theory at mailman.cs.uchicago.edu
Fri Feb 14 09:00:00 CST 2025


*When:         *February 14h *at 12:30pm CT  *


*Where:*        Talk will be given *live, in-person* at

                       TTIC, 6045 S. Kenwood Avenue

                        5th Floor, Room 530


*Virtually:*    via Panopto (Livestream
<https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=f41ea02d-3f4d-4e87-94a2-b1a901057aae>
)



*Who:*           Zhiyuan Li, TTIC



*Title*:           Pencil: Long Thoughts with Short Memory


*Abstract:  *  Recent works (e.g., Deepseek R1) shows that long CoT (Chain
of Thought) greatly improves reasoning capability of Large Language Models
(LLMs). However, it also poses significant challenges for memory efficiency
—and consequently, time efficiency-- during inference, even for problems
solvable with small space. This limitation stems from the non-erasable
nature of standard CoT, which equals the space complexity (context length)
to the time complexity (CoT length).


In this talk, we will introduce a new method, PENCIL, to improve the
efficiency of CoT by incorporating a reduction mechanism into the
autoregressive generation process. PENCIL enables the model to actively
discard obsolete tokens by outputting special tokens which triggers the
reduction mechanism. We show PENCIL can perform universal space-efficient
computation, that is,  PENCIL can simulate Turing machines with maximal
context length matching its space complexity and total number of generated
tokens matching its time complexity. By effectively reducing the maximal
context length, PENCIL also decreases per-token generation time, enabling
improved scalability compared to standard CoT. This efficiency gain
translates into enhanced performance on complex reasoning tasks, including
97% accuracy on the challenging 5×5 Einstein’s puzzle, using a
25M-parameter transformer with a 2048-token context length.



***********************************************************************************************

*Masks are optional in all common areas. **Full visitor guidance is
available at ttic.edu/visitors <http://ttic.edu/visitors>.*

***********************************************************************************************

*Research at TTIC Seminar Series*



TTIC is hosting a weekly seminar series presenting the research currently
underway at the Institute. Every week a different TTIC faculty member will
present their research.  The lectures are intended for students
seeking research topics and advisors and for the general TTIC and
University of Chicago communities interested in hearing what their
colleagues are up to.



-- 
*Brandie Jones *
*Executive **Administrative Assistant*
Toyota Technological Institute
6045 S. Kenwood Avenue
Chicago, IL  60637
www.ttic.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/theory/attachments/20250214/1607b8b0/attachment-0001.html>


More information about the Theory mailing list