<html>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252">

<style type="text/css" style="display:none;"> P {margin-top:0;margin-bottom:0;} </style>

</head>

<body dir="ltr">

<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

Please remove me from this mailing list. I've tried without success to unsubscribe multiple times.</div>

<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

<div class="elementToProof" style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

Thank you,</div>

<div id="Signature">

<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

<div><span style="font-family: Calibri, Helvetica, sans-serif; font-size: 12px; color: rgb(36, 36, 36); background-color: rgb(255, 255, 255);"><b><a href="https://www.linkedin.com/in/lucasawad/" id="OWA65cbc13c-a743-90dc-2199-3a9356ab87e3" class="OWAAutoLink" title="https://www.linkedin.com/in/lucasawad/">Lucas

 Awad</a></b></span><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"><br>

</span><span style="font-family: Calibri, Helvetica, sans-serif; font-size: 12px; color: rgb(134, 17, 6); background-color: rgb(255, 255, 255);"><b>University of Chicago '28</b></span><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"><br>

</span><span style="font-family: Calibri, Helvetica, sans-serif; font-size: 12px; color: rgb(102, 102, 102); background-color: rgb(255, 255, 255);">B.A. Economics, Computer Science</span><span style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);"><br>

</span><span style="font-family: Calibri, Arial, Helvetica, sans-serif; font-size: 12px; color: rgb(102, 102, 102); background-color: rgb(255, 255, 255);">(773) 322 2429 | lawad@uchicago.edu</span></div>

<div style="font-family: Aptos, Aptos_EmbeddedFont, Aptos_MSFontService, Calibri, Helvetica, sans-serif; font-size: 12pt; color: rgb(0, 0, 0);">

<br>

</div>

</div>

<div id="appendonsend"></div>

<hr style="display:inline-block;width:98%" tabindex="-1">

<div id="divRplyFwdMsg" dir="ltr"><font face="Calibri, sans-serif" style="font-size:11pt" color="#000000"><b>From:</b> cs <cs-bounces+lawad=cs.uchicago.edu@mailman.cs.uchicago.edu> on behalf of via cs <cs@mailman.cs.uchicago.edu><br>

<b>Sent:</b> Thursday, June 12, 2025 7:00 PM<br>

<b>To:</b> cs@cs.uchicago.edu <cs@cs.uchicago.edu>; colloquium@cs.uchicago.edu <colloquium@cs.uchicago.edu><br>

<b>Subject:</b> [CS] Zain Sarwar MS PresentationJun 26, 2025</font>

<div> </div>

</div>

<div class="BodyFragment"><font size="2"><span style="font-size:11pt;">

<div class="PlainText">This is an announcement of Zain Sarwar's MS Presentation<br>

===============================================<br>

Candidate: Zain Sarwar<br>

<br>

Date: Thursday, June 26, 2025<br>

<br>

Time:  9 am CST<br>

<br>

Remote Location: <a href="https://uchicago.zoom.us/j/94869095059?pwd=DV4gGttLyWg6qZANktVJRZXV60BL1N.1">

https://uchicago.zoom.us/j/94869095059?pwd=DV4gGttLyWg6qZANktVJRZXV60BL1N.1</a><br>

<br>

Title: Continual Pretraining, Dense Backpropagation, and Hierarchical Routing in Mixture of Experts<br>

<br>

Abstract: MoEs have emerged as a powerful scaling strategy for LLMs, enabling sparse activation of parameters to achieve improved compute efficiency and performance. However, their adoption introduces a unique set of challenges across training stability, scaling

 dynamics, and continual learning. In this thesis, we present three contributions aimed at advancing the robustness and scalability of MoE models.<br>

First, we explore the continual pretraining of MoE transformers, investigating whether sparse routing mechanisms hinder adaptation to new data. Our findings demonstrate that with appropriate strategies, MoEs maintain their sample efficiency and expert balance

 across distribution shifts—offering practical alternatives to full model retraining. Second, we address a fundamental limitation of sparse learning: the router&#39;s exposure to only partial gradient signals. We introduce DeaultMoE, a method that approximates

 dense gradients via expert-wise EMAs, yielding improved pre-training efficiency without sacrificing sparsity. Third, we propose StructMoE, a hierarchical architecture that augments each expert with multiple low-rank submodules selected via a secondary router.

 This design introduces dynamic intra-expert routing, enabling structured parameter growth and improved expressivity. Empirical results demonstrate superior performance over standard MoEs.<br>

<br>

Advisors: Michael Maire<br>

<br>

Committee Members: Michael Maire, Risi Kondor, and Chenhao Tan<br>

<br>

When unsubscribing, use your cnetid@cs.uchicago.edu address if your cnetid@uchicago.edu does not work.<br>

<br>

cs mailing list  -  cs@mailman.cs.uchicago.edu<br>

Edit Options and/or Unsubscribe: <a href="https://mailman.cs.uchicago.edu/mailman/listinfo/cs">

https://mailman.cs.uchicago.edu/mailman/listinfo/cs</a><br>

More information here: <a href="https://howto.cs.uchicago.edu/techstaff:mailinglist">

https://howto.cs.uchicago.edu/techstaff:mailinglist</a></div>

</span></font></div>

</body>

</html>