<div dir="ltr"><div dir="ltr"><div><div class="gmail_default" style="font-size:small"><b style="font-family:arial,sans-serif">When</b><span style="font-family:arial,sans-serif">: Wednesday, September 24<font size="1">th</font></span><span style="font-family:arial,sans-serif"> </span><span style="font-family:arial,sans-serif">from</span><b style="font-family:arial,sans-serif"> <span style="background-color:rgb(255,255,0)">2:3</span><span style="background-color:rgb(255,255,0)">0 - 3:30p</span></b><b style="font-family:arial,sans-serif;background-color:rgb(255,255,0)">m CT</b></div><div><div><div class="gmail_default"><div class="gmail_default"><div><b><font face="arial, sans-serif"><br></font></b></div><div><b style="font-family:arial,sans-serif">Virtually</b><span style="font-family:arial,sans-serif">: via </span><a href="https://uchicagogroup.zoom.us/j/95222100221?pwd=Mdam0MtHKq34h0ipKA1bNx3DivTMww.1" style="font-family:arial,sans-serif" target="_blank"><b>Zoom</b></a><span style="font-family:arial,sans-serif"> </span></div><div><font face="arial, sans-serif"> </font></div><div><font face="arial, sans-serif"><b>Who: </b> </font>Shengjie Lin, TTIC</div><div><font face="arial, sans-serif"><br></font></div></div><div class="gmail_default"><div style="border-top:none;border-right:none;border-left:none;border-bottom:2.25pt solid rgb(11,118,159);padding:0in 0in 1pt"></div><div><font face="arial, sans-serif"><b><br></b></font></div><div><div><b style="color:rgb(0,0,0);font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:16px">Title: </b><span style="color:rgb(0,0,0);font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:16px">Scalable 3D Scene Understanding and Embodied Reasoning for Robotics</span><font face="arial, sans-serif"><br></font><div style="margin-top:1em;margin-bottom:1em;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)"><b>Abstract: </b><span style="font-size:12pt">For robots to become truly useful partners in our world, they must first build a rich, actionable understanding of it. This thesis introduces a complete framework to make that happen, enabling an agent to perceive its environment, understand how objects function, and act on complex human instructions.</span></div><div style="line-height:19px;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">The work first tackles perception at scale. We present a novel method for building large, high-fidelity 3D maps by seamlessly stitching together multiple, independently-captured scene representations. This provides the robot with a foundational spatial awareness. Building on this static map, we then address object dynamics. From just two sparse observations—like a cabinet door open and closed—our system can reconstruct an articulated object and infer its kinematics, creating a world model that is not just descriptive, but functional.</div><div style="line-height:19px;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)"><br></div><div style="line-height:19px;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">With this rich world model in place, we use a Large Language Model (LLM) as the robot's cognitive engine. This system empowers the agent to robustly ground complex, free-form human commands within its 3D map. Crucially, it also maintains an explicit "memory" of the world's state, allowing it to track changes as it acts and successfully perform long-horizon, multi-step tasks.</div><div style="line-height:19px;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)"><br></div><div style="line-height:19px;font-family:Aptos,Aptos_EmbeddedFont,Aptos_MSFontService,Calibri,Helvetica,sans-serif;font-size:12pt;color:rgb(0,0,0)">Together, these contributions form a complete pipeline from pixels to actions, paving the way for more capable and collaborative robots that can operate effectively in human-centric environments.</div></div><div><br></div><div><div class="gmail_default"><span style="font-weight:bold;font-family:arial,sans-serif">Thesis</span><font face="arial, sans-serif" style=""><b> Committee: </b>Chair:<b> </b></font><a href="mailto:mwalter@ttic.edu" target="_blank" style="font-family:arial,sans-serif">Matthew Walter</a><font face="arial, sans-serif" style=""> </font><font face="arial, sans-serif" style="font-weight:bold"> </font><font face="arial, sans-serif" style="">Members</font><font face="monospace" style="">:</font><span style="font-family:arial,sans-serif;color:rgb(0,0,0)">Greg Shakhnarovich, Vitor Guizilini</span></div><div class="gmail_default"><br></div></div></div></div></div></div><br clear="all"></div><div><br></div><br clear="all"></div><div><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue, Rm 517</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">773-834-1757</font></i></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div></div>
</div>