<div dir="ltr"><div dir="ltr"><div><div class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(0,0,0)"><b>When: </b>Tuesday, April 29th at <b style="background-color:rgb(255,255,0)">10am CT</b></div><div><div><font face="georgia, serif" color="#000000"><br></font></div><div><font face="georgia, serif" color="#000000"><b>Where:</b> <b> </b><span style="background-color:rgb(255,255,0)">Talk will be given <font style="font-weight:bold"><u>live, in-person</u></font><font style="font-weight:bold"> </font>at<span class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(0,0,0)"></span></span></font></div><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font color="#000000"><font face="georgia, serif"> </font><font face="georgia, serif" style="background-color:rgb(255,255,0)"> </font><font face="georgia, serif" style="background-color:rgb(255,255,0)">TTIC, 6045 S. Kenwood Avenue</font></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="georgia, serif" color="#000000"> <span style="background-color:rgb(255,255,0)">5th Floor, <b>Room 5<span class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(0,0,0)">29</span></b></span><b style="background-color:rgb(255,255,0)"> </b><b style=""><span style="background-color:rgb(255,255,255)"> </span><span class="gmail_default" style="font-family:georgia,serif;font-size:small;color:rgb(0,0,0)"><span style="background-color:rgb(255,255,255)"> </span><span style="background-color:rgb(255,255,0)">*Please note the room location has changed*</span></span></b></font></p><p class="MsoNormal" style="margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="georgia, serif" color="#000000"><br></font></p><div><font face="georgia, serif" color="#000000"><b>Who</b>: <span class="gmail_default"></span>G<span class="gmail_default">ene Li</span>, TTIC</font></div><div><font face="georgia, serif" color="#000000"><br></font></div><div><font face="georgia, serif" color="#000000"><b>Title: </b>Agnostic Reinforcement Learning: Foundations and Algorithms</font></div><div><font face="georgia, serif" color="#000000"><br></font></div><div><font face="georgia, serif" color="#000000"><b>Abstract:</b> Reinforcement Learning (RL) has demonstrated tremendous empirical success across numerous challenging domains. However, we lack a strong theoretical understanding of the statistical complexity of RL in environments with large state spaces, where function approximation is required for sample-efficient learning. This thesis addresses this gap by rigorously examining the statistical complexity of RL with function approximation from a learning theoretic perspective. Departing from a long history of prior work, we consider the weakest form of function approximation, called agnostic policy learning, in which the learner seeks to find the best policy in a given class $\Pi$, with no guarantee that $\Pi$ contains an optimal policy for the underlying task.</font></div><font face="georgia, serif" color="#000000"><br>We systematically explore agnostic policy learning along three key axes: environment access---how a learner collects data from the environment; coverage conditions---intrinsic properties of the underlying MDP measuring the expansiveness of state-occupancy measures for policies in the class $\Pi$, and representational conditions--- structural assumptions on the class $\Pi$ itself. Within this comprehensive framework, we (1) design new learning algorithms with theoretical guarantees and (2) characterize fundamental performance bounds of any algorithm. Our results reveal significant statistical separations that highlight the power and limitations of agnostic policy learning.</font><div><br></div><div><font face="georgia, serif" color="#000000"><b>Thesis Committee:<span class="gmail_default"> </span></b>Nathan Srebro (Thesis Advisor), Avrim Blum, Akshay Krishnamurthy, Cong Ma</font></div></div><div><div class="gmail_default"><br></div><div class="gmail_default"><br></div><div class="gmail_default"><font face="georgia, serif" color="#000000">Thanks, </font></div><font color="#888888"><div class="gmail_default"><font face="georgia, serif" color="#000000">Brandie </font></div></font></div><br clear="all"></div><div><br></div><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><b style="background-color:rgb(255,255,255)"><font color="#3d85c6">Brandie Jones </font></b><div><div><div><font color="#3d85c6"><b><i>Executive </i></b></font><b style="color:rgb(61,133,198)"><i>Administrative Assistant</i></b></div></div><div><span style="background-color:rgb(255,255,255)"><font color="#3d85c6">Toyota Technological Institute</font></span></div><div><span style="background-color:rgb(255,255,255)"><font color="#3d85c6">6045 S. Kenwood Avenue</font></span></div><div><span style="background-color:rgb(255,255,255)"><font color="#3d85c6">Chicago, IL 60637</font></span></div></div><div><span style="background-color:rgb(255,255,255)"><font color="#3d85c6"><a href="http://www.ttic.edu" target="_blank">www.ttic.edu</a> </font></span></div><div><span style="background-color:rgb(255,255,255)"><font color="#3d85c6"><br></font></span></div></div></div></div>
</div>