<div dir="ltr"><div dir="ltr"><div class="gmail_default"><p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:black">When:</span></b><span style="color:black">
Wednesday, October 25th, 2023 at<b> <u style="background-color:rgb(255,255,0)">10:30 am
CT</u> </b></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:rgb(80,0,80)">Where:
</span></b><span style="color:black">Talk will be given <b><u>live, in-person</u></b></span><b><span style="color:rgb(80,0,80)"> </span></b><span style="color:rgb(80,0,80)">at</span><span style="color:rgb(80,0,80)"></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><span style="color:rgb(80,0,80)">
</span><span style="color:black"> TTIC,
6045 S. Kenwood Avenue</span><span style="color:rgb(80,0,80)"></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><span style="color:black">
5th Floor, Room
530<b> </b></span><span style="color:rgb(80,0,80)"></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:rgb(60,64,67);letter-spacing:0.15pt">Virtually:</span></b><span style="color:rgb(60,64,67);letter-spacing:0.15pt">
<i>via</i> Panopto <a href="https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=0147fe27-678e-4a39-b17f-b0a000173db7" target="_blank"><b>livestream</b></a></span></font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><font face="arial, sans-serif"><b><span style="color:rgb(80,0,80)">Who: </span></b><span style="color:rgb(80,0,80)">
</span><span style="color:rgb(60,64,67);letter-spacing:0.15pt">Noah
Golowich, </span>Massachusetts Institute of Technology</font></p>
<p class="MsoNormal" style="margin:0in;line-height:normal"><span style="color:rgb(80,0,80)"><font face="arial, sans-serif"> </font></span></p>
<div class="MsoNormal" align="center" style="text-align:center;margin:0in 0in 8pt;line-height:107%"><span style="line-height:107%"><font face="arial, sans-serif">
<hr size="1" width="100%" align="center">
</font></span></div>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif"><b><span style="line-height:107%">Title:</span></b><span style="line-height:107%">
Exploring and Learning in Sparse Linear MDPs without
Computationally Intractable Oracles</span></font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif"><b><span style="line-height:107%"><br>
Abstract:</span></b><span style="line-height:107%"> The key assumption underlying
linear Markov Decision Processes (MDPs) is that the learner has access to a
known feature map <i>ϕ</i>(<i>x</i>,<i>a</i>) that maps state-action
pairs to <i>d</i>-dimensional vectors, and that the rewards and
transitions are linear functions in this representation. But where do these
features come from? In the absence of expert domain knowledge, a tempting
strategy is to use the ``kitchen sink" approach and hope that the true
features are included in a much larger set of potential features. In this talk
I will discuss our recent work revisiting linear MDPs from the perspective of
feature selection. In a <i>k</i>-sparse linear MDP, there is an unknown
subset <i>S</i></span><span style="line-height:107%">⊂</span><span style="line-height:107%">[<i>d</i>] of size <i>k</i> containing
all the relevant features, and the goal is to learn a near-optimal policy in
only poly(<i>k</i>,log<i>d</i>) interactions with the environment. Our
main result is the first polynomial-time algorithm for this problem. In
contrast, earlier works either made prohibitively strong assumptions that
obviated the need for exploration, or required solving computationally
intractable optimization problems. As a corollary of our main result, we give
an algorithm for learning a near-optimal policy in block MDPs whose decoding
function is a low-depth decision tree; the algorithm runs in quasi-polynomial
time and takes a polynomial number of samples. This can be seen as a
reinforcement learning analogue of classic results in computational learning
theory. Furthermore, it gives a natural model where improving the sample
complexity via representation learning is computationally feasible. </span></font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><span style="line-height:107%"><font face="arial, sans-serif">Based on joint work with Ankur Moitra
and Dhruv Rohatgi. Paper link: <a href="https://arxiv.org/abs/2309.09457" target="_blank"><span style="color:rgb(5,99,193)">https://arxiv.org/abs/2309.09457</span></a>. </font></span></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif"><b><span style="line-height:107%">Bio:</span></b><span style="line-height:107%"> Noah Golowich is a PhD Student at
Massachusetts Institute of Technology, advised by Constantinos Daskalakis and
Ankur Moitra. His research interests are in theoretical machine learning, with
particular focus on the connections between multi-agent learning, game theory,
and online learning, and on the theory of reinforcement learning. He is
supported by a Fannie & John Hertz Foundation Fellowship and an NSF
Graduate Fellowship, and was supported by an MIT Akamai Fellowship.</span></font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><font face="arial, sans-serif"><b><span style="line-height:107%">Host: </span></b><span style="line-height:107%"><a href="mailto:vakilian@ttic.edu" target="_blank"><b>Ali
Vakilian</b></a></span></font></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><span style="line-height:107%"><font face="arial, sans-serif"> </font></span></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><span style="line-height:107%"><font face="arial, sans-serif">**********************************************************************************</font></span></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><span style="line-height:107%"><font face="arial, sans-serif"> The <b><i><span style="color:rgb(61,133,198)">TTIC Young Researcher Seminar Series</span></i> </b>(<a href="http://www.ttic.edu/young-researcher.php" target="_blank">http://www.ttic.edu/young-researcher.php</a>) features
talks by Ph.D. students and postdocs whose research is of broad
interest to the computer science community. The series provides an
opportunity for early-career researchers to present recent work
to and meet with students and faculty at TTIC and nearby universities.</font></span></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><span style="line-height:107%"><font face="arial, sans-serif"> </font></span></p>
<p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><span style="line-height:107%"><font face="arial, sans-serif"> </font></span></p></div><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue, Rm 517</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">773-834-1757</font></i></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Oct 18, 2023 at 8:34 PM Mary Marre <<a href="mailto:mmarre@ttic.edu">mmarre@ttic.edu</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div><div style="font-size:small"><font style="font-family:arial,sans-serif;color:rgb(0,0,0);vertical-align:inherit"><font style="vertical-align:inherit"><b>When:</b> </font></font><font style="font-family:arial,sans-serif;vertical-align:inherit"><font style="vertical-align:inherit"><font style="color:rgb(0,0,0)"> Wednesday, October 25th</font><span class="gmail_default" style="color:rgb(0,0,0)">, 2023</span><font style="color:rgb(0,0,0)"> at</font><b style="color:rgb(0,0,0)"> <u>10:30</u></b><b><u><font color="#000000"> a</font></u></b><b><u><font color="#000000">m CT</font></u><font color="#000000"> </font></b></font></font><br></div><div><p style="font-size:small;color:rgb(80,0,80);font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal;margin:0px"><b style="font-family:arial,sans-serif"><font color="#500050"><br></font></b></p><p style="font-size:small;color:rgb(80,0,80);font-variant-numeric:normal;font-variant-east-asian:normal;font-stretch:normal;line-height:normal;margin:0px"><b style="font-family:arial,sans-serif"><font color="#500050">Where: </font></b><font color="#000000" style="font-family:arial,sans-serif"><span>Talk</span> will be given </font><font color="#000000" style="font-family:arial,sans-serif;font-weight:bold"><u>live, in-person</u></font><font style="font-family:arial,sans-serif;font-weight:bold"> </font><span style="font-family:arial,sans-serif">at</span><br></p><p class="MsoNormal" style="font-size:small;margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif"><font color="#500050"> </font><font color="#000000"> <span>TTIC</span>, 6045 S. Kenwood Avenue</font></font></p><p class="MsoNormal" style="font-size:small;margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif" color="#000000"> 5th Floor, Room 530<b> </b></font></p><p class="MsoNormal" style="font-size:small;margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font face="arial, sans-serif" color="#000000"><b><br></b></font></p><p class="MsoNormal" style="font-size:small;margin:0in;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><b style="color:rgb(60,64,67);font-family:arial,sans-serif;letter-spacing:0.2px">Virtually:</b><span style="font-family:arial,sans-serif;letter-spacing:0.2px"><font color="#3c4043"> <i>via</i> Panopto <a href="https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=0147fe27-678e-4a39-b17f-b0a000173db7" target="_blank"><b>livestream</b></a></font></span><font face="arial, sans-serif" style="color:rgb(80,0,80)"><b><br></b></font></p><p class="MsoNormal" style="font-size:small;margin:0in;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><br></p><p class="MsoNormal" style="font-size:small;margin:0in 0in 0.0001pt;line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><font style="font-family:arial,sans-serif;vertical-align:inherit"><font style="vertical-align:inherit"><b style="color:rgb(80,0,80)">Who: </b><font color="#500050" style="color:rgb(80,0,80)"> </font><font style="letter-spacing:0.2px" color="#3c4043">Noah Golowich, </font></font></font>Massachusetts Institute of Technology</p><p class="MsoNormal" style="font-size:small;margin:0in 0in 0.0001pt;color:rgb(80,0,80);line-height:normal;background-image:initial;background-position:initial;background-size:initial;background-repeat:initial;background-origin:initial;background-clip:initial"><span style="color:rgb(34,34,34)"><br></span></p><div class="MsoNormal" align="center" style="font-size:small;margin:0in 0in 8pt;color:rgb(80,0,80);text-align:center;line-height:15.6933px"><hr size="2" width="100%" align="center"></div><div><span id="m_4670829903810771367m_-3766314914305244934m_261722529860461756m_-6208928311547511757m_-4775192176943369966m_-5987187406432406688m_4134175150425154507m_-9092562243106946680m_4054032379435658787m_-9136831640199693879m_159455563385457554m_6781907569219318678m_2917498327190496145m_6658573597948212504m_6522692326762214755m_4171898695171381519m_-7622701669742356547m_-6967680496673197251m_8731442605182037865m_-6294525540622960617gmail-docs-internal-guid-fa76653a-7fff-1406-a32b-0d9b384fe884" style="color:rgb(0,0,0)"><font face="arial, sans-serif"><div><div dir="ltr"><div class="gmail_quote"><div dir="ltr"><div><div class="gmail_quote"><div><div><b>Title:</b> <span class="gmail_default"></span>Exploring and Learning in Sparse Linear MDPs without Computationally Intractable Oracles</div></div></div></div></div></div></div></div><p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><b><br>Abstract:</b> The key assumption underlying linear
Markov Decision Processes (MDPs) is that the learner has access to a known
feature map <i>ϕ</i>(<i>x</i>,<i>a</i>) that maps state-action pairs
to <i>d</i>-dimensional vectors, and that the rewards and transitions are
linear functions in this representation. But where do these features come from?
In the absence of expert domain knowledge, a tempting strategy is to use the
``kitchen sink" approach and hope that the true features are included in a
much larger set of potential features. In this talk I will discuss our recent
work revisiting linear MDPs from the perspective of feature selection. In
a <i>k</i>-sparse linear MDP, there is an unknown subset <i>S</i>⊂[<i>d</i>] of
size <i>k</i> containing all the relevant features, and the goal is
to learn a near-optimal policy in only poly(<i>k</i>,log<i>d</i>) interactions
with the environment. Our main result is the first polynomial-time algorithm
for this problem. In contrast, earlier works either made prohibitively strong
assumptions that obviated the need for exploration, or required solving computationally
intractable optimization problems. As a corollary of our main result, we give
an algorithm for learning a near-optimal policy in block MDPs whose decoding
function is a low-depth decision tree; the algorithm runs in quasi-polynomial
time and takes a polynomial number of samples. This can be seen as a
reinforcement learning analogue of classic results in computational learning
theory. Furthermore, it gives a natural model where improving the sample
complexity via representation learning is computationally feasible. </p><p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%">Based on
joint work with Ankur Moitra and Dhruv Rohatgi. Paper link: <a href="https://arxiv.org/abs/2309.09457" style="color:rgb(5,99,193)" target="_blank">https://arxiv.org/abs/2309.09457</a>. </p><p class="MsoNormal" style="margin:0in 0in 8pt;line-height:107%"><b>Bio:</b> Noah Golowich is a PhD Student at
Massachusetts Institute of Technology, advised by Constantinos Daskalakis and
Ankur Moitra. His research interests are in theoretical machine learning, with
particular focus on the connections between multi-agent learning, game theory,
and online learning, and on the theory of reinforcement learning. He is
supported by a Fannie & John Hertz Foundation Fellowship and an NSF
Graduate Fellowship, and was supported by an MIT Akamai Fellowship.</p><div style="color:rgb(34,34,34);font-family:Arial,Helvetica,sans-serif"><b style="color:rgb(0,0,0);font-family:arial,sans-serif">Host: </b><a href="mailto:vakilian@ttic.edu" style="font-family:arial,sans-serif" target="_blank"><b>Ali Vakilian</b></a><br></div></font></span></div></div><div style="font-size:small"><br></div><div style="font-size:small">**********************************************************************************</div><div><div style="font-size:small"><span style="font-family:georgia,serif"> </span><span style="font-family:arial,sans-serif">The </span><b style="font-family:arial,sans-serif"><font color="#3d85c6"><i>TTIC Young Researcher Seminar Series</i></font> </b><span style="font-family:arial,sans-serif">(</span><a href="http://www.ttic.edu/young-researcher.php" style="font-family:arial,sans-serif" target="_blank">http://www.ttic.edu/young-researcher.php</a><span style="font-family:arial,sans-serif">)</span><span style="font-family:arial,sans-serif"> features talks by Ph.D. students and postdocs whose </span><span style="font-family:arial,sans-serif">research</span><span style="font-family:arial,sans-serif"> is of broad interest to the computer science community. The series provides an opportunity for early-career </span><span style="font-family:arial,sans-serif">researchers</span><span style="font-family:arial,sans-serif"> </span><span style="font-family:arial,sans-serif">to present recent work to and meet with students and faculty at TTIC and nearby universities.</span></div></div><div style="font-size:small"><br></div><div style="font-size:small"><br></div><div style="font-size:small"><br></div><div style="font-size:small"><br></div></div><div><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><span style="font-family:arial,helvetica,sans-serif;font-size:x-small">Mary C. Marre</span><br></div><div><div><font face="arial, helvetica, sans-serif" size="1">Faculty Administrative Support</font></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1"><b>Toyota Technological Institute</b></font></i></div><div><i><font face="arial, helvetica, sans-serif" color="#3d85c6" size="1">6045 S. Kenwood Avenue, Rm 517</font></i></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">Chicago, IL 60637</font></i><br></font></div><div><font size="1"><i><font face="arial, helvetica, sans-serif" color="#3d85c6">773-834-1757</font></i></font></div><div><b><i><a href="mailto:mmarre@ttic.edu" target="_blank"><font face="arial, helvetica, sans-serif" size="1">mmarre@ttic.edu</font></a></i></b></div></div></div></div></div></div>
</blockquote></div></div>