[Colloquium] REMINDER: 10/25 Young Researcher Seminar Series: Noah Golowich, MIT

Mary Marre mmarre at ttic.edu
Tue Oct 24 15:44:29 CDT 2023


*When:*        Wednesday, October 25th, 2023 at* 10:30 am CT   *



*Where:       *Talk will be given *live, in-person* at

                   TTIC, 6045 S. Kenwood Avenue

                   5th Floor, Room 530



*Virtually:*   *via* Panopto *livestream*
<https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=0147fe27-678e-4a39-b17f-b0a000173db7>



*Who: *         Noah Golowich, Massachusetts Institute of Technology


------------------------------

*Title:*          Exploring and Learning in Sparse Linear MDPs without
Computationally Intractable Oracles


* Abstract:* The key assumption underlying linear Markov Decision Processes
(MDPs) is that the learner has access to a known feature map *ϕ*(*x*,*a*) that
maps state-action pairs to *d*-dimensional vectors, and that the rewards
and transitions are linear functions in this representation. But where do
these features come from? In the absence of expert domain knowledge, a
tempting strategy is to use the ``kitchen sink" approach and hope that the
true features are included in a much larger set of potential features. In
this talk I will discuss our recent work revisiting linear MDPs from the
perspective of feature selection. In a *k*-sparse linear MDP, there is an
unknown subset *S*⊂[*d*] of size *k* containing all the relevant features,
and the goal is to learn a near-optimal policy in only
poly(*k*,log*d*) interactions
with the environment. Our main result is the first polynomial-time
algorithm for this problem. In contrast, earlier works either made
prohibitively strong assumptions that obviated the need for exploration, or
required solving computationally intractable optimization problems. As a
corollary of our main result, we give an algorithm for learning a
near-optimal policy in block MDPs whose decoding function is a low-depth
decision tree; the algorithm runs in quasi-polynomial time and takes a
polynomial number of samples. This can be seen as a reinforcement learning
analogue of classic results in computational learning theory. Furthermore,
it gives a natural model where improving the sample complexity via
representation learning is computationally feasible.

Based on joint work with Ankur Moitra and Dhruv Rohatgi. Paper link:
https://arxiv.org/abs/2309.09457.

*Bio:* Noah Golowich is a PhD Student at Massachusetts Institute of
Technology, advised by Constantinos Daskalakis and Ankur Moitra. His
research interests are in theoretical machine learning, with particular
focus on the connections between multi-agent learning, game theory, and
online learning, and on the theory of reinforcement learning. He is
supported by a Fannie & John Hertz Foundation Fellowship and an NSF
Graduate Fellowship, and was supported by an MIT Akamai Fellowship.

*Host: **Ali Vakilian* <vakilian at ttic.edu>



**********************************************************************************

 The *TTIC Young Researcher Seminar Series *(
http://www.ttic.edu/young-researcher.php) features talks by Ph.D. students
and postdocs whose research is of broad interest to the computer science
community. The series provides an opportunity
for early-career researchers to present recent work to and meet with
students and faculty at TTIC and nearby universities.




Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue, Rm 517*
*Chicago, IL  60637*
*773-834-1757*
*mmarre at ttic.edu <mmarre at ttic.edu>*


On Wed, Oct 18, 2023 at 8:34 PM Mary Marre <mmarre at ttic.edu> wrote:

> *When:*        Wednesday, October 25th, 2023 at* 10:30** a**m CT   *
>
>
> *Where:       *Talk will be given *live, in-person* at
>
>                    TTIC, 6045 S. Kenwood Avenue
>
>                    5th Floor, Room 530
>
>
> *Virtually:*   *via* Panopto *livestream*
> <https://uchicago.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=0147fe27-678e-4a39-b17f-b0a000173db7>
>
>
> *Who: *         Noah Golowich, Massachusetts Institute of Technology
>
>
> ------------------------------
> *Title:*          Exploring and Learning in Sparse Linear MDPs without
> Computationally Intractable Oracles
>
>
> *Abstract:* The key assumption underlying linear Markov Decision
> Processes (MDPs) is that the learner has access to a known feature map *ϕ*
> (*x*,*a*) that maps state-action pairs to *d*-dimensional vectors, and
> that the rewards and transitions are linear functions in this
> representation. But where do these features come from? In the absence of
> expert domain knowledge, a tempting strategy is to use the ``kitchen sink"
> approach and hope that the true features are included in a much larger set
> of potential features. In this talk I will discuss our recent work
> revisiting linear MDPs from the perspective of feature selection. In a *k*-sparse
> linear MDP, there is an unknown subset *S*⊂[*d*] of size *k* containing
> all the relevant features, and the goal is to learn a near-optimal policy
> in only poly(*k*,log*d*) interactions with the environment. Our main
> result is the first polynomial-time algorithm for this problem. In
> contrast, earlier works either made prohibitively strong assumptions that
> obviated the need for exploration, or required solving computationally
> intractable optimization problems. As a corollary of our main result, we
> give an algorithm for learning a near-optimal policy in block MDPs whose
> decoding function is a low-depth decision tree; the algorithm runs in
> quasi-polynomial time and takes a polynomial number of samples. This can be
> seen as a reinforcement learning analogue of classic results in
> computational learning theory. Furthermore, it gives a natural model where
> improving the sample complexity via representation learning is
> computationally feasible.
>
> Based on joint work with Ankur Moitra and Dhruv Rohatgi. Paper link:
> https://arxiv.org/abs/2309.09457.
>
> *Bio:* Noah Golowich is a PhD Student at Massachusetts Institute of
> Technology, advised by Constantinos Daskalakis and Ankur Moitra. His
> research interests are in theoretical machine learning, with particular
> focus on the connections between multi-agent learning, game theory, and
> online learning, and on the theory of reinforcement learning. He is
> supported by a Fannie & John Hertz Foundation Fellowship and an NSF
> Graduate Fellowship, and was supported by an MIT Akamai Fellowship.
> *Host: **Ali Vakilian* <vakilian at ttic.edu>
>
>
> **********************************************************************************
>  The *TTIC Young Researcher Seminar Series *(
> http://www.ttic.edu/young-researcher.php) features talks by Ph.D.
> students and postdocs whose research is of broad interest to the computer
> science community. The series provides an opportunity for early-career
> researchers to present recent work to and meet with students and faculty
> at TTIC and nearby universities.
>
>
>
>
> Mary C. Marre
> Faculty Administrative Support
> *Toyota Technological Institute*
> *6045 S. Kenwood Avenue, Rm 517*
> *Chicago, IL  60637*
> *773-834-1757*
> *mmarre at ttic.edu <mmarre at ttic.edu>*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20231024/37375a22/attachment-0001.html>


More information about the Colloquium mailing list