[Colloquium] 9/7 Thesis Defense: Falcon Dai, TTIC

Mary Marre mmarre at ttic.edu
Tue Aug 30 16:07:30 CDT 2022


*Thesis Defense: Falcon Dai, TTIC*

When: Wednesday, September 7th from *12:30 - 2:30 pm CT*

Virtually:
<https://uchicago.zoom.us/j/98534120153?pwd=SmRDMFo1UTA1M3pNZEZOblhkWG9yQT09>*Join
Virtually Here
<https://uchicago.zoom.us/j/98534120153?pwd=SmRDMFo1UTA1M3pNZEZOblhkWG9yQT09>*

Who: Falcon Dai, TTIC


Thesis Title: On Reward Structures of Markov Decision Processes

*Abstract*:
A Markov decision process can be parameterized by a transition kernel and a
reward function. Both play essential roles in the study of reinforcement
learning as evidenced by their presence in the Bellman equations. In my
inquiry of various kinds of "costs'' associated with reinforcement learning
inspired by the demands in robotic applications, I discovered that rewards
prove central to understanding the structure of a Markov decision process
and reward-centric notions can elucidate important concepts in
reinforcement learning.

Specifically, I studied the sample complexity of policy evaluation and
developed a novel estimator with an instance-specific error bound of
$\widetilde{O}(\sqrt{\nicefrac{\tau_s}{n}})$ for estimating a single state
value. Under the online regret minimization setting, I refined the
transition-based MDP constant, diameter, into a reward-based constant,
maximum expected hitting cost, and with it, provided a theoretical
explanation for how a well-known technique, potential-based reward shaping,
could accelerate learning with expert knowledge. In an attempt to study
safe reinforcement learning, I modeled hazardous environments with
irrecoverability and proposed a quantitative notion of safe learning via
reset efficiency. In this setting, I modified a classic algorithm to
account for resets achieving promising preliminary numerical results.
Lastly, for MDPs with multiple reward functions, I developed a planning
algorithm that computationally efficiently finds Pareto optimal stochastic
policies.

*Thesis Advisor:** Matthew Walter* <mwalter at ttic.edu>




Mary C. Marre
Faculty Administrative Support
*Toyota Technological Institute*
*6045 S. Kenwood Avenue*
*Chicago, IL  60637*
*mmarre at ttic.edu <mmarre at ttic.edu>*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cs.uchicago.edu/pipermail/colloquium/attachments/20220830/bd34b6d0/attachment.html>


More information about the Colloquium mailing list