Talk by John Moody, Oregon Graduate Institute, 9 February

Thu Feb 1 09:22:02 CST 2001

Friday, 9 February at 2:30 pm in Ryerson 251

Learning to Trade via Direct Reinforcement

                 John Moody
                 Computational Finance Program
                 Department of Computer Science & Engineering
                 Oregon Graduate Institute

Abstract:

I present new methods for optimizing portfolios, asset allocations
and trading systems.  In this approach, investment decision making is
viewed as a stochastic control problem, and strategies are discovered
via reinforcement learning.

Reinforcement learning methods were developed in the machine learning,
neural nets and control engineering communities.  These methods optimize
sequences of inter-dependent decisions, and can be used as adaptive algorithms
for solving stochastic control problems.

The approach we propose is Direct Reinforcement, whereby the control policy
is learned directly.  This differs from dynamic programming and reinforcement
algorithms such as TD-learning and Q-learning, which attempt to estimate a
value function for the control problem.  I present an adaptive algorithm called
Recurrent Reinforcement Learning (RRL) that has its roots in stochastic
approximation, adaptive control and neural computing.

While value function methods have proven effective for certain problems in
computer games and robotics, I will argue that policy based methods are more
natural for some domains.  Direct Reinforcement can enable a simpler problem
representation, avoid Bellman's curse of dimensionality, and offer compelling
advantages in efficiency.

I will demonstrate how Direct Reinforcement can be used to optimize investment
performance criteria such as profit, economic utility or risk adjusted returns.
The effects of transaction costs and market impact can be included in the
optimizations.  The strategies that traders develop depend upon the choice of
performance measure and level of transaction costs.

In extensive simulation work, we find that Direct Reinforcement produces
better trading strategies than systems utilizing Q-Learning (a value function
method) or trading based on forecasts.  Real world applications 
include a monthly
asset allocation system and an intra-daily currency trader.
----
Speaker Biography:

John Moody is the Director of the Computational Finance program and a
Professor in Computer Science at the Oregon Graduate Institute. His research
interests include computational finance, time series analysis and machine
learning algorithms.  He served as Program Co-Chair for Computational Finance
2000 in London, and previously served as Program Chair and General Chair of the
Neural Information Processing Systems conferences.  Moody has authored over 50
scientific papers in the fields of finance, forecasting, machine learning,
neural computation and physics. Prior to joining Oregon Graduate Institute,
he held positions at the Institute for Theoretical Physics in Santa Barbara
and in Computer Science and Neuroscience at Yale University.  Moody received
his B.A. in Physics from the University of Chicago, and earned his Ph.D. in
Theoretical Physics at Princeton University.

(The talk will be followed by refreshments in Ryerson 255)