[Colloquium] Yishay Mansour talk today 2pm at TTI
Meridel Trimble
mtrimble at tti-c.org
Wed Aug 11 10:40:06 CDT 2004
TOYOTA TECHNOLOGICAL INSTITUTE TALK
Speaker: Yishay Mansour, Tel Aviv University
Title: From external to internal regret
Time: Today, Wednesday, August 11th, 2pm
Place: TTI-C (1427 E. 60th St., 2nd Floor)
Speaker's Homepage: http://www.math.tau.ac.il/~mansour/
Abstract: External regret compares the performance of an online algorithm,
selecting among N actions, to the performance of the best of those actions
in hindsight. Internal regret compares the loss of an online algorithm to
the loss of a modified online algorithm, which consistently replaces one
action by another.
In this paper, we give a simple generic reduction that, given an algorithm
for the external regret problem, converts it to an efficient online
algorithm for the internal regret problem. We provide methods that work
both in the full information model, in which the loss of every action is
observed at each time step, and the partial information (bandit) model,
where at each time step only the loss of the selected action is observed.
The importance of internal regret in game theory is due to the fact that in
a general game, if each player has sublinear internal regret, then the
empirical frequencies converges to a correlated equilibrium. Our internal
regret bounds guarantee that if in a general game all the players follow
our algorithm, the empirical frequencies converge to an epsilon-correlated
equilibrium. For external regret we also derive a quantitative regret bound
for a very general setting of regret, which includes an arbitrary set of
modification rules (that possibly modify the online algorithm) and an
arbitrary set of time selection functions (each giving different weight to
each time step). The regret for a given time selection and modification
rule is the difference between the cost of the online algorithm and the
cost of the modified online algorithm, where the costs are weighted by the
time selection function. This can be viewed as a generalization of the
previously-studied sleeping experts setting. Finally, we give a
deterministic algorithm for Boolean prediction in the presence of time
selection functions,whose number of weighted errors, with respect to any
time selection function, is at most three times the optimum.
This is joint work with Avrim Blum at CMU.
If you have questions, or would like to meet the speaker, please contact
Meridel at 4-9873 or mtrimble at tti-c.org
More information about the Colloquium
mailing list