[Colloquium] Yishay Mansour talk today 2pm at TTI

Wed Aug 11 10:40:06 CDT 2004

TOYOTA TECHNOLOGICAL INSTITUTE TALK 

Speaker: Yishay Mansour, Tel Aviv University

Title: From external to internal regret

Time: Today, Wednesday, August 11th, 2pm 
Place: TTI-C (1427 E. 60th St., 2nd Floor) 

Speaker's Homepage: http://www.math.tau.ac.il/~mansour/

Abstract: External regret compares the performance of an online algorithm, 
selecting among N actions, to the performance of the best of those actions 
in hindsight. Internal regret compares the loss of an online algorithm to 
the loss of a modified online algorithm, which consistently replaces one 
action by another. 

In this paper, we give a simple generic reduction that, given an algorithm 
for the external regret problem, converts it to an efficient online 
algorithm for the internal regret problem. We provide methods that work 
both in the full information model, in which the loss of every action is 
observed at each time step, and the partial information (bandit) model, 
where at each time step only the loss of the selected action is observed. 
The importance of internal regret in game theory is due to the fact that in 
a general game, if each player has sublinear internal regret, then the 
empirical frequencies converges to a correlated equilibrium. Our internal 
regret bounds guarantee that if in a general game all the players follow 
our algorithm, the empirical frequencies converge to an epsilon-correlated 
equilibrium. For external regret we also derive a quantitative regret bound 
for a very general setting of regret, which includes an arbitrary set of 
modification rules (that possibly modify the online algorithm) and an 
arbitrary set of time selection functions (each giving different weight to 
each time step). The regret for a given time selection and modification 
rule is the difference between the cost of the online algorithm and the 
cost of the modified online algorithm, where the costs are weighted by the 
time selection function. This can be viewed as a generalization of the 
previously-studied sleeping experts setting. Finally, we give a 
deterministic algorithm for Boolean prediction in the presence of time 
selection functions,whose number of weighted errors, with respect to any 
time selection function, is at most three times the optimum. 

This is joint work with Avrim Blum at CMU. 

If you have questions, or would like to meet the speaker, please contact 
Meridel at 4-9873 or mtrimble at tti-c.org