Negatively Correlated Bandits

(by Sven Rady, with Nicolas Klein)

Review of Economic Studies 78(2), 2011, pp. 693-732

Earlier versions of this paper were circulated as:
Munich Department of Economics Discussion Paper No. 2008-16
SFB/Transregio 15 Discussion Paper No. 243
CEPR Discussion Paper No. 6983

[Download latest working paper version]

We analyze a two-player game of strategic experimentation with two-armed bandits. Either player has to decide in continuous time whether to use a safe arm with a known payoff or a risky arm whose expected payoff per unit of time is initially unknown. This payoff can be high or low, and is negatively correlated across players. We characterize the set of all Markov perfect equilibria in the benchmark case where the risky arms are known to be of opposite type, and construct equilibria in cutoff strategies for arbitrary negative correlation. All strategies and payoffs are in closed form. In marked contrast to the case where both risky arms are of the same type, there always exists an equilibrium in cutoff strategies, and there always exists an equilibrium exhibiting efficient long-run patterns of learning. These results extend to a three-player game with common knowledge that exactly one risky arm is of the high payoff type.

Keywords: Strategic Experimentation, Two-Armed Bandit, Exponential Distribution, Poisson Process, Bayesian Learning, Markov Perfect Equilibrium
JEL Classification: C73, D83, O32

