Webbläsaren som du använder stöds inte av denna webbplats. Alla versioner av Internet Explorer stöds inte längre, av oss eller Microsoft (läs mer här: * https://www.microsoft.com/en-us/microsoft-365/windows/end-of-ie-support).

Var god och använd en modern webbläsare för att ta del av denna webbplats, som t.ex. nyaste versioner av Edge, Chrome, Firefox eller Safari osv.

Nonconvergence to saddle boundary points under perturbed reinforcement learning

Författare

Summary, in English

For several reinforcement learning models in strategic-form games, convergence to action profiles that are not Nash equilibria may occur with positive probability under certain conditions on the payoff function. In this paper, we explore how an alternative reinforcement learning model, where the strategy of each agent is perturbed by a strategy-dependent perturbation (or mutations) function, may exclude convergence to non-Nash pure strategy profiles. This approach extends prior analysis on reinforcement learning in games that addresses the issue of convergence to saddle boundary points. It further provides a framework under which the effect of mutations can be analyzed in the context of reinforcement learning.

Publiceringsår

2015

Språk

Engelska

Sidor

667-699

Publikation/Tidskrift/Serie

International Journal of Game Theory

Volym

44

Issue

3

Dokumenttyp

Artikel i tidskrift

Förlag

Springer

Ämne

  • Control Engineering

Nyckelord

  • Learning in games
  • Reinforcement learning
  • Replicator dynamics

Status

Published

ISBN/ISSN/Övrigt

  • ISSN: 1432-1270