Multi-armed Bandit Problem

wide

In Computer Science, one of the most popular problem in the realm of Explore-exploit trade-off is the Multi-armed bandit problem. The term derived from the casino slot machine “one-armed bandit”, and as the name suggests, the problem entails multiple slot machines and how to win as much as possible.1

The real problem, however, is getting an idea of the odds of each slot machine to know when to keep playing or move on to the next one.

Multi-Armed Bandits: A Cartoon Introduction - DCBA #12

  1. Algorithms to Live By by Brian Christian and Tom Griffiths - Explore/Exploit

    ↩︎
  2. https://www.youtube.com/watch?v=bkw6hWvh_3k&ab_channel=AcademicGamer (Multi-Armed Bandits: A Cartoon Introduction - DCBA)

    ↩︎