Available games

: thoroughly-tested. In many cases, we verified against known values and/or reproduced results from papers.

~: implemented but lightly tested.

X: known issues (see code for details).

Status	Game
	Backgammon
~	Battleship
~	Blackjack
	Breakthrough
	Bridge
	(Uncontested) Bridge bidding
~	Catch
~	Cliff Walking
~	Clobber
~	Coin Game
	Connect Four
~	Cooperative Box-Pushing
	Chess
~	Deep Sea
	First-price Sealed-Bid Auction
	Gin Rummy
	Go
	Goofspiel
	Hanabi
	Havannah
~	Hearts
~	Hex
	Kuhn poker
~	Laser Tag
	Leduc poker
~	Lewis Signaling
	Liar's Dice
~	Markov Soccer
	Matching Pennies (Three-player)
	Negotiation
X	Oh Hell
	Oshi-Zumo
	Oware
	Pentago
~	Phantom Tic-Tac-Toe
	Pig
~	Poker (Hold 'em)
	Quoridor
~	Sheriff
~	Slovenian Tarok
~	Skat (simplified bidding)
~	Solitaire (K+)
	Tic-Tac-Toe
	Tiny Bridge
	Tiny Hanabi
	Trade Comm
	Y

Details

Backgammon

Players move their pieces through the board based on the rolls of dice.
Idiosyncratic format.
Traditional game.
Non-deterministic.
Perfect information.
2 players.
Wikipedia

Battleship

Players place ships and shoot at each other in turns.
Pieces on a board.
Traditional game.
Deterministic.
Imperfect information.
2 players.
Good for correlated equilibria.
Farina et al. '19, Correlation in Extensive-Form Games: Saddle-Point Formulation and Benchmarks. Based on the original game (wikipedia)

Blackjack

Simplified version of blackjack, with only HIT/STAND moves.
Traditional game.
Non-deterministic.
Imperfect information.
1 player.
Wikipedia

Breakthrough

Simplified chess using only pawns.
Pieces on a grid.
Modern game.
Deterministic.
Perfect information.
2 players.
Wikipedia

Bridge

A card game where players compete in pairs.
Card game.
Traditional game.
Non-deterministic.
Imperfect information.
4 players.
Wikipedia

(Uncontested) Bridge bidding

Players score points by forming specific sets with the cards in their hands.
Card game.
Research game.
Non-deterministic.
Imperfect information.
2 players.
Wikipedia

Catch

Agent must move horizontally to 'catch' a descending ball. Designed to test basic learning.
Agent on a grid.
Research game.
Non-deterministic.
Perfect information.
1 players.
Mnih et al. 2014, Recurrent Models of Visual Attention,
Osband et al '19, Behaviour Suite for Reinforcement Learning, Appendix A

Cliff Walking

Agent must find goal without falling off a cliff. Designed to demonstrate exploration-with-danger.
Agent on a grid.
Research game.
Deterministic.
Perfect information.
1 players.
Sutton et al. '18, page 132

Clobber

Simplified checkers, where tokens can capture neighbouring tokens. Designed to be amenable to combinatorial analysis.
Pieces on a grid.
Research game.
Deterministic.
Perfect information.
2 players.
Wikipedia

Coin Game

Agents must collect their and their collaborator's tokens while avoiding a third kind of token. Designed to test divining of collaborator's intentions
Agents on a grid.
Research game.
Non-deterministic.
Perfect, incomplete information.
2 players.
Raileanu et al. '18, Modeling Others using Oneself in Multi-Agent Reinforcement Learning

Connect Four

Players drop tokens into columns to try and form a pattern.
Tokens on a grid.
Traditional game.
Deterministic.
Perfect information.
2 players.
Wikipedia

Cooperative Box-Pushing

Agents must collaborate to push a box into the goal. Designed to test collaboration.
Agents on a grid.
Research game.
Deterministic.
Perfect information.
2 players.
Seuken & Zilberstein '12, Improved Memory-Bounded Dynamic Programming for Decentralized POMDPs

Chess

Players move pieces around the board with the goal of eliminating the opposing pieces.
Pieces on a grid.
Traditional game.
Deterministic.
Perfect information.
2 players.
Wikipedia

Deep Sea

Agent must explore to find reward (first version) or penalty (second version). Designed to test exploration.
Agent on a grid.
Research game.
Deterministic.
Perfect information.
1 players.
Osband et al. '17, Deep Exploration via Randomized Value Functions

First-price Sealed-Bid Auction

Agents submit bids simultaneously; highest bid wins, and that's the price paid.
Idiosyncratic format.
Research game.
Non-deterministic.
Imperfect, incomplete information.
2-10 players.
Wikipedia

Gin Rummy

Players score points by forming specific sets with the cards in their hands.
Card game.
Traditional game.
Non-deterministic.
Imperfect information.
2 players.
Wikipedia

Go

Players place tokens on the board with the goal of encircling territory.
Tokens on a grid.
Traditional game.
Deterministic.
Perfect information.
2 players.
Wikipedia

Goofspiel

Players bid with their cards to win other cards.
Card game.
Traditional game.
Non-deterministic.
Imperfect information.
2-10 players.
Wikipedia

Hanabi

Players can see only other player's pieces, and everyone must cooperate to win.
Idiosyncratic format.
Modern game.
Non-deterministic.
Imperfect information.
2-5 players.
Wikipedia and Bard et al. '19, The Hanabi Challenge: A New Frontier for AI Research
Implemented via Hanabi Learning Environment

Havannah

Players add tokens to a hex grid to try and form a winning structure.
Tokens on a hex grid.
Modern game.
Deterministic.
Perfect information.
2 players.
Wikipedia

Hearts

A card game where players try to avoid playing the highest card in each round.
Card game.
Traditional game.
Non-deterministic.
Imperfect information.
3-6 players.
Wikipedia

Hex

Players add tokens to a hex grid to try and link opposite sides of the board.
Uses tokens on a hex grid.
Modern game.
Deterministic.
Perfect information.
2 players.
Wikipedia
Hex, the full story by Ryan Hayward and Bjarne Toft

Kuhn poker

Simplified poker amenable to game-theoretic analysis.
Cards with bidding.
Research game.
Non-deterministic.
Imperfect information.
2 players.
Wikipedia

Laser Tag

Agents see a local part of the grid, and attempt to tag eachother with beams.
Agents on a grid.
Research game.
Non-deterministic.
Imperfect information.
2 players.
Leibo et al. '17, Lanctot et al. '17

Leduc poker

Simplified poker amenable to game-theoretic analysis.
Cards with bidding.
Research game.
Non-deterministic.
Imperfect information.
2 players.
Southey et al. '05, Bayes’ bluff: Opponent modelling in poker

Lewis Signaling

Receiver must choose an action dependent on the sender's hidden state. Designed to demonstrate the use of conventions.
Idiosyncratic format.
Research game.
Non-deterministic.
Imperfect information.
2 players.
Wikipedia

Liar's Dice

Players bid and bluff on the state of all the dice together, given only the state of their dice.
Dice with bidding.
Traditional game.
Non-deterministic.
Imperfect information.
2 players.
Wikipedia

Markov Soccer

Agents must take the ball to their goal, and can 'tackle' the opponent by predicting their next move.
Agents on a grid.
Research game.
Non-deterministic.
Imperfect information.
2 players.
Littman '94, Markov games as a framework for multi-agent reinforcement learning,
He et al. '16, Opponent Modeling in Deep Reinforcement Learning

Matching Pennies (Three-player)

Players must predict and match/oppose another player. Designed to have an unstable Nash equilibrium.
Idiosyncratic format.
Research game.
Deterministic.
Imperfect information.
3 players.
"Three problems in learning mixed-strategy Nash equilibria"

Negotiation

Agents with different utilities must negotiate an allocation of resources.
Idiosyncratic format.
Research game.
Non-deterministic.
Imperfect information.
2 players.
Lewis et al. '17, Cao et al. '18

Oh Hell

A card game where players try to win exactly a declared number of tricks.
Card game.
Traditional game.
Non-deterministic.
Imperfect information.
3-7 players.
Wikipedia

Oshi-Zumo

Players must repeatedly bid to push a token off the other side of the board.
Idiosyncratic format.
Traditional game.
Deterministic.
Imperfect information.
2 players.
Buro, 2004. Solving the oshi-zumo game
Bosansky et al. '16, Algorithms for Computing Strategies in Two-Player Simultaneous Move Games

Oware

Players redistribute tokens from their half of the board to capture tokens in the opponent's part of the board.
Idiosyncratic format.
Traditional game.
Deterministic.
Perfect information.
2 players.
Wikipedia

Pentago

Players place tokens on the board, then rotate part of the board to a new orientation.
Uses tokens on a grid.
Modern game.
Deterministic.
Perfect information.
2 players.
Wikipedia

Phantom Tic-Tac-Toe

Tic-tac-toe, except the opponent's tokens are hidden. Designed as a simple, imperfect-information game.
Uses tokens on a grid.
Research game.
Deterministic.
Imperfect information.
2 players.
Auger '11, Multiple Tree for Partially Observable Monte-Carlo Tree Search,
Lisy '14, Alternative Selection Functions for Information Set Monte Carlo Tree Search,
Lanctot '13

Pig

Each player rolls a dice until they get a 1 or they 'hold'; the rolled total is added to their score.
Dice game.
Traditional game.
Non-deterministic.
Perfect information.
2-10 players.
Wikipedia

Poker (Hold 'em)

Players bet on whether their hand of cards plus some communal cards will form a special set.
Cards with bidding.
Traditional game.
Non-deterministic.
Imperfect information.
2-10 players.
Wikipedia
Implemented via ACPC.

Quoridor

Each turn, players can either move their agent or add a small wall to the board.
Idiosyncratic format.
Modern game.
Deterministic.
Perfect information.
2-4 players.
Wikipedia

Sheriff

Bargaining game.
Deterministic.
Imperfect information.
2 players.
Good for correlated equilibria.
Farina et al. '19, Correlation in Extensive-Form Games: Saddle-Point Formulation and Benchmarks.
Based on the board game "Sheriff of Nottingham" (bbg)

Slovenian Tarok

Trick-based card game with bidding.
Traditional game.
Non-deterministic.
Imperfect information.
3-4 players.
Wikipedia
Luštrek et al. 2003, A program for playing Tarok

Skat (simplified bidding)

Each turn, players bid to compete against the other two players.
Cards with bidding.
Traditional game.
Non-deterministic.
Imperfect information.
3 players.
Wikipedia

Solitaire (K+)

A single-player card game.
Card game.
Traditional game.
Non-deterministic.
Imperfect information.
1 players.
Wikipedia and Bjarnason et al. '07, Searching solitaire in real time

Tic-Tac-Toe

Players place tokens to try and form a pattern.
Uses tokens on a grid.
Traditional game.
Deterministic.
Perfect information.
2 players.
Wikipedia

Tiny Bridge

Simplified Bridge with fewer cards and tricks.
Cards with bidding.
Research game.
Non-deterministic.
Imperfect information.
2, 4 players.
See implementation for details.

Tiny Hanabi

Simplified Hanabi with just two turns.
Idiosyncratic format.
Research game.
Non-deterministic.
Imperfect information.
2-10 players.
Foerster et al 2018, Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning

Trade Comm

Players with different utilities and items communicate and then trade.
Idiosyncratic format.
Research game.
Non-deterministic.
Imperfect information.
2 players.
A simple emergent communication game based on trading.

Y

Players place tokens to try and connect sides of a triangular board.
Tokens on hex grid.
Modern game.
Deterministic.
Perfect information.
2 players.
Wikipedia

Files

games.md

Latest commit

History

games.md

File metadata and controls

Available games

Details

Backgammon

Battleship

Blackjack

Breakthrough

Bridge

(Uncontested) Bridge bidding

Catch

Cliff Walking

Clobber

Coin Game

Connect Four

Cooperative Box-Pushing

Chess

Deep Sea

First-price Sealed-Bid Auction

Gin Rummy

Go

Goofspiel

Hanabi

Havannah

Hearts

Hex

Kuhn poker

Laser Tag

Leduc poker

Lewis Signaling

Liar's Dice

Markov Soccer

Matching Pennies (Three-player)

Negotiation

Oh Hell

Oshi-Zumo

Oware

Pentago

Phantom Tic-Tac-Toe

Pig

Poker (Hold 'em)

Quoridor

Sheriff

Slovenian Tarok

Skat (simplified bidding)

Solitaire (K+)

Tic-Tac-Toe

Tiny Bridge

Tiny Hanabi

Trade Comm

Y