Playing Othello(Reversi) By Reinforcement Learning

Introduction

This is a simple application that learns to play Othello by reinforcement learning.

TD(0) is used to evaluate a policy.

Value approximation function is based on n-tuple network introduced in Wojciech's paper.

Run python tdl.py to learn a policy by self-play.

Edit config/config.ini to setup players and run python run.py to play Othello in command line.

Or you can try the simple web app:

Jaśkowski, Wojciech (2014). Systematic n-tuple networks for othello position evaluation. ICGA Journal, 37(2), 85–96.
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: an introduction. : MIT press Cambridge.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
config		config
database		database
model		model
web		web
LICENSE		LICENSE
README.md		README.md
ai.py		ai.py
database.py		database.py
evaluation.py		evaluation.py
othello.py		othello.py
run.py		run.py
tdl.py		tdl.py
util.py		util.py
value.py		value.py