This is a simple application that learns to play Othello by reinforcement learning.
TD(0) is used to evaluate a policy.
Value approximation function is based on n-tuple network introduced in Wojciech's paper.
Run python tdl.py
to learn a policy by self-play.
Edit config/config.ini
to setup players and run python run.py
to
play Othello in command line.
Or you can try the simple web app:
- Run
npm install && npm run build
inweb/ui
. - Install
gevent
andflask
:pip install gevent flask
- Run
python run_server.py
- Open http://localhost:44399/othello and play!
-
Jaśkowski, Wojciech (2014). Systematic n-tuple networks for othello position evaluation. ICGA Journal, 37(2), 85–96.
-
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: an introduction. : MIT press Cambridge.