What is IaGo?

How to play?

Install chainer
$ pip install chainer
Download this repository
$ git clone git@github.com:shionhonda/IaGo.git
Move to IaGo directory and execute game.py
$ python game.py
You can set following options:
--auto=False or --a=False
If this is set True, autoplay begins between SLPolicy and PV-MCTS, and if False (default), the game is played by you and PV-MCTS.
The thinking time is 10 seconds.
When placing a stone, input two numbers separated by comma. For example:
4,3
The first number corresponds to the vertical position and the second to the horizontal (one origin).

Download data from http://meipuru-344.hatenablog.com/entry/2017/11/27/205448
Save it as "IaGo/data/data.txt"
Augment data
$ python load.py
You need at least 32MB RAM to complete this step.
Execute train_policy.py to train SL policy network.
$ python train_policy.py --policy=sl --epoch=10 --gpu=0
You need GPUs to complete this step. It will take about 12 hours.
Execute train_policy.py to train rollout policy.
$ python train_policy.py --policy=rollout --epoch=1 --gpu=0
This is fast.
Execute train_rl.py to reinforce SL policy network with REINFORCE (a kind of policy gradients).
$ python train_rl.py --set=10000
Execute train_value.py to train value network.
$ python train_value.py --epoch=20 --gpu=0
Training done!

Special thanks to:
@Rochestar-NRT for replication of AlphaGo (especially MCTS).
@lazmond3 for giving lots of feedbacks!

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
models		models
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
MCTS.py		MCTS.py
game.py		game.py
gen_value_data.py		gen_value_data.py
load.py		load.py
mcts_self_play.py		mcts_self_play.py
network.py		network.py
readme.md		readme.md
reinforce.py		reinforce.py
rl_env.py		rl_env.py
self_play.py		self_play.py
test_train.py		test_train.py
train_policy.py		train_policy.py
train_rl.py		train_rl.py
train_value.py		train_value.py
value_self_play.py		value_self_play.py