Othello AI

Welcome to the Othello AI project! This project implements an AI to play the game of Othello (also known as Reversi) with a graphical user interface (GUI). The AI supports various difficulty levels and different algorithms including MinMax and Monte Carlo Tree Search (MCTS).

GUI

Screens

The main screen is where you decide to play or end exit the game.
Players screen is where each player select the algorithm
Difficulty where we specify the difficulty of the algorithm
Game screen is where the game is played and the Othello board is displayed. The board is an 8x8 grid where the game pieces are placed. You have the options to:

Pause the game
Resume the game
Restart the game
Go back to the Main Menu

Main Algorithms:

MinMax Algorithm:

Over View

A classic AI technique for perfect information games like Othello, where both players have complete knowledge of the game state.
It works by recursively building a game tree, exploring all possible future moves for both players.
At each node in the tree, a score is assigned based on how favorable the position is for the AI player (usually the number of discs flipped).
The algorithm then chooses the move that leads to the highest score for the AI, assuming the opponent plays optimally.

Strengths

Easy to implement.
Provides a guaranteed optimal move for a given search depth. (Greedy algorithm on a certin depth and can find global optimal moves if the game is solved and a the choosen depth can reach the terminal state)

Weaknesses

Computationally expensive for deep searches.
Doesn't consider opponent's potential mistakes.

Arguemnts

Argument	Description
--depth	The number of turns (plies) considered for future moves, impacting the balance between speed and strategic evaluation. Default: 2

MinMax Alpha-Beta Pruning:

Over View

An enhancement of the MinMax algorithm that improves efficiency.
It utilizes alpha and beta values to prune unnecessary branches in the game tree.
Alpha represents the best score the AI can guarantee for itself, while beta represents the best score the opponent can achieve.
When evaluating a move, if its score is worse than the current beta value (for the opponent), the entire branch can be ignored as the opponent wouldn't choose that path anyway.
Similarly, if a move's score is better than the current alpha value, it becomes the new alpha, effectively cutting off branches that wouldn't lead to a better outcome for the AI.

Strengths

Significantly faster than vanilla MinMax for complex games.

Weaknesses

Still limited by search depth, pruning might miss good moves in unexplored branches.

Arguemnts

Argument	Description
--depth	The number of turns (plies) considered for future moves, impacting the balance between speed and strategic evaluation. Default: 2

Monte Carlo Tree Search (MCTS):

Over View

A powerful algorithm that utilizes simulations to explore the game tree.
It starts by building a tree of possible moves, then iteratively selects branches based on a balance between exploitation (choosing proven good moves) and exploration (trying new possibilities).
At each selected node, the AI plays out a simulated game from that point, using random moves for both players.
The results of these simulations are used to update the tree, favoring nodes that lead to better outcomes for the AI in the simulations.
Over time, MCTS converges towards finding the best move by focusing on promising branches while still exploring new possibilities.

Strengths

Efficiently explores the game tree
Adapts to opponent's playing style.

Weaknesses

Requires more computation than MinMax for a single move.
Might not guarantee optimal play.

Arguemnts

Argument	Description
--search	The number of simulations used for searching and exploring the game space during inference. Default: 1000

Actor-Critic (Reinforcement Learning):

Over View

An approach from Reinforcement Learning (RL) where two neural networks work together.
The Actor network predicts the best move for the AI in a given state.
The Critic network evaluates the value of the state-action pair, indicating how good it was for the AI to take a specific move in a particular situation.
Through training, both networks learn from the rewards received during gameplay (e.g., winning, losing, or intermediate rewards for disc advantage).
The Critic's feedback helps the Actor improve its move selection over time.

Strengths

Adapts to different strategies.
Learns from experience, can be very strong with sufficient training data.

Weaknesses

Requires significant training time and data.
Might not be interpretable in terms of specific decision-making logic.

Arguemnts

Argument	Description
--search	The number of simulations used for searching and exploring the game space during inference. Default: 1000
--NH	Number of hidden layers in the neural network architecture. Default: 64
--NB	Number of residual blocks in the neural network architecture. Default: 4
--BatchSize	Batch size used during training. Default: 64
--epochs	Number of epochs for training. Default: 5
--self_play	Number of games the model plays against itself before training. Default: 500
--iterations	Number of iterations the model plays against itself before training. Default: 5
--lr	Learning rate used during training. Default: 0.001
--model_file	Path to the saved model file. Default: "./Othello-AI/RL/SavedModels/model.pt"

How to generate games from Agent for future Training?

Cloning the Repository

git clone https://github.com/WaleedAlzamil80/Othello-AI.git

Run the Following command

python Othello-AI/generate_eps.py --self_play 100

In case it takes too much reduce the number of games (self_play)

The results will be found in the results folder then you can download the 3 numpy arrays corresponding to (State, Policy, Value)

How to Train with your custom dataset?

python training_eps.py --data_path ./path/to/custom_data --BatchSize 32 --lr 0.0001 --epochs 10

How to learn?

python run.py --BatchSize 32 --lr 0.0001 --epochs 10 --self_play 1000 --search 1500 --iterations 10

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
MCTs		MCTs
RL		RL
assets		assets
.gitignore		.gitignore
Board.py		Board.py
Button.py		Button.py
Constants.py		Constants.py
GameController.py		GameController.py
MCTs_vs_MinMax.py		MCTs_vs_MinMax.py
Main.py		Main.py
Minmax.py		Minmax.py
README.md		README.md
generate_eps.py		generate_eps.py
rules.py		rules.py
run.py		run.py
training_eps.py		training_eps.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Othello AI

GUI

Screens

Main Algorithms:

How to generate games from Agent for future Training?

How to Train with your custom dataset?

How to learn?

Explian MCTs

Explain Actor-Critic algorithm (RL)

Quick Start (Continued in next section...)

About

Releases

Packages

Contributors 7

Languages

WaleedAlzamil80/Othello-AI

Folders and files

Latest commit

History

Repository files navigation

Othello AI

GUI

Screens

Main Algorithms:

How to generate games from Agent for future Training?

How to Train with your custom dataset?

How to learn?

Explian MCTs

Explain Actor-Critic algorithm (RL)

Quick Start (Continued in next section...)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages