This project was developed during the Artificial Intelligence Course, at FEUP. A simplified version of the game Lines of Actions is solved using reinforcement learning.
Project Grade: 19.5/20
Random Agent (Before Training) | Trained Agent using TRPO |
---|---|
- Install Python3, see official website
- It is recommended to run in a
conda environment
, our advice is to use Miniconda - After installing Python3, run the following command to install the necessary libraries:
pip install -r requirements.txt
- [OPTIONAL] It may appear an error when installing/importing
tensorBoard
, complaining about theprotobuf
version, if so run the following command to fix the issue:
pip install protobuf~=3.19.0
Using the Command Line, for Windows users, inside the /src
directory:
python main.py [--board=BOARD_SIZE]
Using the Command Line, for Linux or MacOS users, inside the /src
directory:
python3 main.py [--board=BOARD_SIZE]
Options:
[--board=BOARD_SIZE] options:
--board=4 : For 4x4 Board Size
--board=5 : For 5x5 Board Size
--board=6 : For 6x6 Board Size
default= --board=5
For Example:
python3 main.py
# or
python3 main.py --board=4
- After installing the prerequisites, running the command shown above:
- The following 3 RL Models will be trained using a default TIMESTEP=15000 (can be modified in the
main.py
by changing theTIMESTEPS
variable):- Proximal Policy Optimization (PPO)
- Advantage Actor Critic (A2C)
- Trust Region Policy Optimization (TRPO)
- Immediately after the training, these Models will be executed by an agent in order, an UI window will pop up showing the moves chosen
- The terminal will also output the detail of the actions, rewards and observations of the system
- The following 3 RL Models will be trained using a default TIMESTEP=15000 (can be modified in the
- To view graphical elements (graphs, plots) of the trained model, run the command:
tensorboard --logdir=logs
- Open a browser, and head to
http://localhost:6006
(the port number may vary, see detail on the terminal)