Banana collector project: p1 in DRLND

Project details:

This project is being done as part of the Udacity Deep Reinforcement Learning Nanodegree, a four month course that I am taking. The goal of this project is to train an agent to navigate a large square environment collecting as many yellow bananas as possible while avoiding blue bananas.

Environment details:

The environment is episodic and has a single agent, a continous state space and a discrete action space. Following are the details of the environment. Unity brain name: BananaBrain Number of Visual Observations (per agent): 0 Vector Observation space type: continuous Vector Observation space size (per agent): 37 Number of stacked Vector Observation: 1 Vector Action space type: discrete Vector Action space size (per agent): 4 Vector Action descriptions: , , ,

We are working with a state space of 37 dimensions that it contains the agent's velocity and an action space of size 4 corresponding to

0 - for moving forward
1 - for moving backward
2 - for turning left
3 - for turning right The environment is considered solved when the agent get an average score of +13 over 100 consecutive episodes.

Getting started

Clone this repo using git clone https://github.com/RihabGorsan/NavigationVanillaDQN.git
Begin by importing the necessary packages. So first create a conda environment
conda create --name navigation python=3.6
and then install the following packages:
conda install -y pytorch -c pytorch
pip install unityagents==0.4.0
pip install mlagents
Download the Banana unity environment You can download the environment form one of the links below. Just please select the enviornment that matches your OS
- Linux: click here
- Mac OSX: click here
- Windows (32-bit): click here
- Windows (64-bit): click here

Unzip the file hen place it in the cloned project.

Train your agent

To train the agent run python dqn.py or the Navigation.ipynb notebook. This will fire up the Unity environment and output live training statistics to the command line. When training is finished you'll have a saved model in checkpoint.pth.

To watch your trained agent interact with the environment run python dqn.py, just set train parameter to False and assign the path of the trained weights to the filename parameter. This will load the saved weights from a checkpoint file. A previously trained model is included in this repo and named `saved_model_weights.

And following are the results of the training:

Episode 100     Average Score: 4.49 
Episode 200     Average Score: 8.02 
Episode 300     Average Score: 11.58 
Episode 346     Average Score: 13.02 
Environment solved in 346 episodes!     Average Score: 13.02

Feel free to experiment with modifying the hyperparameters to see how it affects training:

model.py : you can change the architecture of the network.
agent.py : play with the hyperparams of an RL agent like gamma, epsilon, tau ..

Report

See the report for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Banana_Linux		Banana_Linux
python		python
.gitignore		.gitignore
Navigation.ipynb		Navigation.ipynb
README.md		README.md
Solvedin346eps.png		Solvedin346eps.png
agent.py		agent.py
dqn.py		dqn.py
model.py		model.py
report.md		report.md
saved_model_weights.pth		saved_model_weights.pth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Banana collector project: p1 in DRLND

Project details:

Environment details:

Getting started

Train your agent

Report

About

Releases

Packages

Languages

RihabGorsan/NavigationVanillaDQN

Folders and files

Latest commit

History

Repository files navigation

Banana collector project: p1 in DRLND

Project details:

Environment details:

Getting started

Train your agent

Report

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages