Solution to the Flatland Challenge

2019 Edition - https://www.aicrowd.com/challenges/flatland-challenge

Usage

Train

python src/main.py --train --num-episodes=10000 --prediction-depth=150 --eps=0.9998 --checkpoint-interval=100 --buffer-size=10000

Using tensorboard

tensorboard --logdir=runs

Parameters

Rendering

python src/main.py --render

Plotting

python src/main.py --plot

Docs

Observations

Observations are obtained by concatenating the "rail occupancy bitmap" of an agent with the "heatmaps".

Rail occupancy bitmaps

A "rail occupancy bitmap" shows on which rail and in which direction the agent is traveling at every timestep and is obtained as follows:

A directed graph representation of the railway network is generated trough BFS, each node is a switch and each edge is a rail between two switches:
The shortest path for each agent is computed
The path is then transformed into a bitmap with the timesteps as columns and the rails as rows.

The direction can be positive (1) if the agent is traveling the edge from the source node to the destination node or negative otherwise (-1),

Heatmaps

Heatmaps are used to provide information about how the traffic is distributed across the rails over time.

Each agent computes 2 heatmaps, one positive and one negative, both are generated summing the bitmaps of all the other agents.

Network

The architecture used is a Dueling DQN, where the input is a Conv2D layer that processes a concatenation of the agent bitmap, the positive and the negative heatmaps. Then data goes through two separate streams, the value (red) and the advantage (blue) to be recombined in the final output Q values (purple).

Training

The training algorithm follows a Double Q Learning with Random Replay Buffer where the action space is reduced to 2 actions (stop and go) and the agent choices are based on a number of alternative paths that can be generated at every bifurcation point. At every fork the most promising path is chosen.

For more detailed information on the approaches see:

Devid Farinelli, Apprendimento con rinforzo applicato allo scheduling dei treni per la Flatland challenge (Italian only)
Giulia Cantini, FLATLAND: A study of Deep Reinforcement Learning methods applied to the vehicle rescheduling problem in a railway environment

Name		Name	Last commit message	Last commit date
Latest commit History 230 Commits
docs		docs
plots		plots
src		src
test-envs		test-envs
weights		weights
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
aicrowd.json		aicrowd.json
apt.txt		apt.txt
environment.yml		environment.yml
run.py		run.py
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Solution to the Flatland Challenge

Usage

Train

Using tensorboard

Parameters

Rendering

Plotting

Docs

Observations

Rail occupancy bitmaps

Heatmaps

Network

Training

About

Releases

Packages

Contributors 2

Languages

License

misterdev/flatland-marl

Folders and files

Latest commit

History

Repository files navigation

Solution to the Flatland Challenge

Usage

Train

Using tensorboard

Parameters

Rendering

Plotting

Docs

Observations

Rail occupancy bitmaps

Heatmaps

Network

Training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages