QLearning-RL-Maze

Find 5 treasures and EXIT with minimum steps (QLearning)

wall positions: [4,5,7,9,22,23,25,30,31,35,39,43,45,47,49,50,51,53,55,57,58,59,61,65,71,74,80,85,88,90,94,97,100,101,102,104,109,110,111,113,114,119,120,127,128,129,132,134,136,141,142,143,145,151,153,155,157,158,164,166,169,172,176,178,181,183,186,187,190,191,193,195,196,206,211,214,226,229]
treasure positions: [6, 79, 170, 212, 227]
exit position : 230

hyperparameters Setting -> EPSILON = 0.9 (randomness of ACTIONS)
-> ALPHA = 0.1 (learning-rate)
-> GAMMA = 1 (desire for future rewards: 0 -> ignore future rewards, 1 -> look for high rewards in the long term)
-> MAX_EPISODES = 1000 (amount of times of walking through the maze)
Reward Setting -> goal_reward (exit found): 5000
-> wall_punish (hit into the wall): -300
-> out_punishment (go out of the map): -200
-> treasure_found: 1000
-> normal_reward (normal path): -100

Steps = 60 (at episode 442)
Path:

Steps = 501 (at episode 80)
Path:

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
QLearning-Maze.ipynb		QLearning-Maze.ipynb
README.md		README.md

Provide feedback