Skip to content

A simulation where an agent, Mr. Krabs, uses a DDQN RL algorithm to navigate a complex environment with obstacles and locate target money.

License

Notifications You must be signed in to change notification settings

Sambonic/krusty-krabs-RL-navigator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Krusty Krabs Navigator

This project involves designing a simulation where a robotic agent, "Mr. Krabs," utilizes a Double Deep Q-Network (DDQN) to locate and collect a forgotten stash of money in a simulation of the Krusty Krabs restaurant. The agent learns to navigate through a dynamic environment containing static and dynamic obstacles using reinforcement learning principles.

Last Updated: January 4th, 2024

Table of Contents

  1. Installation
  2. Objectives
  3. Environment Design
  4. Key Features
  5. Implementation Details
  6. Challenges and Solutions
  7. Testing and Results

Installation

Make sure you have Python installed.

Follow these steps to set up the environment and run the application:

  1. Clone the Repository:

    git clone https://github.com/Sambonic/krusty-krabs-navigator
    cd krusty-krabs-navigator
  2. Create a Python Virtual Environment:

python -m venv env
  1. Activate the Virtual Environment:
  • On Windows:

    env\Scripts\activate
    
  • On macOS and Linux:

    source env/bin/activate
    
  1. Ensure Pip is Up-to-Date:
python.exe -m pip install --upgrade pip
  1. Install Dependencies:

    pip install .

Or simply run the pip install line in the notebook

Objectives

  • Implement a DDQN algorithm to train the agent to navigate the environment efficiently.
  • Develop a reward system that incentivizes the agent to reach the target quickly while avoiding obstacles.
  • Create a lore-accurate environment based on the Krusty Krabs restaurant for simulation purposes.

Environment Design

  • The environment and assets were designed using Ibis Paint X to resemble the Krusty Krabs restaurant.
  • Static and dynamic obstacles, including customers, were added to create realistic challenges for the agent.

Key Features

Reinforcement Learning Algorithm

  • Utilized Double Deep Q-Network (DDQN) for stability and to avoid Q-value overestimation.
  • Employed a target and policy network with periodic updates for stable training.

Neural Network Architecture

  • A feedforward neural network with three hidden layers of 64 neurons each, using ReLU activation.

Hyperparameter Tuning

  • Target Network Update Frequency: Every 100 episodes.
  • Replay Memory Size: 100,000.
  • Batch Size: 32.
  • Learning Rate: 0.0001.
  • Exploration Strategy: Epsilon-greedy approach (decay from 1.0 to 0.01 with a rate of 0.998).

Reward and Penalty System

  • Reward: Distance-based and success-based incentives (e.g., reaching the target yields a reward of 50.0).
  • Penalty: Distance and time-based penalties, with a harsh penalty (-5.0) for invalid moves.

Implementation Details

Technologies Used

  • Python with PyGame for simulation
  • NumPy for occupancy grid management
  • Priority Queues for pathfinding algorithms
  • Custom asset design with Ibis Paint X

Key Components

  • DDQN class for agent behavior
  • Environment class to house reward-penalty logic

Challenges and Solutions

  • Complex Environment: Improved exploration strategy and reward system.
  • Unstable Convergence: Enhanced neural network architecture with additional layers.
  • Overfitting: Adopted DDQN to leverage its mechanisms for stability.

Testing and Results

  • Comparison of Algorithms:
Algorithm Time Elapsed (s)
A* 5.47
Dijkstra 5.54
DDQN 5.37
  • The DDQN agent demonstrated slightly better performance in efficiency compared to traditional search algorithms.

About

A simulation where an agent, Mr. Krabs, uses a DDQN RL algorithm to navigate a complex environment with obstacles and locate target money.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published