Krusty Krabs Navigator

This project involves designing a simulation where a robotic agent, "Mr. Krabs," utilizes a Double Deep Q-Network (DDQN) to locate and collect a forgotten stash of money in a simulation of the Krusty Krabs restaurant. The agent learns to navigate through a dynamic environment containing static and dynamic obstacles using reinforcement learning principles.

Last Updated: January 4th, 2024

Installation

Make sure you have Python installed.

Follow these steps to set up the environment and run the application:

Clone the Repository:

git clone https://github.com/Sambonic/krusty-krabs-navigator

cd krusty-krabs-navigator

Create a Python Virtual Environment:

python -m venv env

Activate the Virtual Environment:

On Windows:
```
env\Scripts\activate
```
On macOS and Linux:
```
source env/bin/activate
```

Ensure Pip is Up-to-Date:

python.exe -m pip install --upgrade pip

Install Dependencies:
```
pip install .
```

Or simply run the pip install line in the notebook

Objectives

Implement a DDQN algorithm to train the agent to navigate the environment efficiently.
Develop a reward system that incentivizes the agent to reach the target quickly while avoiding obstacles.
Create a lore-accurate environment based on the Krusty Krabs restaurant for simulation purposes.

Environment Design

The environment and assets were designed using Ibis Paint X to resemble the Krusty Krabs restaurant.
Static and dynamic obstacles, including customers, were added to create realistic challenges for the agent.

Key Features

Reinforcement Learning Algorithm

Utilized Double Deep Q-Network (DDQN) for stability and to avoid Q-value overestimation.
Employed a target and policy network with periodic updates for stable training.

Neural Network Architecture

A feedforward neural network with three hidden layers of 64 neurons each, using ReLU activation.

Hyperparameter Tuning

Target Network Update Frequency: Every 100 episodes.
Replay Memory Size: 100,000.
Batch Size: 32.
Learning Rate: 0.0001.
Exploration Strategy: Epsilon-greedy approach (decay from 1.0 to 0.01 with a rate of 0.998).

Reward and Penalty System

Reward: Distance-based and success-based incentives (e.g., reaching the target yields a reward of 50.0).
Penalty: Distance and time-based penalties, with a harsh penalty (-5.0) for invalid moves.

Implementation Details

Technologies Used

Python with PyGame for simulation
NumPy for occupancy grid management
Priority Queues for pathfinding algorithms
Custom asset design with Ibis Paint X

Key Components

DDQN class for agent behavior
Environment class to house reward-penalty logic

Challenges and Solutions

Complex Environment: Improved exploration strategy and reward system.
Unstable Convergence: Enhanced neural network architecture with additional layers.
Overfitting: Adopted DDQN to leverage its mechanisms for stability.

Testing and Results

Comparison of Algorithms:

Algorithm	Time Elapsed (s)
A*	5.47
Dijkstra	5.54
DDQN	5.37

The DDQN agent demonstrated slightly better performance in efficiency compared to traditional search algorithms.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
images		images
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Krusty Krabs Navigator

Last Updated: January 4th, 2024

Table of Contents

Installation

Objectives

Environment Design

Key Features

Reinforcement Learning Algorithm

Neural Network Architecture

Hyperparameter Tuning

Reward and Penalty System

Implementation Details

Technologies Used

Key Components

Challenges and Solutions

Testing and Results

About

Releases

Packages

Languages

License

Sambonic/krusty-krabs-RL-navigator

Folders and files

Latest commit

History

Repository files navigation

Krusty Krabs Navigator

Last Updated: January 4th, 2024

Table of Contents

Installation

Objectives

Environment Design

Key Features

Reinforcement Learning Algorithm

Neural Network Architecture

Hyperparameter Tuning

Reward and Penalty System

Implementation Details

Technologies Used

Key Components

Challenges and Solutions

Testing and Results

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages