Skip to content

Latest commit

 

History

History
187 lines (155 loc) · 7.32 KB

File metadata and controls

187 lines (155 loc) · 7.32 KB

Real-time Driving Guide AI with Object Detection and Instance Segmentation

This repository contains the codebase for the AI Driving Guide project, which provides driver guidance through Object Detection and Instance Segmentation. The project offers a user-friendly and customizable interface designed to detect and track pedestrians, traffic signs, and lanes in real-time video streams from various sources. For demonstration purposes, we use Streamlit, a widely-used Python framework for developing interactive web applications.

🛠 Tech Stack 🛠

Python OpenCV Streamlit GitHub Notion



Overview

This README file provides a general introduction to the project. For detailed information about each detection task, please refer to the following README files linked :


video_demo



Getting Started

Requirements

Clone this repository:

git clone https://github.com/lunash0/prometheus5_project_AIDrivingGuide.git

Create a virtual environment and install dependencies (this project was developed using CUDA 12.1):

conda create -n aicar python=3.8
conda activate aicar
conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia  
conda install -c conda-forge ffmpeg
pip install tqdm matplotlib streamlit open-cv

WebApp Demo on Streamlit

You can look at our project through http://localhost:8501/ in web browser.
NOTE : Before you run the following command, please prepare model files and modify the path of model in configs/model.yaml.

streamlit run app/app.py

🏠 Home page for simulation

You can see the following initial screen.

home_demo

Select and Upload your souce ➡️ Click Process Button and Wait !

In this section, users can upload the video they wish to analyze and watch as the model processes it. After selecting and uploading their source file, users click the Process button, initiating the AI model to analyze the given video. The output, whether an image or video, will display detected objects like pedestrians, traffic lights, and lanes on the screen. This process is automated and designed for user convenience, providing real-time feedback.

Additionally, you can set your own score threshold of Pedestrian Detection model and Traffic Lights Detection model.

image_select_demo

💭 Feedback Page

Feel free to give feedback to our demo!

feedback_png

Training

For fine-tuning the backbone, follow the bash file in scripts/train.sh :

cd ./models/Pedestrian_Detection/
python train.py \
    --mode train \
    --config_file configs/noHue_0.50.5_large_re_3.yaml \
    --OUTPUT_DIR ${OUTPUT_DIR}

cd ../Lane_Detection/
python train.py \
    --dataset ${DATASET_DIR}  \
    --pretrained ./model_weight/existing/best_model.pth

cd ../TrafficLights_Detection
python train.py  # Modify config.py for your configurations

You can also download each finetuned model url from here:

Inference

For inference, instead of Webapp, you can directly run through the bash file in scripts/inference.sh which executes inference.py.

python inference.py \
    --task_type all \
    --CFG_DIR configs/model.yaml \
    --OUTPUT_DIR test_video/kaggle_clip_all.mp4 \
    --video videos/kaggle_clip.mp4 \
    --ped_score_threshold 0.25 \
    --tl_score_threshold 0.4 &

python inference.py \
    --task_type message \
    --CFG_DIR configs/model.yaml \
    --OUTPUT_DIR test_video/kaggle_clip_message.mp4 \
    --video videos/kaggle_clip.mp4 \
    --ped_score_threshold 0.25 \
    --tl_score_threshold 0.4 

Results

There are two types of results you can review:

  • The first type shows the object detection results using bounding boxes for pedestrians and traffic lights and lines for lanes.
  • The second type provides feedback, such as displaying 'Stop' on the screen when a stop sign is detected or 'Proceed with caution' when a pedestrian is nearby.

You can also see the results of applying both types simultaneously.

[Simulation Image] Left: Comments Only | Right : Total View

[Simulation Video 1] Left: Raw(nothing applied) | Right: Total View

The first simulation video shows the original video(left) and the total view(right).

combined_video

[Simulation Video 2] Left : Comments Only | Right : Total View

The second simulation shows the results with feedback only(left) and the total view(right).

output_combined
test_video11


Directory Structure

prometheus5_project_AIDrivingGuide/
│
├── README.md         
├── play.py   
├── inference.py           
├── __init__.py      
│
├── engine/           
│   ├── models.py    
│   ├── utils.py       
│   └── __init__.py   
│
├── models/           
│   ├── TrafficLights_Detection/
│   ├── Pedestrian_Detection/
│   └── Lane_Detection/
│
├── scripts/           
│   ├── train.sh   
│   └── inference.sh   
│
├── configs/           
│   └── model.yaml     
│
├── assets/           
│   ├── feedback.json
│   └── ... 
│
└── app/              
    ├── app.py      
    ├── home.py      
    ├── feedback.py      
    ├── helper.py      
    ├── settings.py   
    ├── images/   
    └── videos/