This repository contains the codebase for the AI Driving Guide project, which provides driver guidance through Object Detection and Instance Segmentation. The project offers a user-friendly and customizable interface designed to detect and track pedestrians, traffic signs, and lanes in real-time video streams from various sources. For demonstration purposes, we use Streamlit, a widely-used Python framework for developing interactive web applications.
This README file provides a general introduction to the project. For detailed information about each detection task, please refer to the following README files linked :
Clone this repository:
git clone https://github.com/lunash0/prometheus5_project_AIDrivingGuide.git
Create a virtual environment and install dependencies (this project was developed using CUDA 12.1):
conda create -n aicar python=3.8
conda activate aicar
conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
conda install -c conda-forge ffmpeg
pip install tqdm matplotlib streamlit open-cv
You can look at our project through http://localhost:8501/
in web browser.
NOTE : Before you run the following command, please prepare model files and modify the path of model in configs/model.yaml
.
streamlit run app/app.py
You can see the following initial screen.
In this section, users can upload the video they wish to analyze and watch as the model processes it. After selecting and uploading their source file, users click the Process button, initiating the AI model to analyze the given video. The output, whether an image or video, will display detected objects like pedestrians, traffic lights, and lanes on the screen. This process is automated and designed for user convenience, providing real-time feedback.
Additionally, you can set your own score threshold of Pedestrian Detection model and Traffic Lights Detection model.
Feel free to give feedback to our demo!
For fine-tuning the backbone, follow the bash file in scripts/train.sh
:
cd ./models/Pedestrian_Detection/
python train.py \
--mode train \
--config_file configs/noHue_0.50.5_large_re_3.yaml \
--OUTPUT_DIR ${OUTPUT_DIR}
cd ../Lane_Detection/
python train.py \
--dataset ${DATASET_DIR} \
--pretrained ./model_weight/existing/best_model.pth
cd ../TrafficLights_Detection
python train.py # Modify config.py for your configurations
You can also download each finetuned model url from here:
For inference, instead of Webapp, you can directly run through the bash file in scripts/inference.sh
which executes inference.py.
python inference.py \
--task_type all \
--CFG_DIR configs/model.yaml \
--OUTPUT_DIR test_video/kaggle_clip_all.mp4 \
--video videos/kaggle_clip.mp4 \
--ped_score_threshold 0.25 \
--tl_score_threshold 0.4 &
python inference.py \
--task_type message \
--CFG_DIR configs/model.yaml \
--OUTPUT_DIR test_video/kaggle_clip_message.mp4 \
--video videos/kaggle_clip.mp4 \
--ped_score_threshold 0.25 \
--tl_score_threshold 0.4
There are two types of results you can review:
- The first type shows the object detection results using bounding boxes for pedestrians and traffic lights and lines for lanes.
- The second type provides feedback, such as displaying 'Stop' on the screen when a stop sign is detected or 'Proceed with caution' when a pedestrian is nearby.
You can also see the results of applying both types simultaneously.
The first simulation video shows the original video(left) and the total view(right).
The second simulation shows the results with feedback only(left) and the total view(right).
prometheus5_project_AIDrivingGuide/
│
├── README.md
├── play.py
├── inference.py
├── __init__.py
│
├── engine/
│ ├── models.py
│ ├── utils.py
│ └── __init__.py
│
├── models/
│ ├── TrafficLights_Detection/
│ ├── Pedestrian_Detection/
│ └── Lane_Detection/
│
├── scripts/
│ ├── train.sh
│ └── inference.sh
│
├── configs/
│ └── model.yaml
│
├── assets/
│ ├── feedback.json
│ └── ...
│
└── app/
├── app.py
├── home.py
├── feedback.py
├── helper.py
├── settings.py
├── images/
└── videos/