Drone / Unmanned Aerial Vehicle (UAV) Detection is a very safety critical project. It takes in Infrared (IR) video streams and detects drones in it with high accuracy.
Scope:
Real-time Unmanned Aerial Vehicle (UAV) detection system. The objective of the project is to make a real-time embedded drone detection system for a flying vehicle from the infrared data. The model should detect UAV in presence of varying UAV sizes/types, altitudes, distances and lighting conditions.
Work Done:
- Benchmark of YOLOV5s on Nvidia Jetson TX2 for hardware requirement understanding.
- The purpose of this activity was to test how much FPS we can achieve with Jetson TX2 an embedded device by running a YOLOv5s model to see how much more or less resources we need in our embedded device.
- We could achieve at max 37 FPS with Jetson TX2 with using 100% GPU, 20-30% RAM and 70-90% CPU
- We used TensorRT inference optimization to optimize the YOLOV5s 640 * 640 PyTorch model with a single class (UAV) and ran inference on a saved video
- Following the inferences that came out of it:
- The main bottleneck was GPU and hence we needed a better GPU or a way to optimize the use of CPU+GPU.
- The machine was getting too hot (>90 deg C) within 45 minutes of continuous operation so an industry grade or military grade hardware was required.
- The best embedded hardware according to the benchmark is Nvidia Jetson Xavier AGX: [Link]
- Following are the benchmarks done on Jetson TX2 (For a more detailed description on hardware problem description check hardware_problem_description.pdf):
Work in Progress:
- UAV Detection Model training:
-
We trained a YOLOv5s model to achieve 99.91% mAP with 0.5 IoU threshold.
-
The dataset used was an online dataset we found. Following are its details:
- The dataset is a part of Anti-UAV challenge hosted by ICCV a very reputed Computer Vision journal. They have open-sourced the data
- This data also contains different lighting conditions, backgrounds, drones and distances..
- Following is the link to data: [Link]
-
We also tested the model after fine-tuning on our dataset to evaluate its generalizability and the results were very good. Following is the link to a demo video of inference with latest model: [Link]
-
Use some novel data augmentation techniques specially for small object detection increases accuracy:
-
Following are the evaluation metrics of the object detection:
train/box_loss | train/obj_loss | train/cls_loss | metrics/precision | metrics/recall | metrics/mAP_0.5 | metrics/mAP_0.5:0.95 | val/box_loss | val/obj_loss | val/cls_loss | x/lr0 | x/lr1 | x/lr2 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0.027355 | 0.0039657 | 0 | 0.989 | 0.98299 | 0.98849 | 0.66212 | 0.021445 | 0.0023965 | 0 | 0.0091406 | 0.0091406 | 0.0091406 |
- Drone vs Bird Classification:
- A classification model based on trajectory of the drone.
- Current model predicts both drones and birds with high accuracy as drone so if a trajectory based classification is done it makes drone detection more robust.
- After brainstorming we came up with the following features:
- X-coordinate in terms of pixel
- Y--coordinate in terms of pixel
- z-coordinate in relation to absolute size of frame with UAV Bounding box size
- Curvature of drone at particular frame
- Eucledian distance from last frame
- We trained on ensemble of methods and are yet to find the best model
- PS: The dataset doesn't include a bird class as it doesn't contain one. We tested this on a custom dataaset. If birds are labelled as seperate class that might also make the model more robust. However, since the size of objects can become extremely small it is better to go with this classification approach.
- Model Optimization:
- This is one of the most critical part of the project because a full-blown deep learning model cannot run on an embedded device @50 FPS.
- So model quantization as well as sparsification needs to be done in order to to utilize maximum hardware resources to get the best results.
- We have tried several optimization techniques as follows:
- Deepsparse by Neuralmagic: This is probably one of the best methods that could be employed to our need but it was facing issues in setting up on Jetson TX2
- AIMET by Qualcomm: This is another toolkit which is very promising but didn’t got much chance to explore it due to time crunch
- TAO by Nvidia:: This is a toolkit which can be very fruitful but again didn’t explore it much due to time crunch.
- TensorRT: This is the tool we worked with and its complete setup is already in Nvidia Jetson TX2.
PR's are welcome!