This codebase was created as a submission for CSCE-642 at Texas A&M University by Dineth Gunawardena and Wahib Kapdi.
DISLCLAIMER : This code base is based off of Kim Minji's Github repository Here for SLTTracking. This repository provide a perfect framework to further improve the performance for TransT tracker by using different RL techniques. We have created our implementation of A2C based SLT in this repository after stripping the repository down to only the necessary files.
Traditional visual object trackers often rely on frame-level tracking, which struggles with challenges like occlusion, motion blur, or ambient conditions. This project addresses these issues by treating tracking as a sequential decision problem. By incorporating past frames and bounding boxes as inputs to a reinforcement learning (RL) network, we aim to reduce random perturbations and improve tracking robustness for difficult scenarios.
This project seeks to enhance visual object tracking performance by integrating RL algorithms with existing tracking frameworks. Specifically, we use Transformer Tracking (TransT) with sequence-level reinforcement learning. We trained and tested our models on the GOT10K dataset, drawing on the implementation from Kim et al. (2022).
We tested the codes in the following environments but other versions may also be compatible.
- CUDA 11.3
- Python 3.9
- PyTorch 1.10.1
- Torchvision 0.11.2
# Create and activate a conda environment
conda create -y --name slt python=3.9
conda activate slt
# Install PyTorch
conda install pytorch==1.10.1 torchvision==0.11.2 cudatoolkit=11.3 -c pytorch -c conda-forge
# Install requirements
pip install -r requirements.txt
sudo apt-get install libturbojpeg
Add the your workspace_path and your local GOT10k dataset path to local.py and local.py
- Store one of the two models on your local system.
- Add the path to this file to slt_transt.py
- Then run
python pytracking/run_tracker.py slt_transt slt_transt --dataset_name got10k_test
- Then run and submit the generated .zip to the GOT10K Evaluation.
python pytracking/util_scripts/pack_got10k_results.py slt_transt slt_transt
- Store the baseline model above in yout local system.
- Add the path to this file to as a pretrained model slt_transt.py
- Then run
python ltr/run_training.py slt_transt slt_transt
- Store this model on your local system.
- Add the path to this file to ac_slt_transt.py
- Then run
python pytracking/run_tracker.py ac_slt_transt ac_slt_transt --dataset_name got10k_test
- Then run and submit the generated .zip to the GOT10K Evaluation.
python pytracking/util_scripts/pack_got10k_results.py ac_slt_transt ac_slt_transt
- Store the baseline model above in yout local system.
- Add the path to this file to as a pretrained model ac_slt_transt.py
- Then run
python ltr/run_training.py ac_slt_transt ac_slt_transt
Model | Data Used | AO | SR50 | SR75 |
---|---|---|---|---|
TransT (Baseline)** [1] | Multiple | 66.2 | 75.5 | 58.5 |
SLT + TransT ([1]) | Multiple | 72.0 | 81.6 | 68.3 |
SLT + TransT (Ours) | GOT-10K | 72.5 | 82.2 | 68.8 |
A2C SLT + TransT | GOT-10K | 70.5 | 79.9 | 66.3 |
Notes:
- Multiple Dataset represents a superset of the following publicly available datasets: TrackingNet, GOT-10k, LaSOT, ImageNet-VID, DAVIS, YouTube-VOS, MS-COCO, SBD, LVIS, ECSSD, MSRA10k, and HKU-IS.
- ** Data that we did not verify, but taken directly from Kim et al. (2022) (See Acknowledgments).
Description: A screenshot of performance comparison of various models on the GOT-10k dataset. The table highlights the Average Overlap (AO), Success Rate at 50% (SR50), and Success Rate at 75% (SR75) metrics for the SLT-enhanced versions (as per the referenced paper and the current work), and the proposed A2C SLT + TransT method.
In the results, it is evident that SLT improves on the Baseline. In case of our A2C SLT, we failed to train it for long enough, but it still surpasses the Baseline on all measures and reaches pretty close to the SCST based SLT Tracker.
- Explore other base trackers (e.g., Siamese trackers, TransDiMP).
- Compare performance with different RL techniques (e.g., PPO, DDPG).
- Add attention mechanisms for smoother tracking.
- Implement real-time tracking capabilities.
- Generate more visual results and user-friendly frontends.
SLTTracking was not developed by us, it is taken from the 2022 ECCV paper by Kim, Minji et al.
@inproceedings{SLTtrack,
title={Towards Sequence-Level Training for Visual Tracking},
author={Kim, Minji and Lee, Seungkwan and Ok, Jungseul and Han, Bohyung and Cho, Minsu},
booktitle={ECCV},
year={2022}
}
We have used it's code base here as our starting point. Here is link to it's github Github. SLTtrack is developed upon PyTracking library, also borrowing from TransT. We would like to thank the authors for providing great frameworks and toolkits.
Dineth Gunawardena: pgunawardena@tamu.edu
Wahib Kapdi: wahibkapdi@tamu.edu