IIP_UAVSal_Saliency

It is a re-implementation code for the UAVSal model.

Related Project

Kao Zhang, Zhenzhong Chen, Shan Liu. A Spatial-Temporal Recurrent Neural Network for Video Saliency Prediction. IEEE Transactions on Image Processing (TIP), vol. 30, pp. 572-587, 2021.
Github: https://github.com/zhangkao/IIP_STRNN_Saliency
Kao Zhang, Zhenzhong Chen. Video Saliency Prediction Based on Spatial-Temporal Two-Stream Network. IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), vol. 29, no. 12, pp. 3544-3557, 2019.
Github: https://github.com/zhangkao/IIP_TwoS_Saliency

Installation

Environment:

The code was developed using Python 3.6+ & pytorch 1.4+ & CUDA 10.0+. There may be a problem related to software versions.

Windows10/11 or Ubuntu20.04
Anaconda latest, Python
CUDA, CUDNN

Python requirements

You can try to create a new environment in anaconda, as follows

*For GEFORCE RTX 10 series, such as GTX1080, xp, etc. (Pytorch 1.4.0~1.7.1, python=3.6~3.8)

    conda create -n uavsal python=3.8
    conda activate uavsal
    conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch
    pip install numpy hdf5storage h5py==2.10.0 scipy matplotlib opencv-python scikit-image torchsummary

*For GEFORCE RTX 30 series, such as RTX3060, 3080, etc.
    
    conda create -n uavsal python=3.7
    conda activate uavsal
    conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=11.0 -c pytorch
    pip install numpy hdf5storage h5py==2.10.0 scipy matplotlib opencv-python scikit-image torchsummary

Pre-trained models

Download the pre-trained models and put the pre-trained model into the "weights" file.

UAVSal-UAV2 OneDrive (52M)
UAVSal-AVS1K OneDrive (52M)

Train and Test

The parameters

Please change the working directory: "dataDir" to your path in the "Demo_Test.py" and "Demo_Train_Test.py" files, like:
```
  dataDir = '/home/name/DataSet/'
```
More parameters are in the "train" and "test" functions.
Run the demo "Demo_Test.py" and "Demo_Train_Test.py" to test or train the model.

The full training process:

We initialize the SRF-Net with the pretrained MobileNet V2 and fine-tune the model on SALICON dataset. Then we train the whole model on EyeTrackUAV2 and AVS1K, respectively.

The training and testing datasets:

Training dataset: SALICON(2015), UAV2, and AVS1K
Testing dataset: UAV2-TE and AVS1K-TE

The training and test data examples:

Training data example: UAV2 (376M) AVS1K (147M)
Testing data example: UAV2-TE (483M) AVS1K-TE (29M)

Output

And it is easy to change the output format in our code.

The results of video task is saved by ".mat"(uint8) formats.
You can get the color visualization results based on the "Visualization Tools".
You can evaluate the performance based on the "EvalScores Tools".
You can get the parameter size of each component based on the "Getmodelsize Tools".

Results: ALL (5.8G):

The model is trained using Adam optimizer with lr=0.0001 and weight_decay=0.00005

Version V1 : UAV2-TE (707M) AVS1K-TE (1.8GM),

The model is trained using Adam optimizer with lr=0.00001 and weight_decay=0.000005

Version V2 : UAV2-TE (639M), AVS1K-TE (2.04G)

It can achieve faster speed (85FPS) with similar performance by slightly reducing the input size from original 360 x 640 pixels to 288 x 512 pixels

UAV2 : weigths (52M), results (649M)

Video Demo

A video demo is provided for comparison with State-of-the-art methods, including: OneDrive (596M)

DL based models: STRNN*, TwoS*, TASED*, UNISAL*
non-DL based models: GBVSm, AWSD.
The models fine-tuned on the corresponding dataset (UAV2 and AVS1K) are marked with *.
After fine-tuning, the performance of these models improved significantly.
The first four scenes are from the UAV2-TE dataset, the rest are from AVS1K-TE.

Paper & Citation

If you use the UAVSal video saliency model, please cite the following paper:

@article{zhang2022an,
  title={An Efficient Saliency Prediction Model for Unmanned Aerial Vehicle Video},
  author={Zhang, Kao and Chen, Zhenzhong and Li, Songnan and Liu, shan},
  journal={ISPRS Journal of Photogrammetry and Remote Sensing},
  volume={xxxx},
  pages={xxxx},
  year={2022}
}

Contact

Kao ZHANG
Laboratory of Intelligent Information Processing (LabIIP)
Wuhan University, Wuhan, China.
Email: zhangkao@whu.edu.cn

Zhenzhong CHEN (Professor and Director)
Laboratory of Intelligent Information Processing (LabIIP)
Wuhan University, Wuhan, China.
Email: zzchen@whu.edu.cn
Web: http://iip.whu.edu.cn/~zzchen/

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
Tools		Tools
weights		weights
.gitattributes		.gitattributes
.gitignore		.gitignore
AVS1K_ob_priors_train.mat		AVS1K_ob_priors_train.mat
Demo_Test.py		Demo_Test.py
Demo_Train_Test.py		Demo_Train_Test.py
LICENSE		LICENSE
README.md		README.md
UAV2_ob_priors_train.mat		UAV2_ob_priors_train.mat
config.py		config.py
dataset.py		dataset.py
gauss_priors.mat		gauss_priors.mat
loss_functions.py		loss_functions.py
model.py		model.py
model_convlstm.py		model_convlstm.py
model_feature.py		model_feature.py
utils_data.py		utils_data.py
utils_score.py		utils_score.py
utils_score_torch.py		utils_score_torch.py
utils_vis.py		utils_vis.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

IIP_UAVSal_Saliency

Installation

Environment:

Python requirements

Pre-trained models

Train and Test

Output

Video Demo

Paper & Citation

Contact

About

Releases

Packages

Languages

License

zhangkao/IIP_UAVSal_Saliency

Folders and files

Latest commit

History

Repository files navigation

IIP_UAVSal_Saliency

Installation

Environment:

Python requirements

Pre-trained models

Train and Test

Output

Video Demo

Paper & Citation

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages