Stego-Favicons-Dataset

This repository contains 2 datasets of 32x32 pixels favicons/images modified via the LSB steganographic technique to contain different malicious payloads, i.e., PHP scripts and URLs. The payloads are selected to fit in the first bit of each color channel, i.e., max 32x32x3 bits.

The original images are borrowed from the CIFAR-10 public available dataset. The tool used for the LSB steganographic technique is the LSB-Steganography. The PHP scripts are borrowed from the PHP-Malware-Collection (28 different scripts). The malicious URLs are borrowed from the URLhaus collection.

In particular:

Dataset1: 360.000 images.
- clean: 60.000 legitimate images from the CIFAR-10 dataset;
- LSB_stego_php: 60.000 images embedded with suitable PHP scripts (obtained by combining each legitimate image with 1 PHP script);
- LSB_stego_url: 240.000 images embedded with URLs (obtained by combining each legitimate image with 4 different URLs).

The PHP scripts pool is composed of 28 different scripts whereas the URLs are 1.000, split into 500 IP addresses and 500 DNS entries.

Dataset2: 90.000 images.
- clean: 15.000 legitimate images from the CIFAR-10 dataset;
- LSB_stego_php_b64: 15.000 images embedded with suitable base-64 PHP scripts (obtained by combining each legitimate image with 1 base-64 PHP script);
- LSB_stego_url: 60.000 images embedded with unseen URLs (obtained by combining each legitimate image with 4 different URLs).

The base-64 PHP scripts pool is composed of 21 different scripts whereas the URLs are 1.000, split into 500 IP addresses and 500 DNS entries.

Detection

A Deep Learning approach has been implemented for the detection and the classification of the Stego-Favicons-Dataset. You can find more information in the respective Github repository.

References

If you use this dataset, please cite it as:

@inproceedings{fav,
title={Revealing MageCart-like threats in favicons via artificial intelligence},
author={Guarascio, Massimo and Zuppelli, Marco and Cassavia, Nunziato and Caviglione, Luca and Manco, Giuseppe},
booktitle={Proceedings of the 17th International Conference on Availability, Reliability and Security},
pages={1--7},
year={2022}
}
(https://dl.acm.org/doi/pdf/10.1145/3538969.3544437)

Acknowledgements

This work was partially supported by the Horizon 2020 Program through SIMARGL H2020-SU-ICT-01-2018, Grant Agreement No. 833042, and CyberSec4Europe, EU H2020-SU-ICT-03-2018, Grant Agreement No. 830929.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
Dataset1		Dataset1
Dataset2		Dataset2
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stego-Favicons-Dataset

Detection

References

Acknowledgements

About

License

Ocram95/Stego-Favicons-Dataset

Folders and files

Latest commit

History

Repository files navigation

Stego-Favicons-Dataset

Detection

References

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks