DFV Artifact

This is the artifact to accompany the ASE technical track submission "Distribution Models for Falsification and Verification of DNNs".

The artifact is split into several directories.

Directory rq1 contains the code and data from our experiments for research question 1. The environment models used to evaluate DFV, verifiers and falsifiers outputs, and processed data can be downloaded by running python download.py from within the rq1 directory.
Directory rq2 contains the code and data from our experiments for research question 2. The environment models used to evaluate DFV, falsifiers outputs, and processed data can be downloaded by running python download.py from within the rq2 directory.
Directory rq3 contains the code and data from our experiments for research question 3. The environment models used to evaluate DFV are available in the RQ3 Models directory at https://drive.google.com/drive/folders/1E1-keCoZ8bvKFkud_jofXtYApKT-nTJz. The models and data trained to use the models can be downloaded by running python download.py from within the rq3 directory.

The instructions to reproduce our experiments are in the INSTALL.md file under the 'Reproducing the study' section.

Further details

RQ1

This directory contains the code to replicate the evaluation of our first research question.

Models

Use train_vae.py to train the Fashion MNIST network and VAE. Use train_vae_mrs.py to train the VAE MRS. Models will be output to the directory ./saved_models/.

Verifiers

Use verify.sh to verify all the properties on the Fashion MNIST network with and without DFV. By default the script will run all the tools used in the experiment. However a specific tool can be specified by passing neurify, nnenum, verinet as the first argument. Logs and counter-examples will be output to the directory ./output/Verifiers/.

Falsifiers

Use falsify.sh to falsify all the properties on the Fashion MNIST network with and without DFV. By default the script will run all the tools used in the experiment. However a specific tool can be specified by passing deepfool, bim, fgsm, pgd as the first argument. Logs and counter-examples will be output to the directory ./output/Falsifiers/.

Process data

The data generated by verify.sh in ./output/Verifiers and falsify.sh in ./output/Falsifiers need to be processed. To do so, run process_data.py. The processed data will be output to the directory ./processed_data/.

Print study graphs

The images used for the paper can be generated by running print_images.py. Images will be output to the directory ./images/.

Download Our Data

Running all the above scripts can be time-consuming, therefore by running download.py three folders will be downloaded processed_data, saved_models and output. These folders contain all the data needed to execute print_images.py without running the other scripts.

Folder Description

Default

./common contains files needed to run the verify.sh, falsify.sh and process_data.py.
./images contains the png images generated for the paper.
./properties contains the properties used for the verification and falsification of our model with and without DFV.

Generated

./data will be created when running train_vae.py or train_vae_mrs.py to store MNIST Fashion dataset.
./saved_models will be created when running train_vae.py or train_vae_mrs.py to store the state dictionaries of the models.
./output will be created when running verify.sh or falsify.sh to store the results.
./processed_data will be created when running process_data.py to store the calculations made on the obtained data.

RQ2

This directory contains the code to replicate the evaluation of our second research question.

Models

Use train_models.py to train the 90 Fashion MNIST VAEs. Use train_vae_mrs.py to train the VAE MRS. Models will be output to the directory ./saved_models/.

Falsifier - PGD

Use falsify_multidim.sh to falsify all the properties on the 90 Fashion MNIST models with DFV. Use falsify_ls.sh to falsify all the properties on the Fashion MNIST DFV 8-2-256 model, varying the radius of the latent space. Logs and counter-examples from falsify_multidim.sh will be output to the directory ./output/multidim_study/, and the results from falsify_ls.sh will be output to the directory ./output/ls_study/.

Process data

The data generated by falsify_multidim.sh and falsify_ls.sh need to be processed. To do so, run process_data.py. The processed data will be output to the directory ./processed_data/.

Print study graphs

The images used for the paper can be generated by running print_images.py. Images will be output to the directory ./images/.

Download Our Data

Running all the above scripts can be time-consuming, therefore by running download.py three folders will be downloaded processed_data, saved_models and output. These folders contain all the data needed to execute print_images.py without running the other scripts.

Folder Description

Default

./common contains files needed to run the falsify_multidim.sh, falsify_ls.sh and process_data.py.
./images contains the png images generated for the paper.
./properties/multidim_study contains the properties used for the falsification of the 90 DFV models.
./properties/ls_study contains the properties used for variating the latent space radius of DFV 8-2-256 model.

Generated

./data will be created when running train_models.py or train_vae_mrs.py to store MNIST Fashion dataset.
./saved_models will be created when running train_models.py or train_vae_mrs.py to store the state dictionaries of the models.
./output will be created when running falsify_multidim.sh or falsify_ls.sh to store the results.
./processed_data will be created when running process_data.py to store the calculations made on the obtained data.

RQ3

This directory contains the code to replicate the evaluation of our third research question.

Models

To train environment models, first download the pre-processed DroNet dataset by running python download_data.py. Environment models can be trained by running ./train_fc.py to train FC-VAE_{DroNet}, ./train_vae.py to train Conv-VAE_{DroNet}, and ./train_dcgan.py to train GAN_{DroNet}. Models will be output to the directory ./models/.

Running the Falsifier

The PGD falsifier can then be run on the DroNet network, both without DFV and with DFV using the VAE and GAN models by running ./falsify.sh. By default this script will run all 3 methods. A specific treatment can be specified by passing dnn, vae, or gan as the first argument to this script. If the first argument is all then all treatments will be run. This script will save all counter-examples to the directory cex/. The second argument to ./falsify.sh can be used to specify a different name for this directory. The third argument accepted by this script specifies a timeout for each job in seconds. By default this timeout is 1 hour, or 3600 seconds. The fourth argument specifies where to save the logs, by default this is the directory logs/falsification_logs.

Processing the Data

After running the falsifier, counter-examples can be converted to png formatted images and the plots for the paper can be generated by running python npy_to_png.py cex logs/falsification_logs, where cex is the directory containing the counter-examples and logs/falsification_logs is the directory containing the logs from running falsify.sh. This script will convert all counter-examples from the numpy .npy format to PNG images, will create CSV files containing the times and MRS values for each counter-example, and generate the plots shown in the paper.

Downloading Our Data

Running all of the above scripts can be time-consuming. We provide a script download.py to download our data to allow you to start from any stage of this process. This script will download our trained environment models and logs from running the falsifiers.

Folder Description

Default

./benchmark contains the GHPR-DroNet benchmark. Within this directory, onnx/ contains the DroNet model, properties/ contains the standard GHPR-DroNet properties, and vae_properties/ contains the properties modified to accept an environment model prefix.
./cex contains the counter-examples found, with the sub-directory npy/ containing counter-examples in .npy format, and png/ containing the counter-examples in .png format.

Generated

./models will be created when running any of the training scripts (or downloading the models).
./logs/falsification_logs will be created when running falsify.sh.
./Data will be created when running python download_data.py, and contains the pre-processed DroNet dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
rq1		rq1
rq2		rq2
rq3		rq3
.gitignore		.gitignore
CITATION.cff		CITATION.cff
Distribution Models for Falsification and Verification of DNNs - Appendix.pdf		Distribution Models for Falsification and Verification of DNNs - Appendix.pdf
Distribution Models for Falsification and Verification of DNNs.pdf		Distribution Models for Falsification and Verification of DNNs.pdf
INSTALL.md		INSTALL.md
LICENSE.md		LICENSE.md
README.md		README.md
activate.sh		activate.sh
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DFV Artifact

Further details

RQ1

Models

Verifiers

Falsifiers

Process data

Print study graphs

Download Our Data

Folder Description

Default

Generated

RQ2

Models

Falsifier - PGD

Process data

Print study graphs

Download Our Data

Folder Description

Default

Generated

RQ3

Models

Running the Falsifier

Processing the Data

Downloading Our Data

Folder Description

Default

Generated

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

less-lab-uva/DFV-Artifact

Folders and files

Latest commit

History

Repository files navigation

DFV Artifact

Further details

RQ1

Models

Verifiers

Falsifiers

Process data

Print study graphs

Download Our Data

Folder Description

Default

Generated

RQ2

Models

Falsifier - PGD

Process data

Print study graphs

Download Our Data

Folder Description

Default

Generated

RQ3

Models

Running the Falsifier

Processing the Data

Downloading Our Data

Folder Description

Default

Generated

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages