Skip to content

Implementation of the paper "Improved DeepFake Detection Using Whisper Features"

License

Notifications You must be signed in to change notification settings

piotrkawa/deepfake-whisper-features

Repository files navigation

Improved DeepFake Detection Using Whisper Features

The following repository contains code for our paper called "Improved DeepFake Detection Using Whisper Features".

The paper is available here.

Before you start

Whisper

To download Whisper encoder used in training run download_whisper.py.

Datasets

Download appropriate datasets:

Dependencies

Install required dependencies using (we assume you're using conda and the target env is active):

bash install.sh

List of requirements:

python=3.8
pytorch==1.11.0
torchaudio==0.11
asteroid-filterbanks==0.4.0
librosa==0.9.2
openai whisper (git+https://github.com/openai/whisper.git@7858aa9c08d98f75575035ecd6481f462d66ca27)

Supported models

The following list concerns models and its names to select it supported by this repository:

  • SpecRNet - specrnet,
  • (Whisper) SpecRNet - whisper_specrnet,
  • (Whisper + LFCC/MFCC) SpecRNet - whisper_frontend_specrnet,
  • LCNN - lcnn,
  • (Whisper) LCNN - whisper_lcnn,
  • (Whisper + LFCC/MFCC) LCNN -whisper_frontend_lcnn,
  • MesoNet - mesonet,
  • (Whisper) MesoNet - whisper_mesonet,
  • (Whisper + LFCC/MFCC) MesoNet - whisper_frontend_mesonet,
  • RawNet3 - rawnet3.

To select appropriate front-end please specify it in the config file.

Pretrained models

All models reported in paper are available here.

Configs

Both training and evaluation scripts are configured with the use of CLI and .yaml configuration files. e.g.:

data:
  seed: 42

checkpoint: 
  path: "trained_models/lcnn/ckpt.pth",

model:
  name: "lcnn"
  parameters:
    input_channels: 1
    frontend_algorithm: ["lfcc"]
  optimizer:
    lr: 0.0001
    weight_decay: 0.0001

Other example configs are available under configs/training/.

Full train and test pipeline

To perform full pipeline of training and testing please use train_and_test.py script.

usage: train_and_test.py [-h] [--asv_path ASV_PATH] [--in_the_wild_path IN_THE_WILD_PATH] [--config CONFIG] [--train_amount TRAIN_AMOUNT] [--test_amount TEST_AMOUNT] [--batch_size BATCH_SIZE] [--epochs EPOCHS] [--ckpt CKPT] [--cpu]

Arguments: 
    --asv_path          Path to the ASVSpoof2021 DF root dir
    --in_the_wild_path  Path to the In-The-Wild root dir
    --config            Path to the config file
    --train_amount      Number of samples to train on (default: 100000)
    --valid_amount      Number of samples to validate on (default: 25000)
    --test_amount       Number of samples to test on (default: None - all)
    --batch_size        Batch size (default: 8)
    --epochs            Number of epochs (default: 10)
    --ckpt              Path to saved models (default: 'trained_models')
    --cpu               Force using CPU

e.g.:

python train_and_test.py --asv_path ../datasets/deep_fakes/ASVspoof2021/DF --in_the_wild_path ../datasets/release_in_the_wild --config configs/training/whisper_specrnet.yaml --batch_size 8 --epochs 10 --train_amount 100000 --valid_amount 25000

Finetune and test pipeline

To perform finetuning as presented in paper please use train_and_test.py script.

e.g.:

python train_and_test.py --asv_path ../datasets/deep_fakes/ASVspoof2021/DF --in_the_wild_path ../datasets/release_in_the_wild --config configs/finetuning/whisper_specrnet.yaml --batch_size 8 --epochs 5  --train_amount 100000 --valid_amount 25000

Please remember about decreasing the learning rate!

Other scripts

To use separate scripts for training and evaluation please refer to respectively train_models.py and evaluate_models.py.

Acknowledgments

We base our codebase on Attack Agnostic Dataset repo. Apart from the dependencies mentioned in Attack Agnostic Dataset repository we also include:

Citation

If you use this code in your research please use the following citation:

@inproceedings{kawa23b_interspeech,
  author={Piotr Kawa and Marcin Plata and Michał Czuba and Piotr Szymański and Piotr Syga},
  title={{Improved DeepFake Detection Using Whisper Features}},
  year=2023,
  booktitle={Proc. INTERSPEECH 2023},
  pages={4009--4013},
  doi={10.21437/Interspeech.2023-1537}
}

About

Implementation of the paper "Improved DeepFake Detection Using Whisper Features"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published