Improved DeepFake Detection Using Whisper Features

The following repository contains code for our paper called "Improved DeepFake Detection Using Whisper Features".

The paper is available here.

Before you start

Whisper

To download Whisper encoder used in training run download_whisper.py.

Datasets

Download appropriate datasets:

ASVspoof2021 DF subset (Please note: we use this keys&metadata file, directory structure is explained here),
In-The-Wild dataset.

Dependencies

Install required dependencies using (we assume you're using conda and the target env is active):

bash install.sh

List of requirements:

python=3.8
pytorch==1.11.0
torchaudio==0.11
asteroid-filterbanks==0.4.0
librosa==0.9.2
openai whisper (git+https://github.com/openai/whisper.git@7858aa9c08d98f75575035ecd6481f462d66ca27)

Supported models

The following list concerns models and its names to select it supported by this repository:

SpecRNet - specrnet,
(Whisper) SpecRNet - whisper_specrnet,
(Whisper + LFCC/MFCC) SpecRNet - whisper_frontend_specrnet,
LCNN - lcnn,
(Whisper) LCNN - whisper_lcnn,
(Whisper + LFCC/MFCC) LCNN -whisper_frontend_lcnn,
MesoNet - mesonet,
(Whisper) MesoNet - whisper_mesonet,
(Whisper + LFCC/MFCC) MesoNet - whisper_frontend_mesonet,
RawNet3 - rawnet3.

To select appropriate front-end please specify it in the config file.

Pretrained models

All models reported in paper are available here.

Configs

Both training and evaluation scripts are configured with the use of CLI and .yaml configuration files. e.g.:

data:
  seed: 42

checkpoint: 
  path: "trained_models/lcnn/ckpt.pth",

model:
  name: "lcnn"
  parameters:
    input_channels: 1
    frontend_algorithm: ["lfcc"]
  optimizer:
    lr: 0.0001
    weight_decay: 0.0001

Other example configs are available under configs/training/.

Full train and test pipeline

To perform full pipeline of training and testing please use train_and_test.py script.

usage: train_and_test.py [-h] [--asv_path ASV_PATH] [--in_the_wild_path IN_THE_WILD_PATH] [--config CONFIG] [--train_amount TRAIN_AMOUNT] [--test_amount TEST_AMOUNT] [--batch_size BATCH_SIZE] [--epochs EPOCHS] [--ckpt CKPT] [--cpu]

Arguments: 
    --asv_path          Path to the ASVSpoof2021 DF root dir
    --in_the_wild_path  Path to the In-The-Wild root dir
    --config            Path to the config file
    --train_amount      Number of samples to train on (default: 100000)
    --valid_amount      Number of samples to validate on (default: 25000)
    --test_amount       Number of samples to test on (default: None - all)
    --batch_size        Batch size (default: 8)
    --epochs            Number of epochs (default: 10)
    --ckpt              Path to saved models (default: 'trained_models')
    --cpu               Force using CPU

e.g.:

python train_and_test.py --asv_path ../datasets/deep_fakes/ASVspoof2021/DF --in_the_wild_path ../datasets/release_in_the_wild --config configs/training/whisper_specrnet.yaml --batch_size 8 --epochs 10 --train_amount 100000 --valid_amount 25000

Finetune and test pipeline

To perform finetuning as presented in paper please use train_and_test.py script.

e.g.:

python train_and_test.py --asv_path ../datasets/deep_fakes/ASVspoof2021/DF --in_the_wild_path ../datasets/release_in_the_wild --config configs/finetuning/whisper_specrnet.yaml --batch_size 8 --epochs 5  --train_amount 100000 --valid_amount 25000

Please remember about decreasing the learning rate!

Other scripts

To use separate scripts for training and evaluation please refer to respectively train_models.py and evaluate_models.py.

Acknowledgments

We base our codebase on Attack Agnostic Dataset repo. Apart from the dependencies mentioned in Attack Agnostic Dataset repository we also include:

RawNet3 implementation.

Citation

If you use this code in your research please use the following citation:

@inproceedings{kawa23b_interspeech,
  author={Piotr Kawa and Marcin Plata and Michał Czuba and Piotr Szymański and Piotr Syga},
  title={{Improved DeepFake Detection Using Whisper Features}},
  year=2023,
  booktitle={Proc. INTERSPEECH 2023},
  pages={4009--4013},
  doi={10.21437/Interspeech.2023-1537}
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download_whisper.py		download_whisper.py
evaluate_models.py		evaluate_models.py
install.sh		install.sh
train_and_test.py		train_and_test.py
train_models.py		train_models.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Improved DeepFake Detection Using Whisper Features

Before you start

Whisper

Datasets

Dependencies

Supported models

Pretrained models

Configs

Full train and test pipeline

Finetune and test pipeline

Other scripts

Acknowledgments

Citation

About

Releases

Packages

Languages

License

piotrkawa/deepfake-whisper-features

Folders and files

Latest commit

History

Repository files navigation

Improved DeepFake Detection Using Whisper Features

Before you start

Whisper

Datasets

Dependencies

Supported models

Pretrained models

Configs

Full train and test pipeline

Finetune and test pipeline

Other scripts

Acknowledgments

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages