FIRC Anton, MALINKA Kamil, HANÁČEK Petr. Diffuse or Confuse: A Diffusion Deepfake Speech Dataset. In: 2024 International Conference of the Biometrics Special Interest Group (BIOSIG). Proceedings of the 23rd International Conference of the Biometrics Special Interest Group. Darmstadt: GI - Group for computer science, 2024
Advancements in artificial intelligence and machine learning have significantly improved synthetic speech generation. This paper explores diffusion models, a novel method for creating realistic synthetic speech. We create a diffusion dataset using available tools and pretrained models. Additionally, this study assesses the quality of diffusion-generated deepfakes versus non-diffusion ones and their potential threat to current deepfake detection systems. Findings indicate that the detection of diffusion-based deepfakes is generally comparable to non-diffusion deepfakes, with some variability based on detector architecture. Re-vocoding with diffusion vocoders shows minimal impact, and the overall speech quality is comparable to non-diffusion methods.
Download and extract the following .zip files into the same directory:
@INPROCEEDINGS{firc_diffusion_2024,
author={Firc, Anton and Malinka, Kamil and Hanáček Petr},
booktitle={2024 International Conference of the Biometrics Special Interest Group (BIOSIG)},
title={Diffuse or Confuse: A Diffusion Deepfake Speech Dataset},
year={2024},
pages={1-5},
doi={XXXX}
}