This is a demo page for our paper: SoundMorpher: Perceptually-Uniform Sound Morphing with Diffusion Model
This is a demo for our paper 'SoundMorpher: Perceptually-Uniform Sound Morphing with Diffusion Model'.
We present SoundMorpher, an open-world and generalist sound morphing method that generates perceptually uniform morphing trajectories. Traditional sound morphing methods model the intractable relationship between morph factor and perception of the stimuli for resulting sounds under a linear assumption, which oversimplifies the complex nature of sound perception and limits their morph quality. In contrast, SoundMorpher explores an explicit proportional mapping between the morph factor and the perceptual stimuli of morphed sounds based on log Mel-spectrogram. This approach enables smoother transitions between intermediate sounds and ensures perceptually consistent transformations, which can be easily extended to diverse sound morphing tasks. Furthermore, to address the lack of formal quantitative evaluation system for sound morphing methods, we adapt a set of quantitative metrics to comprehensively assess morphed results based on established objective criterions. The quantitative evaluation system may benefit for future sound morphing studies to have a direct comparison. We provide extensive experiments to demonstrate the effectiveness and versatility of SoundMorpher in real-world scenarios, highlighting its potential impact on various applications such as creative music composition, film post-production and interactive audio technologies.
If you are interesting in our work, please cite it as below:
@article{niu2024soundmorpher,
title={SoundMorpher: Perceptually-Uniform Sound Morphing with Diffusion Model},
author={Niu, Xinlei and Zhang, Jing and Martin, Charles Patrick},
journal={arXiv preprint arXiv:2410.02144},
year={2024}
}