ai4s_sed_demo

General purpose, real-time sound recognition demo:

The prediction is obtained by applying the audio tagging system on consecutive short audio segments. It is able to perform multiple updates per second on a moderate CPU. A sample video can be viewed at:

https://www.youtube.com/watch?v=7TEtDMzdLeY

This is a newer version. The original version can be found at this branch

Authors

This demo has been developed and trained during our previous AudioSet work, check the following links for more info and models:

Paper: https://arxiv.org/abs/1912.10211
Repository: https://github.com/qiuqiangkong/audioset_tagging_cnn

If you use our work, please consider citing us:

[1] Qiuqiang Kong, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, Mark D. Plumbley. "PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition." arXiv preprint arXiv:1912.10211 (2019).

Yin Cao, Qiuqiang Kong, Andres Fernandez, Christian Kroos, Turab Iqbal, Wenwu Wang, Mark Plumbley

Installation

Repo:

At the moment, no pip installation is available. Clone this repo into <repo_root> via:

https://github.com/yinkalario/General-Purpose-Sound-Recognition-Demo

Software dependencies:

We recommend using Anaconda to install the dependencies as follows:

conda create -n panns python=3.7
conda activate panns
pip install -r requirements.txt
conda install -c anaconda pyaudio

A comprehensive list of working dependencies can be found in the full_dependencies.txt file.

pretrained model (CNN9, ~150MB):

Download the model into your preferred <model_location> via:

wget https://zenodo.org/record/3576599/files/Cnn9_GMP_64x64_300000_iterations_mAP%3D0.37.pth?download=1

Then specify the path when running the app using the MODEL_PATH flag (see sample command below).

More models can be found here and here.

RUN

Assuming our command line is on <repo_root>, the panns environment is active and the model has been downloaded into <repo_root>, the following command should run the GUI with default parameters (tested on Ubuntu20.04):

python -m sed_demo MODEL_PATH='Cnn9_GMP_64x64_300000_iterations_mAP=0.37.pth?download=1'

Note that the terminal will print all available parameters and their values upon start. The syntax to alter them is the same as with MODEL_PATH, e.g. to change the number of classes displayed to 10, add TOP_K=10.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
assets		assets
sed_demo		sed_demo
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ai4s_sed_demo

Authors

Installation

Repo:

Software dependencies:

pretrained model (CNN9, ~150MB):

RUN

Related links:

About

Releases

Packages

Contributors 2

Languages

License

yinkalario/General-Purpose-Sound-Recognition-Demo

Folders and files

Latest commit

History

Repository files navigation

ai4s_sed_demo

Authors

Installation

Repo:

Software dependencies:

pretrained model (CNN9, ~150MB):

RUN

Related links:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages