Skip to content

Pytorch code for "Rethinking CNN Models for Audio Classification"

Notifications You must be signed in to change notification settings

tomasic/esc-cnn

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rethinking CNN Models for Audio Classification

This repository contains the PyTorch code for our paper Rethinking CNN Models for Audio Classification. The experiments are conducted on the following three datasets which can be downloaded from the links provided:

  1. ESC-50
  2. UrbanSound8K
  3. GTZAN

Preprocessing

The preprocessing is done separately to save time during the training of the models.

For ESC-50:

python preprocessing/preprocessingESC.py --csv_file ~/Downloads/ESC-50-master/meta/esc50.csv --data_dir ~/Downloads/ESC-50-master/audio --store_dir ~/Downloads/ESC-50-master/spectrograms

For UrbanSound8K:

python preprocessing/preprocessingUSC.py --csv_file ~/Downloads/ESC-50-master/meta/esc50.csv --data_dir ~/Downloads/ESC-50-master/data --store_dir ~/Downloads/ESC-50-master/spectrograms

For GTZAN:

python preprocessing/preprocessingGTZAN.py --data_dir /path/to/audio_data/ --store_dir /path/to/store_spectrograms/ --sampling_rate 22050

Training the Models

The configurations for training the models are provided in the config folder. The sample_config.json explains the details of all the variables in the configurations. The command for training is:

python train.py --config_path config/esc_resnet.json

About

Pytorch code for "Rethinking CNN Models for Audio Classification"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%