Exploring CNN, LSTM, and Attention Mechanism on ESC-50 Dataset

This folder contains the final project for the course "Human Data Analytics". It explores the performance of three different deep learning models: CNN, CNN-LSTM and CNN-LSTM-Attention for environmental sound classification using the ESC-50 dataset. The models were tested across multiple data augmentation scenarios, and the performance was evaluated using a cross- validation approach to ensure robustness. Results show that audio augmentation consistently improves model performance, leading to significant accuracy gains across all models.


└── 📁docs
    └── categories.json
    └── demo.ipynb
    └── esc50.csv
    └── frog.wav #example wav for demo
    └── preprocessing_dataset.ipynb
    └── results.csv # summary of the best validation accuracies results. 
    └── results.ipynb # Generation of figures for the paper
└── 📁imgs # pictures generated with the results notebook and the architecture sketches
    └── comparison.png
    └── ...
└── 📁logs # all the training logs
    └── cnn_lstm_attention_training_16k_audio.log
    └── ...
└── 📁models # The best model for each fold was saved, now only the best model was retained for file size reasons.
        └── cnn_lstm_attention_model_audio_4.kerasù
        └── ...
└── 📁src
    └── data_utils.py  # functions to handle the dataset, the preprocessing and the loading of the data loader.
    └── demo.py
    └── models.py
└── demo.gif
└── main.py
└── paper.pdf # Summary of the methods and results.
└── README.md
└── requirements.txt
└── test_architectures.py # used to check the training progress for the models

Usage

To use the files and resources in this folder, follow these steps:

Clone or download this folder to your local machine.
Open the project in your preferred development environment.
Create a conda env and install the required packages via pip install -r requirements.txt
Explore the different directories to access the relevant files and resources.

Training

Download the ESC-50 dataset
Generate the dataset augmentations with the preprocessing_dataset.ipynb notebook.
Launch test_architectures.py to launch the training for all the three models and three dataset types.
Generate results figures with the notebook results.ipynb.

Demo with Streamlit

run streamlit run main.py and open in browser the app
load a wav file or register with your laptop's microphone and see if the model can identify the right class.

Results

Comparison between the three different models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring CNN, LSTM, and Attention Mechanism on ESC-50 Dataset

Contents

Usage

Training

Demo with Streamlit

Results

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
imgs		imgs
logs		logs
models		models
src		src
.DS_Store		.DS_Store
Presentazione.pdf		Presentazione.pdf
README.md		README.md
demo.gif		demo.gif
main.py		main.py
paper.pdf		paper.pdf
requirements.txt		requirements.txt
test_architectures.py		test_architectures.py

bhroben/esc-50-classification

Folders and files

Latest commit

History

Repository files navigation

Exploring CNN, LSTM, and Attention Mechanism on ESC-50 Dataset

Contents

Usage

Training

Demo with Streamlit

Results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages