Lyrics-Based Song Genre Classification

Lyrics-Based Song Genre Classification (Project @ NTNU). Have a look at the full report.

Abstract

Musical genres are essential for organizing songs into musical collections and providing well-functioning music recommendation and retrieval. In order to support these methods, songs need to be tagged with their appropriate genre(s). Annotation of genres by humans is time-consuming and costly, while reliable automatic song genre classification is difficult, especially because the boundaries between musical genres are not clearly defined. Thus, song genre classification remains a challenging topic. To this end, we target this task by only using song lyrics. For this, we implement both, traditional and machine learning text classification methods. Furthermore, we investigate how the classification performance of all methods depends on the number of considered genres in the dataset. Our experiments show that the classification performance of text classification methods degrades for increasing number of considered genres. The best results were consistently achieved using the Bernoulli Naive Bayes classifier.

Confusion Matrix using Naive Bayes

Macro-averaged F1 score performances per considered generes

Installation (probably outdated)

conda create -n song_classification python=3.8
conda activate song_classification

conda install -c anaconda pandas, tensorflow, scikit-learn, nltk, seaborn
conda install -c conda-forge tqdm, spacy, matplotlib

python -m spacy download en_core_web_sm
pip install spacy-langdetect, more-itertools, contractions

# fix occuring errors
pip install numpy==1.19.5
python -m pip install PyQt5

Dataset

This repository contains the processed dataset used in this work. This can be generated by downloading the Kaggle dataset Song lyrics from 79 musical genres, and putting the two files into data/. Afterwards, run

python build_datasets.py

Simple data analysis can be performed by running

python analyze_dataset.py

Also make sure that the data/ folder contains the GloVe word vectors (glove.6B.50d.txt) and (glove.6B.100d.txt), available at nlp.stanford.edu.

Training

The six different methods can be trained using the following commands. Each run will trigger training on all 11 datasets, containing lyrics of 2 up to 12 genres, and save the logs and achieved metric performances to a text file. Note that no model weights are saved to disk at any time.

python train.py -m naive_bayes_bernoulli > results/results_naive_bayes_bernoulli.txt
python train.py -m naive_bayes_multinomial > results/results_naive_bayes_multinomial.txt
python train.py -m svm > results/results_svm.txt
python train.py -m mlp_glove -lr 1e-3 > results/results_mlp_glove_1e-3.txt
python train.py -m lstm_glove -lr 1e-3 > results/results_lstm_glove_1e-3.txt
python train.py -m lstm -lr 1e-3 > results/results_lstm_1e-3.txt

Generate metric plots and tables

The achieved performances from training were manually written to a json file (see ./evaluation/results_overview.json). Based on this file, the following command (1) generates plots and (2) prints tables in LaTeX format.

python show_results.py

Detailed evaluation

The following commands allow a more detailed model evaluation (generate metrics per musical genre, and generate confusion matrix):

python evaluate_model.py -m naive_bayes_bernoulli
python evaluate_model.py -m naive_bayes_multinomial
python evaluate_model.py -m svm 
python evaluate_model.py -m mlp_glove -lr 1e-3
python evaluate_model.py -m lstm_glove -lr 1e-3
python evaluate_model.py -m lstm -lr 1e-3

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
data		data
evaluation		evaluation
lib		lib
models		models
results		results
.gitignore		.gitignore
0_Presentation.pptx		0_Presentation.pptx
0_Report.pdf		0_Report.pdf
LICENSE		LICENSE
README.md		README.md
analyze_dataset.py		analyze_dataset.py
build_datasets.py		build_datasets.py
constants.py		constants.py
evaluate_model.py		evaluate_model.py
show_results.py		show_results.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lyrics-Based Song Genre Classification

Abstract

Installation (probably outdated)

Dataset

Training

Generate metric plots and tables

Detailed evaluation

About

Releases

Packages

Languages

License

josch14/song-genre-classification-with-lyrics

Folders and files

Latest commit

History

Repository files navigation

Lyrics-Based Song Genre Classification

Abstract

Installation (probably outdated)

Dataset

Training

Generate metric plots and tables

Detailed evaluation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages