Music-Representation-Comparison

This is the repo with the code to conduct a comparative analysis of different audio representation models.

Reproducability

This repo using the MagnaTagATune dataset to evaluate the performance of different music representation model in the downstream task of music tagging.

Dataset

The audio files for MagnaTagATune dataset can be downloaded here. Extract the audio files to audio directory in MTT folder. The directory structure will be as shown below:

.               
├── MTT
│   ├── audios
│   │   │── 0
│   │   │── 1
│   │   │── ...
│   ├── magnatagatune.json
├── evaluate_clap.py
├── evaluate_mert.py
└── ...

We use the same split as Jukebox.

Model Evaluation

We evaluate the following music representation models in this paper:

Model Performance

The comparison of the models are shown below:

Model	MTT_AUC	MTT_AP
ImageBind	88.55%	40.19%
JukeBox	91.50%	41.40%
OpenL3	89.35%	42.88%
CLAP	70.04%	27.95%
Wav2CLIP	90.15%	49.12%
*MERT*	93.91%	59.57%

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
MTT		MTT
models		models
.gitignore		.gitignore
README.md		README.md
evaluate_clap.py		evaluate_clap.py
evaluate_imagebind.py		evaluate_imagebind.py
evaluate_mert.py		evaluate_mert.py
evaluate_wav2clip.py		evaluate_wav2clip.py
mlp.py		mlp.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Music-Representation-Comparison

Reproducability

Dataset

Model Evaluation

Model Performance

About

Releases

Packages

Languages

crypto-code/Music-Representation-Comparison

Folders and files

Latest commit

History

Repository files navigation

Music-Representation-Comparison

Reproducability

Dataset

Model Evaluation

Model Performance

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages