Skip to content

gioypi/penthy

Repository files navigation

logo_full

penthy

Neural network for classification of flac compression quality. up_status

About penthy

penthy is an audiophile tool 🎶 to check whether a flac file contains truly lossless music, as it is supposed to, or comes from a lossy source, like an mp3 file.

The flac file format compresses music in a way that it remains unaltered, in contrast to lossy compression formats (like mp3) that sacrifice sound quality to limit file size. Confirming that your flac discography is truly lossless is an open-ended problem. One approach to tell the difference is machine learning. Although, penthy cannot evaluate every aspect of digital audio to guarantee that your file is exactly what came out of the studio, she tries to identify transcoding from mp3 sources. A song that passes penthy's challenge is not necessarily genuine, but it is unlikely to be an mp3 wearing a flac trenchcoat.

penthy is short for Penthesilea, a skilled queen of the Amazons who fought in the Trojan War, according to Greek mythology.

Working principle

A Convolutional Neural Network was trained with the highest frequencies of several songs in the form of spectrogram images, in order to recognize flac files that were once mp3s.

The songs are split into small segments to produce the spectrograms which are given to the network as inputs. The training dataset contained the truly lossless versions of the songs and their fake counterparts – flac files transcoded from mp3 files generated from the originals. Various music genres and mp3 qualities were included. Each spectrogram is a 128x128 px RGB image depicting only the 16200-22000 Hz frequency range for 8 seconds of audio, saved as a numpy array. The trained model accepts flac or wav tracks as input and outputs a float number from 0 to 1. An output of '0' corresponds to audio transcoded to mp3 and back to a lossless format. An output of '1' classifies the song as not transcoded from an mp3 source, but it could still be transcoded from a different format, subjected to upsampling or altered in other ways.

The CNN is structured as follows: cnn arch white

The current trained model performs generally well, with an approximate accuracy of 90%. False negatives (genuine files classified as transcoded) are more common than false positives (transcoded files classified as truly lossless), especially for songs that lack higher frequencies.

Used technologies

Usage

You may use this code to evaluate your flac files with the pretrained model or train your own if you have access to truly lossless discography.

  • neural_net.py builds a dataset with flexible multiprocessing and trains a new model.
  • trained.py evaluates a single file.
  • trained_dir.py evaluates all applicable files in a directory with multiprocessing (recommended if you use penthy to scan your collection).
  • audio_manipulation.py is used by all modules to generate the spectrograms.
  • No dataset included.

For instance, running trained_dir.py for a directory that contains both genuine and transcoded files will output something like this: output_demo
You do not need both versions of a file to get an accurate evaluation, as in this example. Each file is classified separately.

There is also an online demo, to scan your files without downloading or installing anything. Performance is significantly better when run locally, though.

Installation Requirements

  • Python (3.7 64-bit has been tested) (in Windows, make sure to add Python to the PATH environment variable)
  • FFmpeg (including ffprobe) (in Windows, make sure to add FFmpeg to the PATH environment variable)

and the following python packages (or see requirements.txt, generated by PyCharm):

plus the dependencies of these packages that will come up during installation (e.g. pandas, scipy, matplotlib, scikit-learn).

Credits and license

The license of this repository refers to code written by the author and not the libraries and functions used. For those, look at the respective licenses of the original projects. Music in the example of usage is courtesy of Dean Washburn (nvlachost@gmail.com). Proper attribution of penthy requires mentioning all parties of the following crediting.

Achilleas Papastamatiou developed penthy as part of his undergraduate thesis at the Department Of Computer Science And Telecommunications at the University of Thessaly in Greece. The project was supervised by professor Vaggelis Spyrou.

Auxiliary technologies used while developing

(Not required to run or fork penthy)