basic-pitch-torch

PyTorch version of Spotify's Basic Pitch, a lightweight audio-to-MIDI converter. The provided weights in Spotify's repo are converted using this script. Hopefully this helps researchers who are more accustomed to PyTorch to re-use the pretrained model.

Usage

For transcribing MIDI files, similar to Basic Pitch:

from basic_pitch_torch.inference import predict

model_output, midi_data, note_events = predict(audio_path)

For loading the nn.Module:

from basic_pitch_torch.model import BasicPitchTorch

pt_model = BasicPitchTorch()
pt_model.load_state_dict(torch.load('assets/basic_pitch_pytorch_icassp_2022.pth'))
pt_model.eval()

with torch.no_grad():
    output_pt = pt_model(y_torch)
    contour_pt, note_pt, onset_pt = output_pt['contour'], output_pt['note'], output_pt['onset']

Result Validation

In tests/ we show two levels of validation tests using a test audio from GuitarSet:

On model output
- Most of the discrepancies originated from float division (e.g. normalized_log) and error propagation further down the network. The difference should be minimal enough to be ignored during MIDI note creation.
```
Contour abs diff - max: 0.0003006, min: 0.0, avg: 5.863e-06
Onset abs diff   - max: 0.0002712, min: 0.0, avg: 1.431e-05
Note abs diff    - max: 0.0002297, min: 0.0, avg: 6.6e-06
```
On MIDI transcription
- The transcribed MIDI using both TF and PT models are identical (see midi_data_pt.mid and midi_data_tf.mid)

References

Bittner, Rachel M., et al. "A lightweight instrument-agnostic model for polyphonic note transcription and multipitch estimation." ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2022.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

basic-pitch-torch

Usage

Result Validation

References

Files

README.md

Latest commit

History

README.md

File metadata and controls

basic-pitch-torch

Usage

Result Validation

References