Skip to content

A collection of tools and 1,000,000+ unified annotations for bioacoustics datasets.

Notifications You must be signed in to change notification settings

zacbakerr/bioacoustics-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Bioacoustics Datasets

A collection of tools and 1,000,000+ unified annotations for bioacoustics datasets.

Dataset Species # of annotated calls Dataset size (GB) Duration (hh:mm:ss) License
Animal Sounds Birds, cats, chickens, cows, dogs, donkeys, frogs, lions, monkeys, sheep 809 0.13 0:57:47 -
AnuraSet Anurans 16089 18.6 27:00:00 cc-by-4.0
BIRDeep 38 avian species 3749 13.41 8:50:00 MIT
BirdVox 25 avian species 35402 0.79 7:23:00 cc-by-4.0
Domestic Canary Canary 23308 0.856 3:00:00 cc-by-4.0
Columbia/Costa Rica Coffee Farms 89 avian species 6952 3.8 34:00:00 cc-by-4.0
Darpa Humans 1718 0.67 4:00:00 No license specified, the work may be protected by copyright
Avain Dawn 58 avian species, 1 amphibian species 41183 20.3 131:15:00 cc-by-4.0
DCASE Birds 7206 32.00 17:25:00 cc-by-4.0
Egyptian fruit bat Egyptian fruit bat 90000 91.00 37:45:00 cc-by-4.0
ENABirds Birds 16052 1.40 6:20:00 cc-by-1.0
Female Rook Rook birds 3417 54.37 10:45:36 cc-by-nc-nd-4.0
The Vocal Repertoire of Adult and Neonate Otters Otter 441 0.57 0:06:23 cc
Hainan Gibbons Hainan Gibbons 1233 13.39 104:00:00 cc-by-4.0
Hawaii Birds 27 avian species 59583 5.8 51:00:00 cc-by-4.0
HICEAS Whales, Dolphins 796 3.10 12:40:00 "Public dataset hosted in Google Cloud Storage"
Distributed acoustic cues for caller identity in macaque vocalization Macaques monkeys 7285 0.15 0:45:00 cc-by-1.0
InfantMarmosetVox Marmosets monkeys 169318 21.2 58:20:00 cc-by-4.0
Northeast US Sounds 81 avian species 50760 27.8 285:00:00 cc-by-4.0
Orcas Classifications Orca whales 398 0.26 0:26:30 -
Pigs Pig 6887 0.2 0:40:26 cc-by-4.0
Rainforest Birds, frogs 1216 13.05 20:16:00 "Free for personal or academic purposes"
Rodents Rodents: mouse, gerbil 4576 1.36 0:48:34 cc-by-4.0
Rook Rook birds 17662 23.49 17:21:17 cc-by-4.0
Sierra Nevada 21 avian species 10976 3.57 16:40:00 cc-by-4.0
Southwest Amazon 132 avian species 16482 4.51 21:00:00 cc-by-4.0
Watkins Marine Animal Sounds 21 dolphin, 13 seal, 32 whale species 15152 9.61 29:10:15 "Sound files are free to download for personal or academic use"
Western US 56 avian species 20147 7.08 33:00:00 cc-by-4.0

To install all of the data, you need about TOTAL_GB of free space. But you can also pick and choose which datasets you'd like to download.

Installation Instructions

  1. Run ./scripts/download_data.sh

After running the download script, your datasets folder should look like this:

└── datasets/
    ├── annotations.pkl
    ├── dataset1/
    │   ├── audio/
    │   │   ├── audio1.wav
    │   │   └── audio2.wav
    │   ├── annotations.pkl
    │   └── stats.txt
    └── dataset2/
        ├── audio/
        ├── annotations.pkl
        └── stats.txt

There are individual annotation files for each dataset and one master annotations file located directly in the datasets folder. annotations.pkl is a Python dictionary structured as the following

{
   wav_file_path: [
                   {'start_time': 0, 'end_time': 1.7, 'species': 'bird', 'sub-species': 'serinus canaria'},
                   {'start_time': 2.3, 'end_time': 2.48, ...},
                   ...
                  ]
   ...
}

Tools

  1. ./scripts/generate_spectrograms.sh - Running this will generate 100 high quality mel spectrograms for each dataset and place them in the visualizations folder

  2. scripts/moving_spectrogram.py example_audio.wav output.wav - This script takes in a wav file and generates a moving spectogram with audio called example_audio.wav and saves it in visualizations/output.wav

  3. training/data_engine.py - This a very helpful file to take in datasets and easily produce a PyTorch Dataset. The __ getitem __() method has the output [audio, is_vocalization, species, speaker]. Audio is a tensor of numbers, is_vocalization is a boolean, species is the species of the vocalization, and speaker is the speaker of the vocalization. species and speaker will both be "Noise" if it is a non-vocalization event and speaker will be 'no-speaker' if there is no speaker data. Dataset has three required parameters: datasets_path which should just be datsets folder. Save_path which is where train and val splits will be stored. And datasets which is a list of all the datasets you would like to utilize. After the data is loaded into the data_engine, you can call data_engine.get_annotated_dataset(dataset_names=[]) which returns the above stated PyTorch dataset.

About

A collection of tools and 1,000,000+ unified annotations for bioacoustics datasets.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published