A collection of manually annotated audio files for acoustic event detection (AED).
The annotations are based on the AudioSet ontology (10-second segments from YouTube).
For the time being, these are the supported acoustic events :
- 🐕 Bark (80 files)
- 🐦 Bird (289 files)
- 👏 Clapping (102 files)
- 💣 Explosion (259 files)
- 🔫 Gunshot (300 files)
- 🚆 Train horn (266 files)
Annotation files are named after their corresponding YouTube videos :
- Format : <youtube_id>__.txt
- Example : k1swpimPFxY_30000_40000.txt
Corresponds to second 30 to second 40 in the video youtu.be/k1swpimPFxY
The annotation files follow this format :
- Example :
>$ head -3 data/annotation/Bark/0gkLHfHJSnI_80000_90000.txt
0.490323 2.012903 speech
1.754839 2.012903 bark
2.012903 2.877419 speech
- Annotate more data.
- Add more classes.
- Develop an AED system.