- Collection:
- It has over 500 hours from 20,000 different people.
- It is mainly aligned by sentence.
- Created by volunteers reading requested phrases.
- Collected by a web application.
- License: CC0 (public domain)
- Language: global
- Media:
- Link: https://voice.mozilla.org/zh-TW
- Collection:
- 1000 hours
- It is aligned by sentence level only (lack word-level alignment).
- License: CC4
- Media: FLAC encoder
- Link: http://www.openslr.org/12/
- Summary: Composed of lots of projects.
- Link: http://www.openslr.org/
- Collection:
- 25000 digit sequence spoken
- 300 speakers
- quite room by paid contribution
- License: Commerical License from Language Data Consortium
- Media: NIST SPHERE
- Link: https://catalog.ldc.upenn.edu/LDC93S10
- Collections:
- 50 hours
- aligned by sentence level
- Media: 16 KHz WAV files
- License: Restricted License
- Link: http://spandh.dcs.shef.ac.uk/chime_challenge/index.html
- Collections:
- 105829 wav files (16-bit single-channel PCM encoded, 16 KHz rate)
- 35 words
- 1 second / 1 word
- Link: https://aiyprojects.withgoogle.com/open_speech_recording
- Collections:
- 8732 label files (<=4s)
- Environment background sound : 10 classes
- Link: https://urbansounddataset.weebly.com/urbansound8k.html