Skip to content

Latest commit

 

History

History
76 lines (47 loc) · 1.54 KB

datasets.md

File metadata and controls

76 lines (47 loc) · 1.54 KB

Audio Datasets

Mozilla’s Common Voice dataset

  • Collection:
    • It has over 500 hours from 20,000 different people.
    • It is mainly aligned by sentence.
    • Created by volunteers reading requested phrases.
    • Collected by a web application.
  • License: CC0 (public domain)
  • Language: global
  • Media:
  • Link: https://voice.mozilla.org/zh-TW

LibriSpeech

  • Collection:
    • 1000 hours
    • It is aligned by sentence level only (lack word-level alignment).
  • License: CC4
  • Media: FLAC encoder
  • Link: http://www.openslr.org/12/

OpenSLR

TIDIGITS

  • Collection:
    • 25000 digit sequence spoken
    • 300 speakers
    • quite room by paid contribution
  • License: Commerical License from Language Data Consortium
  • Media: NIST SPHERE
  • Link: https://catalog.ldc.upenn.edu/LDC93S10

CHiME

Open Speech Recording (Google)

UrbanSound8k