Spoken Mandarin audio data under noisy environment captured by mobile phone, it is recorded by 203 speakers from all over China, covering all major dialect regions; and a variety of noise scenes such as subways, supermarkets, restaurants, etc., more suitable for real application scenes; it can be used for automatic speech recognition, machine translation, and voiceprint recognition.
For more details, please refer to the link:https://www.nexdata.ai/datasets/speechrecog/191?source=Github
16kHz, 16bit, uncompressed wav, mono channel
noisy, including subway, market, restaurant, street, airport, etc.
common sentences; letters
203 people, 57% of which are male
Android mobile phone; iPhone
mandarin (without heavy local accent)
text, noise symbols
95% (the accuracy rate of noise symbols is not included)
speech recognition, voiceprint recognition
Commercial License