A voice activity detection (VAD) library for Unity.
Records voice data from any sources (IVoiceSource
, e.g. recording by UnityEngine.Microphone
),
detects voice activity by any logic,
and provides voice data to any buffers (IVoiceBuffer
, e.g. buffering to WAV file) when voice is active.
You can customize voice sources, voice buffers, and voice activity detection logics adjusting your use cases.
- Sources
-
UnityEngine.Microphone
-> UnityMicrophoneSource -
UnityEngine.AudioSource
(OnAudioFilterRead
callback) -> UnityAudioSource - Native microphone
-
- Buffers
- Null (Detection only) -> NullVoiceBuffer
- Wave file (by NAudio) -> WaveFileVoiceBuffer
- AudioClip -> AudioClipBuffer
- Voice activity detection logics
- Queueing-based simple VAD logic -> QueueingVoiceActivityDetector
- Less memory usage but less stability
- Cumulative VAD logic -> CumulativeVoiceActivityDetector
- More stability but more memory usage and less noise robustness
- Queueing-based simple VAD logic -> QueueingVoiceActivityDetector
Add following dependencies to your /Packages/manifest.json
.
{
"dependencies": {
"com.mochineko.voice-activity-detection": "https://github.com/mochi-neko/voice-activity-detection-unity.git?path=/Assets/Mochineko/VoiceActivityDetection#0.4.2",
"com.cysharp.unitask": "https://github.com/Cysharp/UniTask.git?path=src/UniTask/Assets/Plugins/UniTask",
"com.neuecc.unirx": "https://github.com/neuecc/UniRx.git?path=Assets/Plugins/UniRx/Scripts",
"com.naudio.core": "https://github.com/mochi-neko/simple-audio-codec-unity.git?path=/Assets/NAudio/NAudio.Core#0.2.0",
...
}
}
- VAD as component
- VAD with echo
- VAD by AudioSource
- VAD with OpenAI/Whisper API transcription
- VAD by cumulative logic
See also Samples.
See CHANGELOG.
See NOTICE.
Licensed under the MIT license.