Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
-
Updated
Sep 26, 2024 - Jupyter Notebook
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
Silero VAD: pre-trained enterprise-grade Voice Activity Detector
Automagically synchronize subtitles with video.
Command-line utility to transcribe/translate from video/audio/subtitles to subtitles
Python AI assistant 🧠
Voice activity detection (VAD) toolkit including DNN, bDNN, LSTM and ACAM based VAD. We also provide our directly recorded dataset.
Real-time microphone noise suppression on Linux.
🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).
Real-time speech recognition and voice activity detection (VAD) using next-gen Kaldi with ncnn without Internet connection. Support iOS, Android, Linux, macOS, Windows, Raspberry Pi, VisionFive2, LicheePi4A etc.
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
An audio/acoustic activity detection and audio segmentation tool
A python package to build AI-powered real-time audio applications
🗣️ A book and repo to get you started programming voice computing applications in Python (10 chapters and 200+ scripts).
Voice Activity Detection based on Deep Learning & TensorFlow
Runtime Audio Importer plugin for Unreal Engine. Importing audio of various formats at runtime.
Implementation of Logistic Regression, MLP, CNN, RNN & LSTM from scratch in python. Training of deep learning models for image classification, object detection, and sequence processing (including transformers implementation) in TensorFlow.
Android Voice Activity Detection (VAD) library. Supports WebRTC VAD GMM, Silero VAD DNN, Yamnet VAD DNN models.
A statistical model-based Voice Activity Detection
Add a description, image, and links to the voice-activity-detection topic page so that developers can more easily learn about it.
To associate your repository with the voice-activity-detection topic, visit your repo's landing page and select "manage topics."