Skip to content

PaulSZH95/audio_processing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Voice Activity Detection (VAD) with Jupyter Notebooks

Overview

Welcome to the Voice Activity Detection (VAD) project utilizing Jupyter Notebooks. VAD plays a pivotal role in numerous audio processing applications by enabling the identification of segments in audio containing speech. This repository comprises Jupyter Notebooks that serve as practical demonstrations of VAD techniques and their practical applications.

The Modern Approach

Traditionally, VAD has relied on the empirical derivation of suitable filters, which often consumes a significant amount of time. However, in this project, we embrace a more efficient approach by harnessing the power of Machine Learning (ML) and hyperparameter optimization using tools such as Optuna. This modern approach yields substantial time savings, making VAD more accessible and practical.

Enhanced Generalization

One of the advantages of our approach is that ML models are designed to be highly generalizable. This means they can confidently capture a broader spectrum of datasets compared to narrowly tuned traditional filters. The result is a VAD system that can adapt to a wider range of audio sources and environments, enhancing its overall effectiveness.


About

A repository of walkthroughs on audio related ML models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published