Skip to content

rishi-more-2003/Voice-Authentication

Repository files navigation

Voice Based Authentication

  • Voice Authentication Application created using a Siamese Network Approach.

  • Dataset Used: Vox-Celeb1 Indian

  • Trained three variants of the model:

    • 1.4M Parameters on 100K samples.
    • 3M Parameters on 10K samples.
    • 900K Parameters on 1M samples
  • Preprocessed the audio dataset using Librosa with Fast Fourier Transform to extract vocal features.

  • Further used torch to preprocess audio in real time.

  • Created a novel siamese model architecture for extraction of audio features to identify speaker and verify speaker while keeping low overhead and computational delay.

  • Improved the model accuracy to around 90%.

  • 📝 Currently working on detailing the architectural optimizations achieved by globally caching hann windows and embedding the mel spectrograms into 450*80 matrices through a research paper.

About

Voice Authentication Application

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •