-
Voice Authentication Application created using a Siamese Network Approach.
-
Dataset Used: Vox-Celeb1 Indian
-
Trained three variants of the model:
- 1.4M Parameters on 100K samples.
- 3M Parameters on 10K samples.
- 900K Parameters on 1M samples
-
Preprocessed the audio dataset using Librosa with Fast Fourier Transform to extract vocal features.
-
Further used torch to preprocess audio in real time.
-
Created a novel siamese model architecture for extraction of audio features to identify speaker and verify speaker while keeping low overhead and computational delay.
-
Improved the model accuracy to around 90%.
-
📝 Currently working on detailing the architectural optimizations achieved by globally caching hann windows and embedding the mel spectrograms into 450*80 matrices through a research paper.
-
Notifications
You must be signed in to change notification settings - Fork 0
rishi-more-2003/Voice-Authentication
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Voice Authentication Application
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published