This project aims to streamline the dataset creation and labelling processes for Automatic Speech Recognition (ASR) systems. Project consists of 3 parts.
- AudioFile Ingestion
- Automatic Labelling with ASR (Whisper Model)
- Manual Labelling for improved dataset quality
WebRTC-VAD implementation is taken from this repository. https://github.com/wiseman/py-webrtcvad/blob/master/example.py