audio-classification

Downloading the audiosets from Google AudioSet

The code for accessing the audiosets from the Google AudioSet can be found in the audioset-preprocessing, adapted from https://github.com/aoifemcdonagh/audioset-processing. It has a separate README file with detailed instructions to do so.

Classification of Child And Adult Voice

The audios are converted to MelSpectograms and then a Vision image transformer is implemented for the classification purposes with the following parameters:

image_size = 224
patch_size = 16
num_classes = 2
dim = 768
depth = 12
heads = 12
mlp_dim = 3072
dropout = 0.0
emb_dropout = 0.1

The file inference.py does live audio classfication.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
__pycache__		__pycache__
audioset-preprocessing		audioset-preprocessing
live_data		live_data
model		model
README.md		README.md
audio_classification.ipynb		audio_classification.ipynb
inference.py		inference.py
microphone_stream.py		microphone_stream.py
transforms.py		transforms.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

audio-classification

Downloading the audiosets from Google AudioSet

Classification of Child And Adult Voice

About

Releases

Packages

Languages

patel996/audio-classification

Folders and files

Latest commit

History

Repository files navigation

audio-classification

Downloading the audiosets from Google AudioSet

Classification of Child And Adult Voice

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages