Speech-Emotion-Recognition

The idea behind creating this project was to build a machine learning model that could detect emotions from the speech.

Analyzing audio signals

Datasets:

RAVDESS: This dataset includes around 1400 audio file input from 24 different actors. 12 male and 12 female where these actors record short audios in 8 different emotions i.e 1 = neutral, 2 = calm, 3 = happy, 4 = sad, 5 = angry, 6 = fearful, 7 = disgust, 8 = surprised.
Each audio file is named in such a way that the 7th character is consistent with the different emotions that they represent.

Feature Extraction

The next step involves extracting the features from the audio files which will help our model learn between these audio files. For feature extraction we make use of the LibROSA library in python which is one of the libraries used for audio analysis.

TIME DOMAIN FEATURES:
- Zero Crossing Rate
- Root Mean Square Energy
FREQUENCY DOMAIN FEATURES:
- Mel spectogram features
- Chroma energy distribution normalised statistics (CENS)
Mel frequency cepstral coefficients (MFCC)

Building Models

Algorithms Used

Decision Tree
RandomForestClassifier
GradientBoostingClassifier
KNeighborsClassifier
MLPClassifier
SVC
LightGBM
QDA

Predictions

After tuning the model, tested it out by predicting the emotions for the test data. For a model with the given accuracy these are a sample of the actual vs predicted values.

Testing out with live voices.

You can test your own voice by deployed website : Speech Emotion Recognition Web

Conclusion

Building the models was a challenging task as it involved lot of trail and error methods, tuning etc. The model was tuned to detect emotions with more than 70% accuracy. Accuracy can be increased by including more audio files for training.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
App		App
SER.exe-win32-x64		SER.exe-win32-x64
.gitattributes		.gitattributes
README.md		README.md
REPORT.pdf		REPORT.pdf
Speech_emotion_recognition.ipynb		Speech_emotion_recognition.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech-Emotion-Recognition

Analyzing audio signals

Datasets:

Feature Extraction

TIME DOMAIN FEATURES:

FREQUENCY DOMAIN FEATURES:

Building Models

Algorithms Used

Predictions

Testing out with live voices.

Conclusion

About

Releases

Packages

Languages

Aayush-Gangwar/Speech-Emotion-Recognition

Folders and files

Latest commit

History

Repository files navigation

Speech-Emotion-Recognition

Analyzing audio signals

Datasets:

Feature Extraction

TIME DOMAIN FEATURES:

FREQUENCY DOMAIN FEATURES:

Building Models

Algorithms Used

Predictions

Testing out with live voices.

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages