This project aims to implement speech emotion recognition strategy proposed in Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching
CPU Host :
- ubuntu16.04
- python3.5
- tensorflow1.7.0
GPU Server :
- tensorflow-gpu1.7.0
- NVIDIA driver version:390
- cuda9.0
- cudnn7.0
-
Update path of dataset which you want to save from path.py
-
Downloading Berlin Database of Emotional Speech!
- Berlin Dataset
$ python load_emodb.py
- eNTERFACE Dataset
Downloading the eNTERFACE05 Dataset and update the dataset root
- Berlin Dataset
-
Starting preprocessing
$ python melSpec.py
Finetune AlexNet with Tensorflow
$ python finetune.py
Discriminant Temporal Pyramid Matching
$ python dtpm.py -s
$ python dtpm.py -n
Support Vector Machine
$ python svm.py
Refrence Model:
- Alexnet
- SVM
Refrence Papers:
- ImageNet Classification with Deep Convolutional Neural Networks
- Speech Emotion Recognition Using Deep Convolutional Neural Network and Discriminant Temporal Pyramid Matching
- Geometric ℓp-norm feature pooling for image classification