-
Implemented different computer vision techniques in order to detect emotion in images and videos. Utilised
SVM
,MLP
andCNN
models. -
For the
SVM
andMLP
I also implemented three detection methods:Histograms of Oriented Gradients (HOG)
,Scale-Invariant Feature Transform (SIFT)
andOriented FAST and rotated BRIEF (ORB)
to compare performance. -
Used Grid Search for
SVM
andMLP
for hyperparameter tuning. -
Implemented custom test functions to load and predict images and videos.
-
Python Version:
3.8.11
-
Libraries and Packages:
numpy
,pandas
,seaborn
,torch
,torchvision
,PIL
,cv2
,sklearn
,skimage
- Best performing detection method is HOG across both the MLP and SVM models which achevied F1 scores of 0.64 and 0.65 respectively:
ORB:
SIFT:
HOG:
- I tried several different CNN architectures including:
AlexNet
,VGG16
,GoogLeNet
ResNetXt and a vanilla
MLP. The best performing CNN was
VGG16which achieved an accuracy score of 0.81 after 5500 iterations. When testing, it was observed that the
VGG16` model was incorrect in some instances and failed to adequately detect emotions on multiple faces in a given frame.
Code, data and files for the above can be found in the following files:
CNN
,MLP
,SVM
: code for all models and outcomesDataset
: the image data used to train all models. Note: for copyright reasons the images are not included in this repoTest_Functions
: contains the code for the test functions used for all modelsVideos
: test videos for CNN