Learning to recognize a person's facial expressions on a video takes two steps. The first step is object detection. The algorithm should be able to see a video and and be able to recognize a person's face. The object detection section of the algorithm is done with the Open CV (Computer Vision) library from Python.
After the face is detected, the algorithm uses a CNN model to classify the person's facial expression.
The dataset used was from the Kaggle facial expression recognition competition, which can be found here. It contains over 35,000 images, with each one classified with one of the following 7 emotions: (0=Angry, 1=Disgust, 2=Fear, 3=Happy, 4=Sad, 5=Surprise, 6=Neutral).
The Pytorch model uses a ResNet 50 architecture with transfer learning. The Keras model uses a simple 4-layer convolutional neural network (CNN) architecture. The accuracy of both models is roughly the same (around 40%), but when we use the OpenCV library to detect and capture human faces, the algorithm becomes very accurate as you can see see by images of the video captures below.
The following sources were used to help create the models in this repo:
https://github.com/omar178/Emotion-recognition https://github.com/MauryaRitesh/Facial-Expression-Detection-V2