facial expression recognitionhji/cs519_slides/facial... · what is facial expression? one or more...

34
Facial Expression Recognition Convolutional Neural Network Yekun Yang

Upload: others

Post on 28-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Facial Expression RecognitionConvolutional Neural Network

Yekun Yang

Page 2: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

What is Facial Expression?

● One or more motions or

positions of the muscles

beneath the skin of the face

● Babies can already tell the

difference between happy and

sad at just 14-months old

● Even some animals can sense

human facial expressions, such

as dogs, but to read them

would need the animal to have

an experience with humans

Page 3: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

7 Basic Emotions

● Happy

● Sad

● Surprise

● Fear

● Anger

● Disgust

● Neutral

Page 4: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Why is facial emotion recognition important?

● Retailers may use these metrics to evaluate customer’s interest.

● Healthcare providers can provide better service by using additional information

about patients’ emotional state during treatment.

● Entertainment producers can monitor audience engagement in events to

consistently create desired content.

Page 5: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Project Model

● Data

● CNN model

● Analysis

Page 6: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Data

● Kaggle Facial Expression

Recognition Challenge

(FER2013)

● 35887 pre-cropped, 48-by-48-

pixel gray-scale images

Page 7: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Data

● 28709 labeled faces for training

● Remaining two test sets

(3589/set)

Page 8: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

CNN Structure:Input Layer

The input layer has pre-determined,

fixed dimensions, so the image must

be pre-processed before it can be fed

into the layer.

Page 9: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Convolutional Layer

● Consist of a set of learnable

filters (or kernels)

● Every filter is a small receptive

field, but extend through the full

depth of the input volume

● Generates feature maps that

represent how pixel values are

enhanced

Page 10: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Case 1

Page 11: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Case 2

Page 12: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Case 3

Page 13: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Case 4: weight value of (1, 0.3) and (0.1, 5)

Page 14: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Case 5

Page 15: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already
Page 16: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Pooling Layer, Padding & Stride

● Pooling is a dimension

reduction technique usually

applied after one or several

convolutional layers.

● Max pooling

● Padding is adding zeros to the

edge of image for preserving

certain size.

● With higher stride values, move

large number of pixels at a time

and hence produce smaller

output volumes.

Page 17: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already
Page 18: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already
Page 19: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Dense Layer

● Fully connected network

● The more layers/nodes added

to the network the better it can

pick up signals.

● On the other hand, the model

also becomes increasingly

prone to overfitting the training

data.

● Dropout is required.

Page 20: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Output Layer

● Softmax

● How many layers are good for

the model?

● The deeper the better

Page 21: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

model = Sequential()

model.add(keras.layers.InputLayer(input_shape=input_shape))

model.add(keras.layers.convolutional.Conv2D(filters, filtersize, strides=(1, 1), padding='same', activation='relu'))

model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))

model.add(keras.layers.Flatten())

model.add(keras.layers.Dense(units=2, input_dim=50,activation='softmax'))

model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.fit(images, label, epochs=epochs, batch_size=batchsize, validation_split=0.3)

Keras Sample Code

Page 22: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Related Work

● AlexNet (2012)

● ZF Net (2013)

● VGG Net (2014)

● GoogLeNet (2015)

● Microsoft ResNet (2015)

Page 23: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

AlexNet (2012)

● Top 5 test error rate of 15.4%

(Top 5 error is the rate at which,

given an image, the model does

not output the correct label with

its top 5 predictions)

● 15 million annotated images

from a total of over 22,000

categories

● Used ReLU for the nonlinearity

functions, and trained the

model using batch stochastic

gradient descent.

Page 24: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

ZF Net (2013)

● 11.2% error rate

● Trained on only 1.3 million

images

● Instead of using 11x11 sized

filters in the first layer (which is

what AlexNet implemented), ZF

Net used filters of size 7x7 and

a decreased stride value.

● Used ReLUs for their activation

functions, cross-entropy loss

for the error function, and

trained using batch stochastic

gradient descent.

Page 25: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

VGG Net (2014)

● 7.3% error rate

● The use of only 3x3 sized filters

is quite different from AlexNet’s

11x11 filters in the first layer

and ZF Net’s 7x7 filters. The

authors’ reasoning is that the

combination of two 3x3 conv

layers has an effective

receptive field of 5x5.

● Used ReLU layers after each

conv layer and trained with

batch gradient descent.

Page 26: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

GoogLeNet (2015)

● Top 5 error rate of 6.7%

● Used 9 Inception modules in

the whole architecture, with

over 100 layers in total.

● No use of fully connected

layers! They use an average

pool instead, to go from a

7x7x1024 volume to a

1x1x1024 volume. This saves a

huge number of parameters.

● Utilized concepts from R-CNN

for detection model.

Page 27: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Microsoft ResNet (2015)

● 3.6% error rate!

● The idea behind a residual

block is that you have your

input x go through conv-relu-

conv series. This will give you

some F(x). That result is then

added to the original input x.

● 152 layers...

● The group tried a 1202-layer

network, but got a lower test

accuracy, presumably due to

overfitting.

Page 28: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Larsson, Gustav, Michael Maire, and

Gregory Shakhnarovich. “FractalNet:

Ultra-Deep Neural Networks without

Residuals”

Page 29: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Kaiming He, Xiangyu Zhang, Shaoqing

Ren, Jian Sun. “Deep Residual

Learning for Image Recognition”

Page 30: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

Result Analysis

● The final CNN had a validation

accuracy around 55%.

● The model performs pretty well

on classifying positive

emotions, but weaker across

negative emotions on average.

● Happy: 75%

● Surprise: 70%

● Sad: 40%

● Misclassifies angry, fear and

neutral as sad

Page 31: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already
Page 32: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already
Page 34: Facial Expression Recognitionhji/cs519_slides/Facial... · What is Facial Expression? One or more motions or positions of the muscles beneath the skin of the face Babies can already

● Jostine Ho, https://github.com/JostineHo/mememoji

● Dan Duncan, https://github.com/danduncan/HappyNet

● Adit Deshpande,

https://adeshpande3.github.io/adeshpande3.github.io/The-9-Deep-

Learning-Papers-You-Need-To-Know-About.html

● http://cs231n.stanford.edu/

Reference