deep face recognition - nvidiaon-demand.gputechconf.com/gtc/2018/presentation/s... · herta deep...
TRANSCRIPT
Deep Face Recognition Challenges and Tips for Real-life Deployment
1 Deep Face Recognition
2 Public DBs
3 Public models
4 Managing imbalance
5 Embeddings
6 Conclusions
HERTA
www.hertasecurity.com
Deep Face Recognition
GPU-powered face recognition
Offices in Barcelona, Madrid, London, Los Angeles
Crowds, unconstrained
Deep Face Recognition
Large training DBs, >100K images, >1K subjects (Public DBs)
Public models (Inception, VGG, ResNet, SENet…), close to state-of-the-art
Typically, embedding layer (yielding facial descriptor) feeds one-hot encoding
Unconstrained (in-the-wild) environments
HERTA
www.hertasecurity.com
Public DBs
CWF
LFW
VGG Face
VGG Face 2
IJBB
• Mostly celebrities: subjects overlap
2.6K
9.1K
10.6K
1.8K
5.7K
HERTA
www.hertasecurity.com
Public DBs
LFWCWF
• Highly imbalancedD
emo
grap
hic
gro
up
Imag
es /
su
bje
ct
HERTA
www.hertasecurity.com
Public models
Public models • trained on public DBs (DIY)
Validate with • demographically-balanced DB:
Asian female: 1M pairsAsian male: 1M pairsBlack female: 1M pairsBlack male: 1M pairsWhite female:1M pairsWhite male: 1M pairs
FaceNet (2015) CWF / MS-1MCVGGFace (2015) VGGSphereFace (2017) CWFVGGFace2 (2017) MS-1MC + VGG2
(50% same ID, 50% different ID)
HERTA
www.hertasecurity.com
Public models: examples of failures
False positives False negatives
HERTA
www.hertasecurity.com
Public models: evaluation
FaceNet (2015)
SphereFace (2017)
VGGFace (2015)
VGGFace2 (2017)
1MC CWF VGG
CWF 1MC 1MCVG2 VG2
White male Black male Asian female
HERTA
www.hertasecurity.com
“Features get better at understanding faces, improving
performances of individual tasks”
Multi-tasklearning
id
gender
ethnics
Managing imbalance
Undersampling
Oversampling
Cost-sensitive learning
c
SAMPLING(DATA-ORIENTED)
TRAINING LOSS(MODEL-ORIENTED)
R Ranjan, VM Patel, R Chellappa. “Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition.” TPAMI 2017
HERTA
www.hertasecurity.com
Managing imbalance – Data augmentation
• Data augmentation: makes imbalance mitigation much more effective
Stochasticdata augmentation
Oversampled DB DNNDatabase
I Masi et al. "Do we really need to collect millions of faces for effective face recognition?" ECCV 2016.
HERTA
www.hertasecurity.com
Managing imbalance – Proposal
Traditional imbalance:
Proposal: IDR(robust to outliers)
Iterative multi-label oversampling:
𝑚𝑎𝑥 𝑋
𝑚𝑖𝑛(𝑋)
𝐷9 𝑋
𝐷1(𝑋)
1. Find most imbalanced label L2. Find most imbalanced category C within L3. Draw random sample from C, replicate
𝐷1
𝐷9
𝑚𝑎𝑥 𝑋
𝑚𝑖𝑛(𝑋)
𝐷9 𝑋
𝐷1(𝑋)
#samples added #samples added
HERTA
www.hertasecurity.com
Managing imbalance – Sample training batch
Before oversampling… …and after
HERTA
www.hertasecurity.com
Managing imbalance
• Results with ResNet 20 (tiny network, for comparison only)• Better with almost 6X less subjects, 2X less images!
10.6K subjects,494K images
1.8K subjects,295K images
HERTA
www.hertasecurity.com
Sparse embedding
Typically, in deep face recognition: •
What about • ReLU + embedding + one-hot encoding? (e.g. VGGFace)Why more dimensions, if 90% zero?
Larger representation subspace, at expense of computational efficiency•
But can gain it back! • ̴200M comp/s
image CNNembedding
layerone-hot encoding
Sparse 4096-d Dense 512-dDict + Dense 256-d
HERTA
www.hertasecurity.com
Conclusions
• Public training / validation DBs: heavily biased at multiple levels• Without balancing, trained models will be biased, too!• Prefer “better data” over “more data”
• Machine Learning vs Machine Teaching
Explainable ML
Designing algorithms to passively train models
Choosing which examplesto show a learner
Zhu, Xiaojin, et al. "An Overview of Machine Teaching." arXiv preprint arXiv:1801.05927 (2018).
Questions?