multi-class incremental learning - max planck societyyaliu/files/multi-class... · 2020. 9. 13. ·...

38
Multi-Class Incremental Learning Yaoyao Liu [email protected]

Upload: others

Post on 07-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Multi-Class Incremental Learning

Yaoyao [email protected]

Page 2: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Outline

- Background- Methods- Experiments- Takeaways

Page 3: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Background

Page 4: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Motivation

Thousands of new users and items everyday

Update the model with incremental data

(Images from Internet)

Limited memory

Taking too long time to retrain the model

Page 5: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Problem Definitions

Incremental learning (also lifelong learning, continual learning)

DNNclassifier

10-class train

10-class test

Page 6: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Problem Definitions

Incremental learning (also lifelong learning, continual learning)

DNNclassifier

10-class train

10-class test

10-class train

10-class test

Page 7: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Problem Definitions

Incremental learning (also lifelong learning, continual learning)

DNNclassifier

10 class test

10-class train

10-class test

DNNclassifier

10-class train

Page 8: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Problem Definitions

Incremental learning (also lifelong learning, continual learning)

DNNclassifier

10-class train

DNNclassifier

10-class train

10 class test

10-class train

10 class test

DNNclassifier

10-class test

…...

Page 9: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Problem Definitions

Incremental learning (also lifelong learning, continual learning)

DNNclassifier

ImageNet train

DNNclassifier

CUB train

Test dataTest data

DNNclassifier

ImageNet Class 1-10

DNNclassifier

ImageNet Class 11-20

DNNclassifier

ImageNet Class 21-30

Multi-task setting Multi-class setting

10 class test10 class test

Test data

Page 10: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Problem Definitions

Incremental learning (also lifelong learning, continual learning)

DNNclassifier

ImageNet train

DNNclassifier

CUB train

Test dataTest data

DNNclassifier

ImageNet Class 1-10

DNNclassifier

ImageNet Class 11-20

DNNclassifier

ImageNet Class 21-30

Multi-task setting Multi-class setting

10 class test10 class test

Test data

This talk

Page 11: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Related Learning Methods

Transfer learning Multi-task learning

Multi-taskincremental learning

Multi-class incremental learning

Target task(s) Single Multiple Multiple Single

Source task(s) Multiple Multiple Multiple Single

Data arrival Constantly / Once Once Constantly Constantly

Page 12: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Challenges

DNNclassifier

Feature extractor

FC classifier

1. Catastrophic ForgettingModel bias on the latest class group

2. FC classifier is not extendable 10 classes -> 20 classes

3. Memory resources may be limitedNot able to retain all previous samples

Page 13: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Catastrophic Forgetting

DNNclassifier

ImageNet Class 1-10

DNNclassifier

ImageNet Class 11-20

DNNclassifier

ImageNet Class 21-30

Joint distribution

Class 1-10Class 11-20ImageNet

Class 21-30

DNNclassifier

Distribution of DNN parameters

Page 14: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Catastrophic Forgetting

DNNclassifier

ImageNet Class 1-10

DNNclassifier

ImageNet Class 11-20

DNNclassifier

ImageNet Class 21-30

Distribution of DNN parameters

Overfit to the latest class group

Page 15: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

- To improve the feature extractor

LwF[1], ......

Literature Review

- To improve the classifier:

iCaRL[4], BiC[3], ......

Reference[1] Li, Zhizhong, and Derek Hoiem. "Learning without forgetting." T-PAMI 2017;[2] Castro, Francisco M., et al. "End-to-end incremental learning." ECCV 2018;[3] Wu, Yue, et al. "Large scale incremental learning." CVPR 2019;[4] Rebuffi, Sylvestre-Alvise, et al. "icarl: Incremental classifier and representation learning." CVPR 2017;[5] Hou, Saihui, et al. "Learning a Unified Classifier Incrementally via Rebalancing." CVPR 2019.

- To improve both:

Hou et al.[5], EEIL[2]......

DNNclassifier

Feature extractor

FC classifier

Page 16: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Methods

Page 17: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Learning without Forgetting (LwF)

Reference[1] Li, Zhizhong, and Derek Hoiem. "Learning without forgetting." T-PAMI 2017.

all images

old classes’ ground truth

new class’ ground truth

new class images

previous model’s response

for old classes

new class’ ground truth

Page 18: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Learning without Forgetting (LwF)

Reference[1] Li, Zhizhong, and Derek Hoiem. "Learning without forgetting." T-PAMI 2017;[2] Hinton, Geoffrey, Oriol Vinyals, and Jeff Dean. "Distilling the knowledge in a neural network." arXiv 2015.

Idea: discouraging the old classes output to change [1]

Proposed by Hilton et al.[2] for ensemble modeling

Classification

Distillation

Full objective

Knowledge distillation

Page 19: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Learning without Forgetting (LwF)

Reference[1] Li, Zhizhong, and Derek Hoiem. "Learning without forgetting." T-PAMI 2017.

Summary

+ Distillation loss -> improve the learning of feature extractor+ Don’t need to retain data for old classes

- Using a simple way to deal with the FC classifier without solving the bias problem

* This method is proposed for multi-task setting. However, it is usually used as a baseline of multi-class incremental learning papers

Page 20: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

iCaRL: Incremental Classifier and Representation Learning

Reference[1] Rebuffi, Sylvestre-Alvise, et al. "icarl: Incremental classifier and representation learning." CVPR 2017.

Feature extractor

classifier

Idea: FC classifier -> nearest-mean-of-exemplars (NME) *NME is used only in test phase

Page 21: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

iCaRL: Incremental Classifier and Representation Learning

Reference[1] Rebuffi, Sylvestre-Alvise, et al. "icarl: Incremental classifier and representation learning." CVPR 2017. (Images from Ramon Morros)

Page 22: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

iCaRL: Incremental Classifier and Representation Learning

Reference[1] Rebuffi, Sylvestre-Alvise, et al. "icarl: Incremental classifier and representation learning." CVPR 2017.

Summary

+ Solving the bias problem for the classifier

- Need to retain parts of old data- Non-parametric classifier may fail in some novel similar classes- Training and testing using different types of classifier (train: fc, test: NME)

Page 23: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

End-to-end Incremental Learning (EEIL)

Reference[1] Castro, Francisco M., et al. "End-to-end incremental learning." ECCV 2018.

Data augmentation

Page 24: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

End-to-end Incremental Learning (EEIL)

Reference[1] Castro, Francisco M., et al. "End-to-end incremental learning." ECCV 2018.

Summary

+ End-to-end, improvement on both feature extractor and classifier+ A series of data augmentation techniques

- Improvements may come from tricks

Page 25: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Large Scale Incremental Learning (BiC)

Reference[1] Wu, Yue, et al. "Large scale incremental learning." CVPR 2019.

BiC: Bias CorrectionProblem: bias on novel classer

Page 26: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Large Scale Incremental Learning (BiC)

Reference[1] Wu, Yue, et al. "Large scale incremental learning." CVPR 2019.

Summary

+ Solve the bias problem on classifier

- The correction function works only on large scale datasets

Page 27: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Learning a Unified Classifier Incrementally via Rebalancing (Hou et al.)

Reference[1] Hou, Saihui, et al. "Learning a Unified Classifier Incrementally via Rebalancing." CVPR 2019.

FC classifier

Cosine normalization

Improve classifier

Cosine distance

Page 28: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Learning a Unified Classifier Incrementally via Rebalancing (Hou et al.)

Reference[1] Hou, Saihui, et al. "Learning a Unified Classifier Incrementally via Rebalancing." CVPR 2019.

Distillation loss

Cosine distance of feature

Improve feature extractor

Less-forget constraint

Page 29: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Learning a Unified Classifier Incrementally via Rebalancing (Hou et al.)

Reference[1] Hou, Saihui, et al. "Learning a Unified Classifier Incrementally via Rebalancing." CVPR 2019.

Add margin threshold to top-K classes

Improve classifier

Inter-class separation

Page 30: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Learning a Unified Classifier Incrementally via Rebalancing (Hou et al.)

Reference[1] Hou, Saihui, et al. "Learning a Unified Classifier Incrementally via Rebalancing." CVPR 2019.

Summary

+ Improvement on both classifier (cosine distance, inter-class separation) and feature extractor (less-forgot constraint)

- The first group requires more classes than other groups (require a good initialization for the CONV networks)- Extremely slow with inter-class separation strategy

Page 31: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

ComparisonLwF[1] iCaRL[2] EEIF[3] BiC[4] Hou et al.[5]

Feature Extractor

Distillation Loss

Distillation Loss Distillation Loss Distillation Loss Less-Forget Constraint

ExemplarExemplar,Balanced

Fine-tuningExemplar Exemplar,

Classifier FC NME FC

FC, Bias

Correction

FC,Cosine

Normalization, Inter-Class Separation

Reference[1] Li, Zhizhong, and Derek Hoiem. "Learning without forgetting." T-PAMI 2017;[2] Rebuffi, Sylvestre-Alvise, et al. "icarl: Incremental classifier and representation learning." CVPR 2017;[3] Castro, Francisco M., et al. "End-to-end incremental learning." ECCV 2018;[4] Wu, Yue, et al. "Large scale incremental learning." CVPR 2019;[5] Hou, Saihui, et al. "Learning a Unified Classifier Incrementally via Rebalancing." CVPR 2019.

Page 32: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Experiments

Page 33: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Datasets: CIFAR100, ImageNet-Sub (100 classes subset), ImageNet

Datasets and Benchmark

E.g. CIFAR-100

Number of class groups

50 classes 50 classes

10 10 10 10 10 10 10 10 10 10

2 groups

10 groups

Page 34: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Experiments on CIFAR-100

20 groups 10 groups 5 groups 2 groups

Page 35: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Experiments on ImageNet

10 groups

Page 36: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Confusion Matrices

iCaRL LwF fixed representation fine-tuning

Page 37: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Takeaways

Important techniques: 1. Distillation Loss -> retain knowledge for old classes2. Nearest-Mean-of-Exemplar Classifier -> no-bias classifier

Future work:1. Other strategy for retaining knowledge for old classes2. Shareable parametric classifier -> meta-learning?

Page 38: Multi-Class Incremental Learning - Max Planck Societyyaliu/files/multi-class... · 2020. 9. 13. · Learning without Forgetting (LwF) Reference [1] Li, Zhizhong, and Derek Hoiem

Thanks!Any questions?

[email protected]