clusterfit: improving generalization of visual...
TRANSCRIPT
![Page 1: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/1.jpg)
ClusterFit: Improving Generalization of Visual Representations
Xueting Yan*, Ishan Misra*, Abhinav Gupta, Deepti Ghadiyaram†, Dhruv Mahajan† CVPR 2020
STRUCT Group Seminar Presenter: Wenjing Wang
2020.05.17
![Page 2: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/2.jpg)
2
OUTLINE
➤ Authorship
➤ Background
➤ Proposed Method
➤ Experimental Results
➤ Conclusion
![Page 3: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/3.jpg)
9
OUTLINE
➤ Authorship
➤ Background
➤ Proposed Method
➤ Experimental Results
➤ Conclusion
![Page 4: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/4.jpg)
10
BACKGROUND
➤ Background
➤ Overview of the proposed method
➤ Compared with existing methods
![Page 5: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/5.jpg)
11
BACKGROUND
➤ Background
➤ Overview of the proposed method
➤ Compared with existing methods
![Page 6: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/6.jpg)
12
BACKGROUND
➤ Weak or self-supervision pre-training
![Page 7: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/7.jpg)
13
BACKGROUND
➤ Weakly supervised learning
• Defining the proxy tasks using the associated meta-data
• Hashtags predictions
• Search queries prediction
• GPS
• Word or n-grams predictions
![Page 8: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/8.jpg)
14
BACKGROUND
➤ Self-supervised Learning
• Defining the proxy tasks without extra data
• Domain agnostic
• Domain-specific information, e.g. spatial structure
• Color and illumination
• Temporal structure
![Page 9: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/9.jpg)
15
BACKGROUND
➤ Weak or self-supervision pre-training
• Pre-training proxy not well-aligned with the transfer tasks
• Label noise: polysemy (apple the fruit vs. Apple Inc.), linguistic ambiguity, lack of visualness of tags (#love)
• The last layer is more “aligned” with the proxy objective
➤ This paper: avoid overfitting to the proxy objective
• Smoothing the feature space learned via proxy objectives
![Page 10: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/10.jpg)
16
BACKGROUND
➤ Background
➤ Overview of the proposed method
➤ Compared with existing methods
![Page 11: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/11.jpg)
17
BACKGROUND
➤ Proposed method: ClusterFit
• Step 1. Cluster: feature clustering
• Step 2.Fit: predict cluster assignments
![Page 12: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/12.jpg)
18
BACKGROUND
➤ Proposed method: ClusterFit
![Page 13: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/13.jpg)
19
BACKGROUND
➤ Background
➤ Overview of the proposed method
➤ Compared with existing methods
![Page 14: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/14.jpg)
➤ Cluster-based self-supervised learning
• DeepCluster [1]
• DeeperCluster [2]
20
BACKGROUND
[1] Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze: Deep Clustering for Unsupervised Learning of Visual Features. ECCV 2018 [2] Mathilde Caron, Piotr Bojanowski, Julien Mairal, Armand Joulin: Unsupervised Pre-Training of Image Features on Non-Curated Data. ICCV 2019
![Page 15: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/15.jpg)
21
BACKGROUND
➤ Cluster-based self-supervised learning
• DeepCluster [1]
• DeeperCluster [2]
[1] Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze: Deep Clustering for Unsupervised Learning of Visual Features. ECCV 2018 [2] Mathilde Caron, Piotr Bojanowski, Julien Mairal, Armand Joulin: Unsupervised Pre-Training of Image Features on Non-Curated Data. ICCV 2019
![Page 16: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/16.jpg)
22
BACKGROUND
➤ Cluster-based self-supervised learning
• DeepCluster [1], DeeperCluster [2]
• Require alternate optimization
➤ This paper
• No alternate optimization
• More stable and computationally efficient
[1] Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze: Deep Clustering for Unsupervised Learning of Visual Features. ECCV 2018 [2] Mathilde Caron, Piotr Bojanowski, Julien Mairal, Armand Joulin: Unsupervised Pre-Training of Image Features on Non-Curated Data. ICCV 2019
![Page 17: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/17.jpg)
23
BACKGROUND
➤ Model Distillation
• Transferring knowledge from a teacher to a student
➤ This paper
• Distilling knowledge from a higher capacity teacher model Npre to a lower-capacity student model Ncf
![Page 18: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/18.jpg)
24
OUTLINE
➤ Authorship
➤ Background
➤ Proposed Method
➤ Experimental Results
➤ Conclusion
![Page 19: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/19.jpg)
25
PROPOSED METHOD
➤ ClusterFit
• Use the second-last layer of to extract features
• Cluster features using k-means into K groups
• Train a new network from scratch with the K-labels
![Page 20: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/20.jpg)
26
PROPOSED METHOD
➤ Why?
• ClusterFit: a lossy compression scheme
• Captures the essential visual invariances in the feature space
• Gives the ‘re-learned’ network an opportunity to learn features that are less sensitive to the original pre-training objective → making them more transferable.
![Page 21: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/21.jpg)
27
PROPOSED METHOD
➤ Notes
• Npre is trained on Dpre
• Ncf is trained on Dcf
• Dtar is the target dataset
![Page 22: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/22.jpg)
28
PROPOSED METHOD
➤ Control Experiment using Synthetic Noise
• Adding varying amounts (p%) of uniform random label noise
• Npre: pre-train on noisy label, then fixed and train linear classifiers
• Dpre = Dcf = ImageNet-1K
• Npre = Ncf = ResNet-50
• Dtar = ImageNet-1K, ImageNet-9K, iNaturalist
![Page 23: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/23.jpg)
29
PROPOSED METHOD
➤ Control Experiment using Synthetic Noise
![Page 24: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/24.jpg)
30
PROPOSED METHOD
➤ Control Experiment using Synthetic Noise
![Page 25: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/25.jpg)
31
OUTLINE
➤ Authorship
➤ Background
➤ Proposed Method
➤ Experimental Results
➤ Conclusion
![Page 26: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/26.jpg)
32
EXPERIMENTAL RESULTS
➤ Benchmarking
➤ Analysis of ClusterFit
![Page 27: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/27.jpg)
33
EXPERIMENTAL RESULTS
➤ Benchmarking
➤ Analysis of ClusterFit
![Page 28: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/28.jpg)
34
EXPERIMENTAL RESULTS
➤ Compared Methods
• Distillation
• A weighted average of 2 loss functions:
• (a) cross-entropy with soft targets computed using Npre
• (b) cross-entropy with labels in weakly-supervised setup
• Prototype
• Unlike random cluster initialization
• Use label information in Dcf to initialize cluster centers
• Longer pre-training (Npre 2×)
![Page 29: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/29.jpg)
35
EXPERIMENTAL RESULTS
➤ Benchmarking
• Weakly-supervised images
• Weakly-supervised videos
• Self-supervised images
![Page 30: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/30.jpg)
36
EXPERIMENTAL RESULTS
➤ Weakly-Supervised Images
• Dpre = Dcf = IG-ImageNet-1B
• Npre = Ncf = ResNet-50
• Dtar = ImageNet-1K, ImageNet-9K, Places365, iNaturalist
![Page 31: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/31.jpg)
37
EXPERIMENTAL RESULTS
➤ Weakly-Supervised Images
• Results
• ImageNet-1K: the hand-crafted label alignment of the IG-ImageNet-1B with ImageNet-1K
![Page 32: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/32.jpg)
38
EXPERIMENTAL RESULTS
➤ Weakly-Supervised Videos
• Dpre = Dcf = IG-Verb-19M
• Npre = Ncf = R(2+1)D-34 [1]
• Dtar = Kinetics, Sports1M, Something-Something V1
[1] Du Tran, Heng Wang, Lorenzo Torresani, Jamie Ray, Yann LeCun, Manohar Paluri: A Closer Look at Spatiotemporal Convolutions for Action Recognition. CVPR 2018
![Page 33: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/33.jpg)
39
EXPERIMENTAL RESULTS
➤ Weakly-Supervised Videos
• Results
![Page 34: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/34.jpg)
40
EXPERIMENTAL RESULTS
➤ Self-Supervised Images
• Dpre = Dcf = ImageNet-1K, JigSaw & Rotation
• Npre = Ncf = ResNet-50
• Dtar = VOC07, ImageNet-1K, Places205, iNaturalist
![Page 35: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/35.jpg)
41
EXPERIMENTAL RESULTS
➤ Self-Supervised Images
• Results
![Page 36: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/36.jpg)
42
![Page 37: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/37.jpg)
43
EXPERIMENTAL RESULTS
➤ Self-Supervised Images
• Layer-wise results
![Page 38: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/38.jpg)
44
EXPERIMENTAL RESULTS
➤ Benchmarking
➤ Analysis of ClusterFit
![Page 39: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/39.jpg)
45
EXPERIMENTAL RESULTS
➤ Relative model capacity of Npre and Ncf
• Dpre = IG-Verb-19M, Dcf = IG-Verb-62M
• Ncf = R(2+1)D-18 (33M parameters)
• Npre = R(2+1)D-18, R(2+1)D-34 (64M parameters)
![Page 40: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/40.jpg)
46
EXPERIMENTAL RESULTS
➤ Relative model capacity of Npre and Ncf
• Results
![Page 41: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/41.jpg)
47
EXPERIMENTAL RESULTS
➤ Unsupervised vs. Per-Label Clustering
• Per-label clustering:
• Cluster videos belonging to each label into kl clusters
![Page 42: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/42.jpg)
48
EXPERIMENTAL RESULTS
➤ Properties of Dpre
• Number of labels
• IG-Verb-62M (438 weak verb labels)
• Label number: 10, 30, 100, 438
• Reducing the number of labels implies reduced content diversity
![Page 43: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/43.jpg)
49
EXPERIMENTAL RESULTS
➤ Properties of Dpre
• Results
![Page 44: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/44.jpg)
50
OUTLINE
➤ Authorship
➤ Background
➤ Proposed Method
➤ Experimental Results
➤ Conclusion
![Page 45: ClusterFit: Improving Generalization of Visual Representations39.96.165.147/Seminar/WenjingWang_200517.pdf · 2020. 5. 18. · 15. BACKGROUND. Weak or self-supervision pre-training](https://reader036.vdocuments.us/reader036/viewer/2022071417/6114e15e4ec5b37eb31bcd4a/html5/thumbnails/45.jpg)
51
CONCLUSION
➤ First clustering the original feature space and re-learning a new model on cluster assignments
➤ Improves the generalizability for weakly- and self-supervised learning