iccv2013 reading: learning to rank using privileged information
DESCRIPTION
Brief description of ICCV2013 paper entitled "Learning to rank using privileged information" by Sharmanska et al.TRANSCRIPT
![Page 1: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/1.jpg)
ICCV2013 reading 2014.3.28
Akisato Kimura (@_akisato)
![Page 2: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/2.jpg)
Paper to read
(Presented at ICCV2013)
![Page 3: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/3.jpg)
Problem dealing with in this paper
• Learning using privileged information (LUPI) – Training
• Feature vectors : 𝑋𝑋 = 𝑥𝑥1, … , 𝑥𝑥𝑁𝑁 , 𝑥𝑥𝑖𝑖 ∈ ℝ𝑑𝑑 • Label annotation : 𝑌𝑌 = 𝑦𝑦1, … ,𝑦𝑦𝑁𝑁 , 𝑦𝑦𝑖𝑖 ∈ ℕ • Additional information : 𝑋𝑋∗ = 𝑥𝑥1∗, … , 𝑥𝑥𝑁𝑁∗ , 𝑥𝑥𝑖𝑖∗ ∈ ℝ𝑑𝑑∗
– Testing • Prediction function : 𝑓𝑓: ℝ𝑑𝑑 → ℕ • No additional information required
![Page 4: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/4.jpg)
Privileged information??
• Applicable to several scenarios in CV
![Page 5: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/5.jpg)
Formulation
• Generic supervised binary classification – Training
• Feature vectors : 𝑋𝑋 = 𝑥𝑥1, … , 𝑥𝑥𝑁𝑁 , 𝑥𝑥𝑖𝑖 ∈ ℝ𝑑𝑑 • Label annotation : 𝑌𝑌 = 𝑦𝑦1, … ,𝑦𝑦𝑁𝑁 , 𝑦𝑦𝑖𝑖 ∈ {+1,−1} • Additional information : 𝑋𝑋∗ = 𝑥𝑥1∗, … , 𝑥𝑥𝑁𝑁∗ , 𝑥𝑥𝑖𝑖∗ ∈ ℝ𝑑𝑑∗
– Testing • Prediction function : 𝑓𝑓: ℝ𝑑𝑑 → ℝ • No additional information required
![Page 6: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/6.jpg)
Key idea
• Privileged information allow us to distinguish between easy and hard examples – If the privileged data is easy to classify, then the
original data would also be easy to classify.
– … under the assumption that the privileged data is similarly informative about the problem at hand.
![Page 7: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/7.jpg)
Linear SVM
• Ordinary convergence rate = 𝑂𝑂(𝑁𝑁−1/2) • It improves to 𝑂𝑂(𝑁𝑁−1)
– if we knew the optimal slack values 𝜉𝜉𝑖𝑖 in advance (OracleSVM [Vapnik+ 2009])
min𝑤𝑤∈ℝ𝑑𝑑,𝑏𝑏∈ℝ,𝜉𝜉𝑖𝑖∈ℝ
![Page 8: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/8.jpg)
Slack variables in SVM
• Slack variables tell us which training examples are easy / hard to classify – 𝜉𝜉𝑖𝑖 = 0 → easy – 𝜉𝜉𝑖𝑖 ≫ 0 → hard
min𝑤𝑤∈ℝ𝑑𝑑,𝑏𝑏∈ℝ,𝜉𝜉𝑖𝑖∈ℝ
![Page 9: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/9.jpg)
SVM+
• A 1st model for LUPI – Use privileged data as a proxy to the oracle – Parameterize 𝜉𝜉𝑖𝑖 = 𝑤𝑤∗, 𝑥𝑥𝑖𝑖∗ + 𝑏𝑏∗
[Vapnik+ NN2009, NIPS2010]
![Page 10: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/10.jpg)
Why should SVM+ be improved?
• Cannot be solved by popular SVM packages – Although good optimization algorithms were
derived [Pechyony+ 2011], they work only with the dual.
![Page 11: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/11.jpg)
Learning to rank setup instead
• Underlying idea is the same • Using the privileged data to identify easy /
hard-to-separate sample pairs – Instead of using it to identify easy / hard-to-
classify samples
![Page 12: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/12.jpg)
SVMrank
• Slack variables tell us which training example pairs are easy / hard / impossible to separate
[Joachims KDD2002]
![Page 13: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/13.jpg)
Proposed method: Rank transfer
• The strategy is similar to SVM+, but indirect.
1. SVMrank on 𝑋𝑋∗ (The ranking function 𝑓𝑓∗) 2. Margins 𝜌𝜌𝑖𝑖𝑖𝑖 = 𝑓𝑓∗ 𝑥𝑥𝑖𝑖∗ − 𝑓𝑓∗(𝑥𝑥𝑖𝑖∗) ∀𝑖𝑖, 𝑗𝑗 𝑦𝑦𝑖𝑖 > 𝑦𝑦𝑖𝑖
• 𝜌𝜌𝑖𝑖𝑖𝑖 ≫ 0 : easy, 𝜌𝜌𝑖𝑖𝑖𝑖 ≈ 0 : hard, 𝜌𝜌𝑖𝑖𝑖𝑖 < 0 : impossible
3. SVMrank on 𝑋𝑋 with data-dependent margins
![Page 14: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/14.jpg)
Intuition
• If it was difficult to correctly rank a pair on 𝑋𝑋∗, also it will also be difficult on 𝑋𝑋 1. Pairs (𝑖𝑖, 𝑗𝑗) with small margins 𝜌𝜌𝑖𝑖𝑖𝑖 have more
limited influence on 𝑤𝑤 2. Incorrectly ranked pairs are ignored.
1.
2.
![Page 15: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/15.jpg)
Why not Rank transfer?
• We can use standard SVM packages! – For the SVMrank on 𝑋𝑋∗ this is clear. – For the SVMrank on 𝑋𝑋 we need variable
transformations
![Page 16: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/16.jpg)
Experiments
• 4 different types of privileged information – All of those can be handled in a unified framework.
• 4 different methods to be compared – SVM, SVMrank, SVM+, Rank transfer
• Evaluation metric = Average Precision
![Page 17: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/17.jpg)
(1) Attributes as privileged info
• Animals with Attributes Dataset – 10 species ( = classes), 85 properties ( = attributes)
• Features: 2000-dim SURF • Privileged: 85-dim predicted attributes
[Lampert+ PAMI2014]
• Learn 1-vs-1 classifiers with 100 training samples
![Page 18: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/18.jpg)
(1) Results
• Rank transfer is the best.
![Page 19: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/19.jpg)
![Page 20: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/20.jpg)
(2) Bounding box as privileged info
• Fine-grained setup on ILSVRC2012 – 17 classes with variety of snakes
• Features: 4096-dim Fisher vector from the whole images
• Privileged: 4096-dim Fisher vector from the bounding box regions
• Learn 1-vs-rest classifiers
![Page 21: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/21.jpg)
(2) Results
• SVM+ is the best, ranking strategies do not seem suitable for this setup.
![Page 22: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/22.jpg)
(3) Texts as privileged info
• IsraelImages dataset [Bekkerman+ CVPR2007]
– 11 classes, 1800 images with a textual description up to 18 words
• Features: 4096-dim Fisher vectors • Privileged: BoWs from the texts • Learn 1-vs-1 classifiers
Desert Trees
![Page 23: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/23.jpg)
(3) Results
• Reference (privileged only) is the best • All the others produce almost the same.
– Note that, high accuracy in the privileged space does not necessarily mean that the privileged information is helpful for the target task.
![Page 24: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/24.jpg)
(4) Rationales as privileged info
• Hot or Not dataset [Donahue+ ICCV2011]
• Features: 500-dim densely sampled SIFT from the whole image
• Privileged: 500-dim densely sampled SIFT from the rationales
![Page 25: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/25.jpg)
(4) Results
• Reference is the best. • Rank transfer performs better for male class. • Hard to draw a conclusion.
![Page 26: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/26.jpg)
Appendix: Margin transfer
• One possible alternative to Rank transfer
![Page 27: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/27.jpg)
But not so good…
![Page 28: ICCV2013 reading: Learning to rank using privileged information](https://reader034.vdocuments.us/reader034/viewer/2022051514/54b6de624a7959703e8b4828/html5/thumbnails/28.jpg)
Last words
• The idea is nice, easy to use. • More privileged information, better
performance? --- needs discussions • Which types of privileged information are
suitable? --- unknown