sparse coding: an overview - ubc computer scienceschmidtm/mlrg/sparsecoding.pdf · introduction the...
TRANSCRIPT
![Page 1: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/1.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Sparse Coding: An Overview
Brian Booth
SFU Machine Learning Reading Group
November 12, 2013
![Page 2: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/2.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
The aim of sparse coding
Every columnof D is aprototype
Similar to, butmore generalthan, PCA
![Page 3: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/3.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
The aim of sparse coding
Every columnof D is aprototype
Similar to, butmore generalthan, PCA
![Page 4: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/4.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
The aim of sparse coding
Every columnof D is aprototype
Similar to, butmore generalthan, PCA
![Page 5: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/5.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Example: Sparse Coding of Images
![Page 6: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/6.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Sparse Coding in V1
![Page 7: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/7.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Example: Image Denoising
![Page 8: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/8.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Example: Image Restoration
![Page 9: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/9.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Sparse Coding and Acoustics
Inner ear (cochlea) alsodoes sparse coding offrequencies
![Page 10: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/10.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Sparse Coding and Acoustics
Inner ear (cochlea) alsodoes sparse coding offrequencies
![Page 11: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/11.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Sparse Coding and Natural Language Processing
![Page 12: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/12.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Outline
1 Introduction: Why Sparse Coding?
2 Sparse Coding: The Basics
3 Adding Prior Knowledge
4 Conclusions
![Page 13: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/13.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Outline
1 Introduction: Why Sparse Coding?
2 Sparse Coding: The Basics
3 Adding Prior Knowledge
4 Conclusions
![Page 14: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/14.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Outline
1 Introduction: Why Sparse Coding?
2 Sparse Coding: The Basics
3 Adding Prior Knowledge
4 Conclusions
![Page 15: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/15.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Outline
1 Introduction: Why Sparse Coding?
2 Sparse Coding: The Basics
3 Adding Prior Knowledge
4 Conclusions
![Page 16: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/16.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Outline
1 Introduction: Why Sparse Coding?
2 Sparse Coding: The Basics
3 Adding Prior Knowledge
4 Conclusions
![Page 17: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/17.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
The aim of sparse coding, revisited
We assume our data x satisfies
x ≈n∑
i=1
αidi = αD
Learning:Given training data xj , j ∈ 1, · · · ,mLearn dictionary D and sparse code α
Encoding:Given test data x, dictionary DLearn sparse code α
![Page 18: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/18.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
The aim of sparse coding, revisited
We assume our data x satisfies
x ≈n∑
i=1
αidi = αD
Learning:Given training data xj , j ∈ 1, · · · ,mLearn dictionary D and sparse code α
Encoding:Given test data x, dictionary DLearn sparse code α
![Page 19: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/19.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
The aim of sparse coding, revisited
We assume our data x satisfies
x ≈n∑
i=1
αidi = αD
Learning:Given training data xj , j ∈ 1, · · · ,mLearn dictionary D and sparse code α
Encoding:Given test data x, dictionary DLearn sparse code α
![Page 20: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/20.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Learning: The Objective Function
Dictionary learning involves optimizing:
arg mindi,αj
m∑j=1
‖xj −n∑
i=1
αjidi‖2
+ βm∑
j=1
n∑i=1
|αji |
subject to ‖di‖2 ≤ c, ∀i = 1, · · · ,n.
In matrix notation:
arg minD,A‖X− AD‖2F + β
∑i,j
|αi,j |
subject to∑
i
D2i,j ≤ c, ∀i = 1, · · · ,n.
Split the optimization over D and A in two.
![Page 21: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/21.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Learning: The Objective Function
Dictionary learning involves optimizing:
arg mindi,αj
m∑j=1
‖xj −n∑
i=1
αjidi‖2 + β
m∑j=1
n∑i=1
|αji |
subject to ‖di‖2 ≤ c, ∀i = 1, · · · ,n.
In matrix notation:
arg minD,A‖X− AD‖2F + β
∑i,j
|αi,j |
subject to∑
i
D2i,j ≤ c, ∀i = 1, · · · ,n.
Split the optimization over D and A in two.
![Page 22: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/22.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Learning: The Objective Function
Dictionary learning involves optimizing:
arg mindi,αj
m∑j=1
‖xj −n∑
i=1
αjidi‖2 + β
m∑j=1
n∑i=1
|αji |
subject to ‖di‖2 ≤ c, ∀i = 1, · · · ,n.
In matrix notation:
arg minD,A‖X− AD‖2F + β
∑i,j
|αi,j |
subject to∑
i
D2i,j ≤ c, ∀i = 1, · · · ,n.
Split the optimization over D and A in two.
![Page 23: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/23.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Learning: The Objective Function
Dictionary learning involves optimizing:
arg mindi,αj
m∑j=1
‖xj −n∑
i=1
αjidi‖2 + β
m∑j=1
n∑i=1
|αji |
subject to ‖di‖2 ≤ c, ∀i = 1, · · · ,n.
In matrix notation:
arg minD,A‖X− AD‖2F + β
∑i,j
|αi,j |
subject to∑
i
D2i,j ≤ c, ∀i = 1, · · · ,n.
Split the optimization over D and A in two.
![Page 24: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/24.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Learning: The Objective Function
Dictionary learning involves optimizing:
arg mindi,αj
m∑j=1
‖xj −n∑
i=1
αjidi‖2 + β
m∑j=1
n∑i=1
|αji |
subject to ‖di‖2 ≤ c, ∀i = 1, · · · ,n.
In matrix notation:
arg minD,A‖X− AD‖2F + β
∑i,j
|αi,j |
subject to∑
i
D2i,j ≤ c, ∀i = 1, · · · ,n.
Split the optimization over D and A in two.
![Page 25: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/25.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 1: Learning the Dictionary
Reduced optimization problem:
arg minD‖X− AD‖2F
subject to∑
i
D2i,j ≤ c, ∀i = 1, · · · ,n.
Introduce Lagrange multipliers:
L (D, λ) = tr(
(X− AD)T (X− AD))
+n∑
j=1
λj
(∑i
Di,j − c
)
where each λj ≥ 0 is a dual variable...
![Page 26: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/26.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 1: Learning the Dictionary
Reduced optimization problem:
arg minD‖X− AD‖2F
subject to∑
i
D2i,j ≤ c, ∀i = 1, · · · ,n.
Introduce Lagrange multipliers:
L (D, λ) = tr(
(X− AD)T (X− AD))
+n∑
j=1
λj
(∑i
Di,j − c
)
where each λj ≥ 0 is a dual variable...
![Page 27: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/27.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 1: Learning the Dictionary
Reduced optimization problem:
arg minD‖X− AD‖2F
subject to∑
i
D2i,j ≤ c, ∀i = 1, · · · ,n.
Introduce Lagrange multipliers:
L (D, λ) = tr(
(X− AD)T (X− AD))
+n∑
j=1
λj
(∑i
Di,j − c
)
where each λj ≥ 0 is a dual variable...
![Page 28: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/28.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 1: Moving to the dual
From the Lagrangian
L (D, λ) = tr(
(X− AD)T (X− AD))
+n∑
j=1
λj
(∑i
D2i,j − c
)
minimize over D to obtain Lagrange dual
D (λ) = minDL (D, λ) =
tr(
XT X− XAT(
AAT + Λ)−1 (
XAT)T− cΛ
)
The dual can be optimized using congugate gradientOnly n, λ values compared to D being n × k
![Page 29: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/29.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 1: Moving to the dual
From the Lagrangian
L (D, λ) = tr(
(X− AD)T (X− AD))
+n∑
j=1
λj
(∑i
D2i,j − c
)
minimize over D to obtain Lagrange dual
D (λ) = minDL (D, λ) =
tr(
XT X− XAT(
AAT + Λ)−1 (
XAT)T− cΛ
)
The dual can be optimized using congugate gradientOnly n, λ values compared to D being n × k
![Page 30: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/30.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 1: Moving to the dual
From the Lagrangian
L (D, λ) = tr(
(X− AD)T (X− AD))
+n∑
j=1
λj
(∑i
D2i,j − c
)
minimize over D to obtain Lagrange dual
D (λ) = minDL (D, λ) = tr
(XT X− XAT
(AAT + Λ
)−1 (XAT
)T− cΛ
)
The dual can be optimized using congugate gradientOnly n, λ values compared to D being n × k
![Page 31: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/31.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 1: Moving to the dual
From the Lagrangian
L (D, λ) = tr(
(X− AD)T (X− AD))
+n∑
j=1
λj
(∑i
D2i,j − c
)
minimize over D to obtain Lagrange dual
D (λ) = minDL (D, λ) = tr
(XT X− XAT
(AAT + Λ
)−1 (XAT
)T− cΛ
)
The dual can be optimized using congugate gradient
Only n, λ values compared to D being n × k
![Page 32: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/32.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 1: Moving to the dual
From the Lagrangian
L (D, λ) = tr(
(X− AD)T (X− AD))
+n∑
j=1
λj
(∑i
D2i,j − c
)
minimize over D to obtain Lagrange dual
D (λ) = minDL (D, λ) = tr
(XT X− XAT
(AAT + Λ
)−1 (XAT
)T− cΛ
)
The dual can be optimized using congugate gradientOnly n, λ values compared to D being n × k
![Page 33: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/33.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 1: Dual to the Dictionary
With the optimal Λ, our dictionary is
DT =(
AAT + Λ)−1 (
XAT)T
Key point: Moving to the dual reduces the number ofoptimization variables, speeding up the optimization.
![Page 34: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/34.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 1: Dual to the Dictionary
With the optimal Λ, our dictionary is
DT =(
AAT + Λ)−1 (
XAT)T
Key point: Moving to the dual reduces the number ofoptimization variables, speeding up the optimization.
![Page 35: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/35.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 2: Learning the Sparse Code
With D now fixed, optimize for A
arg minA‖X− AD‖2F + β
∑i,j
|αi,j |
Unconstrained, convex quadratic optimizationMany solvers for this (e.g. interior point methods, in-crowdalgorithm, fixed-point continuation)
Note:
Same problem as the encoding problem.Runtime of optimization in the encoding stage?
![Page 36: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/36.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 2: Learning the Sparse Code
With D now fixed, optimize for A
arg minA‖X− AD‖2F + β
∑i,j
|αi,j |
Unconstrained, convex quadratic optimization
Many solvers for this (e.g. interior point methods, in-crowdalgorithm, fixed-point continuation)
Note:
Same problem as the encoding problem.Runtime of optimization in the encoding stage?
![Page 37: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/37.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 2: Learning the Sparse Code
With D now fixed, optimize for A
arg minA‖X− AD‖2F + β
∑i,j
|αi,j |
Unconstrained, convex quadratic optimizationMany solvers for this (e.g. interior point methods, in-crowdalgorithm, fixed-point continuation)
Note:
Same problem as the encoding problem.Runtime of optimization in the encoding stage?
![Page 38: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/38.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Step 2: Learning the Sparse Code
With D now fixed, optimize for A
arg minA‖X− AD‖2F + β
∑i,j
|αi,j |
Unconstrained, convex quadratic optimizationMany solvers for this (e.g. interior point methods, in-crowdalgorithm, fixed-point continuation)
Note:Same problem as the encoding problem.Runtime of optimization in the encoding stage?
![Page 39: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/39.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Speeding up the testing phase
Fair amount of work on speeding up the encoding stage:
H. Lee et al., Efficient sparse coding algorithmshttp://ai.stanford.edu/~hllee/nips06-sparsecoding.pdf
K. Gregor and Y. LeCun, Learning Fast Approximations ofSparse Codinghttp://yann.lecun.com/exdb/publis/pdf/gregor-icml-10.pdf
S. Hawe et al., Separable Dictionary Learninghttp://arxiv.org/pdf/1303.5244v1.pdf
![Page 40: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/40.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Outline
1 Introduction: Why Sparse Coding?
2 Sparse Coding: The Basics
3 Adding Prior Knowledge
4 Conclusions
![Page 41: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/41.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Outline
1 Introduction: Why Sparse Coding?
2 Sparse Coding: The Basics
3 Adding Prior Knowledge
4 Conclusions
![Page 42: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/42.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Relationships between Dictionary atoms
Dictionaries areover-complete bases
Dictate relationshipsbetween atoms
Example: Hierarchicaldictionaries
![Page 43: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/43.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Relationships between Dictionary atoms
Dictionaries areover-complete bases
Dictate relationshipsbetween atoms
Example: Hierarchicaldictionaries
![Page 44: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/44.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Relationships between Dictionary atoms
Dictionaries areover-complete bases
Dictate relationshipsbetween atoms
Example: Hierarchicaldictionaries
![Page 45: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/45.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Example: Image Patches
![Page 46: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/46.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Example: Document Topics
![Page 47: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/47.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Problem Statement
Goal:Have sub-groups of sparse codeα all be non-zero (or zero).
Hierarchical:If a node is non-zero, it’s parentmust be non-zeroIf a node’s parent is zero, thenode must be zero
Implementation:Change the regularizationEnforce sparsity differently...
![Page 48: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/48.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Problem Statement
Goal:Have sub-groups of sparse codeα all be non-zero (or zero).
Hierarchical:If a node is non-zero, it’s parentmust be non-zeroIf a node’s parent is zero, thenode must be zero
Implementation:Change the regularizationEnforce sparsity differently...
![Page 49: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/49.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Problem Statement
Goal:Have sub-groups of sparse codeα all be non-zero (or zero).
Hierarchical:If a node is non-zero, it’s parentmust be non-zeroIf a node’s parent is zero, thenode must be zero
Implementation:Change the regularizationEnforce sparsity differently...
![Page 50: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/50.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Grouping Code Entries
Level k included in k + 1 groups
Add |αi | to objective function once for each group
![Page 51: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/51.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Grouping Code Entries
Level k included in k + 1 groupsAdd |αi | to objective function once for each group
![Page 52: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/52.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Group Regularization
Updated objective function:
arg minD,αj
m∑j=1
[‖xj − Dαj‖2
+ βΩ(αj)
]
whereΩ (α) =
∑g∈P
wg‖α|g‖
α|g are the code values for group g.wg weights the enforcement of the hierarchySolve using proximal methods.
![Page 53: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/53.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Group Regularization
Updated objective function:
arg minD,αj
m∑j=1
[‖xj − Dαj‖2 + βΩ
(αj) ]
whereΩ (α) =
∑g∈P
wg‖α|g‖
α|g are the code values for group g.wg weights the enforcement of the hierarchySolve using proximal methods.
![Page 54: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/54.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Group Regularization
Updated objective function:
arg minD,αj
m∑j=1
[‖xj − Dαj‖2 + βΩ
(αj) ]
whereΩ (α) =
∑g∈P
wg‖α|g‖
α|g are the code values for group g.wg weights the enforcement of the hierarchySolve using proximal methods.
![Page 55: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/55.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Group Regularization
Updated objective function:
arg minD,αj
m∑j=1
[‖xj − Dαj‖2 + βΩ
(αj) ]
whereΩ (α) =
∑g∈P
wg‖α|g‖
α|g are the code values for group g.
wg weights the enforcement of the hierarchySolve using proximal methods.
![Page 56: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/56.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Group Regularization
Updated objective function:
arg minD,αj
m∑j=1
[‖xj − Dαj‖2 + βΩ
(αj) ]
whereΩ (α) =
∑g∈P
wg‖α|g‖
α|g are the code values for group g.wg weights the enforcement of the hierarchy
Solve using proximal methods.
![Page 57: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/57.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Group Regularization
Updated objective function:
arg minD,αj
m∑j=1
[‖xj − Dαj‖2 + βΩ
(αj) ]
whereΩ (α) =
∑g∈P
wg‖α|g‖
α|g are the code values for group g.wg weights the enforcement of the hierarchySolve using proximal methods.
![Page 58: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/58.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Other Examples
Other examples of structured sparsity:
M. Stojnic et al., On the Reconstruction of Block-SparseSignals With an Optimal Number of Measurements,http://dx.doi.org/10.1109/TSP.2009.2020754
J. Mairal et al., Convex and Network Flow Optimization forStructured Sparsity, http://jmlr.org/papers/volume12/mairal11a/mairal11a.pdf
![Page 59: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/59.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Outline
1 Introduction: Why Sparse Coding?
2 Sparse Coding: The Basics
3 Adding Prior Knowledge
4 Conclusions
![Page 60: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/60.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Outline
1 Introduction: Why Sparse Coding?
2 Sparse Coding: The Basics
3 Adding Prior Knowledge
4 Conclusions
![Page 61: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/61.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Summary
Two interestingdirections:
Increasingspeed of thetesting phase
Optimizingdictionarystructure
![Page 62: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/62.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Summary
Two interestingdirections:
Increasingspeed of thetesting phase
Optimizingdictionarystructure
![Page 63: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/63.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Summary
Two interestingdirections:
Increasingspeed of thetesting phase
Optimizingdictionarystructure
![Page 64: Sparse Coding: An Overview - UBC Computer Scienceschmidtm/MLRG/sparseCoding.pdf · Introduction The Basics Adding Prior Knowledge Conclusions The aim of sparse coding Every column](https://reader031.vdocuments.us/reader031/viewer/2022022422/5a9b52737f8b9a8b5d8e860c/html5/thumbnails/64.jpg)
Introduction The Basics Adding Prior Knowledge Conclusions
Summary
Two interestingdirections:
Increasingspeed of thetesting phase
Optimizingdictionarystructure