sparse coding for machine learning, image processing and

Post on 14-Feb-2017

242 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Representations parcimonieuses pour le

traitement d’image et la vision artificielle

Julien Mairal

Department of Statistics, University of California, BerkeleyINRIA - projet Willow, Paris

Journees ORASIS, Praz-sur-Arly7 Juin 2011

Julien Mairal Journees ORASIS 2011 1/97

Remerciements

Francis Jean Rodolphe Guillaume GuillermoBach Ponce Jenatton Obozinski Sapiro

Miki Andrew Florent Y-Lan WillowElad Zisserman Couzinie-Devy Boureau people

Julien Mairal Journees ORASIS 2011 2/97

What is this lecture about?

Why sparsity, what for and how?

Signal and image processing: Restoration, reconstruction.

Machine learning: Selecting relevant features.

Computer vision: Modelling image patches and imagedescriptors.

Optimization: Solving challenging problems.

Julien Mairal Journees ORASIS 2011 3/97

1 Image Processing Applications

2 Sparse Linear Models and Dictionary Learning

3 Computer Vision Applications

Julien Mairal Journees ORASIS 2011 4/97

1 Image Processing ApplicationsImage DenoisingInpainting, DemosaickingVideo ProcessingOther Applications

2 Sparse Linear Models and Dictionary Learning

3 Computer Vision Applications

Julien Mairal Journees ORASIS 2011 5/97

The Image Denoising Problem

y︸︷︷︸

measurements

= xorig︸︷︷︸

original image

+ w︸︷︷︸noise

Julien Mairal Journees ORASIS 2011 6/97

Sparse representations for image restoration

y︸︷︷︸

measurements

= xorig︸︷︷︸

original image

+ w︸︷︷︸

noise

Energy minimization problem - MAP estimation

E (x) =1

2‖y − x‖22

︸ ︷︷ ︸

relation to measurements

+ Pr(x)︸ ︷︷ ︸

image model (-log prior)

Some classical priors

Smoothness λ‖Lx‖22Total variation λ‖∇x‖21MRF priors

. . .

Julien Mairal Journees ORASIS 2011 7/97

What is a Sparse Linear Model?

Let x in Rm be a signal.

Let D = [d1, . . . ,dp] ∈ Rm×p be a set of

normalized “basis vectors”.We call it dictionary.

D is “adapted” to y if it can represent it with a few basis vectors—thatis, there exists a sparse vector α in R

p such that y ≈ Dα. We call αthe sparse code.

y

︸ ︷︷ ︸

y∈Rm

d1 d2 · · · dp

︸ ︷︷ ︸

D∈Rm×p

α[1]α[2]...

α[p]

︸ ︷︷ ︸

α∈Rp,sparse

Julien Mairal Journees ORASIS 2011 8/97

First Important Idea

Why Sparsity?

A dictionary can be good for representing a class ofsignals, but not for representing white Gaussian noise.

Julien Mairal Journees ORASIS 2011 9/97

The Sparse Decomposition Problem

minα∈Rp

1

2‖y −Dα‖22

︸ ︷︷ ︸

data fitting term

+ λψ(α)︸ ︷︷ ︸

sparsity-inducingregularization

ψ induces sparsity in α. It can be

the ℓ0 “pseudo-norm”. ‖α‖0△

= #{i s.t. α[i ] 6= 0} (NP-hard)

the ℓ1 norm. ‖α‖1△

=∑p

i=1 |α[i ]| (convex),

. . .

This is a selection problem. When ψ is the ℓ1-norm, the problem iscalled Lasso [Tibshirani, 1996] or basis pursuit [Chen et al., 1999]

Julien Mairal Journees ORASIS 2011 10/97

Sparse representations for image restoration

Designed dictionaries

[Haar, 1910], [Zweig, Morlet, Grossman ∼70s], [Meyer, Mallat,Daubechies, Coifman, Donoho, Candes ∼80s-today]. . . (see [Mallat,1999])Wavelets, Curvelets, Wedgelets, Bandlets, . . . lets

Learned dictionaries of patches

[Olshausen and Field, 1997], [Engan et al., 1999], [Lewicki andSejnowski, 2000], [Aharon et al., 2006] , [Roth and Black, 2005], [Leeet al., 2007]

minαi ,D∈D

i

1

2‖yi −Dαi‖

22

︸ ︷︷ ︸

reconstruction

+λψ(αi )︸ ︷︷ ︸

sparsity

ψ(α) = ‖α‖0 (“ℓ0 pseudo-norm”)

ψ(α) = ‖α‖1 (ℓ1 norm)

Julien Mairal Journees ORASIS 2011 11/97

Sparse representations for image restoration

Solving the denoising problem

[Elad and Aharon, 2006]

Extract all overlapping 8× 8 patches yi .

Solve a matrix factorization problem:

minαi ,D∈D

n∑

i=1

1

2‖yi −Dαi‖

22

︸ ︷︷ ︸

reconstruction

+λψ(αi)︸ ︷︷ ︸

sparsity

,

with n > 100, 000

Average the reconstruction of each patch.

Julien Mairal Journees ORASIS 2011 12/97

Sparse representations for image restorationK-SVD: [Elad and Aharon, 2006]

Figure: Dictionary trained on a noisy version of the imageboat.

Julien Mairal Journees ORASIS 2011 13/97

Sparse representations for image restorationGrayscale vs color image patches

Julien Mairal Journees ORASIS 2011 14/97

Sparse representations for image restoration

Inpainting, Demosaicking

minD∈D,α

i

1

2‖βi (yi −Dαi )‖

22 + λiψ(αi )

RAW Image Processing

Whitebalance.Black

substraction.

Denoising

Demosaicking

Conversionto sRGB.Gamma

correction.

Julien Mairal Journees ORASIS 2011 15/97

Sparse representations for image restoration[Mairal, Sapiro, and Elad, 2008d]

Julien Mairal Journees ORASIS 2011 16/97

Sparse representations for image restorationInpainting, [Mairal, Elad, and Sapiro, 2008b]

Julien Mairal Journees ORASIS 2011 17/97

Sparse representations for image restorationInpainting, [Mairal, Elad, and Sapiro, 2008b]

Julien Mairal Journees ORASIS 2011 18/97

Sparse representations for video restoration

Key ideas for video processing

[Protter and Elad, 2009]

Using a 3D dictionary.

Processing of many frames at the same time.

Dictionary propagation.

Julien Mairal Journees ORASIS 2011 19/97

Sparse representations for image restorationInpainting, [Mairal, Sapiro, and Elad, 2008d]

Figure: Inpainting results.

Julien Mairal Journees ORASIS 2011 20/97

Sparse representations for image restorationInpainting, [Mairal, Sapiro, and Elad, 2008d]

Figure: Inpainting results.

Julien Mairal Journees ORASIS 2011 21/97

Sparse representations for image restorationInpainting, [Mairal, Sapiro, and Elad, 2008d]

Figure: Inpainting results.

Julien Mairal Journees ORASIS 2011 22/97

Sparse representations for image restorationInpainting, [Mairal, Sapiro, and Elad, 2008d]

Figure: Inpainting results.

Julien Mairal Journees ORASIS 2011 23/97

Sparse representations for image restorationInpainting, [Mairal, Sapiro, and Elad, 2008d]

Figure: Inpainting results.

Julien Mairal Journees ORASIS 2011 24/97

Sparse representations for image restorationColor video denoising, [Mairal, Sapiro, and Elad, 2008d]

Figure: Denoising results. σ = 25

Julien Mairal Journees ORASIS 2011 25/97

Sparse representations for image restorationColor video denoising, [Mairal, Sapiro, and Elad, 2008d]

Figure: Denoising results. σ = 25

Julien Mairal Journees ORASIS 2011 26/97

Sparse representations for image restorationColor video denoising, [Mairal, Sapiro, and Elad, 2008d]

Figure: Denoising results. σ = 25

Julien Mairal Journees ORASIS 2011 27/97

Sparse representations for image restorationColor video denoising, [Mairal, Sapiro, and Elad, 2008d]

Figure: Denoising results. σ = 25

Julien Mairal Journees ORASIS 2011 28/97

Sparse representations for image restorationColor video denoising, [Mairal, Sapiro, and Elad, 2008d]

Figure: Denoising results. σ = 25

Julien Mairal Journees ORASIS 2011 29/97

Digital ZoomingCouzinie-Devy, 2010, Original

Julien Mairal Journees ORASIS 2011 30/97

Digital ZoomingCouzinie-Devy, 2010, Bicubic

Julien Mairal Journees ORASIS 2011 31/97

Digital ZoomingCouzinie-Devy, 2010, Proposed method

Julien Mairal Journees ORASIS 2011 32/97

Digital ZoomingCouzinie-Devy, 2010, Original

Julien Mairal Journees ORASIS 2011 33/97

Digital ZoomingCouzinie-Devy, 2010, Bicubic

Julien Mairal Journees ORASIS 2011 34/97

Digital ZoomingCouzinie-Devy, 2010, Proposed approach

Julien Mairal Journees ORASIS 2011 35/97

Inverse half-toningOriginal

Julien Mairal Journees ORASIS 2011 36/97

Inverse half-toningReconstructed image

Julien Mairal Journees ORASIS 2011 37/97

Inverse half-toningOriginal

Julien Mairal Journees ORASIS 2011 38/97

Inverse half-toningReconstructed image

Julien Mairal Journees ORASIS 2011 39/97

Inverse half-toningOriginal

Julien Mairal Journees ORASIS 2011 40/97

Inverse half-toningReconstructed image

Julien Mairal Journees ORASIS 2011 41/97

Inverse half-toningOriginal

Julien Mairal Journees ORASIS 2011 42/97

Inverse half-toningReconstructed image

Julien Mairal Journees ORASIS 2011 43/97

Inverse half-toningOriginal

Julien Mairal Journees ORASIS 2011 44/97

Inverse half-toningReconstructed image

Julien Mairal Journees ORASIS 2011 45/97

Important messages

Patch-based approaches are achieving state-of-the-art results formany image processing task.

Dictionary Learning adapts to the data you want to restore.

Dictionary Learning is well adapted to data that admit sparserepresentation. Sparsity is for sparse data only.

Julien Mairal Journees ORASIS 2011 46/97

Next topics

A bit of machine learning.

Why does the ℓ1-norm induce sparsity?

Some properties of the Lasso.

Links between dictionary learning and matrix factorizationtechniques.

A simple algorithm for learning dictionaries.

Julien Mairal Journees ORASIS 2011 47/97

1 Image Processing Applications

2 Sparse Linear Models and Dictionary LearningThe machine learning point of viewWhy does the ℓ1-norm induce sparsity?Dictionary Learning and Matrix Factorization

3 Computer Vision Applications

Julien Mairal Journees ORASIS 2011 48/97

Sparse Linear Model: Machine Learning Point of View

Let (y i , xi )ni=1 be a training set, where the vectors xi are in Rp and are

called features. The scalars y i are in

{−1,+1} for binary classification problems.

{1, . . . ,N} for multiclass classification problems.

R for regression problems.

In a linear model, on assumes a relation y ≈ w⊤x (or y ≈ sign(w⊤x)),and solves

minw∈Rp

1

n

n∑

i=1

ℓ(y i ,w⊤xi )

︸ ︷︷ ︸

data-fitting

+ λψ(w)︸ ︷︷ ︸

regularization

.

Julien Mairal Journees ORASIS 2011 49/97

Sparse Linear Models: Machine Learning Point of View

A few examples:

Ridge regression: minw∈Rp

1

2n

n∑

i=1

(y i −w⊤xi )2 + λ‖w‖22.

Linear SVM: minw∈Rp

1

n

n∑

i=1

max(0, 1− y iw⊤xi ) + λ‖w‖22.

Logistic regression: minw∈Rp

1

n

n∑

i=1

log(

1 + e−y iw⊤xi)

+ λ‖w‖22.

The squared ℓ2-norm induces smoothness in w. When one knows inadvance that w should be sparse, one should use a sparsity-inducing

regularization such as the ℓ1-norm. [Chen et al., 1999, Tibshirani, 1996]

The purpose of the regularization is to add additional a-priori

knowledge in the regularization.

Julien Mairal Journees ORASIS 2011 50/97

Sparse Linear Models: the Lasso

Signal processing: D is a dictionary in Rn×p,

minα∈Rp

1

2‖y −Dα‖22 + λ‖α‖1.

Machine Learning:

minw∈Rp

1

2n

n∑

i=1

(y i − xi⊤w)2 + λ‖w‖1 = minw∈Rp

1

2n‖y−X⊤w‖22 + λ‖w‖1,

with X△

= [x1, . . . , xn], and y△

= [y1, . . . , yn]⊤.

Useful tool in signal processing, machine learning, statistics,. . . as long asone wishes to select features.

Julien Mairal Journees ORASIS 2011 51/97

Why does the ℓ1-norm induce sparsity?Exemple: quadratic problem in 1D

minα∈R

1

2(y − α)2 + λ|α|

Piecewise quadratic function with a kink at zero.

Derivative at 0+: g+ = −y + λ and 0−: g− = −y − λ.

Optimality conditions. α is optimal iff:

|α| > 0 and (y − α) + λ sign(α) = 0

α = 0 and g+ ≥ 0 and g− ≤ 0

The solution is a soft-thresholding:

α⋆ = sign(y)(|y | − λ)+.

Julien Mairal Journees ORASIS 2011 52/97

Why does the ℓ1-norm induce sparsity?

y

α⋆

(a) soft-thresholding operator

y

α⋆

(b) hard-thresholding operator

Julien Mairal Journees ORASIS 2011 53/97

Why does the ℓ1-norm induce sparsity?Analysis of the norms in 1D

ψ(α) = α2

ψ′(α) = 2α

ψ(α) = |α|

ψ′−(α) = −1, ψ′

+(α) = 1

The gradient of the ℓ2-norm vanishes when α get close to 0. On itsdifferentiable part, the norm of the gradient of the ℓ1-norm is constant.

Julien Mairal Journees ORASIS 2011 54/97

Why does the ℓ1-norm induce sparsity?Physical illustration

E1 = 0 E1 = 0

y0

Julien Mairal Journees ORASIS 2011 55/97

Why does the ℓ1-norm induce sparsity?Physical illustration

E1 =k12 (y0 − y)2

E2 =k22 y

2 y

y

E1 =k12 (y0 − y)2

E2 = mgy

Julien Mairal Journees ORASIS 2011 56/97

Why does the ℓ1-norm induce sparsity?Physical illustration

E1 =k12 (y0 − y)2

E2 =k22 y

2 y

y = 0 !!

E1 =k12 (y0 − y)2

E2 = mgy

Julien Mairal Journees ORASIS 2011 57/97

Why does the ℓ1-norm induce sparsity?Geometric explanation

x

y

x

y

minα∈Rp

1

2‖y −Dα‖22 + λ‖α‖1

minα∈Rp

‖y −Dα‖22 s.t. ‖α‖1 ≤ T .

Julien Mairal Journees ORASIS 2011 58/97

Important property of the LassoPiecewise linearity of the regularization path

0 1 2 3 4−0.5

0

0.5

1

1.5

λ

co

eff

icie

nt

va

lue

s

α1

α2

α3

α4

α5

Figure: Regularization path of the Lasso

min1‖y −Dα‖2 + λ‖α‖ .

Julien Mairal Journees ORASIS 2011 59/97

Optimization for Dictionary Learning

minα∈Rp×n

D∈D

n∑

i=1

1

2‖yi −Dαi‖

22 + λ‖αi‖1

D△

= {D ∈ Rm×p s.t. ∀j = 1, . . . , p, ‖dj‖2 ≤ 1}.

Classical optimization alternates between D and α.

Good results, but slow!

Instead use online learning [Mairal et al., 2009]

Julien Mairal Journees ORASIS 2011 60/97

Optimization for Dictionary LearningInpainting a 12-Mpixel photograph

Julien Mairal Journees ORASIS 2011 61/97

Optimization for Dictionary LearningInpainting a 12-Mpixel photograph

Julien Mairal Journees ORASIS 2011 62/97

Optimization for Dictionary LearningInpainting a 12-Mpixel photograph

Julien Mairal Journees ORASIS 2011 63/97

Optimization for Dictionary LearningInpainting a 12-Mpixel photograph

Julien Mairal Journees ORASIS 2011 64/97

Matrix Factorization Problems and Dictionary Learning

minα∈Rp×n

D∈D

n∑

i=1

1

2‖yi −Dαi‖

22 + λ‖αi‖1

can be rewritten

minα∈Rp×n

D∈D

1

2‖Y −Dα‖2F + λ‖α‖1,

where Y = [y1, . . . , yn] and α = [α1, . . . ,αn].

Julien Mairal Journees ORASIS 2011 65/97

Matrix Factorization Problems and Dictionary LearningPCA

minα∈Rp×n

D∈Rm×p

1

2‖Y −Dα‖2F s.t. D⊤D = I and αα⊤ is diagonal.

D = [d1, . . . ,dp] are the principal components.

Julien Mairal Journees ORASIS 2011 66/97

Matrix Factorization Problems and Dictionary LearningHard clustering

minα∈{0,1}p×n

D∈Rm×p

1

2‖Y −Dα‖2F s.t. ∀i ∈ {1, . . . , p},

p∑

j=1

αi [j ] = 1.

D = [d1, . . . ,dp] are the centroids of the p clusters.

Julien Mairal Journees ORASIS 2011 67/97

Matrix Factorization Problems and Dictionary LearningSoft clustering

minα∈Rp×n

+

D∈Rm×p

1

2‖Y −Dα‖2F , s.t. ∀i ∈ {1, . . . , p},

p∑

j=1

αi [j ] = 1.

D = [d1, . . . ,dp] are the centroids of the p clusters.

Julien Mairal Journees ORASIS 2011 68/97

Matrix Factorization Problems and Dictionary LearningNon-negative matrix factorization [Lee and Seung, 2001]

minα∈Rp×n

+

D∈Rm×p+

1

2‖Y −Dα‖2F

Julien Mairal Journees ORASIS 2011 69/97

Matrix Factorization Problems and Dictionary LearningNMF+sparsity?

minα∈Rp×n

+

D∈Rm×p+

1

2‖Y −Dα‖2F + λ‖α‖1.

Most of these formulations can be addressed the same types of

algorithms.

Julien Mairal Journees ORASIS 2011 70/97

Matrix Factorization Problems and Dictionary LearningNatural Patches

(a) PCA (b) NNMF (c) DL

Julien Mairal Journees ORASIS 2011 71/97

Matrix Factorization Problems and Dictionary LearningFaces

(d) PCA (e) NNMF (f) DL

Julien Mairal Journees ORASIS 2011 72/97

Important messages

The ℓ1-norm induces sparsity and shrinks the coefficients(soft-thresholding)

The regularization path of the Lasso is piecewise linear.

Learning the dictionary is simple, fast and scalable.

Dictionary learning is related to several matrix factorizationproblems.

Software SPAMS is available for all of this:

www.di.ens.fr/willow/SPAMS/.

Julien Mairal Journees ORASIS 2011 73/97

Next topics: Computer Vision

Intriguing results on the use of dictionary learning for bags of words.

Modelling the local appearance of image patches.

Julien Mairal Journees ORASIS 2011 74/97

1 Image Processing Applications

2 Sparse Linear Models and Dictionary Learning

3 Computer Vision ApplicationsLearning codebooks for image classificationModelling the local appearance of image patches

Julien Mairal Journees ORASIS 2011 75/97

Learning Codebooks for Image Classification

Idea

Replacing Vector Quantization by Learned Dictionaries!

unsupervised: [Yang et al., 2009]

supervised: [Boureau et al., 2010, Yang et al., 2010]

Julien Mairal Journees ORASIS 2011 76/97

Learning Codebooks for Image Classification

Let an image be represented by a set of low-level descriptors yi at Nlocations identified with their indices i = 1, . . . ,N.

hard-quantization:

yi ≈ Dαi , αi ∈ {0, 1}p and

p∑

j=1

αi [j ] = 1

soft-quantization:

αi [j ] =e−β‖yi−dj‖

22

∑pk=1 e

−β‖yi−dk‖22

sparse coding:

yi ≈ Dαi , αi = argminα

1

2‖yi −Dα‖22 + λ‖α‖1

Julien Mairal Journees ORASIS 2011 77/97

Learning Codebooks for Image ClassificationTable from Boureau et al. [2010]

Yang et al. [2009] have won the PASCAL VOC’09 challenge using thiskind of technique.

Julien Mairal Journees ORASIS 2011 78/97

Learning dictionaries with a discriminative cost function

Idea:

Let us consider 2 sets S−, S+ of signals representing 2 different classes.Each set should admit a dictionary best adapted to its reconstruction.

Classification procedure for a signal y ∈ Rn:

min(R⋆(y,D−),R⋆(y,D+))

whereR⋆(y,D) = min

α∈Rp‖y −Dα‖22 s.t. ‖α‖0 ≤ L.

“Reconstructive” training{

minD−

i∈S−R⋆(yi ,D−)

minD+

i∈S+R⋆(yi ,D+)

[Grosse et al., 2007], [Huang and Aviyente, 2006],[Sprechmann et al., 2010] for unsupervised clustering (CVPR ’10)

Julien Mairal Journees ORASIS 2011 79/97

Learning dictionaries with a discriminative cost function

“Discriminative” training

[Mairal, Bach, Ponce, Sapiro, and Zisserman, 2008a]

minD−,D+

i

D(

λzi(R⋆(yi ,D−)− R⋆(yi ,D+)

))

,

where zi ∈ {−1,+1} is the label of yi .

Logistic regression function

Julien Mairal Journees ORASIS 2011 80/97

Learning dictionaries with a discriminative cost functionExamples of dictionaries

Top: reconstructive, Bottom: discriminative, Left: Bicycle, Right:Background.

Julien Mairal Journees ORASIS 2011 81/97

Learning dictionaries with a discriminative cost functionTexture segmentation

Julien Mairal Journees ORASIS 2011 82/97

Learning dictionaries with a discriminative cost functionTexture segmentation

Julien Mairal Journees ORASIS 2011 83/97

Learning dictionaries with a discriminative cost functionPixelwise classification

Julien Mairal Journees ORASIS 2011 84/97

Learning dictionaries with a discriminative cost functionweakly-supervised pixel classification

Julien Mairal Journees ORASIS 2011 85/97

Application to edge detection and classification[Mairal, Leordeanu, Bach, Hebert, and Ponce, 2008c]

Good edges Bad edges

Julien Mairal Journees ORASIS 2011 86/97

Application to edge detection and classificationBerkeley segmentation benchmark

Raw edge detection on the right

Julien Mairal Journees ORASIS 2011 87/97

Application to edge detection and classificationBerkeley segmentation benchmark

Raw edge detection on the right

Julien Mairal Journees ORASIS 2011 88/97

Application to edge detection and classificationBerkeley segmentation benchmark

Julien Mairal Journees ORASIS 2011 89/97

Application to edge detection and classificationContour-based classifier: [Leordeanu, Hebert, and Sukthankar, 2007]

Is there a bike, a motorbike, a car or a person on thisimage?

Julien Mairal Journees ORASIS 2011 90/97

Application to edge detection and classification

Julien Mairal Journees ORASIS 2011 91/97

Application to edge detection and classificationPerformance gain due to the prefiltering

Ours + [Leordeanu ’07] [Leordeanu ’07] [Winn ’05]

96.8% 89.4% 76.9%

Recognition rates for the same experiment as [Winn et al., 2005] onVOC 2005.

Category Ours+[Leordeanu ’07] [Leordeanu ’07]Aeroplane 71.9% 61.9%

Boat 67.1% 56.4%Cat 82.6% 53.4%Cow 68.7% 59.2%Horse 76.0% 67%

Motorbike 80.6% 73.6%Sheep 72.9% 58.4%

Tvmonitor 87.7% 83.8%

Average 75.9% 64.2 %

Recognition performance at equal error rate for 8 classes on a subset ofimages from Pascal 07.

Julien Mairal Journees ORASIS 2011 92/97

Important messages

Learned dictionaries are well adapted to model the localappearance of images and edges.

They can be used to learn dictionaries of SIFT features.

Julien Mairal Journees ORASIS 2011 93/97

References I

M. Aharon, M. Elad, and A. M. Bruckstein. The K-SVD: An algorithm for designingof overcomplete dictionaries for sparse representations. IEEE Transactions onSignal Processing, 54(11):4311–4322, November 2006.

Y-L. Boureau, F. Bach, Y. Lecun, and J. Ponce. Learning mid-level features forrecognition. In Proceedings of the IEEE Conference on Computer Vision andPattern Recognition (CVPR), 2010.

S. S. Chen, D. L. Donoho, and M. A. Saunders. Atomic decomposition by basispursuit. SIAM Journal on Scientific Computing, 20:33–61, 1999.

M. Elad and M. Aharon. Image denoising via sparse and redundant representationsover learned dictionaries. IEEE Transactions on Image Processing, 54(12):3736–3745, December 2006.

K. Engan, S. O. Aase, and J. H. Husoy. Frame based signal compression usingmethod of optimal directions (MOD). In Proceedings of the 1999 IEEEInternational Symposium on Circuits Systems, volume 4, 1999.

R. Grosse, R. Raina, H. Kwong, and A. Y. Ng. Shift-invariant sparse coding for audioclassification. In Proceedings of the Twenty-third Conference on Uncertainty inArtificial Intelligence, 2007.

Julien Mairal Journees ORASIS 2011 94/97

References IIA. Haar. Zur theorie der orthogonalen funktionensysteme. Mathematische Annalen,

69:331–371, 1910.

K. Huang and S. Aviyente. Sparse representation for signal classification. In Advancesin Neural Information Processing Systems, Vancouver, Canada, December 2006.

D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. InAdvances in Neural Information Processing Systems, 2001.

H. Lee, A. Battle, R. Raina, and A. Y. Ng. Efficient sparse coding algorithms. InB. Scholkopf, J. Platt, and T. Hoffman, editors, Advances in Neural InformationProcessing Systems, volume 19, pages 801–808. MIT Press, Cambridge, MA, 2007.

M. Leordeanu, M. Hebert, and R. Sukthankar. Beyond local appearance: Categoryrecognition from pairwise interactions of simple features. In Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2007.

M. S. Lewicki and T. J. Sejnowski. Learning overcomplete representations. NeuralComputation, 12(2):337–365, 2000.

J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. Discriminative learneddictionaries for local image analysis. In Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR), 2008a.

Julien Mairal Journees ORASIS 2011 95/97

References IIIJ. Mairal, M. Elad, and G. Sapiro. Sparse representation for color image restoration.

IEEE Transactions on Image Processing, 17(1):53–69, January 2008b.

J. Mairal, M. Leordeanu, F. Bach, M. Hebert, and J. Ponce. Discriminative sparseimage models for class-specific edge detection and image interpretation. InProceedings of the European Conference on Computer Vision (ECCV), 2008c.

J. Mairal, G. Sapiro, and M. Elad. Learning multiscale sparse representations forimage and video restoration. SIAM Multiscale Modelling and Simulation, 7(1):214–241, April 2008d.

J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online dictionary learning for sparsecoding. In Proceedings of the International Conference on Machine Learning(ICML), 2009.

S. Mallat. A Wavelet Tour of Signal Processing, Second Edition. Academic Press,New York, September 1999.

B. A. Olshausen and D. J. Field. Sparse coding with an overcomplete basis set: Astrategy employed by V1? Vision Research, 37:3311–3325, 1997.

M. Protter and M. Elad. Image sequence denoising via sparse and redundantrepresentations. IEEE Transactions on Image Processing, 18(1):27–36, 2009.

Julien Mairal Journees ORASIS 2011 96/97

References IVS. Roth and M. J. Black. Fields of experts: A framework for learning image priors. In

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2005.

P. Sprechmann, I. Ramirez, G. Sapiro, and Y. C. Eldar. Collaborative hierarchicalsparse modeling. Technical report, 2010. Preprint arXiv:1003.0400v1.

R. Tibshirani. Regression shrinkage and selection via the Lasso. Journal of the RoyalStatistical Society. Series B, 58(1):267–288, 1996.

J. Winn, A. Criminisi, and T. Minka. Object categorization by learned universal visualdictionary. In Proceedings of the IEEE International Conference on ComputerVision (ICCV), 2005.

J. Yang, K. Yu, Y. Gong, and T. Huang. Linear spatial pyramid matching using sparsecoding for image classification. In Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (CVPR), 2009.

J. Yang, K. Yu, , and T. Huang. Supervised translation-invariant sparse coding. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), 2010.

Julien Mairal Journees ORASIS 2011 97/97

top related