learning with structured sparsity
DESCRIPTION
Learning with Structured Sparsity. Authors: Junzhou Huang, Tong Zhang, Dimitris Metaxas. Introduction. Fixed set of p basis vectors where for each j . --> - PowerPoint PPT PresentationTRANSCRIPT
Zhennan Yan 1
Learning with Structured Sparsity
Authors:Junzhou Huang, Tong Zhang, Dimitris
Metaxas
Zhennan Yan 2
IntroductionFixed set of p basis vectors where
for each j. --> Given a random observation
, which depends on an underlying coefficient vector .
Assume the target coefficient is sparse.Throughout the paper, assume X is fixed, and
randomization is w.r.t. the noise in observation y.
},,{ 1 pxx nj Rx
pnX n
n Ryyy ],,[ 1 pR
Xy
Zhennan Yan 3
IntroductionDefine the support of a vector as
So A natural method for sparse learning is L0
regularization for desired sparsity s:
Here, only consider the least squares loss
pR}0:{)(sup jjp
|)(sup||||| 0 p
,||||)(ˆminargˆ00 stosubjectQL
22||||)(ˆ yXQ
Zhennan Yan 4
IntroductionNP-hard!Standard approach:
Relaxation of L0 to L1 (Lasso)Greedy algorithms (such as OMP)
In practical applications, often know a structure on β in addition to sparsity.Group sparsity: variables in the same group
tend to be zero or nonzeroTonal and transient structures: sparse
decomposition for audio signals
Structured SparsityDenote the index set of coefficientsFor any sparse subset
Coding complexity of F is defined as:
Structured SparsityIf a coefficient vector has a small coding
complexity, it can be efficiently learned.Why ?Number of bits to encode F is cl(F)Number of bits to encode nonzero
coefficients in F is O(|F|)
General Coding SchemeBlock Coding: Consider a small number of
base blocks (each element of is a subset of ), every subset can be expressed as union of blocks in .
Define code length on :
Where
So
General Coding Scheme
a structured greedy algorithm that can take advantage of block structures is efficient:Instead of searching over all subsets of up to
a fixed coding complexity s (exponential), we greedily add blocks from one at a time
is supposed to contain only manageable number of base blocks
General Coding SchemeStandard Sparsity: consisted only of single
element sets and each base block has coding length . This uses bits to code each subset of cardinality k.
Group Sparsity: Graph Sparsity:
General Coding SchemeStandard Sparsity:Group Sparsity: Consider , let
contain the m groups, and contain p single element blocks. Element in has cl0 of ∞, and element in has cl0 of . only looks for signals consisted of the groups.
The result coding length is: if can be represented as union of g disjoint groups.
Graph Sparsity:
General Coding SchemeStandard Sparsity:Group Sparsity:Graph Sparsity: Generalization of Group
Sparsity. Employs a directed graph structure G on . Each element of is a node of G but G may contain additional nodes.
At each node , we define coding length clv(S) on the neighborhood Nv of v, as well as any other single node with clv(u), such that
Zhennan Yan 12
General Coding SchemeExample for Graph Sparsity: Image
Processing ProblemEach pixel has 4 adjacent pixels, the number
of the subsets in its neighborhood is 24 = 16, with a coding length . Encode all other pixels using random jumping with coding length
If connected region F is composed of g sub-regions, then the coding length is
While standard sparse coding length is
Zhennan Yan 13
Algorithms for Structured Sparsity
,||||)(ˆminargˆ00 stosubjectQL
Zhennan Yan 14
Algorithms for Structured SparsityExtend forward greedy algorithms by using
block structure, which is only used to limit the search space.
Zhennan Yan 15
Algorithms for Structured Sparsity
Maximize the gain ratio:
Using least squares regression
Where is the projection matrix to the subspaces generated by columns of XF
Select by
Zhennan Yan 16
Experiments-1D1D structured sparse signal with values +1~-
1, p = 512, k =32g = 2Zero-mean Gaussian noise with standard
deviation is a added to the measurements
n = 4k = 128Recovery result by Lasso, OMP and
structOMP:
Zhennan Yan 17
Experiments-1D
Zhennan Yan 18
Experiments-2DGenerate a 2D structured sparsity image by
putting four letters in random locations.p = H*W = 48*48k = 160g = 4m = 4k = 640
Strongly sparse signal, Lasso is better than OMP!
Zhennan Yan 19
Experiments-2D
Zhennan Yan 20
Experiments for sample size
Zhennan Yan 21
Experiment on Tree-structured Sparsity2D wavelet coefficientWeakly sparse signal
Zhennan Yan 22
Experiments-Background Subtracted Images
Zhennan Yan 23
Experiments for sample size