geometric methods in nmf€¦ · 0.4 snr 40 db snr 20 db sdsomp 100 % (0.023 sec) 72.37 % (0.023...

Post on 19-Oct-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Cédric Richard Lagrange Lab., University Nice Sophia-Antipolis, France Email: cedric.richard@unice.fr Web: www.cedric-richard.fr

1

Geometric methods in NMF for hyperspectral data unmixing

Acknowledgments

The author acknowledges • Nicolas Gillis (University of Mons, BE) • Mathieu Fauvel (ENSAT, F)

for providing access to some pictures.

Hyperspectral data acquisition

1. source (active or passive) 2. electromagnetic radiation 3. interaction 4. sensor

5. transmission 6. processing 7. analysis

Reflectance: fraction of incident radiation that is reflected at an interface.

www.hsaj.org

Nontrivial problems: ❖ spectral mixture analysis ❖ detection/classification ❖ characterization ❖ fusion ❖ etc.

Information processing in remote sensing

ultraspectral (1000’s of bands)

hyperspectral (100’s of bands)

multispectral (10’s of bands)

panchromaticwww.higp.hawaii.edu www.corista.eu

Information processing in remote sensing

agriculture

forestry

urbanism

© Mathieu Fauvel, INPT ENSAT

❖ Some particularities: ❖ mixed pixels: due to insufficient spatial resolution and mixing effects ❖ sub-pixel targets: crucial in many hyperspectral applications

❖ Increasing the spatial resolution is not necessarily a solution: ❖ mixed pixels can still be observed at very high spatial resolutions ❖ intimate mixtures may take place regardless of the spatial resolution

Hyperspectral data unmixing

Macroscopic mixture Intimate mixtureminerals intimately mixed10% grass, 80% soil, 10% tree

intimate mixture

❖ The linear mixture model assumes that endmember substances are sitting side-by-side within the FOV.

❖ Nonlinear mixture models assume intimate mixture of endmember components, multiple scattering effects, etc.

Linear vs. nonlinear mixing model

r = Mα+ e

Modele lineaire [Keshava, 2002] Modele non-lineaire

r = ψ(M ,α) + er = Mα+ e

Modele lineaire [Keshava, 2002] Modele non-lineaire

r = ψ(M ,α) + elinear model (Keshava’02) nonlinear model

y = (M ,↵) + zy = M↵+ z s.t. ↵ � 0, 1>↵ = 1

Blind spectral data unmixingroad

grass

Urban hyperspectral image with 162 spectral bands and 307-by-307 pixels

© Nicolas Gillis, University of Mons, Belgium

❖ Basis elements allow to recover the endmember spectra: ❖ Abundances of the endmembers in each pixel:

�1>A = 1>�

Blind spectral data unmixing with NMF

M < 0

A < 0

© Nicolas Gillis, University of Mons, Belgium

. . .

wavelen

gths

⇡ . . .

⇥ . . .

pixels

spectral signatures

=Y M A=

abundances

Blind spectral data unmixing with NMF

© Nicolas Gillis, University of Mons, Belgium

Decomposition of the urban data set

==================================================P

k M(· , k) A(k, j)Y (· , j)

❖ NMF problem: Given an matrix and a factorization rank , determine:

❖ the mixing matrix (endmember spectra) ❖ the fractional abundance matrix

such that

Blind spectral data unmixing with NMF

Ym⇥ n r

m⇥ r

r ⇥ n

M

A

minM ,A

kY �MAk2F =X

i,j

(Y �MA)2ijA < 0, 1>A = 1>

M < 0

Blind source separation problem

s.t. {

❖ Can we only solve NMF problems? ❖ NMF is an NP-hard problem (Vavasis’09) ❖ NMF is ill-posed (Gillis’09) ❖ Under the pure-pixel assumption, the problem becomes tractable.

❖ Pure-pixel assumption a.k.a separability There exist such that where each column of is a column of .

Pure-pixel assumption: columns of are the spectral signatures of the endmembers in the hyperspectral image .

Blind spectral data unmixing with NMF

M , A < 0 Y = MA M

Y

M

Y

www.trimble.com

Under the pure-pixel assumption, the columns of are the vertices of a convex hull of the columns of

Geometric interpretation

Y

M

in the presence of noise:

yj =P

k akj mk akj � 0P

k akj = 1 8 k, j

m1

m2

m3

m1

m2

m3

❖ Geometric methods exploit properties of convex hulls to address the linear unmixing problem:

❖ NMF-based methods optimize a regularized regression function.

❖ Statistical modeling methods address the unmixing problem as a statistical inference problem:

Bayesian framework, essentially.

Blind spectral unmixing frameworks

www.newenergyconnections.com

Geometric methods

M = [m1 . . . mr]

❖ Minimum volume simplex Find the simplex of minimum volume enclosing all the data (Craig’90)

Endmember extraction: pure-pixel based algorithms

Determinesuch that

is maximized and encloses all the data

Data projection onto a dimensional space is required

(r � 1)

volume(M) =

����det✓

1>

M

◆����(r � 1)!

m1

m2

m3

❖ N-FINDR (Winter’99) • randomly select in

• iteratively increase by substituting by if

❖ SGA (Chang et al.’06) Greedy counterpart of N-FINDR

Endmember extraction: pure-pixel based algorithms

Y

mi mj

M = [m10 . . . mr0 ]

volume(M)

volume(M) < volume(M � {mi} [ {mj})

❖ Pixel purity index (PPI) (Boardman’93) • project spectral vectors onto skewers • extreme points onto skewers are stored • points with highest scores are endmembers

Endmember extraction: pure-pixel based algorithms

Remarks Parameters: number of skewers and cut-off threshold No estimation of the number of endmembers

skewer 3

skewer 2

skew

er 1

❖ Orthogonal subspace projection (OSP) (Harsanyi&Chang’94) For • find • • with

Endmember extraction: pure-pixel based algorithms

Remarks Convergence analysis in (Gillis and Vavasis’14) Extremely fast No parameter

i = 1 : r

j⇤ = argmaxj kyjkM = [M yj⇤ ]Y (I � uu>)Y u = yj⇤/kyj⇤k2

❖ Noise-free data with N-FINDR, OSP, … (Honeine, Richard’11) Let be the “abundance” of in

Abundance estimation

mj

mi

�i

�j

mi mjaji

aji =volume(M)

volume(M � {mi} [ {mj})=

�i�j

no extra calculation

volume = algebraic volume

m1

m2

m3

By Cramer’s rule:

Application

1: alunite, 2: kaolinite, 3: sphene

Cuprite mining district (Nevada) with AVIRIS spectrometer

Application

Data in 2D space given by PCA

Cuprite mining district (Nevada) with AVIRIS spectrometer

outliers

m2

m1

m3

❖ Noisy data: a standard QP problem

Abundance estimation

a⇤= argmin

a

1

2

ky �Mak2

subject to a ⌫ 0 and 1>a = 1

a1

a2

0

a⇤nneg

a⇤ls

a⇤sto-nneg

m2

m1

m3

❖ Noisy data: a standard QP problem ❖ historical FCLS:

❖ Prefer standard solvers: • active set methods • projected-gradient methods • interior point methods

Abundance estimation

M 0 =

✓M1>

◆=

✓m1 . . . mr

1 . . . 1

◆y0 =

✓y1

a⇤= argmin

a

1

2

ky0 �M 0ak2

subject to a ⌫ 0

NMF-based algorithms

(M ,A) = arg minM ,A

1

2kY �MAk2F + �1 �1(M) + �2 �2(A)

❖ Minimum volume constrained NMF

❖ Literature • ICE (Breman et al.’04) • MVC-NMF (Miao’07) • SPICE (Zare and Gader’07) • L1/2-NMF (Qian et al.’11) • CoNMF (Li et al’12)

MVC-NMF

s.t. A < 0M < 0 1>A = 1>

volume reg.abundance reg.

�1(M) ⌘ quadratic, �2(A) = 0

�1(M) ⌘ quadratic, �2(A) ⌘ weighted `1

�1(M) ⌘ quadratic, �2(A) = kAk2,1

�1(M) = 0, �2(A) =P

ij |aij |12

�1(M) = | det(MM>)|, �2(A) = 0

❖ Consider that endmembers are known and noise-free. The LMM is given by

❖ In the presence of noisy endmembers in the scene, the LMM is (only) approximated by:

❖ In the presence of noisy endmembers in the scene, the LMM is exactly given by Y = (Y �E)X +E

Y ⇡ Y X +E

Y = MA+E

Separable LMM

M : Y :noise-free endmembers noisy observationsY �E : noise-free observations

Y ⇡ Y X +E

Approximate model

=

s6

x

0,40 0,35 0,30 0,30 0,34 0,31 0,33 0,31 0,33

0,70 0,55 0,00 1,00 0,68 0,77 0,61 0,57 0,61

0,20 0,13 0,10 0,00 0,10 0,04 0,09 0,06 0,09

0,70 0,67 0,80 0,40 0,60 0,51 0,61 0,59 0,61

0,00 0,29 0,70 0,40 0,30 0,42 0,37 0,48 0,37

1,00 0,50 0,00 0,00 0,00 0,10 0,30 0,10 0,00

0,00 0,30 1,00 0,00 0,20 0,20 0,30 0,40 0,30

0,00 0,20 0,00 1,00 0,40 0,70 0,40 0,50 0,40

0,40 0,35 0,30 0,30 0,34 0,31 0,33 0,31 0,33

0,70 0,55 0,00 1,00 0,68 0,77 0,61 0,57 0,61

0,20 0,13 0,10 0,00 0,10 0,04 0,09 0,06 0,09

0,70 0,67 0,80 0,40 0,60 0,51 0,61 0,59 0,61

0,00 0,29 0,70 0,40 0,30 0,42 0,37 0,48 0,37

0 0 0 0 0 0 0 0 0

r1 r2 r3

a1

a2

a3

+ E

LxN LxN NxN

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

where are noisy observationsY

+ E

Y Y X

y6

noisy endmembers

Consider the model:

y1 y3 y4

noisy endmembers

Group-lasso NMF

min

X

12kY � Y Xk2F + µ

nPk=1

kX(k, ·)k2subject to X < 0

1>X = 1>

GLUP: Group-Lasso with Unit sum and Positivity constraints

The GLUP optimization problem ensures that • matches • has only a few non-zero rows • the positivity and sum-to-one constraints are enforced on

Y Y X

X

X

L⇢(X,Z,⇤) = f(X) + g(Z) + trace(⇤>(AX +BZ �C))+⇢

2 kAX +BZ �Ck2F

Alternating Direction Method of Multipliers (ADMM)

❖ The ADMM solves problems of the formmin

X,Zf(X) + g(Z)

subject to AX +BZ = C

❖ The augmented Lagrangian is given by

❖ The ADMM consists of iterating the following steps1. Xk+1 = min

XL⇢(X,Zk,⇤k)

2. Zk+1 = minZ

L⇢(Xk+1,Z,⇤k)

3. ⇤k+1 = ⇤k+1 + ⇢(AXk+1 +BZk+1 �C)

GLUP with ADMM❖ In order to apply the ADMM, we consider the canonical form

min

X,Z

12kY � Y Xk2F + µ

nPk=1

kZ(k, ·)k2 + I(Z)

subject to

✓I1>

◆X +

✓�I0>

◆Z =

✓01>

◆ consensussum-to-one

indicator func.

Y = (Y �E)X +E ! Y = Y X +E(I �X)

Exact model

with the noisy observations.Y

❖ The model is heteroscedastic: the noise variance depends on X

❖ The Maximum Likelihood estimate with Group-lasso yields:

NGLUP: Reduced noise GLUPmin

X

m2 log |�2C(X)|+ 1

2kY � Y Xk2(�2C(X))�1 + µnP

k=1kX(k, ·)k2

subject to X < 0

1>X = 1>

where C(X) = (I �X)>(I �X)

min

X

12kY � Y Xk2

W k + µnP

k=1kX(k, ·)k2

subject to X < 0

1>X = 1>

NGLUP iterative solution

with the noisy observations.Y

❖ The noise variance that maximizes the loss function for fixed isX

❖ At iteration , as in Iteratively Reweighed Least Squares (IRLS):

�2

�2(X) =1

nmtrace((Y � Y X)C(X)�1(Y � Y X)>)

k + 1

• Inject the weight estimate:W k = (�2(Xk)C(Xk))�1

• Solve the optimization problem for fixed weight: (same ADMM steps)

GLUP experiments: synthetic data

Synthetic data set: • 200 pixels • 8 endmembers • SNR: 40 dB

Grayscale image of estimated abundance matrix X , SNR=40dB.

Num. of column

Num.ofline

0 20 40 60 80 100 120 140 160 180 200

0

20

40

60

80

100

120

140

160

180

200 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

SNR 40 dB SNR 20 dB

SDSOMP 100 % (0.023 sec) 72.37 % (0.023 sec)NFINDR 100 % (0.069 sec) 89.75 % (0.068 sec)GLUP 100 % (1.490 sec) 94.12 % (3.737 sec)

Percentage of identified endmembers (100 realizations).

GLUP experiments: real data

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

SDSOMP (0.1103)

N-FINDR (0.0290)

water roof tops 1 roof tops 2 meadow tree shadow

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

GLUP experiments: real data

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

50 100 150 200 250 300 350 400 450

100

200

300

400

500

600

700

800

900

1000

GLUP (0.0198)

N-FINDR (0.0290)

water roof tops 1 roof tops 2 meadow tree shadow

NGLUP experiments: synthetic data

0 10 20 30 40 50 60 70 80 90 1000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Mean of l ines in X, SNR=20dB.

Num. of line0 10 20 30 40 50 60 70 80 90 100

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Mean of l ines in X, SNR=20dB.

Num. of line

Mean value of each row of the abundance matrix, obtained with 100 pixels and SNR=20 dB GLUP (left), NGLUP (right).

Remark: NGLUP was initialized by the GLUP solution

NGLUP experiments: real data

Abundance maps determined by NGLUP for Pavia University data set

shadowroof metalmeadow tree

Algorithm RMSE max angle (rad) avg angle (rad)

N-FINDR 0.0641 1.0592 0.1549NGLUP 0.0287 0.7468 0.074

Concluding remarks

top related