dimension reduction for hyperspectral data using

25
DIMENSION REDUCTION FOR HYPERSPECTRAL DATA USING RANDOMIZED PCA AND LAPLACIAN EIGENMAPS YIRAN LI APPLIED MATHEMATICS, STATISTICS AND SCIENTIFIC COMPUTING ADVISOR: DR. WOJTEK CZAJA, DR. JOHN BENEDETTO DEPARTMENT OF MATHEMATICS UNIVERSITY OF MARYLAND, COLLEGE PARK

Upload: others

Post on 30-Jan-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dimension Reduction for Hyperspectral data using

DIMENSION REDUCTION FOR HYPERSPECTRAL DATA USING RANDOMIZED PCA AND LAPLACIANEIGENMAPSYIRAN LI

APPLIED MATHEMATICS, STATISTICS AND SCIENTIFIC COMPUTING

ADVISOR: DR. WOJTEK CZAJA, DR. JOHN BENEDETTO

DEPARTMENT OF MATHEMATICS

UNIVERSITY OF MARYLAND, COLLEGE PARK

Page 2: Dimension Reduction for Hyperspectral data using

BACKGROUND: HYPERSPECTRAL IMAGING

β€’ Light is described in terms of its wavelength

β€’ A reflectance spectrum shows the reflectance of a material measured across a

range of wavelengths. It helps identify certain materials uniquely

β€’ We measure reflectance at many narrow, closely spaced wavelength bands

β€’ When a spectrometer is used in an imaging sensor, the resulting images record

a reflectance spectrum for each pixel in the images

(Shippert, 2003)

Page 3: Dimension Reduction for Hyperspectral data using

SPECTRUM AND HYPERSPECTRAL IMAGERY

β€’ Left: Reflectance spectra measured by laboratory spectrometers for three

materials: a green bay laurel leaf, the mineral talc, and a silty loam soil.

β€’ Right: The concept of hyperspectral imagery. (Shippert, 2003)

Page 4: Dimension Reduction for Hyperspectral data using

MULTISPECTRAL VS HYPERSPECTRAL

β€’ Multispectral imaging measures reflectance at discrete and somewhat narrow

bands. Multispectral images do not produce the "spectrum" of an object

β€’ Hyperspectral deals with imaging narrow spectral bands over a continuous spectral

range, and produce the spectra of all pixels in the scene.

β€’ So a sensor with only 20 bands can also be hyperspectral when it covers the range

from 500 to 700 nm with 20 bands each 10 nm wide.

(Wikipedia: hyperspectral imaging)

Page 5: Dimension Reduction for Hyperspectral data using

AN EXAMPLE: SALINAS VALLEY, CALIFORNIA

β€’ Left: sample band collected by 224-band sensor. It includes vegetables, bare soils,

and vineyard fields. Right: Groundtruth of Salinas dataset (16 classes)

(IC: Hyperspectral Remote Sensing Scenes)

Page 6: Dimension Reduction for Hyperspectral data using

PROBLEM

β€’ Hyperspectral images are three dimensional (x-coordinate, y-coordinate, b)

β€’ Each pixel has a different spectrum that represents different materials

β€’ Sometimes over 100 bands and with large number of pixels

β€’ Dimension reduction reduces the number of bands of a hyperspectral image

β€’ It maps dimensional data into a lower dimension while preserving the main

features of the original data.

(hyperspectral imaging, Wikipedia)

Page 7: Dimension Reduction for Hyperspectral data using

PROJECT GOAL

β€’ Reduce dimensionality of hyperspectral imaging

β€’ Compare two algorithms to be implemented

Page 8: Dimension Reduction for Hyperspectral data using

METHODS

Existing methods (partial) :

β€’ Principal Component Analysis( PCA)

β€’ Local Linear Embedding

β€’ Neighborhood Preserving Embedding

β€’ Classical multidimensional scaling

β€’ Isomap

β€’ Stochastic Proximity Embedding

My Methods:

β€’ Randomized PCA

β€’ Laplacian Eigenmaps

(Delft University)

Page 9: Dimension Reduction for Hyperspectral data using

COMPARISON BETWEEN TWO ALGORITHMS

Compare two algorithms,

Randomized PCA and Laplacian Eigenmaps, in terms of:

β€’ Implementation

β€’ Running time

β€’ Results

β€’ Difficulties during implementation

Page 10: Dimension Reduction for Hyperspectral data using

ALGORITHM 1: LAPLACIAN EIGENMAPS

β€’ Consider the problem of mapping the weighted graph G to a line so that connected

points stay as close together as possible, let 𝑦 = 𝑦1, 𝑦2, … 𝑦𝑛T be such a map. Our

goal is to minimize

𝑖,𝑗 𝑦𝑖 βˆ’ 𝑦𝑗2π‘Šπ‘–π‘—

Since 𝑖,𝑗 𝑦𝑖 βˆ’ 𝑦𝑗2π‘Šπ‘–π‘— = 2yTLy, the problem of finding π‘Žπ‘Ÿπ‘”π‘šπ‘–π‘› 𝑦𝑇𝐿𝑦 given that

𝑦𝑇𝐷𝑦 = 1, 𝑦𝑇𝐷1 = 0 becomes the minimum eigenvalue problem:

𝐿𝑓 = πœ†π·π‘“

(Belkin, Niyogi, 2002)

Page 11: Dimension Reduction for Hyperspectral data using

ALGORITHM 1: THE ALGORITHM

β€’ Step 1: Constructing the Adjacency Graph

β€’ Construct a weighted graph with n nodes (n number of data points), and a set of edges connecting

neighboring points.

β€’ A) πœ€ neighborhood: connected if

π‘₯𝑖 βˆ’ π‘₯𝑗2< πœ€

β€’ B) n nearest neighbors

β€’ Step 2: Choosing the weights

β€’ A) Heat Kernel:

π‘Šπ‘–π‘— = π‘’βˆ’π‘₯π‘–βˆ’π‘₯𝑗

2

𝑑

β€’ B) Simple Minded: π‘Šπ‘–π‘— = 1 if connected and π‘Šπ‘–π‘— = 0 otherwise

Page 12: Dimension Reduction for Hyperspectral data using

β€’ Step 3: Compute eigenvalues and eigenvectors for the generalized eigenvector

problem:

𝐿𝑓 = πœ†π·π‘“ (1)

Where π‘Š is the weight matrix defined earlier, 𝐷 is diagonal weight matrix, with

𝐷𝑖𝑖 = π‘—π‘Šπ‘—π‘–, and

𝐿 = 𝐷 βˆ’π‘Š

β€’ Let 𝑓0, 𝑓1, … , π‘“π‘›βˆ’1 be the solutions of equation (1), ordered such that

0 = πœ†0 ≀ πœ†1 ≀ … ≀ πœ†π‘›βˆ’1

β€’ Then the first m eigenvectors (excluding 𝑓0) ,

{𝑓1, 𝑓2, … , π‘“π‘š}

are the desired vectors for embedding in m-dimensional Euclidean space

(Belkin, Niyogi, 2002)

Page 13: Dimension Reduction for Hyperspectral data using

ALGORITHM 2: RANDOMIZED PCA INTRODUCTION

β€’ Canonical construction of the best possible rank-k approximation to a real π‘š Γ— 𝑛

matrix 𝐴 uses singular value decomposition (SVD) of 𝐴,

𝐴 = π‘ˆΞ£π‘‰π‘‡ ,

Where π‘ˆ real unitary π‘š Γ—π‘š matrix, 𝑉 is real unitary 𝑛 Γ— 𝑛 matrix, and Ξ£ is real

π‘š Γ— 𝑛 diagonal matrix with nonnegative, non increasing diagonal entries

β€’ Best Approximation of 𝐴:

𝐴 β‰ˆ π‘ˆ Ξ£ 𝑉𝑇 ,

Where π‘ˆ leftmost π‘š Γ— π‘˜ block of π‘ˆ, Ξ£ π‘˜ Γ— π‘˜ upper left block of Ξ£, 𝑉 leftmost 𝑛 Γ— π‘˜

block of 𝑉

(Rokhlin, Szlam, Tygert, 2009)

Page 14: Dimension Reduction for Hyperspectral data using

β€’ Best because it minimizes the spectral norm 𝐴 βˆ’ 𝐡 for a rank-k matrix 𝐡 = π‘ˆ Ξ£ 𝑉𝑇. In fact ,

𝐴 βˆ’ π‘ˆ Ξ£ 𝑉𝑇 = πœŽπ‘˜+1,

Where πœŽπ‘˜+1 is the π‘˜ + 1 π‘‘β„Žgreatest singular value

β€’ Randomized PCA generates 𝐡 such that

𝐴 βˆ’ 𝐡 ≀ πΆπ‘š1

4𝑖+2πœŽπ‘˜+1

with high probability (1 βˆ’ 10βˆ’15) , where 𝑖 is specified by user, and C depends

on parameters of algorithm

(Rokhlin, Szlam, Tygert, 2009)

Page 15: Dimension Reduction for Hyperspectral data using

ALGORITHM 2: THE ALGORITHM

β€’ Choose 𝑙 > k such that 𝑙 ≀ π‘š βˆ’ π‘˜

β€’ Step 1: Generate a real 𝑙 Γ— π‘š matrix 𝐺 whose entries are i.i.d normal Gaussian

random variables, compute

𝑅 = 𝐺 𝐴𝐴𝑇 𝑖𝐴

β€’ Step 2: Using SVD, form a real 𝑛 Γ— π‘˜ matrix 𝑄 whose columns are orthonormal, such

that

𝑄𝑆 βˆ’ 𝑅𝑇 ≀ πœŒπ‘˜+1

for some π‘˜ Γ— 𝑙 matrix 𝑆, where πœŒπ‘˜+1 is the π‘˜ + 1 π‘‘β„Ž greatest singular value of 𝑅

Page 16: Dimension Reduction for Hyperspectral data using

β€’ Step 3: Compute

𝑇 = 𝐴𝑄

β€’ Step 4: Form an SVD of T:

𝑇 = π‘ˆΞ£π‘Šπ‘‡,

where π‘ˆ is a real π‘š Γ— π‘˜ matrix whose columns are orthonormal, π‘Š is a real π‘˜ Γ— π‘˜

matrix whose columns are orthonormal, Ξ£ is a real diagonal π‘˜ Γ— π‘˜ matrix with

nonnegative diagonal entries

β€’ Step 5: Compute

𝑉 = π‘„π‘Š

β€’ In this way, we get π‘ˆ, Ξ£, 𝑉 as desired, and 𝐡 = π‘ˆΞ£π‘‰π‘‡

(Rokhlin, Szlam, Tygert, 2009)

Page 17: Dimension Reduction for Hyperspectral data using

IMPLEMENTATION

β€’ Hardware: Personal laptop/Computers in the math computer lab

β€’ Software: Matlab

β€’ Database: 12 Band Moderate Dimension Image: June 1966 aircraft scanner

Flightline C1 (Portion of Southern Tippecanoe County, Indiana)

β€’ 220 Band Hyperspectral Image: June 12, 1992 AVIRIS image Indian Pine Test Site 3

(2 x 2 mile portion of Northwest Tippecanoe County, Indiana)

β€’ 220 Band Hyperspectral Image: June 12, 1992 AVIRIS image North-South flight line

(25 x 6 mile portion of Northwest Tippecanoe County, Indiana)

β€’ Hyperspectral data from Norbert Weiner Center

β€’ Data can be large (with 10,000^2 pixels, 200 bands, for example)

Page 18: Dimension Reduction for Hyperspectral data using

VALIDATION METHODS

β€’ Delft University has developed Matlab toolbox for dimension reduction, which

includes many methods, and is publically available

β€’ Use algorithms from DR matlab toolbox to run on the same data and compare results

β€’ For randomized PCA, check error bound:

𝐴 βˆ’ 𝐡 ≀ πΆπ‘š1

4𝑖+2πœŽπ‘˜+1 (Rohklin, 2009)

β€’ Compare with ground truth images for the test cases

Page 19: Dimension Reduction for Hyperspectral data using

TEST PROBLEMS FOR VERIFICATION

β€’ Test on known data set (as provided earlier), and compare results with ground

truth classifications and images

β€’ Test on smaller scales at first, and then move to large data set

Page 20: Dimension Reduction for Hyperspectral data using

EXPECTED RESULTS/CONCLUDING REMARKS

β€’ Laplacian Eigenmaps should be easier to implement, but may take longer to

run because it deals with solving the eigenvalue problem of large matrices

β€’ Randomized PCA will be more difficult to implement, but will give desired

results under unfavorable conditions with reasonable speed, and it should

perform better than Laplacian eigenmaps when dealing with very large

matrices

Page 21: Dimension Reduction for Hyperspectral data using

TIMELINE/MILESTONES

β€’ October 17th: Project proposal

β€’ Now to November, 2014: Implement and test laplacian eigenmaps, prepare

for implementation of randomized PCA

β€’ December, 2014: Midyear report and presentation

β€’ January to March: Implement and test randomized PCA, compare two

methods in various situations

β€’ April to May: Final presentation and Final report

Page 22: Dimension Reduction for Hyperspectral data using

DELIVERABLES

β€’ Presentation of data sets with reduced dimensions of both algorithms

β€’ Comparison charts in terms of running time and accuracy of two different

methods

β€’ Comparison charts with other methods that are available from the DR matlab

toolbox

β€’ Data sets, Matlab codes, presentations, proposals, mid-year report, final

report

Page 23: Dimension Reduction for Hyperspectral data using

BIBLIOGRAPHY

β€’ Shippert, Peg. Introduction to Hyperspectral Image Analysis. Online Journal of Space

Communication, issue No. 3: Remote Sensing of Earth via Satellite. Winter 2003.

http://spacejournal.ohio.edu/pdf/shippert.pdf

β€’ Hyperspectral Imaging. From Wikipedia. Oct. 6th, 2014.

http://en.wikipedia.org/wiki/Hyperspectral_imaging

β€’ Belkin, Mikhail; Niyogi, Partha. Laplacian Eigenmaps for Dimensionality Reduction and

Data Representation. Neural Computation, vol 15. Dec. 8th, 2002. Web.

http://web.cse.ohio-state.edu/~mbelkin/papers/LEM_NC_03.pdf

Page 24: Dimension Reduction for Hyperspectral data using

β€’ Rokhlin, Vladimir; Szlam, Arthur; Tygert, Mark. A Randomized Algorithm for

Principal Component Analysis. SIAM Journal on Matrix Analysis and

Applications Volume 31 Issue 3. August 2009. Web.

ftp://ftp.math.ucla.edu/pub/camreport/cam08-60.pdf

β€’ Matlab Toolbox for Dimension Reduction. Delft University. Web. Oct. 6th,

2014.

http://homepage.tudelft.nl/19j49/Matlab_Toolbox_for_Dimensionality_Redu

ction.html

β€’ IC: Hyperspectral Remote Sensing Scenes. Web. Oct. 6th, 2014.

http://www.ehu.es/ccwintco/index.php?title=Hyperspectral_Remote_Sensing_

Scenes

Page 25: Dimension Reduction for Hyperspectral data using

β€’ Hyperspectral Images. Web. Oct. 6th, 2014.

https://engineering.purdue.edu/~biehl/MultiSpec/hyperspectral.html