compression and analysis of very large imagery data sets using spatial statistics james a. shine...

35
Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering Center Interface 2001 June 16, 2001

Upload: randell-fletcher

Post on 13-Jan-2016

221 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

Compression and Analysis of Very Large Imagery Data Sets

Using Spatial Statistics

James A. Shine

George Mason University and

US Army Topographic Engineering Center

Interface 2001

June 16, 2001

Page 2: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

ACKNOWLEDGMENTS

Dr. Margaret Oliver, University of Reading, UK

Dr. Richard Webster, Rothamsted Laboratory, UK

Dr. Daniel Carr, George Mason University

Page 3: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

INTRODUCTIONGreater resolution in imagery data sets:

pixel resolution (1 meter; 3 x 10^6 data points/square mile)

more bands (up to 256 in hyperspectral sensors;+10^2)

more imagery over timeCompression becomes an important part of

timely analysis.How far can image be compressed before

information is lost?

Page 4: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

PROFESSIONAL MOTIVATION:

Collecting imagery, climatic and other topographic data

Transforming the data into maps, surfaces, and other topographic products

Determination of sampling intervals using spatial statistics is an important tool for many of our applications:

collecting ground truth

choosing training points for classification

Page 5: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

DATA SETS

Page 6: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

CAMIS Data Collection

Computerized Airborne Multicamera Imaging System

Four-band sensor flown in Lear jet (blue, green, red, near infrared)

Each data frame 768x576 pixelsEach flight line has 30 framesEach collect uses 10-15 flight linesOrder of 10^7 data points per collect

Page 7: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

Data Preprocessing

Considerable overlap in flight linesBands registered to each other firstOverlap removed, forming mosaicRadiometric correctionMap registration

Page 8: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

Ft. Story, VAFt. A.P. Hill, VA

Page 9: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

SPATIAL STATISTICS

Much spatial data (such as imagery) is spatially correlated; points close together have lower variance than those farther apart.

Variance can be divided into background noise (stochastic) and spatial.

The variance can be modeled by plotting vs. distance between points (variogram) and used for many applications.

Page 10: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

STOCHASTIC AND SPATIAL VARIATION

STOCHASTIC VARIATION IS LOCAL, BACKGROUND NOISE (NUGGET EFFECT)

SPATIAL VARIATION IS GLOBAL (SILL AND RANGE)

THE SCALE OF SPATIAL VARIATION IS ESPECIALLY IMPORTANT

VARIOGRAMS DEMONSTRATE THESE TWO VARIATIONS

Page 11: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

HOW TO COMPUTE A VARIOGRAM

We have sample locations x1, x2, … and values z at each location. The semivariance

for a given distance h is:

Where n(h) is the number of pairs of points a distance h apart. The semivariance is then plotted against h as shown on the next slide.

( )[ ( ) ( )]

* ( )

( )

hz x z x

n h

i h ii

n h

2

1

2

Page 12: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

MODELING THE VARIOGRAMThe variogram is then fit on several different

models: exponential, nested exponential spherical, nested spherical circular others

The best-fitting model (minimum squared error or a similar metric) is chosen.

The model is then used to determine the scale (or scales in nested models) of variation and for interpolation and estimation.

Page 13: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

COMPARISON EXPERIMENT

Compute variogram of complete image bandCompute variograms of subsampled image

band (reduced by powers of 2)Compare the variograms, determine when

curve is lostUse this as a compression threshold

Page 14: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

COMPUTING A FULL IMAGE VARIOGRAM

Data transferred from imagery to text file (ERDAS Imagine, Arc/Info)

Modified FORTRAN program Running time: approx. 1 hour per 4 x 10^6

points only 2 directions (N-S and E-W)Current algorithm O(n^2), may be reducibleDetails: Shine, JSM 2000

Page 15: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

Ft. Story full image variograms

Page 16: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

FT. STORY BAND 1 ROWS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

3000

5000

FT. STORY BAND 1 COLUMNS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

3000

5000

FT. STORY BAND 1 AVERAGE

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

3000

5000

Page 17: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

FT. STORY BAND 2 ROWS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

2000

3000

4000

FT. STORY BAND 2 COLUMNS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

2000

3000

4000

FT. STORY BAND 2 AVERAGE

DISTANCE

GA

MM

A

0 200 400 600 800 1000

01000

2000

3000

4000

Page 18: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

FT. STORY BAND 3 ROWS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

0500

1500

2500

FT. STORY BAND 3 COLUMNS

DISTANCE

GA

MM

A

0 200 400 600 800 1000

0500

1500

2500

FT. STORY BAND 3 AVERAGE

DISTANCE

GA

MM

A

0 200 400 600 800 1000

0500

1500

2500

Page 19: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

NUGGET MODEL

h

gam

ma

0 5 10 15 20 25 30

0.8

0.9

1.0

1.1

1.2

LINEAR MODEL

h

gam

ma

0 5 10 15 20 25 30

05

1015

2025

30

SPHERICAL MODEL

h

gam

ma

0 5 10 15 20 25 30

0.2

0.4

0.6

0.8

1.0

EXPONENTIAL MODEL

h

gam

ma

0 5 10 15 20 25 30

0.2

0.4

0.6

0.8

1.0

THEORETICAL VARIOGRAM MODELS

Page 20: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

DOUBLE EXPONENTIAL MODEL

distance

ga

mm

a

0 5 10 15 20 25 30

0.5

1.0

1.5

2.0

+

+

++

++ + + + + + + + + + + + + + + + + + + + + + + + +

o

oo

oo

oo

o o o o o o o o o o o o o o o o o o o o o o o

X

X

X

X

X

XX

XX

XX X X X X X X X X X X X X X X X X X X X

A NESTED VARIOGRAM MODEL

Page 21: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

Ft. A.P. Hill full image variograms

Page 22: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

BAND 1

Page 23: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering
Page 24: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering
Page 25: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering
Page 26: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

COMPRESSION ANALYSIS

Start with full variogram

Reduce sample by ¼ successively

Compare resulting variograms

Page 27: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

EXAMPLE RESULT: A.P. HILL, BAND 1

Page 28: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

FULL

Page 29: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

ADD 1/4

Page 30: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

ADD 1/16

Page 31: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

ADD 1/64

Page 32: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

ADD 1/256

Page 33: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

FULL (ORANGE) AND 1/256 (BLUE) IMAGES SUPERIMPOSED

Page 34: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

CONCLUSIONS

Preliminary results show little degradation in variogram at 256 times reduction

Seems to indicate that image can be compressed ~10^2 without affecting results of spatial statistical analysis

Computing time savings: hours to minutes

Page 35: Compression and Analysis of Very Large Imagery Data Sets Using Spatial Statistics James A. Shine George Mason University and US Army Topographic Engineering

FUTURE WORK

Optimize variogram code

Finish tests on other Ft.A.P. Hill and Ft. Story imagery bands

Compare other available CAMIS imagery

Obtain general rule for achievable compression for obtaining a spatial correlation model from 1-meter imagery

Perform other image analysis operations on original and compressed images and compare.