gradient nearest neighbor (gnn) method for local-scale basal area mapping: fia 2005 symposium...

Gradient Nearest Neighbor (GNN) Method for Local-Scale

Basal Area Mapping:

FIA 2005 Symposium Interpolation Contest

Kenneth B. Pierce Jr., Matthew J. Gregory* and Janet L. Ohmann

Forestry Science Lab, 3200 SW Jefferson Way, Corvallis OR 97331

Why map? Why GNN? (Pacific Northwest perspective)

• Primary objective: supply missing data for analysis and modeling of forest ecosystems at the regional level

• Problem: basic information on current vegetation is needed to address a wide array of issues in forest management and policy. Increasingly, this information needs to be:

– spatially complete (spatial pattern, small geographic areas)

– consistent across large, multi-ownership regions

– rich in floristic and structural detail

– suitable for input to stand and landscape simulation models

– flexible in meeting a variety of analytical needs

• Differs from other objectives which are concerned primarily with estimation

GNN Mapping in West Coast States

Future GNN mapping:

• Wall-to-wall OR, CA, WA

– Start Oct. ’05 in eastern OR

– 5-year mapping cycle

– Coordinated with Region 6, Oregon Department of Forestry and other collaborators

– Funded by FIA and the Western Wildlands Environmental Threat and Analysis Center

• ‘Ecological Systems’ for Gap Analysis Program (MZs 8 & 9)

• Includes non-forest mapping

COLA

CLAMS

GNNFire

GNNFire

Current GNN efforts

The Gradient Nearest Neighbor (GNN) Method for Vegetation

Mapping• A tool for:

– Spatially explicit (wall-to-wall) vegetation data based on ‘interpolation’ of FIA plot data using an ecological (gradient) model

– Inference of plot data to smaller geographic areas (e.g., 6th-field HUCs)

• Imputation approach (as are kNN, MSN) provides:

– Data that are regional in extent, yet rich in detail

– Analytical flexibility for users

Components of GNN Imputation• Statistical model = canonical correspondence analysis (CCA)

(flexibility for redundancy analysis (RDA) and other methods):

– Multivariate

– Results in a weight for each of many spatial variables, based on its relationship with the multiple response variables

– Any multivariate method can be specified (eg. PCA, CCorA)

• Distance measure (between map pixel and potential NN plots):

– Euclidean distance for first n axes (usually 8, specified by user)

– Axes weighted by their explanatory power (eigenvalues)

• Imputation method:

– Single nearest neighbor (k=1, MSN-like)

– Summary statistic of multiple neighbors (kNN-like)

– Measures of variation based on multiple imputation (k>1)

Environmental and Disturbance Gradients(Explanatory Variables)

Landsat TM (1996)

Bands, transformations, texture

ClimateMeans, seasonal

variability

Topography

Elevation, slope, aspect, solar

Disturbance

Past fires, harvest, insects and disease

Location X, Y

OwnershipFS, BLM, forest industry,

other private

Gradient Nearest Neighbor MethodPlot data

ClimateGeologyTopographyOwnership

Remotesensing

PredictionSpatial data

Plot locations

Direct gradient analysis

Plot assigned to each pixel

Statistical model

Imputation

PixelPSME

(m2/ha)CanCov (%)

Snags >50 cm

(trees/ha)

Old-growth index

Etc...

1 11 3 7.4 0.27 ...

2 79 97 2.1 0.82 ...

(2) calculate

axis scores of pixel from

mapped data layers

(3) find nearest-

neighbor plot in

gradient space

Axis 2(climate)

gradient space geographic space

Axis 1(Landsat)

(1)conductgradient

analysis ofplot data

field plots study

area(4)

impute nearest

neighbor’s grounddata to

mapped pixel

The imputation component of GNN

Accuracy assessment (‘obsessive transparency’)• Local-scale accuracy (at plot locations) via cross-validation:

– Confusion matrices

– Kappa statistics

– Correlation statistics

• Regional-scale accuracy:

– distribution of forest conditions in map vs. plot sample

– range of variation in map vs. plot sample

• Spatial depictions:

– Variation among k nearest neighbors

– Distance to nearest neighbor(s) (sampling sufficiency)

• Findings re. GNN map accuracy:

– Excellent for regional patterns and amounts, imperfect for local sites

– Mid-scales???

– Appropriate for regional planning and policy analysis

Bartlett Interpolation Contest

• Comparison between ‘control’ methods and GNN methods

• Effect of footprint size

Interpolation Contestants

•Kriging

– best with intensive sampling and autocorrelated data

•Linear Model

– perhaps best local predictions when a strong gradient / remote sensing link exists for the response

•Single neighbor GNN Imputation

– best for multivariate responses and regional data, recaptures variation and attribute covariance

•Mean of 5 nearest GNN neighbors

Observed Kriged Linear GNN1 GNN5

Distributions

Average 37 38 37 40 39

Maximum 63 54 49 60 60

Variance 173 60 64 125 90

Models

RMSE 11.07 12.48 14.41 13.29

Slope 0.31 0.23 0.27 0.25

Y-intercept 25.93 29.00 29.91 29.62

Corr. coeff. 0.53 0.37 0.32 0.34

R-square 0.28 0.14 0.10 0.12

Model Comparisons

Plot scale accuracy assessmentPre

dic

ted b

asa

l are

a (

m2/h

a)

Observed basal area (m2/ha)

a b

c d

a) Kriging

b) Linear Model

c) GNN1

d) GNN5

Quantile distributions

•Overprediction at lower basal areas / underprediction at higher basal areas

•Accentuated for linear model

Bartlett Study Area

TM Leaf On 4|5|3

Kriged Spatial Prediction

0.0

0.0 – 10.010.0 – 20.0

20.0 – 30.0

30.0 – 40.0

40.0 – 50.0

50.0 – 60.0

60.0 – 70.0

> 70.0

Basal area m2/ha

0.0 – 15.015.0 – 30.0

30.0 – 45.0

45.0 – 60.0

> 60.0

Linear Model Spatial Prediction

0.0

0.0 – 10.010.0 – 20.0

20.0 – 30.0

30.0 – 40.0

40.0 – 50.0

50.0 – 60.0

60.0 – 70.0

> 70.0

Basal area m2/ha

0.0 – 15.015.0 – 30.0

30.0 – 45.0

45.0 – 60.0

> 60.0

GNN 1st Neighbor Spatial Prediction

0.0

0.0 – 10.010.0 – 20.0

20.0 – 30.0

30.0 – 40.0

40.0 – 50.0

50.0 – 60.0

60.0 – 70.0

> 70.0

Basal area m2/ha

0.0 – 15.015.0 – 30.0

30.0 – 45.0

45.0 – 60.0

> 60.0

GNN 5-Neighbor Mean Spatial Prediction

0.0

0.0 – 10.010.0 – 20.0

20.0 – 30.0

30.0 – 40.0

40.0 – 50.0

50.0 – 60.0

60.0 – 70.0

> 70.0

Basal area m2/ha

0.0 – 15.015.0 – 30.0

30.0 – 45.0

45.0 – 60.0

> 60.0

Effect of plot footprint size

• Studied to account for possible misregistration between plots and TM imagery

• Used two footprints at 30m cell resolution

– 1x1 and 2x2 (plot spacing is ~65m – 3x3 windows overlap)

– Used for both extraction of spatial data and for mean basal area prediction at the cross-validation plots

• Imputation is still at a per-pixel level

Observed GNN 1x1 GNN 2x2

Distributions

Average 37.2 37.6 36.6

Maximum 63.1 56.5 49.3

Variance 172.2 172.1 69.4

Models

RMSE 16.15 12.65

Slope 0.229 0.230

Y-intercept 29.1 28.0

Corr. coeff. 0.23 0.36

GNN 1x1 Window

GNN 2x2 Window

Summary – Bartlett Interpolation

• Inverse relationship between better model fits and maintaining sample variance between methods

• While kriging gives the highest degree of local scale agreement, it suffers from lack of spatial pattern

• Linear model and GNN imputation methods seem to maintain spatial pattern

• Plot footprint size made larger difference than anticipated

Strengths and limitations of GNN imputationAdvantages:

• Recaptures most of variation in plot data

• Maintains multi-attribute covariance at a location

• Analytical flexibility: detailed vegetation data for post-mapping classification, analysis, and modeling

• Ability to map variability and assess sampling sufficiency

• Where strong gradients exist, can use other spatial environmental data to describe pattern

Limitations:

• Map values are constrained to those at sampled locations

• Natural variability reduces local-scale prediction accuracy

gradient nearest neighbor (gnn) method for local-scale basal area mapping: fia 2005 symposium...

Documents