kriging: an introduction to concepts and applications · 2020-07-02 · math. function. demo #1 map...

37
Kriging: An Introduction to Concepts and Applications Nicholas M. Giner – Esri

Upload: others

Post on 06-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Kriging: An Introduction to Conceptsand ApplicationsNicholas M. Giner – Esri

Page 2: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Agenda

• What is interpolation?

• Interpolation applications

• Spatial autocorrelation

• Deterministic vs. Geostatistical interpolators

• Building up interpolation

• Kriging theory

• Empirical Bayesian Kriging (EBK)

• EBK Regression

• EBK 3D

• Areal Interpolation

Page 3: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

What is interpolation?• Process of predicting values at unknown locations using values at known location

• Transforms measurements of a continuous phenomenon into a continuous surface

• Interpolation predicts within region; Extrapolation predicts outside region

Page 4: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

What is interpolation?

Page 5: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Interpolation applications• Many continuous phenomena (z)

- Elevation

- Soil (pH, nutrient levels, porosity)

- Precipitation / Snowfall

- Temperature

- Windspeed

- Air pollution / Air quality

- Ozone

- Water quality

- Mining

- Heavy metal concentrations

- Environmental contaminants

- Noise

- Disease occurrence

Page 6: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Spatial autocorrelation• Tobler’s First Law of Geography

- “…everything is related to everything else, but near things are more related than distant things”

• O’Sullivan and Unwin, 2003

- “If geography is worth studying at all, it must be because phenomena do not vary randomly across space”

Page 7: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Deterministic vs. Geostatistical interpolators• Deterministic interpolators

- Based on mathematical functions, not statistical theory

- Model parameters are determined by the user

- Does not include randomness

- No estimates of prediction error (uncertainty/accuracy/confidence)

- Examples: Inverse Distance Weighting (IDW), Spline, Global Polynomial Interpolation

• Geostatistical interpolators

- Based on mathematical functions, AND statistical theory

- Model parameters are estimated based on the data (spatial autocorrelation)

- Includes randomness to approximate the variation present in geographic data

- Produces estimates of prediction error (uncertainty/accuracy/confidence)

- Example: Kriging

Page 8: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Two components of all interpolators• Neighborhood definition – distance or number of points

• Estimation function – mathematics used to make the estimation

(e.g. determine the weights)

Page 9: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Building up interpolation

Source: Geographic Information Analysis – O’Sullivan and Unwin

• Average of all data points: 49

Page 10: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Building up interpolation

• Local spatial average: 40.75

- All points in the local neighborhood are weighted equally

Page 11: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Building up interpolation

• Inverse Distance Weighted (IDW): 41.01

- Closer points have higher weights and more influence

Source: Geographic Information Analysis – O’Sullivan and Unwin

Page 12: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Building up interpolation

• Inverse Distance Weighted (IDW): 49.8

- More influence from points below simply because they are within the neighborhood and closer in distance

Page 13: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Building up interpolation

• Kriging: 56.2

- Prediction is based on how correlated points are based on distance

- There can be negative weights

Page 14: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Geostatistics and Kriging • Geostatistics - statistics of spatially correlated data

• Quantify spatial autocorrelation and incorporate it into the interpolation

• Kriging – “optimal” interpolator given that data meets certain conditions (assumptions)

- Based on the foundational work by Daniel Krige and George Matheron in the 1950s-1960s predicting gold ores in South Africa

- Main idea is that spatial data can be decomposed into two main components

1) Deterministic variation (global trend)

• Can be constant mean or mathematical function

2) Spatially correlated, random variation (local autocorrelation)

Z (s) = µ + ε(s)

Prediction = mean + error

Page 15: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

What makes it “optimal”?• Estimates true value, on average (unbiased)

• Lowest expected prediction error

• Can use information about covariates

• Can be generalized to different geometries

• Estimates a prediction distribution at each location (not just one value)

• Kriging assumptions- Normally distributed

- No trends

- Spatially autocorrelated

- Stationary

Page 16: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Kriging assumption: Normal distribution• If your input data is normally distributed, you can guarantee that your predicted

distribution will be normally distributed

• Many transformation options if not

Histogram

QQ Plot

Page 17: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Kriging assumption: No trends• Systematic patterns and trends in an area might impact the interpolation

• Trade-off with spatial autocorrelation

Page 18: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Kriging assumption: Spatial autocorrelation• How correlated are points based on how far apart they are from one another

• Once you know expected correlation in known values given distance, you can predict the value at unknown locations

Page 19: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Kriging assumption: Stationarity• The correlation between points is defined only by the distance between them, not

their location

- Mean stationarity

- Local stationarity

Page 20: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Kriging workflow

1) Map your data 2) Exploratory Spatial Data Analysis (ESDA) Configure options

3) Variography –Describe spatialvariation in thedata

5) Use model to determine weightsin search neighborhood

6) Interpolate7) Evaluate

(Cross-validation)

8) Repeat Steps 2-7

4) Fit model –Summarize spatialvariation with a math. function

Page 21: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Demo #1Map the data, Geostatistical Wizard, ESDA, Configure options

Page 22: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Variography (Modeling)• Examining and modeling spatial autocorrelation

Page 23: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Variography (Modeling)

1) Calculate empirical semivariogram

- Calculate distance and difference between each pair of points

2) Bin the semivariogram

- Group the pairs of locations into a specified range of distances (lags)

3) Average the semivariogram

- Calculate the average distance and difference (semivariance) for each lag

4) Fit a model

- Find the best fit line for the average semivariances

Semivariogram (distance h) = 0.5 * average (location i – location j)2

Page 24: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Semivariogram

• Represents the expected difference in data value for pairs of points that are a given distance apart, regardless of their spatial location

Nugget – semivariance at 0 distance (measurement error)

Range – distance at which autocorrelation falls off, where semivariance is constant, where there is no more spatial structure in the data. Points are uncorrelated after the range.(data correlation)

Sill – constant semivariance value beyond the range(data variance)

Page 25: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Demo #2Simple kriging

Page 26: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Validation• Full validation

- Split data into ~80% training, ~20% testing

• Cross-validation (“Leave-one out”)

- Remove a single known point, use all remaining points to interpolate at that location, then compare measured value to predicted value

• Diagnostics

- Predictions should be unibiased (e.g. over- and under-predictions should cancel each other out)

- Mean Error should be near zero (unbiased)

- Mean Standardized Error should be near zero

- Predictions should be closed to known values

- Root Mean Square Error (RMSE) should be as small as possible

- Assessment of model stability and accuracy of standard errors

- Root Mean Square Standardized should be close to 1

- Average Standard Error close to RMSE

Page 27: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Empirical Bayesian Kriging (EBK)• Automates the most difficult aspects of building a valid kriging model

• Not as many parameters

• Relaxes the stationarity assumption of kriging

• More accurate estimates of prediction standard errors

• Handles uncertainty associated with one semivariogram (true)

Page 28: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

How EBK works

1. Divide data into local subsets of a given size (can overlap)

2. For each subset, estimate the semivariogram

3. Use this semivariogram to simulate a new set of values for the points (sim #1)

4. Produce a semivariogram from the simulated points (semiv #1)

5. Repeat step 3 many times, resulting in a distribution of semivariograms

6. Mix the local prediction surfaces together to get the final surface

Page 29: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Demo #3EBK

Page 30: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

EBK Regression Prediction

• Combines regression with kriging

• Allows covariates (explanatory variables to improve predictions)

• Both regression models and kriging models are estimated locally

• Uses Principal Components Analysis (PCA)

Prediction = mean + error

• Mean is constant and error term is estimatedfrom surrounding points

• Estimation focuses on the error terms, and doeslittle with the mean

Kriging

Prediction (DV) = intercept + (v1 * coef1) + (v2 * coef2) +… (vk * coefk) + error

• Error term is assumed to be random noise (unmodellable)• Estimation focuses on the mean, and does little

with the error terms

Regression (OLS)

Regression Kriging

Prediction (DV) = intercept + (v1 * coef1) + (v2 * coef2) +… (vk * coefk) + error

• Regression equation estimates the mean for kriging• Error is modeled with the semivariogram, and kriging is performed

• If semivariogram is flat, you essentially have OLS• If there are no explanatory variables, you essentially have simple kriging

Page 31: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Demo #4EBK Regression

Page 32: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

EBK 3D

• Applies the EBK model to 3D

- Distances are calculated using 3D Euclidean Distance

- Subsets are created in 3D

- Search neighborhoods are 3D

- Vertical trend can be removed

• Elevation Inflation Factor

- Vertical variation happens at ta different rate than horizontal variation

Page 33: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Demo #5EBK 3D

Page 34: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Areal Interpolation

• Applies kriging theory to polygon data

• Two main use cases

- Fill missing data

- Downscale from larger polygons to smaller polygons

• Three data inputs

- Average (Gaussian)

- Rate (Binomial)

- Count (Poisson)

Page 35: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Demo #5Areal Interpolation

Page 36: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Print Your Certificate of Attendance

Print Stations Located in 150 Concourse Lobby

Tuesday12:30 pm – 6:30 pmExpoHall B

5:15 pm – 6:30 pmExpo SocialHall B

Wednesday10:45 am – 5:15 pmExpoHall B

6:30 pm – 9:30 pmNetworking ReceptionSmithsonian National Museumof Natural History

Page 37: Kriging: An Introduction to Concepts and Applications · 2020-07-02 · math. function. Demo #1 Map the data, Geostatistical Wizard, ESDA, Configure options. Variography (Modeling)

Download the EsriEvents app and find

your event

Select the session you attended

Scroll down to “Survey”Log in to access the

surveyComplete the survey and select “Submit”

Please Share Your Feedback in the App