error & uncertainty: ii ce / enve 424/524. handling error methods for measuring and visualizing...

22
Error & Uncertainty: II CE / ENVE 424/524

Upload: evelyn-roberts

Post on 05-Jan-2016

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Error & Uncertainty: II

CE / ENVE 424/524

Page 2: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Handling Error

Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio data types.

Uncertainty associated with ‘classification’ data types is usually expressed in terms of a probability of being correctly classified

Uncertainty associated with quantitative values is usually expressed as a deviation from the true value.

Page 3: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Classification Uncertainty

Example: Satellite image or aerial photograph is processed and some pixels are inaccurately reported.

Page 4: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Confusion Matrix

A confusion matrix contains information about actual and predicted classifications done by a classification system. Performance of such systems is commonly evaluated using the data in the matrix.

The entries in the confusion matrix have the following meaning:•a is the number of correct predictions of class A, •b is the number of incorrect predictions of class A, •c is the number of incorrect of predictions of class B, and •d is the number of correct predictions of class B.

Actual

Class A Class B

PredictedClass A a b

Class B c d

Page 5: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Confusion Matrix Example

Page 6: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Overall map accuracy = total on diagonal / grand total

Ground Classification

A B C

MapClassification

A 10 2 3

B 0 20 0

C 4 1 10

Ground Classification

A B C

MapClassification

A 10 2 3

B 0 20 0

C 4 1 10

Ground ClassificationGround Classification

AA BB CC

MapClassification

MapClassification

AA 1010 22 33

BB 00 2020 00

CC 44 11 1010

Confusion Matrix Example

Overall accuracy (percent correctly classified): (10+20+10)/(10+2+3+0+20+0+4+1+10)= 40/50 = 80%

Error of commission for class A: (2+3)/(10+2+3) = 5/15 = 33% error

Error of omission for class A: (0+4)/(10+0+4) = 4/14 = 29% error

Total

15

20

15

14 23 13 50

Page 7: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

User and Producer Perspective

A B C TOTALUser's

Accuracy

A 10 2 3 15 0.67

B 0 20 0 20 1.00

C 4 1 10 15 0.67TOTAL 14 23 13 40

Producer's Accuracy 0.71 0.87 0.77

Overall Accuracy 0.8

Page 8: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

• A measure of agreement that compares the observed agreement to agreement expected by chance if the observer ratings were independent

• Expresses the proportionate reduction in error generated by a classification process, compared with the error of a completely random classification.

– For perfect agreement, kappa = 1– A value of .82 would imply that the classification process was avoiding 82 % of

the errors that a completely random classification would generate.

Cohen’s Kappa

n

i

ji

n

i

n

i

jiij

c

ccc

c

ccc

1 ..

....

1 1 ..

..

ci.= sum over all columns for row icj.=sum over all rows for column jc..=grand total sum over all columns or all rows

Sum of diagonal entriesq = number of agreements between prediction and actual that sould occur by chance

Page 9: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

kappa is 1 for perfectly accurate data (all N cases on the diagonal), zero for accuracy no

better than chance

Page 10: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Interval/Ratio Data Type Error

Error = Estimated Value – True Value

These errors are often referred to as residuals.

For a set of values, the magnitude of errors is described by the root mean square error (RMSE):

n

xRMSE

n

i 1

2

x = Error

n = number of observations/values

Page 11: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Positional Accuracy AssessmentSummary Table

14

Page 12: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Error Scatterplots

The plot to the right is preferable since they generally fall closer to the diagonal on which perfect estimates would fall

Page 13: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Error Distributions

negative bias positive bias

no bias

Page 14: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Error Distribution Variance (Spread)

Page 15: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Error Propagation

No data stored in a GIS is truly error-free. When data that are stored in a GIS database are used as input to a GIS operation, then the errors in the input will propagate to the output of the operation. Moreover, the error propagation continues when the output from one operation is used as input to an ensuing operation. Consequently, when no record is kept of the accuracy of intermediate results, it becomes extremely difficult to evaluate the accuracy of the final result.

Although users may be aware that errors propagate through their analyses, in practice they rarely pay attention to this problem. No professional GIS currently in use can present the user with information about the confidence limits that should be associated with the results of an analysis.

Page 16: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Living with It (Error)

• As with any inherent problem, first step to dealing with it is to admit it’s there.

• Document the data quality (metadata)

• Conduct error propagation analysis (ex.: sensitivity analysis)

• Use multiple sources of data• The more data sources tell you the same story, the more reliable your story (weight of evidence)

Page 17: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Visualization

Page 18: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Overview

• The techniques of effective data display• How mapping can mislead• How displays are customised to the requirements of

particular applications

Page 19: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Visualization Definitions

“It is a human ability to develop mental representations that allow us to identify patterns and create or impose order” (MacEachren, 1992)

Visualization is the process of representing information synoptically for the purpose of recognizing, communicating and interpreting pattern and structure. Its domain encompasses the computational, cognitive, and mechanical aspects of generating, organizing , manipulating and comprehending such representations.” (Buttenfield and Mackaness, 1999)

Page 20: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Visualization Principles

• Role of visualization in spatial analysis is not limited to maps but extends to numeric and statistical analysis as well.

• The interpretation of a graph or chart is often more efficient than interpretation based on a string of numbers representing the same data.

• “It is abstraction, not realism that give maps their unique power” (Muehrcke, 1990)

• Visualization is needed to:• access pertinent information from large volumes of data• communicate complex patterns effectively• formalize sound principles for data presentation• guide analysis, modeling and interpretation

Page 21: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Visualizing Continuous and Discrete Variation

Page 22: Error & Uncertainty: II CE / ENVE 424/524. Handling Error Methods for measuring and visualizing error and uncertainty vary for nominal/ordinal and interval/ratio

Graphic Variables