components of spatial data quality in gis

34
Lecture 5, Wednesday 17th September 2014 DEPARTMENT OF GEOGRAPHY AND ENVIRONMENT UNIVERSITY OF DHAKA

Upload: kaium-chowdhury

Post on 13-Jul-2015

307 views

Category:

Education


5 download

TRANSCRIPT

Page 1: Components of Spatial Data Quality in GIS

Lecture 5, Wednesday 17th September 2014

DEPARTMENT OF GEOGRAPHY AND ENVIRONMENT

UNIVERSITY OF DHAKA

Page 2: Components of Spatial Data Quality in GIS

According to NCDCDS (The US National Committee for Digital Cartographic Data Standards) there are five dimensions for geographic data quality. In addition, ICA proposed two more dimensions.

1. Lineage of Geographic data

2. Positional Accuracy of Geographic data

3. Attribute Accuracy of Geographic data

4. Logical consistency

5. Completeness of Geographic data

6. Temporal accuracy

7. Semantic accuracy

Page 3: Components of Spatial Data Quality in GIS

This refers to the sources of materials from which a specific set of geographic data was derived

Lineage provides following questions to a user about data:

1. Who collected data?

2. When were the data collected?

3. How collected?

4. How were the data converted?

5. What algorithms were used to process the data?

6. What was the precision of computation?

Page 4: Components of Spatial Data Quality in GIS

“Closeness” of coordinate values to the “true” positions of the real world

Generally, maps are accurate to roughly one line width or 0.5 mm. This is known as minimum mapping unit. A 0.5 mm resolution is equivalent to 5 m on 1:10000 scale maps and 125 m on 1:250000 scale maps.

Positional accuracy of data can be measured by two ways:

1. Planimetric accuracy

2. Height accuracy

Page 5: Components of Spatial Data Quality in GIS

Scale Effective Resolution (m)

1:2500 1.25

1:10000 5

1:24000 12

1:50000 25

1:100000 50

1:250000 125

1:500000 250

1:1000000 500

1:10000000 5000

Page 6: Components of Spatial Data Quality in GIS

Defined as the “closeness” of the descriptive data in the geographic database to the true or assumed values of the real world features that they may represent

Different ways are used to measure attribute accuracy:

For metric attribute (DEM, TIN), accuracy may always be simply expressed as measurement error

For categorical attributes (land use classification) it is very difficult to measure accuracy of spatial data. In such case, attribute accuracy usually evaluated in terms of other factors, such as-

1. The classification scheme

2. The amount of gross error

3. The degree of heterogeneity of the polygons

Page 7: Components of Spatial Data Quality in GIS

Defined as a square array of values, denoted as C, which cross-tabulates the number of sample spatial data units assigned to a particular category relative to the actual category as verified by the reference data

Constructed to show the frequency of discrepancies between encoded values and their corresponding reference values of sample

In the error matrix, rows represent the categories of the classification of the database obtained by the user

The columns indicate the classification of the reference data obtained by source data or field visit

Page 8: Components of Spatial Data Quality in GIS

Diagonal elements represent correctly classified spatial data

Off-diagonal elements represent the frequencies of misclassification of various categories

If in a particular error matrix, all the non-zero entries lie on the diagonal, it indicates that no misclassification at the sample locations has occurred and an overall accuracy of 100% is obtained

When misclassifications occur, it can be termed either as an error of commission/user accuracy (error of inclusion) or an error of omission/ producers accuracy (errors of exclusion)

Page 9: Components of Spatial Data Quality in GIS

Overall Accuracy

Computed by dividing the total number of correctly classified pixels by the total number of reference pixels

The maximum value of the overall accuracy is 100 when there is perfect agreement between the database and the reference data. The minimum value is 0.

Page 10: Components of Spatial Data Quality in GIS

OA can also be termed as PCC (Percent Correctly Classified). The following equation can be used:

PCC or OA= (Sd /n)* 100%

Where,

Sd = sum of values along diagonal

N= total number of sample locations

Page 11: Components of Spatial Data Quality in GIS

Sample Data

Reference Data Total

Exposed soil

Cropland Range Sparse woodland

Forest Water

Exposed soil

1 2 0 0 0 0 3

Cropland

0 5 0 2 3 0 10

Range

0 3 5 1 0 0 9

Sparse woodland

0 0 4 4 0 0 8

Forest

0 0 0 0 4 0 4

Water

0 0 0 0 0 1 1

Total 1 10 9 7 7 1 35

Page 12: Components of Spatial Data Quality in GIS

This can be computed by dividing the number of correctly classified pixels in each category (on the major diagonal) by number of training set pixels used for that category (the column total)

Producer’s accuracy= (C i / C t) *100%

Where,

Ci= correctly classified sample locations in column

Ct= total number of sample locations in column

EO=100-producer’s accuracy

Page 13: Components of Spatial Data Quality in GIS

Calculation of PA

Exposed soil =1/1 =100%

Cropland =5/10 =50%

Range =5/9 =55.6%

Sparse woodland =4/7 =57.1%

Forest =4/7 =57.1%

Water body =1/1 =100%

Page 14: Components of Spatial Data Quality in GIS

Computed by dividing the number of correctly classified pixels in each category by the total number of pixels that were classified in that category (the row total)

This figure is a measure of commission error and indicates the probability that a pixel classified into a given category actually represents that category on the ground

UA= (Ri / Rt) *100

Where,

Ri= correctly classified sample locations in row

Rt= total number of sample locations in row

Error of commission=100-users accuracy

Page 15: Components of Spatial Data Quality in GIS

Calculation of UA

Exposed soil =1/3 =33.3%

Cropland =5/10 =50%

Range =5/9 =55.6%

Sparse woodland =4/8 =50%

Forest =4/4 =100%

Water body =1/1 =100%

Page 16: Components of Spatial Data Quality in GIS

4. Logical consistency

Description of the fidelity of the relationships between the real

world and encoded geographic data

In GIS, topological model is an example of assigning logical

consistency

>> consistency of the data model

>> consistency of the positional and attribute data

>> consistency between data files

Page 17: Components of Spatial Data Quality in GIS

5. Completeness of Geographic data

Are all possible objects included within the database?

A. Spatial completeness

B. B. Thematic completeness

Page 18: Components of Spatial Data Quality in GIS

6. Temporal accuracy

Measure of data quality with respect to the representation of time in geographic database

A. World time

B. Database time

Page 19: Components of Spatial Data Quality in GIS

7. Semantic accuracy

>> how correctly spatial objects are labeled on

named

>> correct encoding in accordance with a set of

features

Page 20: Components of Spatial Data Quality in GIS

Datum

A geodetic datum (plural datums, not data) is a reference from

which measurements are made.

In surveying and geodesy, a datum is a set of reference points

on the Earth's surface against which position measurements are

made.

Page 21: Components of Spatial Data Quality in GIS

Horizontal datums are used for describing a point on the earth's surface, in latitude and longitude or another coordinate system.

Vertical datums are used to measure elevations or underwater depths.

Page 22: Components of Spatial Data Quality in GIS
Page 23: Components of Spatial Data Quality in GIS

A coordinate system defines the location of a point on a planar or spherical surface.

Types of coordinate system

A. Based on Nature

B. Based on Extent

Page 24: Components of Spatial Data Quality in GIS

A. Based on Nature

1. Plane coordinate system

2. Geographic coordinate system

B. Based on Extent

1. Global coordinate system

2. Local coordinate system

Page 25: Components of Spatial Data Quality in GIS

Some coordinate systems

1. Cartesian coordinate system

Page 26: Components of Spatial Data Quality in GIS

2. Universal Transverse Mercator (UTM)

Page 27: Components of Spatial Data Quality in GIS

3. WGS 84

The World Geodetic System 1984 (WGS84) is the datum used

by the Global Positioning System (GPS). The datum is defined

and maintained by the United States National Geospatial-

Intelligence Agency (NGA).

Coordinates computed from GPS receivers are likely to be

provided in terms of the WGS84 datum and the heights in

terms of the WGS84 ellipsoid.

Page 28: Components of Spatial Data Quality in GIS

4. Everest 1830

India and other countries of the world made measurements in

their countries and defined reference surface to serve as

Datum for mapping.

In India the reference surface was defined by Sir George

Everest, who was Surveyor General of India from 1830 to

1843.

It has served as reference for all mapping in India. Indian

system can be called Indian Geodetic System as all

coordinates are referred to it. The reference surface was

called Everest Spheroid.

Page 29: Components of Spatial Data Quality in GIS

Geoid

An imaginary surface that coincides with mean sea level in the ocean and its extension through the continents.

A hypothetical surface that corresponds to mean sea level and extends at the same level under the continents.

The geoid is used as a reference surface for astronomical measurements and for the accurate measurement of elevationon the Earth's surface.

Ellipsoid

A geometric surface, symmetrical about the three coordinate axes, whose plane sections are ellipses or circles

Page 30: Components of Spatial Data Quality in GIS
Page 31: Components of Spatial Data Quality in GIS
Page 32: Components of Spatial Data Quality in GIS
Page 33: Components of Spatial Data Quality in GIS
Page 34: Components of Spatial Data Quality in GIS