spatial information systems (sis) comp 30110 spatial data modelling gis and sdbms

22
Spatial Information Spatial Information Systems (SIS) Systems (SIS) COMP 30110 COMP 30110 Spatial data modelling Spatial data modelling GIS and SDBMS GIS and SDBMS

Post on 20-Dec-2015

225 views

Category:

Documents


2 download

TRANSCRIPT

Spatial Information Systems (SIS)Spatial Information Systems (SIS)

COMP 30110COMP 30110

Spatial data modellingSpatial data modelling

GIS and SDBMSGIS and SDBMS

Spatial Spatial data modellingdata modelling

Spatial data models must allow the representation of Spatial data models must allow the representation of the spatial extent of data and support spatial queriesthe spatial extent of data and support spatial queries

Two types of models:Two types of models:

• Object-based modelObject-based model: also called entity-based or : also called entity-based or feature-basedfeature-based

• Field-based modelField-based model: also called space-based: also called space-based

Object-based modelObject-based model

• Individual objects are represented explicitly using Individual objects are represented explicitly using their geometric counterparttheir geometric counterpart

• The representation of the spatial extent (e.g., lines The representation of the spatial extent (e.g., lines representing the boundary of a lake) using directed representing the boundary of a lake) using directed line segments (vectors) has given rise to the term line segments (vectors) has given rise to the term vector modelvector model

• More suitable for applications requiring high quality More suitable for applications requiring high quality cartographic operations, coordinate geometry, and cartographic operations, coordinate geometry, and networksnetworks

2D Vector-based representations2D Vector-based representations

• Primitive spatial data objects are points, lines and polygons Primitive spatial data objects are points, lines and polygons located by Cartesian coordinates in a spatial reference framelocated by Cartesian coordinates in a spatial reference frame

• These geometric primitives indicate static locations and These geometric primitives indicate static locations and spatial extents of geographic phenomena in terms of XY spatial extents of geographic phenomena in terms of XY coordinatescoordinates

2D Vector-based representations2D Vector-based representations

•Point object marks the location of a geographic entity (e.g., Point object marks the location of a geographic entity (e.g., a a paper millpaper mill) by a pair of XY coordinates) by a pair of XY coordinates

• Line object shows location and linear extent of a geographic Line object shows location and linear extent of a geographic entity (e.g., a entity (e.g., a riverriver) by a series of XY coordinates) by a series of XY coordinates

• Polygon object shows location and 2D extent of a Polygon object shows location and 2D extent of a geographic area (region), e.g. a geographic area (region), e.g. a lakelake, by a series of XY , by a series of XY coordinates along the boundary of the region.coordinates along the boundary of the region.

major road

river

minor road A minor roadB

creek mouth

paper mill lakecreek

Field-based modelField-based model

• The underlying (geographic) space is partitioned (also tesselated) into The underlying (geographic) space is partitioned (also tesselated) into cells that cover it entirely.cells that cover it entirely.

• Spatial objects are embedded in the space and are described and Spatial objects are embedded in the space and are described and manipulated in terms of the cells they intersect. manipulated in terms of the cells they intersect.

•ExampleExample: lake described by the cells that cover its interior rather than : lake described by the cells that cover its interior rather than by the lines that form its boundaryby the lines that form its boundary

Raster-based representationsRaster-based representations

• Usually the partition is composed of polygonal units of Usually the partition is composed of polygonal units of equal size (fixed or regular grid: equal size (fixed or regular grid: rasterraster))

• Each cell in the grid has two associated values:Each cell in the grid has two associated values:

– a positional value that marks its identitya positional value that marks its identity

– attribute values of the underlying area it represents (e.g. land attribute values of the underlying area it represents (e.g. land elevation, land use, etc.)elevation, land use, etc.)

Spatial data formatsSpatial data formats

• Vector Vector

• RasterRaster

Sets of spatial entities (points, lines, regions) and spatial relationsSets of spatial entities (points, lines, regions) and spatial relations

e.g. geographic maps, digital terrain models, etc.e.g. geographic maps, digital terrain models, etc.

Spatial data in vector formatSpatial data in vector format

Spatial RelationsSpatial Relations

• Topological RelationsTopological Relations: containment, overlapping, etc. : containment, overlapping, etc.

[Egenhofer et al. 1991][Egenhofer et al. 1991]

• Metric RelationsMetric Relations: distance between objects, etc. [Gold : distance between objects, etc. [Gold and Roos 1994]and Roos 1994]

• Direction RelationsDirection Relations: north of, south of, etc.: north of, south of, etc.[Hernandez et al. 1990; Frank et al. 1991][Hernandez et al. 1990; Frank et al. 1991]

A BBA

B

A

1 KmA B

Sets of pixels Sets of pixels e.g. satellite images, aerial photos, scanned mapse.g. satellite images, aerial photos, scanned maps

Spatial data in raster formatSpatial data in raster format

• Organizations depend on the ability toOrganizations depend on the ability to efficiently efficiently– – acquireacquire,,

– – managemanage, and, and

– – analyseanalyse (spatial) data.(spatial) data.

• Data Management Aspect:Data Management Aspect:– – Data needs to be accurate and timely, and also stay accurate even Data needs to be accurate and timely, and also stay accurate even

ifif many people access and use the same datamany people access and use the same data

– – Amount of spatial data is exploding – how to find data ?!Amount of spatial data is exploding – how to find data ?!

• Data Analysis Aspect:Data Analysis Aspect:– – Quickly find the data that is relevant to a given questionQuickly find the data that is relevant to a given question

powerful powerful data managementdata management systems are needed systems are needed

DBMS: motivationsDBMS: motivations

DBMS and Spatial DataDBMS and Spatial Data

• In a GIS/SIS we need to store and manipulate both spatial and non-In a GIS/SIS we need to store and manipulate both spatial and non-

spatial dataspatial data

• Storage can be controlled directly by applications programs or by a Storage can be controlled directly by applications programs or by a

DBMSDBMS

• Early GIS built directly on top of file systems:Early GIS built directly on top of file systems:

– No DB usedNo DB used

– Spatial and non-spatial data both stored in files controlled by the Spatial and non-spatial data both stored in files controlled by the

applicationapplication

– Functions are defined on the dataFunctions are defined on the data

– Problem with this approach: no data independence, data security, Problem with this approach: no data independence, data security,

concurrencyconcurrency

• Solution:Solution: using a DBMS using a DBMS

Use of Relational DBMSUse of Relational DBMS

• Each relation/table represents a theme (e.g., Each relation/table represents a theme (e.g.,

country, landuse, etc.)country, landuse, etc.)

• A geographic object is a tuple/row of such A geographic object is a tuple/row of such

relationrelation

• Each column is an attributeEach column is an attribute

• Attributes have alphanumeric types (non-spatial Attributes have alphanumeric types (non-spatial

aspects)aspects)

• SQL-based queryingSQL-based querying

ExampleExample

name capital population Id-boundary

Germany Berlin 78.5 B1

France Paris 58 B2

… … … …

Id-contour Point-num Id-point

C1 2 P1

C1 1 P2

C1 3 P3

C1 … …

C2 1 P4

C2 2 P5

C2 … …

… … …

Id-boundary Id-contour

B1 C1

B2 C2

B2 C3

B3 C4

B3 C5

… …

Id-point x y

P1 452 1000

P2 365 875

P3 386 985

P4 296 825

P5 589 189

… … …

COUNTRIES

CONTOURS

POINTS

BOUNDARIES

Example (cont.d) Example (cont.d)

• Relation Relation COUNTRIESCOUNTRIES with schema ( with schema (name, capital, name, capital, population, id-boundarypopulation, id-boundary))

• Id-boundaryId-boundary is the spatial attribute corresponding to the is the spatial attribute corresponding to the boundary of the countryboundary of the country

• A boundary is made of several components (A boundary is made of several components (contourscontours) ) when a country is made of several parts. A contour is when a country is made of several parts. A contour is characterised by an characterised by an IdId and a list of points, each of which is and a list of points, each of which is stored in relation stored in relation PointsPoints..

• Point-numPoint-num is used to represent an ordering of the points is used to represent an ordering of the points along the boundary of a country. For example, contour along the boundary of a country. For example, contour C1 is represented by the sequence {P1,P2,P3} of points, C1 is represented by the sequence {P1,P2,P3} of points, although the points are not stored in that orderalthough the points are not stored in that order

Example (cont.d) Example (cont.d)

• Querying via SQL: “Return the contours of Italy”Querying via SQL: “Return the contours of Italy”

select B.Id-contour, point-num, x, y

from COUNTRIES C, BOUNDARIES B,

CONTOURS CT, POINTS P

where name=‘Italy’ and C.Id-boundary=B.Id-boundary

and B.Id-contour=CT.Id-contour

and CT.Id-point=P.Id-point

order by B.Id-contour, point-num

The query involves retrieving the set of coordinates of The query involves retrieving the set of coordinates of the vertices on the boundary of the polygons (objects) the vertices on the boundary of the polygons (objects) that represent Italythat represent Italy

Use of relational DBMS: pros & cons Use of relational DBMS: pros & cons

• PRO: use of standard DBMS and languages (SQL)PRO: use of standard DBMS and languages (SQL)

• CONS:CONS:

Querying requires knowledge of the structure of the spatial objects Querying requires knowledge of the structure of the spatial objects (no data independence from this point of view)(no data independence from this point of view)

Representing spatial information requires a large amount of tuples Representing spatial information requires a large amount of tuples (inefficient)(inefficient)

Need to manipulate (possibly very large) tables of points (no user Need to manipulate (possibly very large) tables of points (no user friendliness)friendliness)

Difficult to perform spatial computations (e.g., “when are two objects Difficult to perform spatial computations (e.g., “when are two objects adjacent?”) and to define new “data types” for spatial objectsadjacent?”) and to define new “data types” for spatial objects

Spatial queries (see example) are not directly supported and require Spatial queries (see example) are not directly supported and require join of several different tables (inefficient)join of several different tables (inefficient)

Loosely coupled approach Loosely coupled approach

• Separation between spatial and non-spatial dataSeparation between spatial and non-spatial data

• Approach used by the majority of traditional GIS vendors Approach used by the majority of traditional GIS vendors (ESRI, MapInfo, Intergraph, etc.)(ESRI, MapInfo, Intergraph, etc.)

• Two systems coexist:Two systems coexist:

- A (usually relational) DBMS, or some component of it for descriptive alphanumeric data

- A specific module for spatial data management

Application programs

DB files

RelationalDBMS

(standard SQL)Spatial data processing

Loosely coupled approach: pros & cons Loosely coupled approach: pros & cons

• Pros: Pros:

- Proper geo-spatial data management Proper geo-spatial data management

- Spatial queries are directly supportedSpatial queries are directly supported

• Cons: Cons:

- difficulty in modeling, using and integrating heterogenous models within the difficulty in modeling, using and integrating heterogenous models within the same systemsame system

- partial loss of basic DBMS functionality (e.g., for querying spatial data)partial loss of basic DBMS functionality (e.g., for querying spatial data)

- need to learn complex sw packagesneed to learn complex sw packages

Integrated approach: SDMSIntegrated approach: SDMS

• Based on DBMS extensibilityBased on DBMS extensibility

• Main concept: ability to add new types and operations to Main concept: ability to add new types and operations to existing relational DBMSexisting relational DBMS

•For geo-spatial applications, extensions to relational DBMS For geo-spatial applications, extensions to relational DBMS include:include:

• extension of the query language SQL to allow for manipulation of spatial data extension of the query language SQL to allow for manipulation of spatial data as well as descriptive data. New spatial types (points, lines, and regions) should as well as descriptive data. New spatial types (points, lines, and regions) should be handled as alphanumeric typesbe handled as alphanumeric types

• adaptation of usual DBMS functions (such as indexing, query optimisation) in adaptation of usual DBMS functions (such as indexing, query optimisation) in order to handle also spatial data efficientlyorder to handle also spatial data efficiently

•Only few of the available DBMS offer a spatial extension:Only few of the available DBMS offer a spatial extension:•Oracle 8i/9i (more later)Oracle 8i/9i (more later)

•PostgresPostgres

• Classical GIS systems adopt a hybrid management for dataClassical GIS systems adopt a hybrid management for data

• Non-spatial (attribute) data is stored in “standard” DBNon-spatial (attribute) data is stored in “standard” DB

• Spatial data is stored in proprietary (vector) files and Spatial data is stored in proprietary (vector) files and spatial objects in these files are linked to tables containing spatial objects in these files are linked to tables containing corresponding attribute informationcorresponding attribute information

• SDBMS adopt an integrated approachSDBMS adopt an integrated approach

• Better for data integration but still “behind” from the point Better for data integration but still “behind” from the point of view of functionality (will see Oracle Spatial later)of view of functionality (will see Oracle Spatial later)

Summary: GIS vs SDBMSSummary: GIS vs SDBMS