spatial databases: building spatial db spring, 2015 ki-joune li
TRANSCRIPT
Spatial Databases:Building Spatial DB
Spring, 2015
Ki-Joune Li
STEMPNU
2
Importance of Database
Application of Spatial Databases
(e.g. GIS)
Garbage-In Garbage-OutGarbage-In Garbage-Out
About 70% of GIS Development Cost: DB CostAbout 70% of GIS Development Cost: DB Cost
STEMPNU
3
Comparison with Software Lifecycle
Requirement Analysis
Functional Specification
Design
Development Environments
Coding
Test
Maintenance
Software Life Cycle – Waterfall Model
Requirement Analysis
Modeling
Schema Design
DB Environments
Data Collection and Input
Quality Control
Maintenance
DB Life Cycle
STEMPNU
4
Requirement Analysis
Analysis of Status as it is and as it shall be.
Output of Analysis Use-Case Diagram of UML: Workflow Analysis Data items that have been maintained and to be maintained Description of each item: Data Dictionary Relationships and Constraints on items Required accuracy
Spatial Precision Temporal Precision
Current State: As it is As it must be
STEMPNU
5
Data Dictionary
Definitions and Representation of Data Items such as Precise definition of data elements Integrity constraints or Constrains Stored procedures and trigger rules Specification of
Producer and Consumer of data element
Why it is so important? Common understanding on data items Consistency of databases Important input to data modeling
STEMPNU
6
Data Modeling
Data Modeling Understanding the real world and application A very small piece of the real world
According to viewpoint Determined by applications
Drawing what you have understood in formal method Class Diagram in UML
4 steps Definition of Entities Attributes of each Entity Relationships Constraints
STEMPNU
7
Class Diagram: Basic
DVD Movie VHS Movie Video Game
Rental Item{abstract}
Rental Invoice
1..*1
Customer
Checkout Screen
0..1
1
Simple
Association
Class
Abstract
Class
Simple
Aggregation
Generalization
Composition
(Dependency)
Multiplicity
MyClassName
+SomePublicAttribute : SomeType
-SomePrivateAttribute : SomeType
#SomeProtectedAttribute : SomeType
+ClassMethodOne()
+ClassMethodTwo()
Responsibilities
-- can optionally be described here.
STEMPNU
8
Extract nouns from Problem statement Use-Case Diagram
Delete unnecessary entities Duplication Attributes rather than entity
ex. Loan amount
Definition of Features Geographic Entity Granularity
Definition of Entities
MyClassName
STEMPNU
9
Definition of Features
Feature Meaningful Object of GIS in real world Must have a geometry
Point, Line, Polygon, etc..
How to define the Granularity of Features Example
How to define “a” coastal line? The highway from Pusan to Seoul is a long feature ?
How to separate this long road?
STEMPNU
10
Definition of Attributes
Attributes of Feature Geometric type: Spatial Attribute Non-Spatial Attributes
Geometric Type Different Levels of Detail (LOD)
Building Polygon in 1/1,000 scale Point in 1/1,000,000 scale
Road Polygon in 1/1,000 scale Polyline in 1/1,000,000 scale
MyClassName
+SomePublicAttribute : SomeType
-SomePrivateAttribute : SomeType
#SomeProtectedAttribute : SomeType
+GeometricAttribute
STEMPNU
11
Relationship
Relationship Non-Spatial Relationship Spatial Relationship: Topology
STEMPNU
12
Constraints
Example No building on road surface More than 50 meters between two poles
Implementation Internal Functions for checking constraints (or constructor) Spatial OCL (Object Constraint Language)
More detail and complete constraint Better quality of DB
STEMPNU
13
Quality Control for Data Modeling
For the quality control, A Simulation with a pre-defined test scenario
STEMPNU
14
Schema Design
Automatic Conversion from Data Modeling to Schema
Check Points: Performance Issues Materialization Index Geographic Distribution of DB: Clustering
Based on Workload Analysis Distribution of operations Distribution of values
STEMPNU
15
Materialization
In SQL, view is a virtual table derived from a Select statement Eample
CREATE VIEW ExcellentStudents ASSELECT Name, Department, ScoreFROM StudentsWHERE Score > 4.0
SELCT NameFROM ExcellentStudentsWhere Department=‘CS’
Invoke
ExcellentStudents
Materialization
STEMPNU
16
Materialize or Not ?
Materialization Duplication
Not 3NF (BCNF) Cause an inconsistency between the original and derived tables Update: Overhead due to update propagation
Extra Space Requirements
Should be determined depending on the WORKLOAD Frequency of updates Cost for update propagation
Especially when materialized view is geographically distributed
STEMPNU
17
Spatial Index
Index: Accelerate Search
Spatial Index Spatial predicates: contain, overlapping, k-NN Much improves the query processing performance Has a performance overhead for insertion/deletion
Search ConditionSearch
Condition { Block# }{ Block# }Search Block Number
Databaseon Disk
1st Phase
2nd Phase
STEMPNU
18
Clustering: Placement of records
Vertical Fragmentation vs. Horizontal Fragmentation
Vertical Fragmentation: Decomposition of table Horizontal Fragmentation: Placement of objects Consideration on Workload
Vertical Fragmentation Horizontal Fragmentation
STEMPNU
19
Clustering
Clustering: Grouping objects so as to maximize Prob(a C, bC), when OK=a and OK+1=b for any two objects a and b of the same group C.
Spatial Clustering Basic Assumption:
If dist(a,b) < dist(a,c), Prob(OK=a, OK+1=b) > Prob(OK=a, OK+1=c)
Two consecutive accesses
a
b
c
STEMPNU
20
Spatial Clustering Methods
k-Means CLARANS in IEEE TKDE 2002, 14(5) BIRCH in proc. VLDB 1996 DBSCAN in proc. KDD 1996 SMTIN in proc. ACM-GIS 1997