outlinechapter 10 creating and maintaining geographic databases b. klinkenberg geog 376 07 outline...

12
Databases October 2007 Geography 376: B. Klinkenberg 1 Attribute data handling in a GIS environment. Chapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational model SQL Database design Linking data to place Having defined how the geography can be modeled within a GIS, we now need to consider how the characteristics (or attributes) of the geographic features are associated with that geography. GIS Data Spatial Data Attribute Linkages Attribute Data

Upload: others

Post on 21-May-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 1

Attribute data handling in a GIS environment.

Chapter 10 Creating and maintaining geographic databases

B. Klinkenberg Geog 376 07

OutlineLinking data to placeDefinitionsCharacteristics of DBMSTypes of databaseRelational modelSQLDatabase design

Linking data to placeHaving defined how the geography can be modeled within a GIS, we now need to consider how the characteristics (or attributes) of the geographic features are associated with that geography.

GIS Data

SpatialData

AttributeLinkages

AttributeData

Page 2: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 2

Storing attribute data

attribute data are stored separately from the coordinate datafeature identifier points to an attribute table:

point attribute tableline or arc attribute tablepolygon attribute table

1 2

3

polygon attribute table

similarly we can define point or line attribute tables if the spatial features are, for example, villages androads

Storing attribute data

A45233

A23482

A23431

Feature idpolygon id

Storing attribute datagood organization of the attribute data is very importantin socioeconomic GIS applications, the attribute data component is often much larger than the database component (e.g., few provinces, but hundreds of variables)

Id District Province101 Palma Merida102 S. Maria Merida103 Veralo Merida104 Bolo La Paz105 Jose La Paz106 Malabo La Paz107 Chilabo La Paz… … …

101102

103

P_Pop P_TFR Province397881 3.7 La Paz214084 3.2 Merida

… … …104

105107

106

Page 3: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 3

OutlineLinking data to placeDefinitionsCharacteristics of DBMSTypes of databaseRelational modelSQLDatabase design

DefinitionsDatabase – an integrated set of data on a particular subjectGeographic (=spatial) database -database containing geographic data of a particular subject for a particular areaDatabase Management System (DBMS)– software to create, maintain and access databases

Storing dataThere are two fundamental ways to store data:

As a simple file (e.g., a text file)In a ‘database’

Simple file structures

Page 4: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 4

Moving from files to databases

http://cs1.mcm.edu/~rob/class/dbms/notes/Chapt01/index.html

Advantages of Databases over Files

Avoids redundancy and duplicationReduces data maintenance costsApplications are separated from the data

Applications persist over timeSupport multiple concurrent applications

Better data sharingSecurity and standards can be defined and enforced

Disadvantages of Databases over Files

ExpenseComplexityPerformance – especially complex data typesIntegration with other systems can be difficult

Characteristics of DBMS (1)Data model support for multiple data types

MS Access: Text, Memo, Number, Date/Time, Currency, AutoNumber, Yes/No, BLOBs, OLE Object, Hyperlink, Lookup Wizard (other DBMS’s are similar)

Load data from files, databases and other applicationsIndex for rapid retrieval

Page 5: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 5

Characteristics of DBMS (2)Query language – SQL (also QBE, …)Security – controlled access to data

Multi-level groupsControlled update using a transaction managerBackup and recoveryDBA tools

Configuration, tuning

Characteristics of DBMS (3)Applications

CASE tools (Computer-aided software engineering)

Forms builderReportwriterInternet Application Server

Programmable API (Application Programming Interface)

OutlineLinking data to placeDefinitionsCharacteristics of DBMSTypes of databasesRelational modelSQL

Geographic Information

System

Database Management

System

• Data load• Editing• Visualization• Mapping• Analysis

• Storage• Indexing• Security• Query

Data

System TaskRole of DBMS

Page 6: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 6

Types of DBMS ModelsHierarchicalNetworkRelational - RDBMSObject-oriented - OODBMSObject-relational - ORDBMS

Hierarchical DBMS

Hierarchical and Network

Hierarchical

Network

Node X YI 1 4II 4 4III 6 4IV 4 1

Line From To Left Right1 I III O A2 I IV B O3 III IV O C4 I II A B5 II III A C6 II IV C B

Poly LinesA 1,4,5B 2,4,6C 3,5,6

1

1

5

4

3

2

6

2 3 4 5 6

A

B C

1

2 3

4 5

6

III

III

IV

O = “outside” polygon

Relational Tables: Topological data model

Keys

Foreign keys

Page 7: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 7

Object-oriented DBMS

Inheritance, encapsulation

OverviewNetwork

essentially a programmer's database model efficient but inflexible and hard to understand

Relational its only complex data type is the relationit is the only complete data model aimed at users instead of programmers relational query languages are easier to use than full-blown programming languages rich underlying theory separation of implementation and design

Object-Oriented an extension of object-oriented programming no generally agreed upon formal data model great freedom regarding complex data structures inheritance user-defined types encapsulation

OutlineLinking data to placeDefinitionsCharacteristics of DBMSTypes of databasesRelational modelSQLDatabase design

Relational DBMS (1)Data stored as tuples (tup-el), conceptualized as tablesTable – data about a class of objects

Two-dimensional list (array)Rows = objectsColumns = object states (properties, attributes)

Page 8: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 8

Table

Row = object

Column = property Table = Object Class

Row = record =tuple

[# rows = cardinality]

Column = field = attribute[# of columns = degree]

Table = file = relation

Text Table 4.1

FID = Primary Key= Index

Relational DBMS (2)Most popular type of DBMS

Over 95% of data in a DBMS is in a RDBMS

Commercial systemsIBM DB2InformixMicrosoft AccessMicrosoft SQL ServerOracleSybase

Relation Rules (Codd, 1970)Only one value in each cell (intersection of row and column)All values in a column are about the same subjectEach row is uniqueNo significance in column sequenceNo significance in row sequence

Page 9: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 9

NormalizationProcess of converting tables to conform to Codd’s relational rulesSplit tables into new tables that can be joined at query time

The relational joinSeveral levels of normalization

Forms: 1NF, 2NF, 3NF, etc.Normalization creates many expensive joinsDe-normalization is OK for performance optimization

Relational JoinFundamental query operationOccurs because

NormalizationData created/maintained by different users, but integration needed for queries

Table joins use common keys (column values -- foreign keys)Table (attribute) join concept has been extended to geographic case

OutlineLinking data to placeDefinitionsCharacteristics of DBMSTypes of databaseRelational modelSQLDatabase design

SQLStructured (or Standard) Query Language –(pronounced SEQUEL)Developed by IBM in 1970sNow de facto and de jure standard for accessing relational databasesThree types of usage

Stand alone queriesHigh level programmingEmbedded in other applications

Page 10: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 10

Types of SQL StatementsData Definition Language (DDL)

Create, alter and delete dataCREATE TABLE, CREATE INDEX

Data Manipulation Language (DML)Retrieve and manipulate dataSELECT, UPDATE, DELETE, INSERT

Data Control Languages (DCL)Control security of dataGRANT, CREATE USER, DROP USER

OutlineLinking data to placeDefinitionsCharacteristics of DBMSTypes of databaseRelational modelSQLDatabase design

Steps involved in database creationData investigation: consider the type, quantity and qualities of data to be included in the database; the nature of the entities and attributes is decided (inventory of data, needs analysis).

Data modeling: form a conceptual model of data by examining the relationships between entities and the characteristics of entities and attributes (logical design--infological model).

Steps involved in database creationDatabase design: creation of a practical design for the database. This step depends upon and is constrained by the software being used. Field names, specific attribute types and structures (e.g., tables) are decided (physical design--datalogicalmodel).

Database implementation: populating the database with attribute data. This is followed by monitoring and upkeep, fine tuning, modification and updating.

Page 11: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 11

Database design perspectivesInfological problems deal with how to define the information to be provided by the system to satisfy the needs of its users. Datalogical problems are about how to design the structure and operation of the system and to take full advantage of current information technology available. Essentially, infological work refers to system analysis and conceptual modeling, and datalogical work to technical design and physical implementation of the system.

ERMThe identification of entitiesThe identification of relations between entitiesThe identification of attributes of entities (Infological steps)The derivation of tables from this (datalogicalsteps)

Entity Relationship Modelling

ERM ERM RelationsMapping an ER Model intoa table.

Example of 1:1 relations

Example of a 1:M relation

Example of a M:N relation

Page 12: OutlineChapter 10 Creating and maintaining geographic databases B. Klinkenberg Geog 376 07 Outline Linking data to place Definitions Characteristics of DBMS Types of database Relational

Databases October 2007

Geography 376: B. Klinkenberg 12

Prof. Börje Langefors recognized the importance of three contexts in the infological approach. They are the “organizational context, wherein organized collections of people/individuals are perceived; the language context, wherein organized collections of symbols and linguistic behaviors are perceived; and technical context, wherein organized collections of technical artifacts (computers, telecommunication technologies, software) are perceived” (Iivari & Lyytinen, 1998 ; p. 170).

http://isworld.student.cwru.edu/tiki/tiki-index.php?page=Langefors_Review

Database design perspectives

This quote perfectly describes the situation wrt GIS within an organization, as well.

SummaryDatabase – an integrated set of data on a particular subjectDatabases offer many advantages over filesRelational databases dominateDatabase design issues