quick lesson on databases relational databases are key to managing complex data you’ve been using...
TRANSCRIPT
![Page 1: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/1.jpg)
Quick Lesson on Databases
• Relational databases are key to managing complex data
• You’ve been using relational databases with “Joins” and “Relates” in ArcGIS
• GeoDatabases are relational databases
• Structured Query Language (SQL) is the primary language for relational databases
• You’ve been using SQL statements in ArcGIS to query data
![Page 2: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/2.jpg)
Relational Databases
• Need to represent data with a complex structure
Plot
TreeSpecies
![Page 3: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/3.jpg)
Database Tables
• What you’ve seen in ArcGIS only more flexible
• Tables are made up of “fields” (columns) and “records” (rows)
• Queries are used to combine and subset tables into new tables
• Each table should have a unique, integer, ID, referred to as a primary key– Greatly improves query performance
![Page 4: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/4.jpg)
Field Data Types
• Numeric– Float or integer– Auto numbered, use for primary keys
• Dates– YYYY-MM-DD HH:MM:SS.SS– 2013-04-05 14:23:12.34
• Text– Specified width– “Variant” width
• Binary Large Objects (BLOB)
![Page 5: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/5.jpg)
What’s Wrong With This?Tree QueryLAT LON MEASYEAR MEASMON MEASDAY COMMON_NAME HT
45.446392 -122.236107 1995 6 22 Douglas-fir 4945.446392 -122.236107 1995 6 22 Douglas-fir 2745.446392 -122.236107 1995 6 22 Douglas-fir 9545.446392 -122.236107 1995 6 22 Douglas-fir 6645.446392 -122.236107 1995 6 22 Douglas-fir 11845.446392 -122.236107 1995 6 22 Douglas-fir 7645.446392 -122.236107 1995 6 22 Douglas-fir 14745.456116 -122.397774 1995 6 22 Douglas-fir 18545.456116 -122.397774 1995 6 22 Douglas-fir 10545.456116 -122.397774 1995 6 22 Douglas-fir 10545.456116 -122.397774 1995 6 22 Douglas-fir 8945.193054 -122.51667 1996 6 23 Douglas-fir 9045.193054 -122.51667 1996 6 23 Douglas-fir 9545.193054 -122.51667 1996 6 23 Douglas-fir 9645.193054 -122.51667 1996 6 23 Douglas-fir 99
![Page 6: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/6.jpg)
Relational Databases
• Allow us to “relate” tables to:– Reduce the overall amount of data
• Removes duplicates
– Makes updates much easier– Improves search speeds
![Page 7: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/7.jpg)
Entity-Relationship Diagram
• ERD– Unified Markup Language (UML)
Plot
TreeSpecies
Entities
Relationships
One to one
One to many
Many to many
Relationship Types
![Page 8: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/8.jpg)
ID Lat Lon Year Month Day
1 45.446392 -122.236107 1995 6 22
2 45.193054 -122.51667 1995 6 22
Plot
ID Common Name
1 Douglas-fir
2 Ponderosa Pine
Species
ID PlotID SpeciesID Height
1 1 1 49
2 1 1 27
3 1 1 95
4 1 1 66
5 1 1 118
… 1 … …
12 2 1 90
13 2 1 95
Tree
Primary Key
Foreign Key
![Page 9: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/9.jpg)
Database Normalization
1. Eliminate duplicate columns from the same table
2. Move fields that have “duplicate” row entries and move them to a related table
3. All field entries should be dependent on the primary key
4. There should be only one primary key in each table
![Page 10: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/10.jpg)
Database Dictionary
• Defines each of the tables and fields in a database
• A database forms the basis for data management behind many GIS projects, web sites, and organizations
• Proper documentation is key to long term success!– Database design (including ERDs)– Database Dictionary
![Page 11: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/11.jpg)
Geospatial Databases
• Not required to store spatial data!
• Provide:– Field types for spatial data: point, polyline,
polygon, etc.– Spatial operations: union, intersect, etc.– Spatial queries: return records that overlap
with a polygon, etc.– Some provide spatial reference control
![Page 12: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/12.jpg)
What we really want
• What we need from a database:– Distributed, concurrent access
(concurrency)– Automatic Backup– Version control– Unlimited amounts of data– Quick data access– Inexpensive– Broad OS Support– File-level copying– GeoSpatial queries, operations, data types
![Page 13: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/13.jpg)
Relational Databases
• Enterprise-Level – SQL Server– PostgreSQL– MySQL– Oracle– Sybase
• File-Level– Geodatabase– MS-Access
![Page 14: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/14.jpg)
What we haveSQL Server PostgreSQL ESRI
GeodatabaseMS-Access
Concurrency Yes Yes No No
Automatic backup
Yes Yes No No
Versioning No No No No
Data Size 100s of millions 100s of millions 100,000? 100,000?
Performance Fast Fast Good Poor
Cost $600 per CPU Free ~$10,000 w/ArcGIS
~$400
OS Windows Any Windows Windows
File-level copy No No Yes Yes
Spatial Queries Yes Yes Yes No
Spatial data types
Yes Yes Yes No
Spatial operations
Yes Yes Yes No
![Page 15: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/15.jpg)
Structured Query Language (SQL)
• Comes from the database industry
• “INSERT”, “DELETE”, and “SELECT” rows in tables
• Very rich syntax
• Portions of “SELECT” grammar used heavily in ArcGIS:– Selecting attributes– Raster calculator– Geodatabases
![Page 16: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/16.jpg)
Transaction SQL
• “SQL” is a subset of T-SQL
• T-SQL allows full management of a database:– Create & drop:
• Tables, fields/columns, relationships, indexes, views, etc.
– Administrative functions
• Varies some between databases
![Page 17: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/17.jpg)
Using SQL
• All Databases have “query editors” that allow us to write, save, edit, and use SQL queries
• Use programming languages to “write” queries and “fetch” records from the database
![Page 18: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/18.jpg)
SQL: SELECT
SELECT Field1, Field2
FROM TableName INNER JOIN TableName2
ON TableName2.FK=TableName.PK
WHERE Filter1 AND Filter 2
GROUP BY Field1,Field2
ORDER BY Field1 [DESC], Field2 [DESC]
FK=Foreign Key, PK=Primary Key
![Page 19: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/19.jpg)
Selecting Fields
• SELECT *– Returns all fields as new table
• SELECT Field1,Field2
• SELECT Table1.Field1,Table2.Field1– Return specified fields
• SELECT Table1.Field1 AS NewName– Avoids name collisions
![Page 20: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/20.jpg)
Selecting Tables
• FROM Table1– Returns contents of one table
• FROM Table1 INNER JOIN Table2 ON Table2.ForeignKey=Table1.PrimaryKey– Returns records from Table2 that match
primary keys in Table1– Does not return all rows in Table1
![Page 21: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/21.jpg)
Selecting Tables (con’t)
• FROM Table1 OUTER JOIN Table2 ON Table2.ForeignKey=Table1.PrimaryKey– Returns all matches between Table1 and
Table2 and any records in Table1 that don’t match records in Table2
– Missing values are NULL
![Page 22: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/22.jpg)
Filters or “WHERE” clauses
SELECT *
FROM Table1
WHERE (Field1 Operator Value1) BooleanOperator (Field1 Operator Field2)
![Page 23: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/23.jpg)
Filter Examples
• WHERE:– ID = 1– Area < 10000– Area <= 10000– Name = “Crater Lake” (case dependent)– Name LIKE “Crater Lake” (ignores case)
• Notice:– String values have double quotes– Syntax for strings vary some between
databases
![Page 24: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/24.jpg)
SQL Comparisons
• Equals: =
• Greater than: >
• Less than: <
• Greater than or equal: >=
• Less than or equal: <=
• Not equal: <>
• Like: case independent string comparison with wild cards (%)
![Page 25: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/25.jpg)
Boolean Operators
A B A AND B A OR B NOT A NOT B
T T T T F F
T F F T F T
F T F T T F
F F F F T T
![Page 26: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/26.jpg)
More Complex Filter Examples
• WHERE:– Name LIKE “Hawaii” AND Area < 10000– Species LIKE “Ponderosa” AND DBH > 1
![Page 27: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/27.jpg)
ORDER BY
SELECT *
FROM Table 1
ORDER BY LastName DESC, FirstName DESC
•Careful with performance on large datasets and string fields
![Page 28: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/28.jpg)
GROUP BY
• Aggregates data
SELECT Species ,AVG(Height)
FROM Trees
GROUP BY Species
• Only aggregated fields can appear in SELECT list
![Page 29: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/29.jpg)
SQL INSERT
• INSERT INTO TableName (Field1,Field2) VALUES (Value1,”Value2”)
• String values must be in quotes– Other values can also be in quotes
• If the table has an “auto numbered” ID field, it will be added automatically
• Otherwise, very difficult to set the ID field
![Page 30: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/30.jpg)
SQL DELETEDELETE FROM TableName
WHERE ID=Value- Deletes one row
DELETE FROM Plot
WHERE PlotID=12
- Deletes all rows with PlotID=12
DELETE FROM TableName- Deletes everything in TableName!
![Page 31: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/31.jpg)
Database Performance
Default Search
Indexed Search
Primary Key Search
![Page 32: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/32.jpg)
Indexes
• Added to a table– Typically for one field
• Adds overhead to INSERT and DELETEs
• Important for:– Large tables– Complex queries– Especially text searches!
![Page 33: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/33.jpg)
Maintaining Performance
• Always use integer, auto numbered primary keys
• Avoid iterative or hierarchical queries
• Sometimes code is faster:– Do simple query, load into RAM and sort
• With REALLY big data, don’t use SQL– NoSQL, accessing data directly, without the
use of a relational database package– There are “NoSQL” products in the works
• Avoid text searches and sorts
![Page 34: Quick Lesson on Databases Relational databases are key to managing complex data You’ve been using relational databases with “Joins” and “Relates” in ArcGIS](https://reader030.vdocuments.us/reader030/viewer/2022032604/56649e665503460f94b606c4/html5/thumbnails/34.jpg)
Rasters and Databases
• Don’t put rasters into a database!– Makes it impossible to backup and restore
the database– Put a file path to the rasters in the database