by xiaohui liu...o postgres 9.3.1 - use the 32-bit version as additional computing power is not...

33
1 GeoDatabase Practice with PostGreSQL by Xiaohui Liu Objectives Create a new database schema in Postgres Use the Postgres shapefile loader plugin to import data in Esri's shapefile format Write and execute queries in Postgres Describe the spatial data types built into PostGIS Translate features in textual format into PostGIS geometries View PostGIS tables as layers in QGIS Use QGIS to conduct basic GIS operations Activities 1. Please download and install Postgres, PostGIS and QGIS (all open-source desktop GIS packages): o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted to provide a Postgres user name and password, accept the default user name of 'postgres' and specify a password you will remember. After Postgres is installed, when prompted “want to launch the "Stack Builder," a separate package that allows for the installation of add-ons to Postgres answer Yes to this prompt and under the category of Spatial Extensions, choose PostGIS 2.0. o PostGIS - You can use a separate installer file if for some reason you do not use the Stackbulder, you can use this installer. Follow the link go to PostGIS Windows download pg92 directory postgis-pg92- setup-2.0.3-2.exe o QGIS - click the Download QGIS link under the Windows heading Getting Started with PostGIS Download the data file PGIS_Data1.zip unzip it to a folder of your choice You will use Postgres's graphical interface pgAdmin. You will import a shapefile, load data from a text file and see how queries are performed in pgAdmin. A. Create a new schema 1. Open pgAdmin (Start PostgreSQL 9.2 pgAdmin III a. The application should open with an Object Browser pane displayed on the left side of the window displaying a tree with Server Groups at the top. b. Beneath Server Groups, you should see a Servers node and beneath that you should see a Postgres 9.2 server called localhost (which is accessed through port 5432). 2. Double-click on your localhost server and log in using the default user name of postgres. 3. Enter the password you defined earlier for the postgres account when you installed the software. You should now see 4 nodes beneath the localhost server: Databases, Tablespaces, Group Roles and Login Roles. 4. Expand the Databases list. You should see two "starter" databases: postgres and template_postgis_20. The first was created when you installed Postgres; the second when you installed PostGIS. 5. Click on template_postgis_20 to connect to the database Click on the plus sign to view the objects associated with the database. We will be using the schemas (for a database). When you begin a new project, you'll want to create a new schema. 6. Expand the Schemas list there should be only two schemas - public and topology. Create a new schema before looking into the other existing schemas. 7. Right-click on Schemas select New Schema specify the name usa in the dialog box set the Owner of the schema to postgres Click OK. B. Load data from a shapefile

Upload: others

Post on 21-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

1

GeoDatabase Practice with PostGreSQL

by Xiaohui Liu

Objectives

• Create a new database schema in Postgres

• Use the Postgres shapefile loader plugin to import data in Esri's shapefile format

• Write and execute queries in Postgres

• Describe the spatial data types built into PostGIS

• Translate features in textual format into PostGIS geometries

• View PostGIS tables as layers in QGIS

• Use QGIS to conduct basic GIS operations

Activities

1. Please download and install Postgres, PostGIS and QGIS (all open-source desktop GIS packages):

o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may

use 64 bit for your project and production).

▪ When prompted to provide a Postgres user name and password, accept the default user name of

'postgres' and specify a password you will remember.

▪ After Postgres is installed, when prompted “want to launch the "Stack Builder," a separate package

that allows for the installation of add-ons to Postgres → answer Yes to this prompt and under the

category of Spatial Extensions, choose PostGIS 2.0.

o PostGIS - You can use a separate installer file if for some reason you do not use the Stackbulder, you can

use this installer. Follow the link → go to PostGIS Windows download → pg92 directory → postgis-pg92-

setup-2.0.3-2.exe

o QGIS - click the Download QGIS link under the Windows heading

Getting Started with PostGIS

• Download the data file PGIS_Data1.zip → unzip it to a folder of your choice

• You will use Postgres's graphical interface pgAdmin. You will import a shapefile, load data from a text file and see

how queries are performed in pgAdmin.

A. Create a new schema

1. Open pgAdmin (Start → PostgreSQL 9.2 → pgAdmin III

a. The application should open with an Object Browser pane displayed on the left side of the window

displaying a tree with Server Groups at the top.

b. Beneath Server Groups, you should see a Servers node and beneath that you should see a Postgres 9.2

server called localhost (which is accessed through port 5432).

2. Double-click on your localhost server and log in using the default user name of postgres.

3. Enter the password you defined earlier for the postgres account when you installed the software. You should now

see 4 nodes beneath the localhost server: Databases, Tablespaces, Group Roles and Login Roles.

4. Expand the Databases list. You should see two "starter" databases: postgres and template_postgis_20. The first

was created when you installed Postgres; the second when you installed PostGIS.

5. Click on template_postgis_20 to connect to the database → Click on the plus sign to view the objects associated

with the database. We will be using the schemas (for a database). When you begin a new project, you'll want to

create a new schema.

6. Expand the Schemas list → there should be only two schemas - public and topology. Create a new schema

before looking into the other existing schemas.

7. Right-click on Schemas → select New Schema → specify the name usa in the dialog box → set the Owner of

the schema to postgres → Click OK.

B. Load data from a shapefile

Page 2: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

2

• PostGIS users should convert Esri shapefile format data to PostGIS tables. We will use plug-ins to do this conversion.

1. In pgAdmin → select Plugins → PostGIS Shapefile and DBF loader 2.0.

a. Click the Add File button → navigate to your data folder →Select States.shp → Click Open.

2. Confirm using the View Connection Details button that the PostGIS Connection parameters are set to:

Username: postgres

Password: should already be set to your password

Server Host: localhost (port 5432)

Database: template_postgis_20

3. In the Import List section, supply the following settings by clicking on the current values under each heading:

Schema: usa

Table: states

Geo Column: geom

SRID: 4269

(After setting the last value, click away from the row of specifications to avoid changing the value to its original

value.)

4. Look at the SRID (Spatial Reference IDentification) setting. This ID is set to 0 (spatial reference of the shapefile

is unknown) by default. For now, use the value 4269 (decimal degree/NAD83) for the ID.

5. Click Import → the Log Window area should report that the import process has been completed →Close the

import manager.

6. Expand the object list associated with the usa schema → Click on Tables → the newly imported states table must

be visible under the Properties tab → Right-click on the states table → select View Data → View Top 100

Rows.

a. Refresh the GUI if necessary by right-clicking on the schema or Tables node in the Object browser and

selecting Refresh (or hitting F5 on the keyboard).

b. Note the other options in this context menu.

c. Column headers include column names and data types. The gid column is an auto-incrementing integer

column added by the importer. The presence of [PK] indicates that gid is designated as the primary key.

d. The geom column shows values for only a couple of rows (geometries that too long are not displayed

though stored).

7. Follow the steps to import us_cities shapefile and rename it to cities.

C. Create a new table

• Loading data from a comma-delimited text file is common. Start with creating a new blank table.

1. Right-click on Tables under the usa schema → select New Table

a. Under the Properties tab, set the table's Name to census2010 and the Owner to postgres

b. Under the Columns tab → click Add →Under the Properties tab, set the column's Name to state, its

Data type to character varying and its Length to 50 → Click OK.

i. Add a column with the Name total and Data type of integer

ii. Add following columns of integer type. (Note: follow the order so that data from the text is

inserted to correct column: male, female, white, black, amind, asian, hawaiian, other, mixed )

2. After adding columns → click Constraints tab → Confirm that Primary Key is selected from the drop-down list

near the bottom of the dialog → Click Add.

a. In the New Primary Key dialog, set the Name to census2010_pk.

b. Under the Columns tab, select state from Columns drop-down list → click Add → Click OK to finish

adding the primary key.

3. Click OK on the New Table dialog to finish creating the table.

D. Load data using the COPY command

• Before completing data import, look at the data in a text editor and note its location.

1. Determine the full path name of the location of your census2010.csv file. Copy the exact path of your file

Page 3: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

3

a. In pgAdmin → Click SQL → Open Query builder dialog → in the SQL Editor text box, enter the

following command,

b. COPY usa.census2010 FROM 'ENTER YOUR FILE PATH AS COPIED' WITH (FORMAT csv,

HEADER True, QUOTE ' " '); File path should look like: 'C:\temp\PGIS_Data1\census2010.csv'

2. Look at the options in the WITH clause. In case of tab-delimited data, the FORMAT setting would be

text. Because the data contains a header row, HEADER option is set to True. The QUOTE option was set to the

double-quote character to indicate that the input file encloses text strings with that character. A number of other

options are available on the documentation. Among the other options that you may need to set is DELIMITER,

which defaults to comma for csv files and tab for text files. If your file uses another delimiter, such as the pipe

character (|), you can indicate that using the DELIMITER option.

3. COPY command inserts values from the first column of the input file into the first column of the table, values

from the second column of the input file into the second column of the table, etc. The HEADER option simply

tells Postgres to skip the first line, not to read the column headers and intelligently match the columns of the input

file to the columns of the table. If your table happens to have more columns than the input file and/or the columns

are in a different order, you can deal with this by supplying a list of column names that matches the input file after

the table name. For example:

COPY usa.census2010 (state, total, male....) FROM ....

4. Click Execute query to execute the command. You should receive a successful implementation message (there

should be 52 rows).

a. If you encounter a "permission denied" error, the "postgres" database login doesn't have permission to

read the csv file where you downloaded it. Try copying it to a sub-directory belonging to the "Public"

user (e.g., 'C:\Users\Public\Public Documents') or to a location that has no permission restrictions (e.g.,

'C:\temp').

E. Write queries in pgAdmin

1. Click on the SQL button on the pgAdmin toolbar to open the Query GUI. You can write your own query or

build one the Graphical Query Builder.

2. In the SQL Editor box, write the query to identify the states where most of the population uses the term

'Soda' when referring to soft drinks:

3. Run the query by clicking the Execute query ("play") button on the toolbar. You should receive an error relation

"states" does not exist. This is because the states table is part of usa schema. You must specify the schema name

followed by table name: usa.states in the FROM syntax.

4. You can reset pgAdmin's search path to your schema (By default, pgAdmin searches only the public schema.).

a. Cut the query from SQL editor and paste it in a text pad.

b. Enter the following statement into the SQL Editor: SET search_path TO usa, public;

5. Click Execute query → this query will set usa schema in the public schema.

a. Write the following query: SHOW search_path; to find out what the search path is.

6. Rewrite the query to select states. You should get 17 states in the Output pane. pgAdmin converts all

table/column names to lower-case prior to execution by default. So, pay attention to table name as SQL is case

sensitive. To override this case conversion, you can put the table/column name in double quotes. SELECT name,

sub_region FROM "states" WHERE sub_region = 'Soda';

F. Query-Writing Practice Exercises

• Implement the following queries using the following tables: the 2008 population data, soft drink data and geometries

are in the states table; the 2010 data are in the census2010 table.

1. Select the states with a 2008 population over 10 million.

2. Select the state capitals.

3. Select the states whose names begin with the word "New".

4. Select the cities whose names contain the letter "z".

5. Sort the states by their 2008 population from high to low.

6. Sort the states first by soft drink name, then by state name.

Page 4: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

4

7. Select the states with a 2008 population over 10 million and where the majority of population refers to soft drinks

as pop.

8. Select cities in the states of NY, NJ and PA (using the stateabb column).

9. For each state, compute the percentage of the 2010 population that is white. Give this output column an alias of

pctwhite. Besides pctwhite, include only the name of the state in the output. Note: the columns involved in this

calculation are defined as integers, which means the resulting value will be rounded to the nearest integer (0). To

avoid this rounding and obtain the desired percentages, add '::double precision' after the white column in the

calculation. This will convert the integer values to double precision values prior to the calculation. It is only

necessary to perform this conversion for one of the columns involved in the calculation.

10. Sum the 2008 state populations across the soft drink categories (i.e., What is the population of the 'soda' states? Of

the 'pop' states? Of the 'coke' states?)

11. Bring together data from the states and census2010 tables, outputting the name from the states table, total

population from the census2010 table and geom from the states table. Look into the syntax for INNER JOIN

12. Calculate the average 2010 male population across the soft drink categories.

G. Introduction to Spatial Select Queries

• Spatial data sets help answer geometric and topological questions.

1. Go to pgAdmin → Query → Execute the following query:

SELECT name, ST_Centroid(geom) AS centroid

FROM states

WHERE sub_region = 'Soda';

i. This query calls a function called ST_Centroid() that accepts inputs and returns outputs. The geom

column is used as an input to the function to get geometric centers of the shapes.

ii. The output is a point in the coordinate system of the input column, but expressed in hexadecimal

notation. To display the coordinate values in a readable form, convert it to a text using the function

ST_AsText().

2. Modify the query as follows, then execute:

SELECT name, ST_AsText(ST_Centroid(geom)) AS centroid

FROM states

WHERE sub_region = 'Soda';

3. Modify the query as follows, then execute:

SELECT name, ST_AsText(ST_Centroid(geom)) AS centroid, ST_Area(geom) AS area

FROM states

WHERE sub_region = 'Soda';

i. The values returned by ST_Area() are in the units of the input geometry (squared). As the coordinate

system is in latitude/longitude coordinates, the area is in square degrees. Use the ST_Transform()

function to reproject the geometry to a different spatial reference.

4. Modify the query as follows, then execute:

SELECT name, ST_AsText(ST_Centroid(geom)) AS centroid, ST_Area(ST_Transform(geom,2163))???? AS area

FROM states

WHERE sub_region = 'Soda';

i. The code 2163 is an equal-area projection in meters that is suitable for the continental US

ii. ST_Transform() re-projects input geometries in memory only; the input geometries stored in the table

remain in the same spatial reference they had before.

iii. "ST_" at the beginning of these functions stands for spatial-temporal.

H. PostGIS Geometry Types

Page 5: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

5

• Though polygon, line and point are the three main geometry types, other geometry types are available in PostGIS

that allow storage of multipart shapes, 3-dimensional shapes and shapes that have a measure (or M value)

associated with its vertices.

• You will create a new schema to store points of interest in New York City. Instead of using Shapefile Loader to

both create and populate a table at the same time, you will do these steps separately.

I. Create a new empty spatial table

1. pgAdmin → Object browser → right-click on the Schemas node beneath the template_postgis_20 database →

select New Schema.

i. Set the Name →nyc_poi

ii. Owner →postgres → Click OK.

2. Expand the object listing associated with the nyc_poi schema → Right-click on Tables → select New Table

i. Properties → table's Name →pts and Owner →postgres.

ii. Columns → Add →New Column → Name to gid and the Data type to serial. (This data type is roughly

equivalent to the AutoNumber type in Access.) →Click OK.

iii. Create a column name; Data type →character varying and its Length →50.

iv. Last column will be to hold geometries. Though a column of type 'point' through the GUI can be added, a

number of settings should be made such as its spatial reference ID or SRID. These settings are all handled

by a PostGIS maintenance function called AddGeometryColumn().

3. To specify gid column is the table's primary key: New Table dialog → click Constraints →Confirm that

Primary Key is selected from the drop-down list near the bottom of the dialog → Click Add.

i. New Primary Key → set the Name to pts_pk.

ii. Columns tab →select gid from the Column drop-down list → Click Add → Click OK →Click OK

again to dismiss the New Table dialog and create the table.

4. Reset the search path of pgAdmin to nyc_poi by executing the following statement in the Query dialog:

SET search_path TO nyc_poi, public;

5. Add a geometry column called geom to the table by executing this statement:

SELECT AddGeometryColumn('nyc_poi','pts','geom',4269,'POINT',2);

i. SELECT is used in this case because SQL rules don't allow for invoking functions directly. Function

calls must be made in one of the statement types we've encountered so far (SELECT, INSERT,

UPDATE or DELETE). In this situation, a SELECT statement is the most appropriate.

ii. Function arguments are: the schema name, the table name, the name to be given to the geometry

column, its spatial reference ID, the type of geometry it will store and the dimension of the

coordinates it will hold.

iii. dimension argument: if storing X and Y coordinates, then the dimension value is 2. In case of 3

dimensions, it will be 3. It is also possible to store some type of measure (M value) with each point

(e.g., the time the point was recorded with a GPS unit). In that scenario, you would also assign a

dimension value of 3, but would differentiate that XYM type of point from an XYZ point by setting

the geometry type to POINTM instead of POINT. Finally, it is possible to store all four values (X, Y,

Z and M). In that situation, you would assign a dimension value of 4.

II. Add rows to the spatial table

You will use INSERT statements to insert rows. Copy and paste them rather than typing manually. Can you store it in a

SQL file to implement all insertions one time???

1. SQL statement to insert a row into the pts table.

INSERT INTO pts (name, geom)

VALUES ('Empire State Building', ST_GeomFromText('POINT(-73.985744 40.748549)',4269))

Page 6: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

6

i. The ST_GeomFromText() function will convert a geometry supplied in text format to the

hexadecimal form that PostGIS geometries are stored in.

ii. The other argument “4269”is the spatial reference of the geometry. As the geom column using

AddGeometryColumn() was created with a constraint that values in that column must be in a

particular spatial reference (which we specified as 4269).

2. Execute the following statements to add a couple more rows to the table. You can write all the insert statements

simultaneously.

INSERT INTO pts (name, geom)

VALUES ('Statue of Liberty', ST_GeomFromText('POINT(-74.044508 40.689229)',4269));

INSERT INTO pts (name, geom)

VALUES ('World Trade Center', ST_GeomFromText('POINT(-74.013371 40.711549)',4269));

3. Another example of adding multiple rows using a single statement.

INSERT INTO pts (name, geom)

VALUES ('Radio City Music Hall', ST_GeomFromText('POINT(-73.97988 40.760171)',4269)),

('Madison Square Garden', ST_GeomFromText('POINT(-73.993544 40.750541)',4269));

4. pgAdmin → right-click on the pts table → select View Data →View All Rows to confirm that the INSERT

statements executed properly.

III. Create and populate a table of linestrings

1. Repeat the steps in Part I above to create a new table that will hold NYC line features.

o Give the table a name of lines.

o The table should have the same column definitions, but the geometry type should be set to LINESTRING

rather than POINT:

SELECT AddGeometryColumn('nyc_poi','lines','geom',4269,'LINESTRING',2)

o Search path is already set. So, don’t have to do it, but you can check.

2. Execute the following statement to insert 3 new rows into the lines table:

3.

INSERT INTO lines(name,geom)

VALUES

('Holland Tunnel', ST_GeomFromText('LINESTRING(-74.036486 40.730121,-74.03125 40.72882,-74.011123

40.725958)',4269)),

('Lincoln Tunnel', ST_GeomFromText('LINESTRING(-74.019921 40.767119,-74.002841 40.759773)',4269)),

('Brooklyn Bridge', ST_GeomFromText('LINESTRING(-73.99945 40.708231,-73.9937 40.703676)',4269));

i. The syntax for constructing linestrings is similar to that of points, but you supply more than one pair

of lat/lon to delineate the feature (with the pairs being connected sequentially by straight line

segments). The longitude (X) is followed by latitude (Y) by a space. Each lon/lat pair is separated by

comma.

ii. There may be curvy lines with too many lat/lon pairs in real world.

IV. Create and populate a table of polygons

1. Repeat the steps to create a polygon table, but give the table name of polys and set the geometry columns to

POLYGON.

2. Execute the following statement to add a row to your polys table:

INSERT INTO polys (name, geom)

VALUES ('Central Park',ST_GeomFromText('POLYGON((

-73.973057 40.764356,

The pink () means it may contain multiple

interior rings (multiple polygons inside)

Page 7: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

7

-73.981898 40.768094,

-73.958209 40.800621,

-73.949282 40.796853,

-73.973057 40.764356))',4269));

i. The syntax is similar to the syntax for line strings, but

a. The first lon/lat pair should be the same as the last (to close the polygon).

1. The coordinate list is enclosed in an additional set of parentheses because polygons are composed

of potentially multiple rings and every polygon has a ring that defines its exterior.

2. In case of polygons with rings inside, the exterior ring is supplied first followed by the

interior rings. Each ring is enclosed in a set of parentheses and the rings are separated by

commas. For example, add Central Park again with the reservoir near its center removed.

3. Remove the original Central Park row: pgAdmin →right-click on the polys table → select Truncate to delete all

rows from the table.

4. Add Central Park (minus the reservoir) back into the table using this statement:

INSERT INTO polys (name, geom)

VALUES ('Central Park',ST_GeomFromText('POLYGON((

-73.973057 40.764356, -73.981898 40.768094, -73.958209 40.800621, -73.949282 40.796853, -73.973057

40.764356), (-73.966681 40.785221, -73.966058 40.787674, -73.965586 40.788064, -73.9649 40.788291, -

73.963913 40.788194, -73.963333 40.788291, -73.962539 40.788259, -73.962153 40.788389, -73.96181

40.788714, -73.961359 40.788909, -73.960887 40.788925, -73.959986 40.788649, -73.959492 40.788649, -

73.958913 40.78873, -73.958269 40.788974, -73.957797 40.788844, -73.957497 40.788568, -73.957497

40.788259, -73.957776 40.787739, -73.95784 40.787057, -73.957819 40.786569, -73.960801 40.782394, -

73.961145 40.78215, -73.961638 40.782036, -73.962518 40.782199, -73.963076 40.78267, -73.963677

40.783661, -73.965694 40.784457, -73.966681 40.785221))',4269));

V. 3- and 4-dimensional geometries

• You won’t create any 3-dimensional (XYZ and XYM) and 4-dimensional (XYZM) geometries, but check out the

syntax.

• Define a column that can store M values as part of the geometry, use the POINTM, LINESTRINGM and

POLYGONM data types. When specifying objects of these types, the M value should appear last. For example,

an M value of 9999 is attached to each coordinate in these features from nyc_poi schema:

POINTM(-73.985744 40.748549 9999)

LINESTRINGM(-74.019921 40.767119 9999, -74.002841 40.759773 9999)

POLYGON((-73.973057 40.764356 9999, -73.981898 40.768094 9999, -73.958209 40.800621 9999, -

73.949282 40.796853 9999, -73.973057 40.764356 9999)

• The most common usage of M coordinates is in linear referencing (e.g., to store the distance from the start of a

road, power line, pipeline, etc.). Find a good article about this topic of linear referencing.

• To define a column capable of storing Z values along with X and Y, use the "plain" POINT, LINESTRING and

POLYGON data types rather than their "M" counterparts. The syntax for specifying an XYZ coordinate is the

same as that for an XYM coordinate, but the "plain" data type indicates the third coordinate is for Z value. E.g.,

you can include sea level elevation in the coordinates for the Empire State Building (in feet): POINT(-73.985744

40.748549 190)

• If you want to store both Z and M values, again use the "plain" POINT, LINESTRING and POLYGON data

types. The Z value should be listed third and the M value last. For example:

POINT(-73.985744 40.748549 190 9999)

VI. Multipart geometries

Page 8: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

8

• PostGIS provides support for features with multiple parts through the MULTIPOINT, MULTILINESTRING and

MULTIPOLYGON data types.

• E.g, state of Hawaii is composed of multiple disconnected islands. The syntax for specifying a

MULTIPOLYGON builds upon the rules for a regular POLYGON; the parts are separated by commas and an

additional set of parentheses is used to enclose the full coordinate list.

• The footprints of the World Trade Center Towers 1 and 2 (now fountains in the 9/11 Memorial) can be

represented as a single multipart polygon as follows:

MULTIPOLYGON(((-74.013751 40.711976, -74.01344 40.712439, -74.012834 40.712191, -74.013145

40.711732, -74.013751 40.711976)), ((-74.013622 40.710772, -74.013311 40.711236, -74.012699 40.710992, -

74.013021 40.710532, -74.013622 40.710772)))

• The Z values and M values are also supported for multipart geometries. The "MULTI" data types have "M"

counterparts too: MULTIPOINTM, MULTILINESTRINGM and MULTIPOLYGONM.)

VII. Mixing geometries

• The tables you have created use a Esri-centric design: with each table storing a single column of homogeneous

geometries (i.e., all points, all lines, or all polygons, but not a mix). PostGIS supports two design approaches:

• It is possible to store multiple geometry columns in a table. This capability could be used to store data in two

or more different spatial reference systems. Note: this should be done under limited circumstances given the

existence of the ST_Transform() function and the additional maintenance such a design would require.)

• It is possible to store multiple geometry types in a single column. This capability can simplify schema design

and certain types of queries, though there are cons to this approach, such as the fact that some third-party tools

can't deal with mixed-geometry tables.

• Implement this heterogeneous column approach to store all of our nyc_poi data in the same table.

1. Repeat the steps in Part I above to create a new table. Pay particular attention to these differences:

o Give the table a name of mixed.

o The table should have the same column definitions, with the exception that the geometry type should be

set to GEOMETRY rather than POINT.

2. Add the same features to this new table by executing the following statement:

INSERT INTO mixed (name, geom)

VALUES ('Empire State Building', ST_GeomFromText('POINT(-73.985744 40.748549)',4269)),

('Statue of Liberty', ST_GeomFromText('POINT(-74.044508 40.689229)',4269)),

('World Trade Center', ST_GeomFromText('POINT(-74.013371 40.711549)',4269)),

('Radio City Music Hall', ST_GeomFromText('POINT(-73.97988 40.760171)',4269)),

('Madison Square Garden', ST_GeomFromText('POINT(-73.993544 40.750541)',4269)),

('Holland Tunnel',ST_GeomFromText('LINESTRING(-74.036486 40.730121, -74.03125 40.72882, -74.011123

40.725958)',4269)),

('Lincoln Tunnel',ST_GeomFromText('LINESTRING( -74.019921 40.767119, -74.002841 40.759773)',4269)),

('Brooklyn Bridge',ST_GeomFromText('LINESTRING( -73.99945 40.708231, -73.9937 40.703676)',4269)),

('Central Park',ST_GeomFromText('POLYGON(( -73.973057 40.764356, -73.981898 40.768094, -73.958209

40.800621, -73.949282 40.796853, -73.973057 40.764356))',4269));

3. pgAdmin → Right-click on the mixed table → Select View Data → View All Rows to confirm that the INSERT

statement executed properly.

I. Viewing Data in QGIS

Quantum GIS (QGIS) is a free and open-source desktop GIS package analogous to Esri's ArcMap and supports viewing

PostGIS data and has strong cartographic capabilities. OpenJUMP is another desktop application often used in

combination with PostGIS though its strengths are in spatial querying and geoprocessing.)

I. Add PostGIS data to QGIS

1. Click on Start → All Programs → Quantum GIS Lisboa → Quantum GIS Desktop (1.8.0) to launch QGIS.

Page 9: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

9

2. The Manage Layers toolbar provides access to a set of buttons used to add data to the project (the buttons with

plus signs). Moving from left to right:

o Add Vector Layer - for data that is supported by the OGR library, an open-source library that allows

working with vector data in many different formats including Esri shapefiles and geodatabases

o Add Raster Layer - for data that is supported by the GDAL library the raster counterpart to the OGR

library

o Add PostGIS Layers - as the name implies, for data stored in PostGIS tables

o Add SpatiaLite Layer - for data stored in a SpatiaLite [1] database (another free and open-source spatial

database extension similar to PostGIS, built to add spatial functionality to an RDBMS called SQLite [2])

o Add MSSQL Spatial Layer - for data stored in Microsoft SQL Spatial Server spatial databases

o Add WMS Layer - for data streamed via a Web Mapping Service (WMS)

o Add WFS Layer - for data streamed via a Web Feature Service (WFS)

3. Click the Add PostGIS Layers button → Add PostGIS layers →Click New button.

4. In the Create a New PostGIS connection dialog supply the following information.

o Name: Lesson3

o Service - (leave blank)

o Host: localhost

o Port: 5432

o Database: template_postgis_20

o SSL mode: disable

o Username: postgres

o Password: <the password you established earlier in the lesson>

5. Click Test Connection to make sure you have typed things correctly → Click OK.

6. Lesson3 name should appear in the pick list in the Connections section of Add PostGIS layers dialog.

Page 10: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

10

7. Click Connect → you should see something like the following, includes nyc_poi in the Schema list.

8. Expand the list of layers to see all the tables you have created so far.

9. Highlight all 6 nyc_poi layer rows → click Add. Your data should display in the map display area and the layers

listed in the Layers pane on the left side of the window (just like ArcMap's Table of Contents).

I. Explore basic functions of QGIS

1. Zoom to full extent View →Zoom Full.

2. Reorganize the layers: point, line, polygon.

3. The pts, lines and polys layers are redundant with the three layers based on the mixed table, so turn off pts, lines

and polys.

4. Do you see any difference between the polys layer and the mixed layer for polygon for central park?

5. Rename the mixed table layers as mixed - lines, mixed - pts and mixed - polys.

i. See that each geometry type has a separate layer

6. Right-click on the mixed - lines layer and select Query.

i. In the SQL where clause

a. You see this expression: upper(geometrytype("geom")) IN ('LINESTRING,'MULTILINESTRING')

b. It was automatically created when we chose to add the lines from the mixed table. The geometrytype()

function returns the type of each geometry in the geom column (returning 'POINT', 'LINESTRING' or

'POLYGON', in this case). That value is then passed to the upper() function to ensure that it is all upper-

case. If the value matches one of the items in the IN clause (in this case, 'LINESTRING' and

'MULTILINESTRING'), the associated row is included in the layer. If the value does not match, then the

Page 11: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

11

row is excluded.

The mixed - pts layer similarly is defined by a query that looks for geometries of type POINT and

MULTIPOINT; mixed - polys by a query that looks for POLYGON and MULTIPOLYGON.

Check the Query property of the pts, lines and polys layers. Do you see this expression?

7. You need to extract only the tunnels from the mixed-lines layer. Go to mixed - lines Query dialog. Use the GUI

buttons and/or type directly into the box

"name" LIKE '%Tunnel' AND upper(geometrytype("geom")) IN ('LINESTRING','MULTILINESTRING')

You should see that the Brooklyn Bridge is no longer displayed as part of the layer.

8. Play with other tools available on: Map Navigation toolbar, Attributes toolbar, Identify Features tool,

9. Layer Properties dialog, Style tab, Actions tab, Joins tab, Diagrams tab.

10. Right-click on the mixed - pts layer → Open Attribute Table → Enter Statue in the Look for text box → select

name from the in drop-down list → click Search. This is a quick way to select features. For more complicated

queries, a Search query builder dialog can be invoked by clicking the Advanced search button.

11. Click Advanced search → enter the following example expression: gid < 4 → Click OK.

12. Finally, select Settings → Project Properties

i. Explore the General tab, the Coordinate Reference System (CRS) tab.

13. Click on the Coordinate Reference System (CRS) tab →Check the Enable 'on the fly' CRS transformation

box.

14. The Coordinate reference system of the world list has two categories Geographic Coordinate Systems and

Projected Coordinate Systems. Re-project your data into the New York East State Plane system.

i. Note: QGIS doesn't include a State Plane sub-category in its Projected Coordinate System list. You

should, however, find all of the state plane projections in either the Lambert Conformal Conic or

Transverse Mercator sub-categories.

15. Find and expand the Transverse Mercator category, then scroll down, select NAD83 / New York East EPSG:

32115 and click OK.

16. View →Zoom Full to see all your data. The coordinate readout at the bottom of the application window now

reports values in the ballpark of 190000,210000.

17. Save your project by clicking the Save Project button → name it Lesson3 with file for project files .qgs.

J. Working with Views

You have done queries using MS Access that can be stored. Similar things can be done in Postgres and they are called

views (though it is not the same conceptually).

I. Create a view based on attribute criteria

1. In the pgAdmin Query dialog, execute the following query (which identifies the state capitals):

SELECT * FROM usa.cities WHERE capital = 1 ORDER BY stateabb;

2. After confirming that the query returns the correct rows, copy the SQL to the clipboard.

3. Go to pgAdmin window → usa schema → right-click on the Views node → select New View → Set the view's

Name to vw_capitals and Owner to postgres → Click on the Definition tab → paste the SQL statement held on

the clipboard → Click OK.

4. Select View → Refresh → Right-click on vw_capitals and select View Data →View All Rows. Any updates

made to the table will be displayed in the view.

II. Build a query based on a view

Page 12: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

12

1. In the pgAdmin Query dialog, execute the following query (which identifies the relatively small capitals):

SELECT * FROM usa.vw_capitals WHERE popclass = 2;

III. Create a view based on a spatial function

Create views that re-project our states and cities data on the fly.

1. Execute the following query:

SELECT gid, name, pop2008, sub_region, ST_Transform(geom,2163) AS geom FROM usa.states;

Does not work here?? 2. Follow the procedure outlined above to create a new view based on this query. Assign a name of vw_states_2163

to this view.

3. Repeat this process to create an on-the-fly re-projection of the cities data called vw_cities_2163. Define the view

using the following query:

SELECT *, ST_Transform(geom,2163) AS geom_2163 FROM usa.cities;

IV. Display views in QGIS

You can add views as layers.

1. Follow the same steps you followed above to get to usa schema.

2. Expand the object list associated with the usa schema → You should see the original cities and states tables and

vw_capitals, vw_states_2163 and two versions of vw_cities_2163.

You see two versions of vw_cities_2163 because that view outputs all of the columns from the cities table

(including geom) plus a column of geometries re-projected into SRID 2163 (geom_2163).

Add these layers to the QGIS project and play around with the view (zoom to the extent of the different layers,

and turn the views on and off), you'll note that cities, states, vw_capitals and vw_cities_2163 (the one based on

the original geom column) align correctly with one another, but they do not align with the other vw_cities_2163

(the one based on the geom_2163 column) and vw_states_2163 due to projection difference. You need on-the-fly

re-projection.

3. Click Settings → Project Properties → Coordinate Reference System (CRS) tab → check the Enable 'on the

fly' CRS transformation box → Click OK.

K. Spatial Reference Considerations

I. Spatial Reference ID lookup

• Populating a geometry column with usable data requires specifying the spatial reference of the data. You can re-

project using the ST_Transform() function, but you had to use SRID (Spatial Reference ID). These IDs come

from the spatial reference IDs defined by the European Petroleum Survey Group (EPSG). How to find these Ids?

Using pgAdmin

SRID are stored in a Postgres table in the public schema called spatial_ref_sys.

1. In pgAdmin → spatial_ref_sys table → view top 100 rows → check the srid and srtext columns.

2. One way to find an SRID is to query the spatial_ref_sys table.

3. Open the query dialog and execute the following query:

SELECT srid, srtext FROM spatial_ref_sys

WHERE srtext LIKE '%Mississippi%';

This query shows the SRIDs of each Mississippi-specific spatial reference supported in PostGIS.

Page 13: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

13

Using QGIS

Another way to find SRIDs is to look them up in QGIS.

1. QGIS → Settings > Project Properties → Under the Coordinate Reference System (CRS) tab

2. Expand the Projected Coordinate Systems → Universal Transverse Mercator (UTM) category.

3. Scroll down through the list and find NAD83 / UTM zone 18N.

IN QGIS 2.6, Settings→Options→CRS→CRS for new Layers→use default CRS displayed below→”Select”

button

4. On the right side of the dialog, you should see a column called Authority ID. Note that most of the Authority ID

values are prefixed with EPSG, which means those are the values you should use in PostGIS. In this case, you

would find that the desired SRID, for UTM NAD83 Zone 18N, is 26918.

Using the Prj2EPSG service

The website http://prj2epsg.org/ [1] provides an easy-to-use interface for finding EPSG IDs. You can upload a .prj file

(used by Esri to store projection metadata) and get back the matching EPSG ID. The site also makes it possible to enter

search terms.

II. Geometry metadata in PostGIS

The new version of the PostgreSQL has view, which used to be the geometry_columns table and is found in the Views list

of the Public schema. What is a table?

Pre-version 2.0 of PostGIS: The public schema contains a table called spatial_ref_sys that stores all of the spatial

references supported by PostGIS. Another important table in that schema is geometry_columns. This table stores the

metadata, the parent schema, the parent table, the geometry column’s name, the coordinate dimension, the SRID and the

geometry type (e.g., POINT, LINESTRING, etc.). Being able to conduct spatial analysis with PostGIS requires accurate

data in this table, so it’s best not to go mucking around in it unless you’re confident you know what you’re doing. You

used the AddGeometryColumn() function instead of adding the geometry column through the table definition GUI. An

important reason for adding the column in that manner is that it adds a record to the geometry_columns table, something

that would not happen if we had used the GUI.

III. Spherical measurements and the geography data type

• What is the difference between Cartesian (2D) measurement versus spherical (3D) measurement. For example,

the PostGIS function ST_Distance() can be used to calculate the distance between two geometries. For

geometries you used so far, ST_Distance() will calculate distances in Cartesian space. This is OK at a local or

regional scale (earth’s curvature is minimum), but over a continental or global scale a significant error would

result.

• To address this issue, you can use ST_Distance_Spheroid(): calculates the minimum great-circle distance

between two geometries.

• You can also use a data type called geography. Unlike the geometry data type, the geography data type is

meant for storing only latitude/longitude coordinates. In this data type, the measurement functions like

ST_Distance(), ST_Length() and ST_Area() will return measures calculated in 3D space rather than 2D space.

The disadvantage is that the geography data type is compatible with a significantly smaller subset of functions as

compared to the geometry type. Calculating spherical measures can also take longer than Cartesian measures

since the mathematics involved is more complex.

• This data type: geography can simplify data handling for projects covering a continental-to-global scale. For

projects covering a smaller portion of the earth’s surface, better to use geometry data type.

Project 3: Mapping the Class Roster • Map the hometowns of everyone in your class based on their zip code centroids using Postgres/PostGIS and

QGIS.

• The zip file holds a counties shapefile and a comma-separated values file called postal_codes.txt that stores U.S.

and Canadian postal codes along with the coordinates of their centroids.

Page 14: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

14

• Here are the broad steps you should follow:

1. Import the counties shapefile into Postgres.

2. Create a table to store the postal code centroids.

3. Load the centroid data from the text file.

The format of the code in postal_codes text file (double quotes) causes syntax error when copying to

prj.postcode, so I first import it into excel, and then copy the data as csv file.

Page 15: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

15

Here is the SQL statement used to copy postal_code.csv to postgres:

COPY prj.postcode FROM 'C:\Xiaohui\GeoDB\w8\PGIS_Data1\postal_codes.csv' WITH (FORMAT csv,

HEADER True);

4. Add a geometry column and use an UPDATE query to populate it with POINT geometries.

1) SELECT AddGeometryColumn('prj','postcode','geom',4269,'POINT',2);

Page 16: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

16

2) use an UPDATE query to populate it with POINT geometries:

UPDATE prj.postcode

SET geom = ST_SetSRID(ST_MakePoint(postcode.long,postcode.lat),4269);

3) After populated the point geometry, the prj.postcode table is shown below:

Page 17: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

17

5. Create a table to store the class roster.

6. 1) The format of the names in class_roster text file (double quotes) causes syntax error when copying to

prj_classroster, so I first import it into excel, and then copy the data as csv file. Load the class roster from

class_roster.csv file.

Page 18: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

18

2) Here is the SQL statement used to copy class_roster.csv to postgres:

COPY prj.classroster(last,first,postalcode)

FROM 'C:\Xiaohui\GeoDB\w8\PGIS_Data1\class_roster.csv'

WITH (FORMAT csv,DELIMITER ',');

3) The updated prj.classroster is shown below:

Page 19: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

19

7. Determine a way to associate the geometries in the postal codes table with the records in the roster table.

1) Associate the geometries

SELECT classroster.last,classroster.first,postcode.geom,postcode.code

FROM prj.postcode,prj.classroster

WHERE postcode.code = classroster.postalcode

Page 20: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

20

2) Create a view to store the result from the above query

a. In the pgAdmin Query dialog, execute the following query (which identifies the state capitals):

SELECT classroster.last,classroster.first,postcode.geom,postcode.code

FROM prj.postcode,prj.classroster

WHERE postcode.code = classroster.postalcode

b. After confirming that the query returns the correct rows, copy the SQL to the clipboard.

c. Go to pgAdmin window → prj schema → right-click on the Views node → select New View → Set the

view's Name to vw_postal and Owner to postgres → Click on the Definition tab → paste the SQL

statement held on the clipboard → Click OK.

d. Select View → Refresh → Right-click on vw_postal and select View Data →View All Rows. Any updates

made to the table will be displayed in the view.

8. Use QGIS to create a map of the hometowns. Include state and county boundaries for some context.

Page 21: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

21

Tips:

• The copy command can be used to load data from a text file into a table. Use the Postgres documentation [2] to

determine the correct usage for this command. Pay attention to the following options: FORMAT, HEADER,

DELIMITER and QUOTES. The copy command expects the columns of the "to" table to match and be in the

same order as the columns of the "from" file.

• You can assume that the centroid coordinates are in the NAD83/Geographic coordinate system.

• Assuming your postal codes table contains columns named lat and lon, you could use the following expression in

an UPDATE statement to create a textual representation of the POINT geometry:

'POINT(' || lon || ' ' || lat || ')'

Note that the double-pipe character string (||) is the concatenation operator in Postgres.

• Tables must have a primary key column defined in order to be added as a layer in QGIS. Also, it is important that

this primary key column be an integer.

• There are multiple ways to address step 7, some more well-designed than others. The ideal design would allow

you to make changes to the class roster table (e.g., add new students or change erroneous postal codes) and those

changes will automatically be reflected in QGIS (i.e., it should not be necessary to update a geometry column

when a new student is added). Coming up with this design will be worth 10% of the total project points.

• Draw the state and county boundaries in light hues so they don't distract from the main thing the map is meant to

convey.

Deliverables

1. A map for the final output. The map should have all the cartographic requirements.

2. Describe the process you implemented to get to the map

o Ease of making updates/insertions to the class roster (10 of 100 points)Thanks to PSU online education

for the information for the lab.

Page 22: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

22

Homework 9

GHY 467/567 – Spring 2015

DUE: April 14, 2015

Email the answers to [email protected]

PostGIS Spatial Function Practice Exercises

1. Add Times Square to the nyc_poi pts table (located at 40.757685,-73.985727).

INSERT INTO nyc_poi.pts(name,geom) VALUES ('Times Square', ST_SetSRID(ST_MakePoint(40.757685,-

73.985727),4269))

2. Report the number of points in the geometries of states with a 2010 population over 10 million.

SELECT name, ST_NPoints(geom)

FROM usa.states INNER JOIN usa.census2010

ON states.name = census2010.state

WHERE census2010.total > 10000000;

3. Perform an on-the-fly re-projection of your nyc_poi pts data into the New York East NAD27 state plane

coordinate system.

In QGIS, settings→options→CRS→Default CRS for new projects→Select →use “new york” as the filter and find

the “New York East NAD27 state plane coordinate system” shown in the figure:

Page 23: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

23

The SRID is 32015.

SELECT name, ST_Transform(geom,32015) FROM nyc_poi.pts;

4. Output the US cities coordinates in human-readable format, including the SRID.

SELECT name,ST_AsEWKT(geom)

FROM usa.cities

Page 24: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

24

5. Select the names and centroids of states where the 2010 male population is greater than the female population.

SELECT name,ST_AsText(ST_Centroid(geom))

FROM usa.states INNER JOIN usa.census2010

ON states.name = census2010.state

WHERE male>female

6. List the states that contain a city named Springfield (based on points in the cities table).

SELECT states.name AS states,cities.name AS city

FROM usa.cities CROSS JOIN usa.states

WHERE ST_Contains(ST_SetSRID(states.geom,4269), cities.geom)

AND cities.name LIKE '%Springfield%'

Page 25: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

25

7. List the cities that are found in ‘Soda’ states. Sort the cities first by state name, then city name.

SELECT states.name AS states,cities.name AS city

FROM usa.cities CROSS JOIN usa.states

WHERE ST_Contains(ST_SetSRID(states.geom,4269),cities.geom)

AND states.sub_region = 'Soda'

ORDER BY states.name,cities.name

8. Select features from the nyc_poi lines table that are within 1 mile of Madison Square Garden.

SELECT lines.name AS LineFeature

FROM nyc_poi.lines CROSS JOIN nyc_poi.pts

WHERE ST_DWithin(ST_Transform(lines.geom,4269),ST_Transform(pts.geom,4269),5280*2)

AND pts.name = 'Madison Square Garden'

9. Calculate the distance in kilometers (based on a sphere) between all of the state capitals.

SELECT cityA.name,cityB.name,ST_Distance_Sphere(cityA.geom,cityB.geom)/1000 AS dist_sphere

FROM usa.cities As cityA CROSS JOIN usa.cities AS cityB

Page 26: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

26

WHERE cityA.capital = -1 AND cityB.capital = -1

10. Find the SRID of the Pennsylvania North NAD83 state plane feet coordinate system.

SELECT * FROM spatial_ref_sys

WHERE srtext LIKE '%Pennsylvania North%'

11. Using data in the 'usa' schema, select states containing cities that have a 'popclass' value of 4 or 5. Pretend that

the stateabb column in the cities table doesn't exist. Don't fret if your results include duplicate states. How to

handle duplicate results?

SELECT DISTINCT states.name AS states,cities.name AS city

FROM usa.cities CROSS JOIN usa.states

WHERE ST_Contains(ST_SetSRID(states.geom,4269),cities.geom)

AND cities.popclass = 4 OR cities.popclass = 5

Page 27: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

27

Project 4: Jen and Barry's Site Selection in PostGIS

Suppose "Jen and Barry" are looking for the best place to open an ice cream business. Your task is to import the project

shapefiles into a PostGIS schema and then write a series of SQL statements that automate the site selection process.

Download the file PGIS_Data2.zip. Here are the selection criteria:

• Greater than 500 farms for milk production

• A labor pool of at least 25,000 individuals between the ages of 18 and 64 years

• A low crime index (less than or equal to 0.02)

• A population of less than 150 individuals per square mile

• Located near a university or college

• At least one recreation area within 10 miles

• Interstate within 20 miles

• You should get 9 candidate sites without using the interstate and recreation area criteria. You should have 4

candidate sites will all the criteria included.

o Create a new schema “prj2”, then import the project shapefiles into it.

Page 28: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

28

o Browse through each table in schema prj2 and identity the level of query for each of the

selection criteria.

Query Level Selection Criteria

county-level • Greater than 500 farms for milk production

• A labor pool of at least 25,000 individuals between the

ages of 18 and 64 years

• A population of less than 150 individuals per square mile

city-level • A low crime index (less than or equal to 0.02)

• Located near a university or college

"near an interstate" • Interstate within 20 miles

"near a recreation area" • At least one recreation area within 10 miles

o 1) Implement query at county level.

SELECT name, geom

FROM prj2.counties

WHERE no_farms87 >500 AND age_18_64 > 25000

AND pop_sqmile <150

Page 29: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

29

2) Create a view to store the result from the above query.

a. Copy the above query

b. c. Go to pgAdmin window → prj2 schema → right-click on the Views node → select New

View → Set the view's Name to vw_county and Owner to postgres → Click on the Definition tab

→ paste the SQL statement held on the clipboard → Click OK.

o 1) Implement query at city level from the saved view vw_county table and cities table.

SELECT vw_county.name AS county, cities.name AS city,cities.geom AS city_geom

FROM prj2.vw_county CROSS JOIN prj2.cities

WHERE ST_Contains(vw_county.geom,cities.geom)

AND cities.crime_inde <= 0.02 AND cities.university > 0

2) Copy the above query to create a view for the result: vw_city

o Implement query to find cities within 20 miles of interstate roads.

▪ Be sure to re-project the data into a spatial reference that is appropriate for distance

measurement. Use Pennsylvania State Plane North NAD83 (SRID = 2271) for re-projection.

SELECT DISTINCT vw_city.city AS city, vw_city.city_geom AS city_geom

FROM prj2.vw_city CROSS JOIN prj2.interstates

WHERE ST_DWithin(ST_Transform(vw_city.city_geom,2271),

ST_Transform(interstates.geom,2271), 5280*20)

Page 30: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

30

▪ Copy the above query to create a view for the result: vw_interstate.

o 1) Implement a query to find cities have at least one recreation area within 10 miles.

SELECT DISTINCT vw_interstate.city AS city, vw_interstate.city_geom AS city_geom

FROM prj2.vw_interstate CROSS JOIN prj2.recareas

WHERE ST_DWithin(ST_Transform(vw_interstate.city_geom,

2271),ST_Transform(recareas.geom,2271), 5280*10)

▪ 2) As shown in the above screenshot, there are 4 cities returned for final result. Copy the

above query to create a view for the result: vw_final.

o Open QGIS→add the final candidate layer and the other original datasets.

Page 31: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

31

o In QGIS interface, change the color schema and the label display of each layer to make sure all

the map content are clearly displayed.

o Project→New print composer→specify a name for the map composer as “Jen and Barry's Site

Selection”.

o In the map composer window, layout→add map, then the map shown in QGIS desktop will be

added.

o Insert map frame, legend, title, scale bar, north arrow and adjust the layout. The final map is

shown below:

Page 32: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

32

B. Tips for completing Project 4

• Do not use a long query.

• The most logical workflow will be:

1. Write a query that selects the suitable counties based on the county-level criteria and save this as a view.

2. Write a query that selects cities in the suitable counties that also meet the city-level criteria and save this

as a view. This would meet the project requirements.

3. Select from cities in the view saved in #2 that meet the "near an interstate" criterion and save this as a

view.

4. Then select from cities in the view saved in #3 that meet the "near a recreation area" criterion.

• It should not be necessary to create any new tables to hold intermediate results.

• This is a scenario that calls for the use of cross joins.

• If you look at the .prj files for the project shapefiles, you'll see that they are in geographic coordinates, NAD27

(SRID = 4267). Be sure to re-project the data into a spatial reference that is appropriate for distance

measurement. Pennsylvania State Plane North (NAD27 or NAD83), units = feet, would be a logical choice for

that purpose.

C. Deliverables

1. Submit a write-up that summarizes your approach to this project. Most importantly, include the SQL code that

you developed. Don't forget to include the code built into your views, if you create any. Create a map in QGIS of

the candidate cities and all other data sets.

Thanks to PSU online education for the information for the lab.

Page 33: by Xiaohui Liu...o Postgres 9.3.1 - Use the 32-bit version as additional computing power is not necessary for the class (you may use 64 bit for your project and production). When prompted

33