open geoda - chicago public health gis...2014/01/24  · geoda open geoda on desktop file/open...

46
OPEN GEODA WORKSHOP / CRASH COURSE FACILITATED BY M. KOLAK

Upload: others

Post on 22-Jan-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

OPEN

GEODAWORKSHOP / CRASH COURSE

FACILITATED BY M. KOLAK

Page 2: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

WHAT IS GEODA?

• Software program that serves as an introduction to spatial data analysis

• Free

• Open Source

• Source code is available under GNU license

• As of final version, runs on Windows, Mac OS, and Linux

• Can open shapefiles or tables

Page 3: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

WHAT IS GEODA?

• Developed by Dr. Luc Anselin team

• Spatial econometrics• Epidemiology applications

• Supported by the National Science Foundation and the Center for Spatially Integrated Social Science Flagship of the GeoDa Center in Arizona State University

geodacenter.asu.edu/projects/opengeoda

Page 4: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

PART I

• Open a file in GeoDa

• Make different Chloropleth Maps

• Open a Table in GeoDa

• Link between table and maps

• Navigate, sort, select, and query data in the Table

• Create a new variable

• Calculate raw rate for new variable

• Save as a new shapefile

Page 5: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

OPENING A FILE IN

GEODA

Open GeoDa on Desktop

File/Open Shapefile• Open SIDS.shp

Many ways to change the “map” you see in view:• Right click on display and change the Category• Got to Map/ in the Navigation Menu

Page 6: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

CHLOROPLETH MAPS

A choropleth map is a thematic map in which areas are shaded or patterned in proportion to the measurement of the statistical variable being displayed on the map. - WikiPedia

Quantile Map• Create a quantile map for NWBIR74 and SID74 (using defaults)

Page 7: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

CHLOROPLETH MAPS

Percentile Map:• Create a percentile map for NWBIR74 and SID74 (using defaults)

Page 8: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

WORKING WITH DATA IN

TABLE

Navigation, selection, and sorting: (live demos)

• Linking between data table and map• Moving selection to top

Queries: • Selection Dialog to select something specific• Can add as a variable

• Could assign value as 1 if query is true, for example• Can move selection to top

Page 9: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

WORKING WITH DATA IN

TABLE

Creating a Variable• Add Variable (Right-Click)• Name your new variable

“SIDR74” to record a raw rate for SID occurrence in 1974 specified population

Page 10: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

WORKING WITH DATA IN

TABLE

Raw RateThe raw rate is the same as the rate or the percentage. It consists of an event (numerator) and base (denominator) variable.

Event and Base variables

For rates, the Event field refers to the numerator, the Base field to the denominator. The Event field can be thought of as a count field since it refers to variables such as counts, dollar values, or indices. In the Base field, the reference universe for the Event variable is chosen (it cannot contain any zero values). For instance, in the St. Louis homicide dataset, an Event variable is HC7984 (homicide count, 1979-84) while a Base variable is PO7984 (population total, 1979-84).

Page 11: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

WORKING WITH DATA IN

TABLE

Creating a Variable• Assign the “SID74Rate” variable to equal the Raw Rate in the

Variable Calculation tool

Page 12: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

WORKING WITH DATA IN

TABLE

Rescale by 100,000 births

Page 13: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

WORKING WITH DATA IN

TABLE

Confirm changes in Table

Page 14: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

WORKING WITH DATA IN

TABLE

Page 15: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

WORKING WITH DATA IN

TABLE

Save as a New Shapefile (with new name), -- under File

Page 16: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

PRACTICE

• Create a SID Raw Rate variable for 1979/

• Save changes as a new shapefile.

• Try out other map options using Category options.

Page 17: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

PART II

• Intro to Exploratory Data Analysis

• Make a Histogram and Box Plot from Data

• Investigate Outliers

• Make a Rate Map (Raw and Excess)

• Make an EB Smoothed Map

• Make a Spatial Weight file for your data

Page 18: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

EDA BASICS - HISTOGRAM

• Create a Histogram for a Variable

• Click histogram icon in Navigation toolbar• Select Variable (ie. Calculated SIDS rate)• Right-Click on histogram to adjust display• Change the number of intervals in histogram• “Link” histogram to map by Clicking interested areas

Page 19: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

EDA BASICS – BOX PLOT

• Create a Box Plot for a Variable

• Click box plot icon in Navigation toolbar• Select Variable (ie. Calculated SIDS rate)• Right-Click on box plot to adjust display• Hinge can be adjusted to 1.5 or 3• Create a map from Box Plot data

Page 20: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

EDA BASICS – BOX PLOT

• Depicts non-spatial distribution of a variable

• Represents cumulative distribution of variable, sorted by value

• Value in parantheses on upper right corner = # of observations

• Shows median, first, third quartile of distribution (50%, 25%, 75%) and an outlier

• Outliers: lie more than a given multiple of the interquartile range (difference in value between 75% and 25% observation)

• Standard Multiples used are 1.5 and 3 times the interquartile range

Page 21: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

EDA BASICS

Explore the data further by clicking on interesting areas, outliers, etc.

Change the hinge and explore again.

Page 22: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

BASIC RATE

MAPPING

Raw Rate Map• Keep your Box Plot, Hinge 1.5 Map Open

• Create a new, themeless map

• Right-click your map, and select “Rates/Raw Rate”

• Choose SID74 as event variable, and BIR74 as base• Right-click map and select “Save Rate” to write as a new

variable (default as R_RAWRATE)• Drag and drop column next to previously calculated rate• (Should be off by our multiplying factor)

Page 23: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

BASIC RATE

MAPPING

Raw Rate Map

Page 24: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

BASIC RATE

MAPPING

Excess Rate Map• Standardized mortality rate (SMR) commonly used notion to

compare observed rate to a standard

• In GeoDa, Excess Ratio is the ratio of the observed rate to the average rate computed for all data

• This average is NOT the average of the all rates

• Calculated as ratio of total sum of all events over sum of all populations at risk

Page 25: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

BASIC RATE

MAPPING

Excess Rate Map• Right-Click Map, click on Rates/ Excess Rates

• Choose appropriate event and base variables

• Right-click on Map again to Save Rates, and add to table

Page 26: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

BASIC RATE

MAPPING

Excess Rate Map• Areas with less risk are blue (<1.00)

• Areas with more risk are red (>1.00)

• Legend Categories are hard-coded

• To do analysis or visualization, you must use add the rates to the table (done in previous slide)

• Drag and drop column to appropriate place in table

Page 27: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

PRACTICE

Create Histogram, Box Plots, and Rate Maps for the Ohio lung cancer sample data

Page 28: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

RATE SMOOTHING

Rate Smoothing techniques:

• To correct for the inherent variance instability of rates

• Empirical Bayes Smoothing (according to L. Anselin):

• Computing weighted average between raw rate for each county and state average, with weights proportional to the underlying population at risk

• IE. Small counties, with small populations at risk, will tend to have rates adjusted considerably, whereas large counties will barely change

Page 29: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

RATE SMOOTHING

Empirical Bayes (EB) Smoothed Rates• Right-click map, Select Rates/ Empirical Bayes• Choose your event and base variables• Use a 1.5-hinge box plot

• Can use a Percentil Map if Appropriate• Use Box Plot if <100 observations

• Right-Click to Save Rates and add to table

• Compare EB-smoothed map with previous rate maps

• How are outliers affected?

Page 30: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

RATE SMOOTHING

Page 31: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

RATE SMOOTHING

Spatial Weight Smoothing• Does proximity to neighbors affect the results?• In GeoDa, neighbors are defined as a spatial weights file• Create a simple spatial weights file for 8 nearest neighbors for

each county:• Go to the menu: Tools/ Weights/ Create• Choose “FIPSNO” for the ID variable

• Each county (or tract or block) will have a unique ID no.• Leave the defaults for the “Distance Weights” Section• Click on the k-Nearest Neighbors radio button, and adjust

for 8 neighbors• Save as a .gwt file in your folder

Page 32: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

RATE SMOOTHING

Spatial Weight Smoothing• Load spatial weight file you just created

• Go to the menu: Tools/ Weights/ Open

• Spatial Weights will now be loaded for next maps

• Create a new map with spatial rate smoothing• Right-click and choose Rates / Spatial Rates

• Use the same Base and Event variables

• Use the Box Plot with 1.5 Hinge

• Compare to previous box plot maps!

Page 33: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

RATE SMOOTHING

Spatial Weight Smoothing• Spatially smoothed

maps emphasize broad regional patterns.

• What happened to the outliers?

Page 34: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL WEIGHTS

Contiguity-Based Spatial WeightsDefinition of a neighbor is based on sharing a common boundary.

Connectivity Histogram (according to L. Anselin)

• Histogram reflects connectivity distribution in data set

• Detects strange features in the distribution which could affect spatial autocorrelation and spatial regression specifications

• Beware of 1) islands, or unconnected observations, and 2) bimodal distribution of locations

Page 35: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL WEIGHTS

Rook-Based Contiguity• Go to Tools/ Weights /Create

• create a Rook-Based Weights File• use the Key variable

• Go to Tools/ Weights/ Connectivity Histogram to see results

Page 36: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL WEIGHTS

Queen-Based Contiguity• Go to Tools/ Weights /Create

• create a Queen-Based Weights File• use the Key variable

• Go to Tools/ Weights/ Connectivity Histogram to see results

Page 37: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL WEIGHTS

How are neighboring units determined?• Queen criterion determines

neighboring units as those that have any point in common, including both common boundaries and common corners

• Number of neighbors for any given unit will be equal to or greater that the rook criterion

Page 38: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL WEIGHTS

Page 39: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL WEIGHTS

Higher Order Contiguity• Two definitions of higher order contiguity:

• Pure: does not include locations that were also contiguous of a lower order

• Cumulative: includes all lower order neighbors

Page 40: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL LAG CONSTRUCTION

Spatially Lagged Variables• Load a weighted file

• Open Table, Right-Click and select “Variable Calculation”

• Choose “Spatial Lag” construction

• Can Add Variable with new name (W_INC) • Spatial Weights file will already be loaded• Choose Variable to be spatially lagged (HH_INC)

• New Variable is calculated and added to Table

• For contiguity weights file, spatially lagged variable is the simple average of the values for the neighboring units

Page 41: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL LAG CONSTRUCTION

Value for one value is the average of values of weighted variable in neighboring units.

Page 42: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL AUTOCORRELATION

Moran Scatter PlotPlot with variable of interest on x-axis, and spatial lag on y-axis

Use the Scatter Plot icon to manually create a Moran Scatter Plot:

• W_INC in left side, HH_INC on the right side

• Slope of regression line is the Moran’s I Statistics for HH_INC using a rook contiguity weights definition

Page 43: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL

AUTOCORRELATION

Global Spatial AutocorrelationWe will work with the univariate case and Moran scatter plot.

Scottish Lip Cancer Data:• Map/ Raw Rate• Cancer as Event, and Pop as Base variable• Set map to the Box Type with Hinge 1.5• Save Rates (R_RAWRATE is the default)• Create a weights file with 5 nearest neighbors (try k)

Page 44: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL

AUTOCORRELATION

Moran I Plot and Statistic• Go to the Menu, and select Space/ Univariate Moran I

• Select R_RAWRATE as variable

• Select your weights file

• Notice x and y axis set up accordingly

• Spatial lag variable constructed for y-axis• R_RAWRATE on x-axis has been standardized to

correspond to standard deviations (beyond 2SD as outlier)• Centered on Mean with axes drawn in 4 quadrants

Page 45: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL

AUTOCORRELATION

Moran I Plot and Statistic• 4 quadrants correspond to different types of spatial

autocorrelation:

• High-high and low-low for positive autocorrelation• Low-high and high-low for negative spatial autocorrelation

• Value listed at the top is the Moran’s I Statistic

• You can exclude selected as an option

• Intermediate calculations can be saved to data table

• Right-click on graph and select Save Results

Page 46: OPEN GEODA - Chicago Public Health GIS...2014/01/24  · GEODA Open GeoDa on Desktop File/Open Shapefile •Open SIDS.shp Many ways to change the “map” you see in view: •Right

SPATIAL

AUTOCORRELATION

Inference• Inference for Moran I is based on random permutation

procedure (calculates statistic many times to generate reference distribution)

• Obtained statistic compared to reference distribution for a pseudo significance level computation

• Right-click plot, Select Randomization > 999 permutations

• Click on Run to assess sensitivity of results

• Most significant p-level depends directly on # of permutations