introduction to spatial data analysis in the social sciences rsoc597a: special topics in...

50
Introduction to Introduction to Spatial Data Analysis Spatial Data Analysis in the Social Sciences in the Social Sciences RSOC597A: Special Topics in RSOC597A: Special Topics in Methods/Statistics Methods/Statistics Kathy Brasier Kathy Brasier Penn State University Penn State University June 14, 2005 June 14, 2005

Upload: leonard-griffin

Post on 02-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Introduction to Spatial Introduction to Spatial Data Analysis in the Social Data Analysis in the Social

SciencesSciencesRSOC597A: Special Topics in Methods/StatisticsRSOC597A: Special Topics in Methods/Statistics

Kathy BrasierKathy BrasierPenn State UniversityPenn State University

June 14, 2005June 14, 2005

Page 2: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Session ObjectivesSession Objectives Understand why spatial data analysis is Understand why spatial data analysis is

importantimportant Identify types of questions for which SDA is Identify types of questions for which SDA is

relevantrelevant Gain basic knowledge of the concepts, Gain basic knowledge of the concepts,

statistics, and methods of SDAstatistics, and methods of SDA Identify some important issues and decision Identify some important issues and decision

points within SDApoints within SDA Learn about some resources for doing spatial Learn about some resources for doing spatial

data analysis (software, web sites, books, etc.) data analysis (software, web sites, books, etc.) Avoid getting lost in equations!Avoid getting lost in equations!

Page 3: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Why Do Spatial Analysis?Why Do Spatial Analysis?

““Everything is related to everything Everything is related to everything else, but closer things more so.” else, but closer things more so.”

(attributed to (attributed to Tobler)Tobler)

Page 4: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

ExamplesExamples

Is your educational level likely to be Is your educational level likely to be similar to your neighbor’s?similar to your neighbor’s?

Are farm practices likely to be similar Are farm practices likely to be similar on neighboring farms?on neighboring farms?

Are housing values likely to be Are housing values likely to be similar in nearby developments?similar in nearby developments?

Do nearby neighborhoods have Do nearby neighborhoods have similar burglary rates?similar burglary rates?

Page 5: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

South.shp0 - 5.8325.832 - 11.98311.983 - 20.30520.305 - 64.261

County Homicide Rates County Homicide Rates 19901990

Page 6: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

What Is Spatial Data?What Is Spatial Data? 4 main types4 main types

event data, spatially continuous data, zonal data, event data, spatially continuous data, zonal data, spatial interaction dataspatial interaction data

Most frequently used in social sciences is Most frequently used in social sciences is zonal datazonal data Data aggregated to a set of areal units (counties, Data aggregated to a set of areal units (counties,

MSAs, census blocks, ZIP codes, watersheds, etc.)MSAs, census blocks, ZIP codes, watersheds, etc.) Variables measured over the set of unitsVariables measured over the set of units

Examples: Census, REIS, County and City Examples: Census, REIS, County and City Databook, etc. Databook, etc.

Page 7: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

What is Spatial Data What is Spatial Data Analysis?Analysis?

““The analysis of data on some The analysis of data on some processprocess operating in space, where methods are operating in space, where methods are sought to describe or explain the behavior of sought to describe or explain the behavior of this process and its possible relationship to this process and its possible relationship to other spatial phenomena.”other spatial phenomena.”

Bailey and Gatrell (1995:7)Bailey and Gatrell (1995:7)

Objective of spatial data analysis: to Objective of spatial data analysis: to understand the spatial arrangement of understand the spatial arrangement of variable values, detect patterns, and variable values, detect patterns, and examine relationships among variablesexamine relationships among variables

Page 8: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Why Do Spatial Data Why Do Spatial Data Analysis?Analysis?

To learn more about what you’re studyingTo learn more about what you’re studying To avoid specification problems (missing To avoid specification problems (missing

variables, measurement error)variables, measurement error) To ensure satisfaction of statistical To ensure satisfaction of statistical

assumptionsassumptions

To be cool! To go crazy! To learn more To be cool! To go crazy! To learn more about statistics than you ever wanted or about statistics than you ever wanted or thought possible! thought possible!

To learn the limitations of statisticsTo learn the limitations of statistics

Page 9: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Theoretical Reasons Theoretical Reasons for Spatial Analysisfor Spatial Analysis

It tells us something more about what It tells us something more about what we’re studyingwe’re studying Is there an unmeasured process that Is there an unmeasured process that

affects the phenomenon?affects the phenomenon? Does this process manifest itself in Does this process manifest itself in

space?space? Examples: interaction processes, Examples: interaction processes,

diffusion, historical or ethnic legacy, diffusion, historical or ethnic legacy, programmatic effectsprogrammatic effects

Page 10: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Statistical Reasons Statistical Reasons for Spatial Analysisfor Spatial Analysis

Violation of regression assumptionsViolation of regression assumptions Units of analysis might not be independentUnits of analysis might not be independent Parameter estimates are inefficientParameter estimates are inefficient Estimated error variance is downwardly biased, Estimated error variance is downwardly biased,

which inflates the observed Rwhich inflates the observed R2 2 valuesvalues

If spatial effects are present, and you don’t If spatial effects are present, and you don’t account for them, your model is not account for them, your model is not accurate!accurate!

Page 11: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Examples of Research Using Examples of Research Using SDASDA

Epidemiology (environmental exposure research)Epidemiology (environmental exposure research) Criminology (crime patterns)Criminology (crime patterns) Education (neighborhood effects on attainment)Education (neighborhood effects on attainment) Diffusion/adoption (technologies)Diffusion/adoption (technologies) Social movements (trade unions, demonstrations)Social movements (trade unions, demonstrations) Market analysis (housing and land price variation)Market analysis (housing and land price variation) Spillover effects (economic spillovers of universities)Spillover effects (economic spillovers of universities) Regional studies (regional income variation & Regional studies (regional income variation &

inequality)inequality) Demography (segregation patterns)Demography (segregation patterns) Political science (election studies)Political science (election studies)

Page 12: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

BREAK!!BREAK!!

Page 13: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

When do you need to do SDA?When do you need to do SDA?

Is there a theoretical reason to suspect Is there a theoretical reason to suspect differences across space?differences across space? Differences in phenomena (variable values)Differences in phenomena (variable values) Differences in relationships between Differences in relationships between

phenomena (covariances)phenomena (covariances) Are you using data with spatial referent? Are you using data with spatial referent?

If yes to both, it is a good idea to at least If yes to both, it is a good idea to at least explore any potential spatial effects explore any potential spatial effects

Exploration will tell you more about the Exploration will tell you more about the subject you’re studyingsubject you’re studying

Page 14: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Spatial IndependenceSpatial Independence

Null hypothesis (HNull hypothesis (H00)) Any event has an equal probability of Any event has an equal probability of

occurring at any position in the regionoccurring at any position in the region Position of any event is independent of Position of any event is independent of

the position of any otherthe position of any other

Implicit assumption of much work in Implicit assumption of much work in social sciencessocial sciences

Page 15: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Spatial EffectsSpatial Effects

Test Hypothesis (HTest Hypothesis (H11)) Probability of an event occurring not equal for Probability of an event occurring not equal for

each location within regioneach location within region Position of any one event dependent on position Position of any one event dependent on position

of any other eventof any other event

Methods and statistics of SDA test this Methods and statistics of SDA test this hypothesishypothesis If supported, can tell us more about what we’re If supported, can tell us more about what we’re

studying; can improve our modelsstudying; can improve our models If not supported, we know that we have satisfied If not supported, we know that we have satisfied

assumptionsassumptions

Page 16: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

First Order Spatial EffectsFirst Order Spatial Effects

Non-uniform distribution of observations over Non-uniform distribution of observations over spacespace

Large-scale variation in mean across the Large-scale variation in mean across the spatial unitsspatial units

Values of the variables are not independent Values of the variables are not independent of their spatial locationof their spatial location

Results from interaction of unique Results from interaction of unique characteristics of the units and their spatial characteristics of the units and their spatial locationlocation

Ex: magnets and iron filings (Bailey & Gatrell)Ex: magnets and iron filings (Bailey & Gatrell) Referred to as Referred to as spatial heterogeneityspatial heterogeneity

Page 17: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Causes of Spatial Causes of Spatial HeterogeneityHeterogeneity

Patterns of social interaction that create Patterns of social interaction that create unique characteristics of spatial unitsunique characteristics of spatial units Spatial regimes: legacies of regional core-periphery Spatial regimes: legacies of regional core-periphery

relationships => differences between units (pop, relationships => differences between units (pop, econ dvpt, etc.)econ dvpt, etc.)

Differences in physical features of spatial unitsDifferences in physical features of spatial units Size of counties Size of counties

Combination:Combination: Differences in topography of units => different Differences in topography of units => different

patterns of economic development (extractive patterns of economic development (extractive industries)industries)

Page 18: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

South.shp0 - 5.8325.832 - 11.98311.983 - 20.30520.305 - 64.261

County Homicide Rates County Homicide Rates 19901990

First order First order effects?effects?

Page 19: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Second Order Spatial EffectsSecond Order Spatial Effects

Localized covariation among means Localized covariation among means (or other statistics) within the region(or other statistics) within the region

Tendency for means to ‘follow’ each Tendency for means to ‘follow’ each other in spaceother in space

Results in clusters of similar valuesResults in clusters of similar values Ex: magnets and iron filings (Bailey & Ex: magnets and iron filings (Bailey &

Gatrell)Gatrell) Referred to as Referred to as spatial dependence spatial dependence

(spatial autocorrelation)(spatial autocorrelation)

Page 20: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Causes of Spatial Causes of Spatial DependenceDependence

Underlying socio-economic process has led to Underlying socio-economic process has led to clustered distribution of variable valuesclustered distribution of variable values Grouping processes Grouping processes

grouping of similar people in localized areasgrouping of similar people in localized areas Spatial interaction processes Spatial interaction processes

people near each other more likely to interact, share people near each other more likely to interact, share Diffusion processesDiffusion processes

Neighbors learn from each otherNeighbors learn from each other Dispersal processesDispersal processes

People move, but tend to be short distances, take their People move, but tend to be short distances, take their knowledge with themknowledge with them

Spatial hierarchiesSpatial hierarchies Economic influences that bind people togetherEconomic influences that bind people together

Mis-match of process and spatial unitsMis-match of process and spatial units Counties vs retail trade zonesCounties vs retail trade zones Census block groups vs neighborhood networksCensus block groups vs neighborhood networks

Page 21: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

County Homicide Rates County Homicide Rates 19901990

Second order Second order effects?effects?

South.shp0 - 5.8325.832 - 11.98311.983 - 20.30520.305 - 64.261

Page 22: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

So now that I’ve convinced you So now that I’ve convinced you that spatial data analysis is an that spatial data analysis is an

important consideration…. important consideration….

What Do We Do About What Do We Do About It?It?

Page 23: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Goals of SDAGoals of SDA

To identify spatial effects and their causesTo identify spatial effects and their causes To appropriately measure spatial effectsTo appropriately measure spatial effects To incorporate spatial effects into modelsTo incorporate spatial effects into models

To improve our knowledge of the process To improve our knowledge of the process and how it occurs over spaceand how it occurs over space

All of these goals require both theory and All of these goals require both theory and methods methods

Page 24: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Exploratory Spatial Data Exploratory Spatial Data AnalysisAnalysis

Start with questions about your theory and data:Start with questions about your theory and data: Are there likely to be spatial processes at work Are there likely to be spatial processes at work

(diffusion, interaction, etc.)?(diffusion, interaction, etc.)? Do your data units match the process? Do your data units match the process? (Messner et al. reading)(Messner et al. reading)

Visually and statistically explore your dataVisually and statistically explore your data Run basic descriptive statisticsRun basic descriptive statistics Map variablesMap variables

Look for patterns, outliersLook for patterns, outliers Look for spatial effects (large-scale variation, localized Look for spatial effects (large-scale variation, localized

clusters)clusters)

Page 25: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

South.shp0.263 - 0.360.36 - 0.3960.396 - 0.4350.435 - 0.533

Gini Index 1989Gini Index 1989GI89

GI89

Fre

quen

cy

300

200

100

0

Std. Dev = .03

Mean = .393

N = 1412.00

Page 26: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

How to Measure ‘Space’?How to Measure ‘Space’?

Need to define space in order to measure Need to define space in order to measure its effectsits effects

Traditional ways (regional dummy Traditional ways (regional dummy variables, distance measures, etc.)variables, distance measures, etc.)

Neighborhood structureNeighborhood structure Weights matrixWeights matrix

n x n matrix, where:n x n matrix, where:0 = not neighbor0 = not neighbor1 = neighbor1 = neighbor

Page 27: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Weights MatrixWeights Matrix ‘‘Neighbors’ can be defined as:Neighbors’ can be defined as:

Boundaries:Boundaries: Adjacent units (rook or queen)Adjacent units (rook or queen) Those units sharing some minimum/maximum proportion Those units sharing some minimum/maximum proportion

of common boundaryof common boundary CentroidsCentroids

If centroids are within some specified distanceIf centroids are within some specified distance If unit is one of If unit is one of kk nearest neighbors defined by centroid nearest neighbors defined by centroid

distancedistance Others?Others?

Decision to use one over another somewhat Decision to use one over another somewhat arbitraryarbitrary Simpler is generally betterSimpler is generally better Closer is generally betterCloser is generally better Rely on theory, your knowledge, and the ESDA to Rely on theory, your knowledge, and the ESDA to

guide youguide you

Page 28: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Weights Matrix ExampleWeights Matrix Example

11 22 33

44 55 66

77 88 99

Simple Contiguity (rook) Simple Contiguity (rook) MatrixMatrix

11 22 33 44 55 66 77 88 99

11 00 11 00 11 00 00 00 00 00

22 11 00 11 00 11 00 00 00 00

33 00 11 00 00 00 11 00 00 00

44 11 00 00 00 11 00 11 00 00

55 00 11 00 11 00 11 00 11 00

66 00 00 11 00 11 00 00 00 11

77 00 00 00 11 00 00 00 11 00

88 00 00 00 00 11 00 11 00 11

99 00 00 00 00 00 11 00 00 00

Sample Region and UnitsSample Region and Units

Page 29: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Statistical Tests for Spatial Statistical Tests for Spatial Dependence (Autocorrelation)Dependence (Autocorrelation)

Univariate Global Moran’s Univariate Global Moran’s II Indicates presence and degree of spatial autocorrelation Indicates presence and degree of spatial autocorrelation

among variable values across spatial unitsamong variable values across spatial units

Where z is a vector of variable values expressed as Where z is a vector of variable values expressed as deviations from the meandeviations from the mean

Where W is the weights matrixWhere W is the weights matrix

Expected value of Expected value of II convergences on 0 when n is large; can convergences on 0 when n is large; can do significance testsdo significance tests

Large positive => strong clustering of similar valuesLarge positive => strong clustering of similar valuesLarge negative => strong clustering of dissimilar valuesLarge negative => strong clustering of dissimilar values

Iz W z

z z

Page 30: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Global Moran’s Global Moran’s II and Moran Scatterplotand Moran Scatterplot

Assesses relationship between the variable value for unit of origin Assesses relationship between the variable value for unit of origin (x axis) against the average of the values its neighbors (y axis)(x axis) against the average of the values its neighbors (y axis)

Page 31: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Local Indicators of Spatial Local Indicators of Spatial Autocorrelation (LISA)Autocorrelation (LISA)

Local Moran’s Local Moran’s II Decomposes global measure into each Decomposes global measure into each

unit’s contributionunit’s contribution Identifies the local ‘hotspots’, areas Identifies the local ‘hotspots’, areas

which contribute disproportionately to which contribute disproportionately to global Moran’s global Moran’s II

Page 32: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

LISA Cluster MapsLISA Cluster Maps

Homicide Rate Homicide Rate 19901990

Gini Index Gini Index 19891989

Page 33: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Additional Suggestions for Additional Suggestions for ESDAESDA

Identify outliers and hotspots both Identify outliers and hotspots both statistically and visually statistically and visually

Try taking outlier units out of analysis and Try taking outlier units out of analysis and see what happens (does Moran’s see what happens (does Moran’s I I change?)change?)

Explore changes in spatial patterns over timeExplore changes in spatial patterns over time Compare two (or more) regionsCompare two (or more) regions Split your sample by a variable of interest Split your sample by a variable of interest Try different weights matricesTry different weights matrices Play around with different covariates – get Play around with different covariates – get

into your data!into your data!

Page 34: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

BREAK!!!BREAK!!!

Page 35: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Regression Modeling and Regression Modeling and SDASDA

Use theory and ESDA findings to craft Use theory and ESDA findings to craft your modelyour model

Procedure:Procedure: Run OLS modelRun OLS model Assess diagnosticsAssess diagnostics

If diagnostics indicate no spatial autocorrelation If diagnostics indicate no spatial autocorrelation (or other violations of regression assumptions), (or other violations of regression assumptions), OLS model is fineOLS model is fine

If diagnostics indicate spatial autocorrelation If diagnostics indicate spatial autocorrelation present, need to consider ways to measure and present, need to consider ways to measure and incorporate spatial structureincorporate spatial structure

Page 36: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

OLS DiagnosticsOLS Diagnostics Diagnostics of OLS model will indicate type of Diagnostics of OLS model will indicate type of

spatial effectsspatial effects If either present, need to identify likely source If either present, need to identify likely source RemediesRemedies

Spatial heterogeneity (Koenker-Bassett test)Spatial heterogeneity (Koenker-Bassett test) Include covariate which accounts for heterogeneity?Include covariate which accounts for heterogeneity? Split region?Split region?

Spatial autocorrelation (Lagrange Multiplier tests)Spatial autocorrelation (Lagrange Multiplier tests) Identify missing variables?Identify missing variables? Explore effects of spatially-lagged independent variables?Explore effects of spatially-lagged independent variables? Use appropriate spatial regression model?Use appropriate spatial regression model?

Page 37: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Spatial Regression ModelsSpatial Regression Models ESDA and OLS diagnostics tell you that ESDA and OLS diagnostics tell you that

there is spatial autocorrelation there is spatial autocorrelation Identify the source (LM tests will help)Identify the source (LM tests will help)

Regression residuals (LM-Error)Regression residuals (LM-Error) Mis-match of process and spatial units => Mis-match of process and spatial units =>

systematic errors, correlated across spatial systematic errors, correlated across spatial unitsunits

Dependent variable (LM-Lag)Dependent variable (LM-Lag) Underlying socio-economic process has led to Underlying socio-economic process has led to

clustered distribution of variable values => clustered distribution of variable values => influence of neighboring values on unit valuesinfluence of neighboring values on unit values

Spatial autocorrelation in bothSpatial autocorrelation in both

Page 38: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Spatial Autocorrelation in Spatial Autocorrelation in Residuals => Spatial Error Residuals => Spatial Error

ModelModely = Xy = Xββ + + εε εε = = λλWWεε + + ξξ

εε is the vector of error terms, spatially is the vector of error terms, spatially weighted (weighted (WW); ); λλ is the coefficient; is the coefficient; and and ξξ is the vector of uncorrelated, is the vector of uncorrelated, homoskedastic errorshomoskedastic errors

Incorporates spatial effects through Incorporates spatial effects through error termerror term

Page 39: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Spatial Autocorrelation in Dep. Spatial Autocorrelation in Dep. Variable => Spatial Lag ModelVariable => Spatial Lag Model

y = y = ρρWy + Wy + XXββ + + εε

yy is the vector of the dependent variable, is the vector of the dependent variable, spatially weighted (spatially weighted (WW); ); ρρ is the is the coefficientcoefficient

Incorporates spatial effects by including a Incorporates spatial effects by including a spatially lagged dependent variable as spatially lagged dependent variable as an additional predictor an additional predictor

Page 40: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Spatial Lag ExampleSpatial Lag Example

11

7722

6633

44

44

4455

5566

44

77

5588

6699

33

Spatial lag = sum of Spatial lag = sum of spatially-weighted spatially-weighted values of neighboring values of neighboring cellscells

= 1/3(7) + 1/3(5) + = 1/3(7) + 1/3(5) + 1/3(4)1/3(4)

= 5.3= 5.3

Sample Region and UnitsSample Region and Units

Page 41: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Example: Example: Change in Farm Numbers 1982-Change in Farm Numbers 1982-

19921992 RQ: RQ:

How do changes in agricultural structure How do changes in agricultural structure affect the rates of farm loss during the affect the rates of farm loss during the Farm Crisis?Farm Crisis?

Hypothesized spatial effect: Hypothesized spatial effect: spatial dependence through clustering spatial dependence through clustering

of similar types of farmsof similar types of farms

Page 42: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Farm Structure Example: Farm Structure Example: Moran’s I StatisticsMoran’s I Statistics

MatrixMatrix Moran’s I Moran’s I

for dep varfor dep var

ContiguityContiguity 0.465***0.465***

45-mile 45-mile 0.413***0.413***

100-mile 100-mile 0.267***0.267***

Figure 6.38: Moran Scatterplot for Dependent Variable, Change in Numberof Farms, Under 45-mile Distance Weights Matrix

Figure 6.38: Moran Scatterplot for Dependent Variable, Change in Numberof Farms, Under 45-mile Distance Weights Matrix

Page 43: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Farm Structure Example: Farm Structure Example: LISA MapsLISA Maps

N

0 200 400 KM

Great Plains Region

Moran ScatterplotMap for Change inNumber of Farms

High-HighLow-LowHigh-LowLow-High

Great Plains Region

Moran SignificanceMap for Change inNumber of Farms

0 200 400 KM

N

not significantHigh-HighLow-LowHigh-LowLow-High

Page 44: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Farm Structure Example: Farm Structure Example: OLS Regression & DiagnosticsOLS Regression & Diagnostics

Variable (sig. Variable (sig. only)only)

Coeff.Coeff.

Prime farmlandPrime farmland -0.343*-0.343*

Corporate Corporate FarmingFarming

0.196***0.196***

Small-scale Small-scale FarmingFarming

0.904***0.904***

……

Adj. RAdj. R22 0.6960.696

Likelihood (L)Likelihood (L) -410.187-410.187

AICAIC 862.374862.374

Prob.Prob.

LM-ErrorLM-Error 0.0000.000

R-LM-ErrorR-LM-Error 0.0240.024

LM-LagLM-Lag 0.0000.000

R-LM-LagR-LM-Lag 0.0000.000

Page 45: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Farm Structure Example: Farm Structure Example: Spatial Error – Spatial Lag Spatial Error – Spatial Lag

RegressionRegressionVariable (sig. Variable (sig. only)only)

Coeff.Coeff.

Prime farmlandPrime farmland -0.243*-0.243*

Corporate Corporate FarmingFarming

0.180***0.180***

Small-scale Small-scale FarmingFarming

0.820***0.820***

Rho (dep var)Rho (dep var) 0.381***0.381***

Lambda (error)Lambda (error) 0.0440.044

Adj. RAdj. R22 0.7400.740

Likelihood (L)Likelihood (L) -381.736-381.736

AICAIC 807.473807.473

Prob.Prob.

LM-ErrorLM-Error 0.2120.212

Likelihood ratio Likelihood ratio test for spatial test for spatial lag dependencelag dependence

0.7680.768

Page 46: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Practical Issues with SDA Practical Issues with SDA

Scale of observations vs scale of processScale of observations vs scale of process Time as a factor in analysis (no natural Time as a factor in analysis (no natural

order)order) Definition of proximityDefinition of proximity Edge/boundary effectsEdge/boundary effects Modifiable area unit problemModifiable area unit problem Complexity of topographyComplexity of topography Assumptions related to ‘sample’ of attributesAssumptions related to ‘sample’ of attributes

Page 47: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

How in the Heck How in the Heck Do I Actually Do This?Do I Actually Do This?

Existing statistical software packages (SPSS, Existing statistical software packages (SPSS, SAS) SAS) Have trouble with weights matrix, so need to bring Have trouble with weights matrix, so need to bring

in by handin by hand Some routines exist, but limitedSome routines exist, but limited

Comprehensive software packagesComprehensive software packages S+ SpatialstatsS+ Spatialstats

Linear spatial regression; weights constructionLinear spatial regression; weights construction Not transparent; no diagnostics; not compatible with Not transparent; no diagnostics; not compatible with

ArcView 8.2ArcView 8.2 Spatial Toolbox (LeSage) Spatial Toolbox (LeSage)

Matlab routinesMatlab routines Linear spatial regression; weights construction; Bayesian Linear spatial regression; weights construction; Bayesian

estimation; spatial probit/tobit modelsestimation; spatial probit/tobit models

Page 48: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Software Packages (2)Software Packages (2) SpaceStatSpaceStat

Linear spatial regression; weights construction; Linear spatial regression; weights construction; diagnostics; multiple optionsdiagnostics; multiple options

Outdated architecture and interface; not supported by Outdated architecture and interface; not supported by Anselin; not compatible with ArcView 8.2Anselin; not compatible with ArcView 8.2

GeoDa & Spdep (R)GeoDa & Spdep (R) GeoDa strong in ESDA, mapping; weights construction; GeoDa strong in ESDA, mapping; weights construction;

basic linear spatial regression w/ diagnostics basic linear spatial regression w/ diagnostics Spdep has linear spatial regression w/ diagnostics; Spdep has linear spatial regression w/ diagnostics;

greater functionality than GeoDa; driven by command greater functionality than GeoDa; driven by command languagelanguage

Both shareware, downloadableBoth shareware, downloadable Little support, other than network of those using softwareLittle support, other than network of those using software

Anselin’s working on PySpace, software to have Anselin’s working on PySpace, software to have greater breadth of options, diagnostics, models, greater breadth of options, diagnostics, models, and estimation proceduresand estimation procedures

Page 49: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

Additional ResourcesAdditional Resources

Handout has resources listed (web, Handout has resources listed (web, articles, etc.)articles, etc.) Particularly CSISS, SAL Particularly CSISS, SAL If interested, consider joining Openspace If interested, consider joining Openspace

listservelistserve AERS faculty AERS faculty Geographic Information Analysis Geographic Information Analysis

group within PRIgroup within PRI

Page 50: Introduction to Spatial Data Analysis in the Social Sciences RSOC597A: Special Topics in Methods/Statistics Kathy Brasier Penn State University June 14,

AssignmentAssignment

Details in handoutDetails in handout Article choices – Use those with *Article choices – Use those with * Due DateDue Date

June 17 (Fri.) by 5:00 pm (email preferred)June 17 (Fri.) by 5:00 pm (email preferred) NOTE CHANGE: I will email you NOTE CHANGE: I will email you

comments/gradescomments/grades Re-writes due June 23 (Thur.) by 5:00 pmRe-writes due June 23 (Thur.) by 5:00 pm

Questions?Questions?