geospatial data and spatial data analysis tools for ecologists

33
University of California – Santa Barbara www.nceas.ucsb.edu Rick Reeves / March 17, 2005 Geospatial Data and Spatial Data Analysis Tools For Ecologists

Upload: gitano

Post on 10-Jan-2016

48 views

Category:

Documents


4 download

DESCRIPTION

Geospatial Data and Spatial Data Analysis Tools For Ecologists. University of California – Santa Barbara www.nceas.ucsb.edu Rick Reeves / March 17, 2005. Presentation Goals. Overview: Geospatial Data Analysis Defining and distinguishing between spatial, geospatial, geographic data - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Geospatial Data and Spatial Data Analysis Tools For Ecologists

University of California – Santa Barbara

www.nceas.ucsb.edu

Rick Reeves / March 17, 2005

Geospatial Data and Spatial Data Analysis Tools For Ecologists

Page 2: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Presentation Goals

Overview: Geospatial Data Analysis

Defining and distinguishing between spatial, geospatial, geographic data

Addressing the particular attributes of geospatial data

Inventory of Geospatial Data Types

Primary data types and common sources for data

Survey of Geoprocessing Software Tools

Key issues driving choice of geospatial processing software

A Tour of NCEAS Scientific Computing Web Site

Spatial Datasets, Tools, Tutorials, and Project Archives

Some Examples: Geospatial Data Analysis at NCEAS

From the Annals of the NCEAS Scientific Programmer: ‘Real World’ solutions to Ecological research challenges

Page 3: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Meet the Scientific Programmer

Rick’s Academic and Professional Background

Undergraduate: Environmental Remote Sensing

Graduate: Spatial Operations Research / Location-Allocation Heuristic Development

Spatial Modeling branch of Geographic Data Analysis

Problem Domain: Transportation and Facility Location within networks

Professional: Software Development, geospatial database development, training curriculum development

Page 4: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Spatial Data: A Hierarchical Definition

Spatial Data Observations are distributed in multidimensional space

X / Y / Z coordinates attached to each data element

Geospatial Data Spatial Data with attached Geographic coordinates

Latitude / Longitude, UTM Optional: data subjected to a map projection transformation

Geographic Data Geospatial Data that captures ‘Earth System’ phenomena

Terrain height Drainage Network Land surface cover or urban Land Use Meteorological / climate data forecasts

Ecologists may work with any or all during a project

Page 5: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Overview: Geospatial / Geographic Data

Two Broad Primary Categories Raster: A multi dimensional, regularly-spaced grid of values

(samples) Dimensions: Northing, Easting, Altitude, Time Examples: Satellite Image, Digital Terrain, land surface cover maps

Vector: Three primary shapes stored in drawing-optimized format Point, Line, Polygon, (TIN, vector field)

Thousands of datasets exist in hundreds of formats Remote Sensing Imagery / Digital Elevation Models Surface Features (political, physiographic) as points/lines/polygons Meteorological data (observed / forecasted (short-and long-term)) File format standards set by Industry, Government, user community

Data Ingestion: First Step in Geospatial Analysis Data input / format conversion / spatial registration

Page 6: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Geospatial Data Analysis

Geospatial Information Analysis: 3 Categories From O’Sullivan & Unwin (2003)

Spatial Data Manipulation: Investigate the relationships between geographic dataset layers

Examples: ‘point-in-polygon’, buffer zones around spatial features GIS software typically used to view/ manipulate / create layers

Spatial/Statistical Data Analysis: Descriptive and Explanatory: What is there? How do we categorize it?

Data points treated as statistical ‘population’, compared to others

Spatial Modeling: Construct models to explore and understand geospatial systems

Based on ‘abstraction’ of domain-specific problem into a systems framework. Some examples:

Predicting network flows; optimizing facility locations among demands

Lessons learned building model as valuable as model’s ‘answers’

Page 7: Geospatial Data and Spatial Data Analysis Tools For Ecologists

The Challenge of Geospatial Analysis

Geospatial Data violate some key statistical assumptions Must be addressed in the experimental design and sampling scheme Require specialized assessment techniques to factor out effects

Spatial Autocorrelation Samples are NOT randomly selected from normally-distributed

population In fact, nearby samples more likely to be similar than distant ones Autocorrelated data points introduce redundancy into the sample

set Spatial Scaling

AKA Modifiable Areal Unit Problem Statistical relationships in an area may change at different

aggregations The placement of sampling grid can introduce artifacts

Nonuniform sampling space, edge effects Geospatial Data Attributes have explanatory power

Spatial relationships may be causes for observed phenomena

Page 8: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Selecting Geospatial Software Tools

Geospatial software: layered software architecture Data layer: Efficiently store geospatial data

Feature Set + spatial coordinates Analytic Layer: Spatial/statistical analysis algorithms

Statistical packages increasingly contain geospatial analysis tools

Visualization Layer: Creates data views (AKA maps) Geospatial tools broadly divided in two categories

Geographic Information Systems (GIS) Three software layers are each extensive, ‘feature rich’

Geospatial Analysis Packages Data layer is ‘thinner’, Analytic layer ‘thicker’ Visualization layer built on existing data plotting tools

Page 9: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Geospatial Software Tools: GIS ‘Value Added’

Data layer is optimized for efficient geospatial data storage/processing

Raster and Vector Data storage, ‘mixed mode’ operations

Georeferencing tools for data layer projection, spatial registration

Map Algebra tools foster analysis and creation of data layers

Comprehensive cartographic tools for output map design

Page 10: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Geospatial Software Tools: GIS Caveats

Underdeveloped geostatistical processing tools Vendors pressured to include them in product

Yet validation data and algorithm details not available Often, these are critical tools for ecological analysis

Steep Learning Curve Identifying, mastering ‘essential’ features a

challenge Cost: GIS Software can be expensive

Upfront purchase and yearly license fees Time investment in training and data maintenance

Workload If non-GIS must be used for part of analysis, time

must be spent moving between s/w packages

Page 11: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Geospatial Software Tools: GIS Caveats

Underdeveloped geostatistical processing tools Vendors pressured to include them in product

Yet validation data and algorithm details not available Often, these are critical tools for ecological analysis

Steep Learning Curve Identifying, mastering ‘essential’ features a

challenge Cost: GIS Software can be expensive

Upfront purchase and yearly license fees Time investment in training and data maintenance

Workload If non-GIS must be used for part of analysis, time

must be spent moving between s/w packages

Page 12: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Geospatial Software Tools: Choosing

Some Suggested Selection Criteria Research Objectives should drive choice of tools

Identify the project’s core geospatial processing needs Platform Flexibility

Select tools supported on multi-platforms (hardware/OpSys) Widely supported/used platforms foster collaberation

Solution ‘Visibility’ Can you obtain the details of the algorithm? Does the community recognize the accuracy of the

algorithm? Costs of implementing your research idea in software

Scripted solutions using integrated environments are best R, SAS, MATLAB

Avoid development in high-level programming languages

Page 13: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Geospatial Software Tools: Choosing

Select GIS for core needs: Construct, compare, create multiple spatial data layers Simultaneously analyzing vector and raster data Creating detailed production quality study site maps Your data is exclusively in the GIS product format You require spatial analysis tools unavailable outside

GIS Select Geospatial Analysis tools for core needs:

Spatial/Statistical data analysis is the focus Your mapping requirements are modest

two-dimensional data plots with geographic coordinates, legend You need in-depth understanding of algorithms used

Or, you wish to extend / modify the algorithms

Page 14: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Sources for Geospatial Software Tools

Commercial Software Products For-profit corporations sell or license their software Major players produce comprehensive products

ESRI ArcGIS is the dominant GIS vendor Their goal: Provide solution for every geospatial application

Other vendors offer tailored solutions Examples: ENVI / IDL, ERDAS: Remote Sensing oriented GIS Example: S Plus Spatial Statistics: Geospatial statistics and

spatial data visualization enhancements to statistical package Example: MATLAB has mapping and image processing toolkits Example: SAS offers GIS, geospatial software tools

Commercial products often drive geospatial data formats

Example: ESRI Shape File, ERDAS IMG file

Page 15: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Sources for Geospatial Software Tools

Open Source Software Broad-based effort by worldwide scientific and

research community Distributed under General Public License (GPL) Software development and maintenance by the

user community Most significant geospatial analysis products: R, GRASS GIS Examples of others: PostGIS, GDAL libraries

Visit FreeGIS.org, or the open software foundation sites.

Page 16: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Tradeoffs: Commercial GIS Software

Centralized documentation and product support….. At a price of $100s to $1000s per year

Comprehensive, integrated software product Data/Analytic/Visualization layers populated w/

features Steep learning curve: Where are my ‘essential

features?’ Training always available – at a cost…. Details of proprietary geospatial algorithms

usually unavailable

Page 17: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Tradeoffs: Open Source GIS Software

Open Source Software Distributed under General Public License (GPL) Software development and maintenance by the user

community Most significant geospatial analysis products: R, GRASS GIS

Many applications available via the Internet but…. Quality, features, support, and documentation are inconsistent

Algorithms and even source code are freely available Open Source software drawbacks are shrinking as

user support community evolves and matures But active participation in the community is advised for

those wishing to stay technically proficient

Page 18: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Sources for Geospatial Data

Government Agencies National Mapping and Survey Agencies: surface cover data

USGS Research Centers: Climate forecasting models

NOAA, NASA, NCDC For-Profit Corporations

The highest-quality UNCLASSIFIED imagery now acquired by the private sector

Sometimes, no-cost government data is resold to public

Data widely available via the Internet Many data sets available at no- or low-cost

Notable Exception: Satellite Remote Sensing data Some discounts available to education and/or research entities

The best sites allow ‘search by geographic coordinates’ Examples from NCEAS Scientific Computing web site

Page 19: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Popular Geospatial Data Formats

Meteorological and Climatalogical Data Historical measurements Short-term model-based forecasts (3 – 10 days from now) Long-term predictions (10 – 100 years): General Circulation

Models Widely-Used Formats: Gridded Binary (GRIB), NetCDF

Political and Physiographic features Country Boundaries Road Networks Drainage Networks Widely-Used Formats: Digital Line Graphs (DLG), ESRI

Shape Files (.shp)

Most GIS/Geospatial packages ingest these formats

Or conversion utilities are available to ingest them

Page 20: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Popular Geospatial Data Formats

Remote Sensing Imagery Many operational systems provide many kinds of images

Multispectral Imagery: Landsat, SPOT, IKONOS Data Formats tend to be sensor-specific Most GIS can ingest most imagery types

Portal sitesCommercial: http://www.vterrain.org/Imagery/commercial.html Govt: http://www.nationalgeographic.com/maps/map_links.html

Digital Terrain Models Raster Grid datasets containing elevation measurements Available for complete Earth land surface Primary format: USGS Digital Elevation Model (DEM)

AKA National Elevation Dataset (NED) Portal sites:

USGS: http://gisdata.usgs.net/Website/Seamless/Terrainmap.org: http://www.terrainmap.org/

Page 21: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Tour of the Scientific Computing Web Site

Links to Data Sources Links to Geospatial Software Sources Links to Tutorials and Research Papers Archive of NCEAS Research Projects

http://www.nceas.ucsb.edu/scicomp

Page 22: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Example: Spatial Modeling: Optimization

Route vehicles along network using environmental costs as a metric

Simultaneously locate facilities along shipment routes that mitigate environmental costs

Optimal Location of species reserve sites Develop and compare performance of

alternate solution methods Mathematically optimal but operationally

impractical Heuristically derived Near-optimal, usable

solution

Page 23: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Spatial Modeling: The Problem Domain

Page 24: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Geospatial Dataset: Routes + Locations

Page 25: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Spatial Model Solution: Alternative Methods

Page 26: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Selecting Species Reserves Locations

Dr. Ross Gerrard, UCSB Biogeography Lab, 1996

Page 27: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Example: Spatial Data Manipulation

Elevation zone threshold calculation Digital Elevation Models for selected worldwide

sites Classify sites into 100 meter ‘wide’ elevation

zones General Circulation Model climate data

extraction Identify, obtain, import GCM data files Import the data into GIS as raster grid Overlay point file, extract matching climate

values

Page 28: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Digital Elevation Data Ingestion / Clipping

Page 29: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Elevation Zone Data Analysis

Page 30: Geospatial Data and Spatial Data Analysis Tools For Ecologists

General Circulation Model data extraction

Page 31: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Spatial Analysis: Arc GIS and R Platforms

• ESRI Shape files exported to the R programming environment

• R Geostatistical and Spatial Analysis methods can then be applied

Page 32: Geospatial Data and Spatial Data Analysis Tools For Ecologists

A Sampling: R Geospatial Analysis packages

clim.pact: Climate data analysis and downscaling tools

GeoR: Geostatistical Data Analysis: variograms, et. al

maptools: read/manipulate polygon data (ESRI .shp)

shapefiles: read/manipulate ESRI shape files sgeostat: Geostatistical modeling code splancs: Spatial and space-time point

patterns spstat: Spatial Point Pattern analysis

Page 33: Geospatial Data and Spatial Data Analysis Tools For Ecologists

Concluding thoughts

NCEAS Associates are extensively use geospatial data in many creative ways

Geospatial Data Analysis requires specialized techniques

GIS and geospatial analysis available from commercial vendors and open source community

Choosing geospatial data and tools can be overwhelming and distract from the primary ‘science mission’

Scientific Programming Team has geospatial expertise, and can assist NCEAS Associates in this domain

Coming soon: Short course on the R Programming Language!