interfacing r with gis

Post on 24-Feb-2016

29 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

GIS in the Sciences ERTH 4750 (38031). Interfacing R with GIS. Xiaogang (Marshall) Ma School of Science Rensselaer Polytechnic Institute Tuesday, Apr 02, 2013. A review of our course outline. MapInfo. Week 1 (Jan. 22/25): Introduction to GIS - PowerPoint PPT Presentation

TRANSCRIPT

Interfacing R with GIS

Xiaogang (Marshall) MaSchool of Science

Rensselaer Polytechnic InstituteTuesday, Apr 02, 2013

GIS in the SciencesERTH 4750 (38031)

MapInfo

RMashup

• Week 1 (Jan. 22/25): Introduction to GIS• Week 2 (Jan. 29/Feb. 1): Geographic information and spatial data types • Week 3 (Feb. 5/8): Spatial referencing• Week 4 (Feb 12/15): Geostatistical computing• Week 5 (Feb. 19/22): Exploring and visualizing spatial data• Week 6 (Feb. 26/Mar. 1): Modeling spatial structure from point samples• Week 7 (Mar. 5/8): Spatial prediction from point samples (Part 1)• Week 8 (Mar. 12/15: no classes - spring break) • Week 9 (Mar. 19/22): Spatial prediction from point samples (Part 2)• Week 10 (Mar. 26/29): Assessing the quality of spatial predictions• Week 11 (Apr. 2/5): Interfacing R spatial statistics with GIS• Week 12 (Apr. 9/12: no class - Grand Marshal week)• Week 13 (Apr. 16/19): Efficient and effective result presentation with GIS• Week 14 (Apr. 23/26): Tuesday: Guest Lecture• Week 15 (Apr. 30): Tuesday: Short final project presentations

A review of our course outline

2

Acknowledgements

• This lecture is partly based on:– Bivand, R.S., Pebesma, E.J., Gómez-Rubio, V., 2008.

Applied Spatial Data Analysis with R. Springer, New York, NY, 374pp.

– Rossiter, D.G., 2012. R and GIS. Exercise in distance course Applied Geostatistics. ITC, University of Twente

3

Contents

1. GIS and R: Overview2. Classes for spatial data in R3. Handling coordinate reference systems in R4. Spatial data import and export in R5. Examples

4

1 GIS and R: Overview

• GIS: a computerized system that facilitates the phases of data entry, data analysis and data presentation especially in cases when we are dealing with georeferenced data.– Geographic phenomena: continuous field, discrete field, collection

of objects– Computer representations: raster and vector– Spatial autocorrelation and topology– Spatial reference: definitions, physical/geometric constructs and

tools required to describe the geometry and motion of objects near and on the Earth’s surface

– Datum (vertical and horizontal): a surface that represents the shape of the earth

– Map projection: map the curved surface of the Earth (as modeled by an ellipsoid) onto a flat plane

5

• So, data analysis is an essential part of a GIS• Statistics is a primary way of data analysis

– Descriptive and Inferential statistics– Geostatistics: statistics on a population with known location

• R: an open-source environment for statistical computing and visualization– Exploring and visualizing spatial data: spatial structure,

regional trends, local spatial dependence and anisotropy– Modeling spatial structure from point samples– Spatial prediction from point samples– Assessing the quality of spatial predictions

6

GIS and R

• Users approaching R with experience of GIS will want to see ‘layers’, ‘vectors’, ‘rasters’, and ‘georeferences’, etc.– They desire that R accommodates familiar concepts and well-

known data models and formats

• For statistician users of R, ‘everything’ is a data.frame, a rectangular table with rows of observations on columns of variables– They want their geo-data analysis result can be visualized and

presented in main-stream GIS software programs, and thus extend the usage of the result.

7

• In R, classes have grown that look like GIS data models to GIS people, and look and behave like data frames from the point of view of applied statisticians.– We have used the R packages sp, gstat and lattice quite often in

the past few weeks– sp: provides classes for importing, manipulating and exporting

spatial data in R, and for methods including print/show, plot, subset, names, dim, summary, and a number of methods specific to spatial data handling

– gstat: creates gstat objects that hold all the information necessary for univariate or multivariate geostatistical prediction (simple, ordinary or universal (co)kriging), or its conditional or unconditional Gaussian or indicator simulation equivalents

– lattice: a high-level data visualization system with an emphasis on multivariate data

8

• A simplified view

9

Data import Data

analysis

Data export

GIS(MapInfo, ArcGIS, Google

Earth, etc.)R GIS

(MapInfo, ArcGIS, Google Earth, etc.)

• The real world is much more complicated– Combining spatial data from different sources often means that

much more insight is needed into the data formats and data models involved

• Methods are needed for handling and combining multisource datasets– Transformation of coordinate reference systems– Data import and export– Mash-up for themed studies

10

2 Classes for spatial data in R

• Data frames are containers for data used everywhere in S– view data as a rectangle of rows of observations on columns of

variables

• New-style (S4) classes were introduced in the S language at release 4– The central advantage of new-style classes is that they have

formal definitions that specify the name and type of the components, called slots, that they contain.

11

• For spatial objects the foundation class is the Spatial class, with just two slots.– The first is a bounding box, a matrix of numerical coordinates with

column names c(‘min’, ‘max’), and at least two rows, with the first row eastings (x-axis) and the second northings (y-axis).

– The second is a CRS class object defining the coordinate reference system, and may be set to ‘missing’, represented by NA in R, by CRS(as.character(NA)), its default value.

• Operations on Spatial* objects should update or copy these values to the new Spatial* objects being created.

12

> m <- matrix(c(0, 0, 1, 1), ncol = 2, dimnames = list(NULL,+ c("min", "max")))

> crs <- CRS(projargs = as.character(NA))

> S <- Spatial(bbox = m, proj4string = crs)

13

Build a simple Spatial object from a bounding box matrix, and a missing coordinate reference system

SpatialPoints

• The SpatialPoints class is the first subclass of Spatial in S

> getClass("SpatialPoints")Slots:Name: coords bbox proj4stringClass: matrix matrix CRSExtends: "Spatial"Known Subclasses:Class "SpatialPointsDataFrame", directlyClass "SpatialPixels", directlyClass "SpatialGrid", by class "SpatialPixels", distance 2Class "SpatialPixelsDataFrame", by class "SpatialPixels", distance 2Class "SpatialGridDataFrame", by class "SpatialGrid", distance 3

14

SpatialLines

• SpatialLines object contains the bounding box and projection information for the list of Lines objects stored in its lines slot.

> getClass("SpatialLines")Slots:Name: lines bbox proj4stringClass: list matrix CRSExtends: "Spatial"Known Subclasses: "SpatialLinesDataFrame"

15

SpatialPolygons

• A SpatialPolygons object is a set of Polygons objects with the additional slots of a Spatial object to contain the bounding box and projection information of the set as a whole

> getClass("SpatialPolygons")Slots:Name: polygons plotOrder bbox proj4stringClass: list integer matrix CRSExtends: "Spatial"Known Subclasses: "SpatialPolygonsDataFrame"

16

3 Handling coordinate reference systems in R

• Datum: a surface that represents the shape of the earth• Projection: render the surface of an ellipsoid as a plane

• Projection and datum together are referred to as a coordinate reference systems (CRS)– Most countries have multiple CRS, often for very good reasons.

This section can be covered in the general topic of ‘Spatial data import and export in R’ (next section), but is worthy to be discussed separately

17

The EPSG List

• The European Petroleum Survey Group (EPSG; now Oil & Gas Producers (OGP) Surveying & Positioning Committee) began collecting a geodetic parameter data set starting in 1986, based on earlier work in member companies.

• The EPSG list is under continuous development and/or update. A copy of the list is provided in the rgdal package, because it permits the conversion of a large number of CRS into the PROJ.4 style description.– Datum transformation is based on transformation to the World

Geodetic System of 1984 (WGS84), or inverse transformation from it to an alternative specified datum.

18

The PROJ.4 CRS Specification

• PROJ.4 is an open-source of library of projection functions; it is not part of R or the sp package, and may be used stand-alone or linked in another program.

• The PROJ.4 library uses a ‘tag=value’ representation of coordinate reference systems, with the tag and value pairs enclosed in a single character string– The only values used autonomously in CRS class objects are

whether the string is a character NA (missing) value for an unknown CRS, and whether it contains the string longlat, in which case the CRS contains geographical coordinates

19

> data(meuse)> coordinates(meuse) = ~x + y> str(meuse)

We can browse the CRS information either by read the @proj4string slot in the result of str(meuse) or by using the proj4string function

> proj4string(meuse)

For the meuse data the result is NA

20

Browse CRS information of a spatial object

> ?meuseThe metadata of meuse dataset shows that the coordinates are in the RDH (Rijksdriehook = Dutch triangulation) CRSWe use the make_EPSG utility function of the rgdal package to load the definitions into a list, and then search with grep for the string Amersfoort, which is the name of the origin of the Rijksdriehook (Dutch triangulation) coordinate system> EPSG <- make_EPSG()> (EPSG[grep("Amersfoort", fixed = T, EPSG$note), ])> rm(EPSG)There are two systems ‘RD old’ and ‘RD new’. meuse data uses ‘RD new’, which is corresponding to reference 28992 in the EPSG database> proj4string(meuse) <- CRS("+init=epsg:28992")We can now browse the CRS information of meuse> proj4string(meuse)

21

Specify a CRS with the EPSG database (1)

Unfortunately, the EPSG is not (as of June 2012) up-to-date; in particular it is missing a key issue: the offset of the center of the Earth in the RD system to that in the “WGS84” CRSThe Netherlands Geodetic Commission [5] has published the following adjustment+towgs84=565.237,50.0087,465.658,-0.406857,0.350733,-1.87035,4.0812Add the WGS84 displacement to the CRS definition for the Meuse dataset> proj4string(meuse) <- CRS(paste(proj4string(meuse),+ "+towgs84=565.237,50.0087,465.658,-0.406857,0.350733,-1.87035,4.0812"))We can now browse the updated CRS information of meuse> proj4string(meuse)

22

Specify a CRS with the EPSG database (2)

Result of proj4string(meuse)

[1] " +init=epsg:28992 +proj=sterea +lat_0=52.15616055555555+lon_0=5.38763888888889 +k=0.9999079 +x_0=155000 +y_0=463000 +ellps=bessel+units=m +no_defs+towgs84=565.237,50.0087,465.658,-0.406857,0.350733,-1.87035,4.0812"Now browse the records of the coordinates> head(coordinates(meuse))Result

x y[1,] 181072 333611[2,] 181025 333558[3,] 181165 333537[4,] 181298 333484[5,] 181307 333330[6,] 181390 333260

23

Transformation of CRS

• The rgdal package provides methods to convert between projections and datums; this dual process we can call “transformation” between CRS– The work is done by the spTransform method, taking any Spatial*

object, and returning an object with coordinates transformed to the target CRS.

24

Transform the Meuse dataset from the Rijksdriehook CRS to geographic coordinates on the WGS84 ellipsoid> meuse.wgs84 <- spTransform(meuse, CRS("+proj=longlat +datum=WGS84"))We can now browse the updated CRS information of meuse> proj4string(meuse.wgs84)Result[1] " +proj=longlat +datum=WGS84 +ellps=WGS84 +towgs84=0,0,0"Now browse the records of the coordinates> head(coordinates(meuse))Result

x y[1,] 5.758560 50.99153[2,] 5.757887 50.99105[3,] 5.759880 50.99086[4,] 5.761770 50.99037[5,] 5.761887 50.98899[6,] 5.763064 50.98836

25

An example of CRS transformation

4 Spatial data import and export in R

• Transformation between sp objects in R and formats of external GIS programs– A number of open source projects made the transformation

possible, such as: – Geospatial Data Abstraction Library ( http://www.gdal.org )– PROJ.4 Cartographic Projections Library ( http://proj.maptools.org

)

26

Vector File Formats

• Spatial vector data are points, lines, polygons, and fit the equivalent sp classes.– GIS vector data can be either topological or simple. Legacy GIS

were topological, desktop GIS were simple – The sp vector classes are simple, meaning that for each polygon

all coordinates are stored without checking that boundaries have corresponding points

27

Using OGR Drivers in rgdal

• Using the OGR vector functions of the Geospatial Data Abstraction Library, interfaced in rgdal, lets us read spatial vector data for which drivers are available. – A driver is a software component plugged-in on demand –

here the OGR library tries to read the data using all the formats that it knows, using the appropriate driver if available.

• OGR also supports the handling of coordinate reference systems directly, so that if the imported data have a specification, it will be read.

28

• The availability of OGR drivers differs from platform to platform, and can be listed using the ogrDrivers function. The function also lists whether the driver supports the creation of output files.

• The readOGR function: reads an OGR data source and layer into a suitable Spatial vector object– It takes at least two arguments – they are the data source name

(dsn) and the layer (layer), and may take different forms for different drivers.

• The writeOGR function: allowing data to be written out using supported drivers– Like readOGR, it uses drivers to handle different data formats

29

• Other vector import/export functions (only with shapefiles)– The shapefiles package is written without external libraries, using

file connections. It can be very useful when a shapefile is malformed, because it gives access to the raw numbers.

– The maptools package contains a local copy of the library used in OGR for reading shapefiles (the DBF reader is in the foreign package), and provides a low-level import read.shape function, a helper function getinfo.shape to identify whether the shapefile contains points, lines, or polygons.

30

Raster File Formats

• There are now a number of R packages that support image import and export– Such as the rimage and biOps packages and the EBImage

package in the Bioconductor project– The requirements for spatial raster data handling include

respecting the coordinate reference system of the image, so that specific solutions are needed

• Using arguments to readGDAL, subregions or bands may be selected, and the data may be decimated, which helps handle large rasters.

• The writeGDAL function can be used directly for drivers that support file creation.

31

• Other raster import/export functions– There is a simple readAsciiGrid function in maptools that

reads ESRI™ Arc ASCII grids into SpatialGridDataFrame objects; it does not handle CRS and has a single band. The companion writeAsciiGrid is for writing Arc ASCII grids.

– It is also possible to use connections to read and write arbitrary binary files, provided that the content is not compressed.

32

5 Examples

1. Read and write MapInfo data2. Export meuse data to Google Earth

33

With this example we only show the data import and export First, load the libraries needed> require(sp)> require(gstat)> require(rgdal)> require(maptools)> require(lattice)

Set the working directory, where the MapInfo datasets ("CITY_125.DAT“, "CITY_125.ID“, "CITY_125.IND“, "CITY_125.MAP“, and "CITY_125.TAB") located> setwd("E:/RnGIS")

34

Read and write MapInfo data

Show the metadata of the layer CITY_125, including CRS information> ogrInfo("CITY_125.TAB", layer="CITY_125")ResultSource: "CITY_125.TAB", layer: "CITY_125"Driver: MapInfo File number of rows 125 Feature type: wkbPoint with 2 dimensionsExtent: (-157.8583 18.40056) - (-66.10611 61.21806)CRS: +proj=longlat +ellps=GRS80 +towgs84=0,0,0,-0,-0,-0,0 +no_defs Number of fields: 4 name type length typeName1 City 4 50 String2 State 4 2 String3 Tot_hu 0 0 Integer4 Tot_Pop 0 0 Integer

35

Read this source and layer into a suitable Spatial vector object> CITY125 <- readOGR("CITY_125.TAB", layer="CITY_125")Returned noticeOGR data source with driver: MapInfo File Source: "CITY_125.TAB", layer: "CITY_125"with 125 features and 4 fieldsFeature type: wkbPoint with 2 dimensions

36

Show the class of CITY125> class(CITY125)Result[1] "SpatialPointsDataFrame"attr(,"package")[1] "sp"

Show the CRS information of CITY125> proj4string(CITY125)Result"+proj=longlat +ellps=GRS80 +towgs84=0,0,0,-0,-0,-0,0 +no_defs“

37

Plot CITY125 in a diagram> plot(CITY125, pch=1, col="red", border="grey60", + axes=TRUE, asp=1,+ main="Big cities in USA")Result

38

Add name of cities in the diagram> text(coordinates(CITY125), + labels=as.character(CITY125$City),+ font=2, cex=0.25, pos=4)Result

39

Export the spatial object to a MapInfo file. dsn="." means exporting to the working folder> writeOGR(CITY125, dsn=".", + layer="CITY125RwriteToMapInfo",+ driver="MapInfo File", overwrite_layer=TRUE)Result: in the working folder there will be a list of generated MapInfo files

40

• Here we show a simple example of exporting meuse data Zn values as a KML file. In coming lab there will be a few enriched examples with exporting result of kriging predication into Google Earth

Load the meuse data> data(meuse)> coordinates(meuse) = ~ x + y

Specify the CRS infromation> EPSG <- make_EPSG()> (EPSG[grep("Amersfoort", fixed=T, EPSG$note), ])> proj4string(meuse) <- CRS("+init=epsg:28992")> proj4string(meuse) <- CRS(paste(proj4string(meuse),+ "+towgs84=565.237,50.0087,465.658,+ -0.406857,0.350733,-1.87035,4.0812"))

41

Export meuse data to Google Earth

Transform the Meuse dataset from the Rijksdriehook CRS to geographic coordinates on the WGS84 ellipsoid> meuse.wgs84 <- spTransform(meuse, + CRS("+proj=longlat +datum=WGS84"))

Export the point coverage of Meuse Zn values as a KML file.> writeOGR(meuse.wgs84["zinc"], "meuseZnPoints.kml",+ "zinc", driver = "KML", overwrite_layer = TRUE)

In the working folder there will be a KML file generated, which can be opened in Google Earth

42

43

We can also export the result of predication to Google Earth (details will be in the coming Friday lab)

• Reading assignments for this week– Handout: Chapter 4 Spatial Data Import and Export. In:

Bivand, R.S., Pebesma, E.J., Gómez-Rubio, V., 2008. Applied Spatial Data Analysis with R. Springer, New York, NY, 374pp.

– Chapter 9 Selecting and Querying Data. In: MapInfo Professional 11.0 User Guide

44

Next classes

• Friday class:– Interfacing R and GIS – practices

• Next week– no class – Grand Marshal week

• Term assignment

45

top related