dave bloom museum of vertebrate zoology university of california, berkeley georeferencing...

36
Dave Bloom Museum of Vertebrate Zoology University of California, Berkeley Georeferencing Introduction: Collaboration to Automation

Upload: nathan-davidson

Post on 28-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Dave BloomMuseum of Vertebrate Zoology

University of California, Berkeley

Georeferencing Introduction: Collaboration to Automation

Georeferencing

Collaborations

Automation

Georeferencing

Collaborations

Automation

What is a georeference?

A numerical description of a place that can be mapped.

What is a georeference?

A numerical description of a place that can be mapped.

What is a georeference?

In other words…

ID Species Locality1 Lynx rufus Dawson Rd. N Whitehorse2 Pudu puda cerca de Valdivia3 Canis lupus 20 mi NW Duluth

9 Ursus arctos Bear Flat, Haines Junction

4 Felis concolor Pichi Trafúl5 Lama alpaca near Cuzco6 Panthera leo San Diego Zoo7 Sorex lyelli Lyell Canyon, Yosemite8 Orcinus orca 1 mi W San Juan Island

What we have:Localities we can read

Darwin Core Location Terms

–higherGeography–waterbody, island, islandGroup–continent, country, countryCode, stateProvince, county, municipality

– locality–minimumElevationInMeters, maximumElevationInMeters, minimumDepthInMeters, maximumDepthInMeters

What we want:Localities we can map

Darwin Core Georeference Terms

– decimalLatitude, decimalLongitude– geodeticDatum– coordinateUncertaintyInMeters– georeferencedBy, georeferenceProtocol– georeferenceSources – georeferenceVerificationStatus– georeferenceRemarks– coordinatePrecision– pointRadiusSpatialFit– footprintWKT, footprintSRS,

footprintSpatialFit

What is a georeference?

A numerical description of a place that can be mapped.

“Davis, Yolo County, California”

“point method”

Coordinates: 38.5463 -121.7425Horizontal Geodetic Datum: NAD27

Data Quality

• data have the potential to be used in ways unforeseen when collected.

• the value of the data is directly related to the fitness for a variety of uses.

• “as data become more accessible many more uses become apparent.” – Chapman 2005

• the GBIF Best Practices (Chapman and Wieczorek 2006) promote data quality and fitness for use.

What is an acceptable georeference?

A numerical description of a place that can be mapped

and that describes the spatial extent of a locality

and its associated uncertainties.

“Davis, Yolo County, California”

“bounding-box method”

Coordinates: 38.5486 -121.754238.545 -121.7394

Horizontal Geodetic Datum: NAD27

“Davis, Yolo County, California”

“point-radius method”

Coordinates: 38.5468 -121.7469Horizontal Geodetic Datum: NAD27Maximum Uncertainty: 8325 m

What is an ideal georeference?

A numerical description of a place that can be mapped

and that describes the spatial extent of a locality

and its associated uncertaintiesas well as possible.

“Davis, Yolo County, California”

“shape method”

“20 mi E Hayfork, California”

“probability method”

point easy to produce no data quality

bounding-box simple spatial queriesdifficult quality assessment

point-radius easy quality assessmentdifficult spatial queries

shape accurate representationcomplex, uniform

Method Comparison

probability accurate representationcomplex, non-uniform

MaNIS/HerpNET/ORNIS (MHO) Guidelines

http://manisnet.org/GeorefGuide.html

• uses point-radius representation of georeferences

• circle encompasses all sources of uncertainty about the location

• methodology formalizes assumptions, algorithms, and documentation standards that promote reproducible results

• methods are universally applicable

Darwin Core Georeference Terms

– decimalLatitude, decimalLongitude– geodeticDatum– coordinateUncertaintyInMeters– georeferencedBy, georeferenceProtocol– georeferenceSources – georeferenceVerificationStatus– georeferenceRemarks– coordinatePrecision– pointRadiusSpatialFit– footprintWKT, footprintSRS,

footprintSpatialFit

Georeferencing

Collaborations

Automation

Collaborative DistributedDatabases for Vertebrates

Collaborations

MaNIS Localities Georeferenced

n = 326k localities (1.4M specimens)r = 14 localities/hr (point-radius method)

t = 3 yrs (~40 georeferencers)

ORNIS Localities Georeferenced

n = 267k localities (1.4M specimens)r = 30 localities/hr (point-radius method)

t = 2 yrs (~30 georeferencers)

Scope of the Problem for Natural History Collections

~2.5x109 records

Scope of the Problem for Natural History Collections

~6 records per locality*

~14 localities per hour*

* based on the MaNIS Project

~2.5x109 records

Scope of the Problem for Natural History Collections

~6 records per locality*

~14 localities per hour*

~15,500 years

* based on the MaNIS Project

~2.5x109 records

Scope of the Problem for Natural History Collections

~6 records per locality*

~14 (30) localities per hour*

~15,500 (7233) years

* based on the MaNIS (ORNIS) Project

~2.5x109 records

Georeferencing

Collaborations

Automation

Automation

Combining the Best in Georeferencing

GeoLocateGADM

MaNIS Georeferencing Calculator

GADM

Global Administrative Boundaries:

http://www.museum.tulane.edu/geolocate

Georeferencing Calculator: