community building and collaborative georeferencing using geolocate nelson e. rios & henry l....

40
Community Building Community Building and Collaborative and Collaborative Georeferencing using Georeferencing using GEOLocate GEOLocate Nelson E. Rios Nelson E. Rios & & Henry L. Bart Jr. Henry L. Bart Jr. Tulane University Museum of Tulane University Museum of Natural History Natural History

Upload: philip-lindsey

Post on 22-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Community Building and Community Building and Collaborative Georeferencing Collaborative Georeferencing

using GEOLocateusing GEOLocate

Nelson E. RiosNelson E. Rios

&&

Henry L. Bart Jr.Henry L. Bart Jr.

Tulane University Museum of Natural HistoryTulane University Museum of Natural History

What is Georeferencing

• As applied to natural history collection data it is the process of assigning geographic coordinates to a textually described collecting event

• Traditional approaches laborious and time consuming

• Automated and collaborative processes have proven to improve efficiency

GEOLocate Overview

Desktop application for automated georeferencing of natural history collections data

Locality description analysis, coordinate generation, batch processing, geographic visualization, data correction and error determination

Initial release in 2002

Overview: Locality Visualization & Adjustment

Computed coordinates are displayed on digital maps

Manual verification of each record

Drag and drop correction of records

Overview:Multiple Result Handling

Caused by duplicate names, multiple names & multiple displacements

Results are ranked and most “accurate” result is recorded and used as primary result

All results are recorded and displayed as red arrows

Working on using specimen data to limit spread of results

Overview:Estimating Error

User-defined maximum extent described as a polygon that a given locality description can represent

Recorded as a comma delimited array of vertices using latitude and longitude

Current Lines of Development

• “Taxonomic Footprints”

• Collaborative georeferencing

• Global expansion

• Multilingual georeferencing with user-defined locality grammar

Taxonomic Footprint Validation

Taxa collected for a given locality

Uses point occurrence data from distributed museum databases to validate georeferenced data

Species A

Species B

Lepomis macrochirusLepomis macrochirus

Notropis chrosomusNotropis chrosomus

Notropis volucellusNotropis volucellus

Micropterus coosaeMicropterus coosae

LepomisLepomis cyanelluscyanellus

Cottus carolinaeCottus carolinae

Hypentelium etowanumHypentelium etowanum

Etheostoma ramseyiEtheostoma ramseyi

Footprint for specimens collected at Little Schultz Creek, off Co. Rd. 26 (Schultz Spring Road), approx. 5 mi Footprint for specimens collected at Little Schultz Creek, off Co. Rd. 26 (Schultz Spring Road), approx. 5 mi N of Centreville; Bibb County; White circles indicate results from automated georeferencing. Black circle N of Centreville; Bibb County; White circles indicate results from automated georeferencing. Black circle indicates actual collection locality based on GPS. This sample was conducted using data from UAIC & indicates actual collection locality based on GPS. This sample was conducted using data from UAIC & TUMNHTUMNH

Collaborative Georeferencing

• Distributed community effort increases efficiency

• Web based portal used to manage each community

• DiGIR used for data input (alternatives in development)

• Similar records from various institutions can be flagged and georeferenced at once

• Data returned to individual institutions via portal download as a comma delimited file

Collaborative GeoreferencingDiGIR Service

Record Processor

GEOLocate Desktop Application

Cache Update Web Service

Web Portal Application

Data Store

Georeferencing Web Service

Data Retrieval Web Service

Insert Correction Web Service

Remote Data Source

Global Georeferencing

Typically 1:1,000,000

Will work with users to improve resolution (examples: Australia250K & Spain200K)

Advanced features such as waterbody matching bridge crossing detection possible but requires extensive data compilation (example: Spain)

Multilingual Georeferencing

• Extensible architecture for adding languages via language libraries

• Language libraries are text files that define various locality types in a given language

• Current support for:– Spanish– Basque– Catalan– Galician

• May also be used to define custom locality types in English

Future Directions

• Collaboration with foreign participants to improve datasets and language libraries

• Cross platform Java client

• More web services integration

• Integration of WFS & WMS for mapping

• Alternatives to DiGIR

Final Thoughts

• Approximately 20% of DiGIR providers (excluding fish collections) tested have problems of stability and/or data quality resulting in the inability to cache records.

• Typical data quality issues:– Malformed data (common in “datelastmodified”)

– Catalog numbers missing and/or duplicated

• Resolution requires dedicated infrastructure for data providers or alternative means of serving data.

AcknowledgementsAcknowledgements

Djihbrihou Abibou Djihbrihou Abibou Andy BentleyAndy Bentley

Shwetha BelameShwetha BelamePaul FlemonsPaul Flemons

Demin HuDemin HuSophie HungSophie Hung

John JohansonJohn JohansonDavid Draper MuntDavid Draper Munt

Dave Vieglais Dave Vieglais Ed WileyEd Wiley

National Science FoundationNational Science Foundation