community building and collaborative georeferencing using geolocate nelson e. rios & henry l....
TRANSCRIPT
Community Building and Community Building and Collaborative Georeferencing Collaborative Georeferencing
using GEOLocateusing GEOLocate
Nelson E. RiosNelson E. Rios
&&
Henry L. Bart Jr.Henry L. Bart Jr.
Tulane University Museum of Natural HistoryTulane University Museum of Natural History
What is Georeferencing
• As applied to natural history collection data it is the process of assigning geographic coordinates to a textually described collecting event
• Traditional approaches laborious and time consuming
• Automated and collaborative processes have proven to improve efficiency
GEOLocate Overview
Desktop application for automated georeferencing of natural history collections data
Locality description analysis, coordinate generation, batch processing, geographic visualization, data correction and error determination
Initial release in 2002
Overview: Locality Visualization & Adjustment
Computed coordinates are displayed on digital maps
Manual verification of each record
Drag and drop correction of records
Overview:Multiple Result Handling
Caused by duplicate names, multiple names & multiple displacements
Results are ranked and most “accurate” result is recorded and used as primary result
All results are recorded and displayed as red arrows
Working on using specimen data to limit spread of results
Overview:Estimating Error
User-defined maximum extent described as a polygon that a given locality description can represent
Recorded as a comma delimited array of vertices using latitude and longitude
Current Lines of Development
• “Taxonomic Footprints”
• Collaborative georeferencing
• Global expansion
• Multilingual georeferencing with user-defined locality grammar
Taxonomic Footprint Validation
Taxa collected for a given locality
Uses point occurrence data from distributed museum databases to validate georeferenced data
Species A
Species B
Lepomis macrochirusLepomis macrochirus
Notropis chrosomusNotropis chrosomus
Notropis volucellusNotropis volucellus
Micropterus coosaeMicropterus coosae
LepomisLepomis cyanelluscyanellus
Cottus carolinaeCottus carolinae
Hypentelium etowanumHypentelium etowanum
Etheostoma ramseyiEtheostoma ramseyi
Footprint for specimens collected at Little Schultz Creek, off Co. Rd. 26 (Schultz Spring Road), approx. 5 mi Footprint for specimens collected at Little Schultz Creek, off Co. Rd. 26 (Schultz Spring Road), approx. 5 mi N of Centreville; Bibb County; White circles indicate results from automated georeferencing. Black circle N of Centreville; Bibb County; White circles indicate results from automated georeferencing. Black circle indicates actual collection locality based on GPS. This sample was conducted using data from UAIC & indicates actual collection locality based on GPS. This sample was conducted using data from UAIC & TUMNHTUMNH
Collaborative Georeferencing
• Distributed community effort increases efficiency
• Web based portal used to manage each community
• DiGIR used for data input (alternatives in development)
• Similar records from various institutions can be flagged and georeferenced at once
• Data returned to individual institutions via portal download as a comma delimited file
Collaborative GeoreferencingDiGIR Service
Record Processor
GEOLocate Desktop Application
Cache Update Web Service
Web Portal Application
Data Store
Georeferencing Web Service
Data Retrieval Web Service
Insert Correction Web Service
Remote Data Source
Global Georeferencing
Typically 1:1,000,000
Will work with users to improve resolution (examples: Australia250K & Spain200K)
Advanced features such as waterbody matching bridge crossing detection possible but requires extensive data compilation (example: Spain)
Multilingual Georeferencing
• Extensible architecture for adding languages via language libraries
• Language libraries are text files that define various locality types in a given language
• Current support for:– Spanish– Basque– Catalan– Galician
• May also be used to define custom locality types in English
Future Directions
• Collaboration with foreign participants to improve datasets and language libraries
• Cross platform Java client
• More web services integration
• Integration of WFS & WMS for mapping
• Alternatives to DiGIR
Final Thoughts
• Approximately 20% of DiGIR providers (excluding fish collections) tested have problems of stability and/or data quality resulting in the inability to cache records.
• Typical data quality issues:– Malformed data (common in “datelastmodified”)
– Catalog numbers missing and/or duplicated
• Resolution requires dedicated infrastructure for data providers or alternative means of serving data.
AcknowledgementsAcknowledgements
Djihbrihou Abibou Djihbrihou Abibou Andy BentleyAndy Bentley
Shwetha BelameShwetha BelamePaul FlemonsPaul Flemons
Demin HuDemin HuSophie HungSophie Hung
John JohansonJohn JohansonDavid Draper MuntDavid Draper Munt
Dave Vieglais Dave Vieglais Ed WileyEd Wiley
National Science FoundationNational Science Foundation