building a global invasive species information network with a tapir protocol jim graham, annie...

21
Building A Global Invasive Species Information Network with a TAPIR Protocol Jim Graham, Annie Simpson, Michael Browne, Bob Morris, Tom Stohlgren, Greg Newman, …

Post on 22-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

Building A Global Invasive Species Information Network with a TAPIR

Protocol

Jim Graham, Annie Simpson, Michael Browne, Bob Morris, Tom

Stohlgren, Greg Newman, …

Research vs. Production

Attribute Research Production

Quality Accurate Robust

Number of Users Few Lots

Technology Latest and greatest What works

Learning Curve Typically long Short

Support None to Informal Must

Documentation Minimal and techy is ok

Must be complete and easy to understand

Bottom Line If it’s cool they will come

If it doesn’t work, they will go elsewhere

Software Lifecycle

Investigation

Design

Implementation

Testing

Maintenance

Time

The Tire Swing

What the customerneeded

What wasdesigned

What marketing suggested

What management

approved

What was delivered

Alan Chapman, http://www.businessballs.com/treeswing.htm

Questions to Answer

• Who is the customer?– Invasive species data providers– Invasive species data consumers– Stake holders

• What are we selling/giving them?– A network to allow the exchange of

information on invasive species

• What do we need to do to get them to want to buy/use it?

Technology Adoption Lifecycle

Bohlen, Joe M. & George M. Beal (May 1957), "The Diffusion Process", Special Report No. 18 1: 56-77

Time

Survey & Interview Highlights

• At least 3 languages/frameworks important

• 1 hour to “as long as it takes” for commitment

• Minimal web service expertise

• Various installation scenarios

• DiGIR did not meet all needs– Complex queries not needed– Database problems

History

• National Biological Information Infrastructure (NBII)

• Global Invasive Species Information Network (GISIN)

• NISBase: Brian Steves and Shawn Dalton

• GIS standards (WMS)

• Common web services

• Invasive Alien Species Profile Schema - (IAS-PS)

Situation

• Need:– Toolkits in 3 languages– Documentation– Support– Registry/Directory– Portal– Provider test bed

• Have:– Existing:

• Protocols• Schemas/Data Models• Toolkits• Portals• Registries• Databases

– Minimal funding for development

– No funding for support?

Complexity

• Complexity is a multiplier on:– Development: more to implement– Testing: more to test– Support: more to document, train, and

upgrade– Performance: larger data transfers, longer

parsing time

• Simpler means we can get tools; with higher quality, better support, that run faster, and for less money

Architecture

GISIN Data

Providers

GISIN

Consumers

GISIN

Portals

TCS

Database

Other

Consumers

Other

Providers

End-Users

Other Web

Sites

BGIF

Registry

GISIN

Registry/Directory

Web Services

Web Browser Communication

Protocol Design

• Approaches:– TAPIR-Light

• Key Value Pair Only

– Flat data models• Performance : 1 million records in 14 minutes

– Controlled vocabulary wherever possible

Required Data Models

• BioStatus: Indigenous, Harmful, etc.• Occurrences: X, Y (DarwinCore)• ProfileURLs: Language, URL• ImpactStatus: Human, Agriculture, etc.• ManagementStatus: Activity, etc.• DistributionStatus: Growing, Stable, etc.

All have: Scientific Name, Location

Implementation Requirements

1. Automatic Installation– Installer and DiGIR-like admin pages

2. Adapt toolkit to database, web server, security

3. Roll toolkit to another language (Perl, C++)

4. Do it themselves – Just the documentation

• Existing toolkits/protocols are too complex and lack the development documentation to do 2 through 4 quickly

Protocol Transaction Diagram

Locations

Observations

Organisms

SQL QuerySELECT Latitude,…FROM LocationsJOIN Observation…JOIN Organisms…WHERE Genus=‘Tamarix’

Latitude Longitude Date Scientific

Name

-105 40 10/2/2007 Tamarix aphyla

-110 35 2/10/1999 Tamarix chinensis

Requesthttp://provider.org/GISIN.php?Op=Inventory&Model=Occurrences&Count=true&Genus=Tamarix&Concept=Latitude&Concept=Longitude&Concept=Date&Concept=ScientificName

Response<response> <inventory> <records> <record> <Latitude>-105</Latitude> <Longitude>40 <Date>10/12/2000</Date> <ScientificName> Tamarix aphyla </ScientificName> </record> … </records> </inventory></response>

Database

Toolkit Design: Data Flow

Provider

Web Service

DatabaseConnection

ProviderDatabase

Metadata.xml

Capabilities.xml

GISIN Protocol

Internet

Web

Date

Utilities

Admin Web Site

Query Builder

Service

Manager

Provider.xml

Configuration Files

Performance by Time Per Record

Time Per Record

0.0000

0.1000

0.2000

0.3000

0.4000

0.5000

0.6000

0.7000

0.8000

0.9000

10 100 1000 10000 100000 1E+06

Records in Database

Fet

ch T

ime

Per

Rec

ord

in

Mill

isec

on

ds

7. Fetch Rows with 1 field

8. Fetch rows with 10 fields

10. Fetch blocks of rowswith limit query

1 million records: 14 hours -> 14 minutes

Products Mapped to Customer Needs

Consumers Consumers

Providers Providers

Complex QueriesRDF

KVPXML

Adapted from Peter Fox, Debra McGuiness (personal communication)

More Sophisticated Users

Invasive Species Databases Other Databases

TAPIR/DarwinCore…GISIN

Next Steps

• Resolve Issues• Toolkit Development:

– Complete the design– Roll to Java and ASP– User’s Guide

• Testing:– 2-4 more databases connected– Automated tests– Defect tracking

• Portal– Incremental improvements

• Provider Meeting in November

Current Web Site

• GISIN Organization Site: www.GISINetwork.org• GISIN Directory: www.niiss.org/GISIN

– Until end of September: www.niiss.org/GISS– Browse Directory– Search for data: BioStatus, Occurrences, ProfileURLs

• GISIN Technical Site– Documentation– For providers:

• Get Toolkit• Sample Provider (based on the toolkit)• Manual exercising of TAPIR-GISIN web services• Automated tests are coming!

Acknowledgements

• Funded by NSF, NBII (USGS), GBIF, TDWG• Thanks to: Renato de Giovanni, Roger Hyam,

Donald Hobern, Markus Döring, Hannu Saarenmaa, Kevin Richards, Peter Fox, Debra McGuiness, Brain Steves, Pam Fuller, John Pickering, Shawn Dalton, Greg Ruiz, and the other members of GISIN

• Review: www.niiss.org/GISIN (or GISS)• Contact: [email protected]