abcd & biocase a quick introduction. motivation & rationale – abcd i “access to...

ABCD & BioCASe

A Quick Introduction

Motivation & Rationale – ABCD I

“Access to Biological Collection Data” v2.06 ratified by TDWG, v1.20 still in use

Comprehensive data exchange format to share detailed primary collection data of all

organisms (living & preserved specimens and observations)

lately “Extended For Geoscience”: ABCDEFG Fossils, rocks, minerals, meteorites

Covers ~925 concepts (EFG adds ~825)

Motivation & Rationale – ABCD II

Monolithic xml schema with single root including ~15 subschemas defining reusable types (->UBIF)

To ease processing & publishing ABCD refrains from: keys; no normalisation, no external standards recursive xml structures

Contains xsd:any extension slots for ad-hoc additions through external schemas

Variable atomisation (polymorphism) allows provision of data in different degrees of detail & standardisation simplifies publishing, but pushes work to consumers priority towards mobilisation of rich data sources

Motivation & Rationale – BioCASe

XML based protocol for discovery & retrieval in distributed, heterogenous systems

Derived from DiGIR to share ABCD records supports hierarchical, nested structures

Shared “ontology” defined in xml schema(s) concept = namespace + simple xpath to element in

potential instance document

Publishing Software Implementations

PyWrapper implements BioCASe protocol

Security layer for BioCASe services roles, signatures, access restriction, encryption,

SSL

Publishing Software Deployments

PyWrapper ABCD

~65 installations, 85 databases, mainly in Europe GCP Passport

~15 installations, 22 DBs, worldwide BioCASE Collection Metadata Profile

~25 installations & DBs, Europe SPICE / Species2000

4 installations & DBs, Europe

+35 installation p.a. (ABCD,MCPDH)

Consuming Software Implementations

Simple UI, a simple but truly distributed portal

Quertool for a single BioCASE service uses XSLTs (ABCD + EFG, DarwinCore)

Synthesys portal, in prep. uses multilingual XSLTs + central cache

Germplasm clearing house mechanism for GCP Passport & ABCD services + cache

GBIF indexer/portal

Consuming Software Deployments

Simple UI BioCASE GBIF germany botany

Quertool for a single BioCASE service part of PyWrapper, customized layout ~5 times

Synthesys portal, in prep. Synthesys GBIF-DE GBIF-FR

Germplasm clearing house mechanism portal

Potential Publishers

All biological & geoscience collections thousands, probably > 20.000 collections billions of records, 1.5-3 billions vouchered digitisation currently very low (5% ?)

Potential Customers

Researchers biologists (taxonomy, genetics, ecology) historians

Policy Makers e.g. land planning

Public, Education, ...more customers the more

detailed the data (e.g. images) exhaustive covering of collection

Success Factors

ComprehensivenessEase of use (software installation)Personal free technical support (for

Europe)Allows for custom extensions

Hurdles to Adoption

Complexity, both for consumers and publishers

lack of good documentation with examples

Big Picture

ABCD feeds into ontology modularisation (taxa, publications, ...) as UML, XML schema, OWL

abcd & biocase a quick introduction. motivation & rationale – abcd i “access to...

Documents