names are not sufficient: the challenge of documenting organism identity
DESCRIPTION
Names are not sufficient: the challenge of documenting organism identity. R.K. Peet, J.B.Kennedy, and N.M. Franz and The Ecological Society of America Vegetation Panel The SEEK development team. Locality. Observation/ Collection Event. Co-occurrence database. Specimen or Object. - PowerPoint PPT PresentationTRANSCRIPT
Names are not sufficient: the challenge of documenting organism
identity R.K. Peet, J.B.Kennedy,
and N.M. Franz
and
The Ecological Society of America Vegetation PanelThe SEEK development team
Biodiversity data structure
Taxonomic database
Co-occurrence database
Occurrence database
Observation/Collection Event
Specimen or Object
Bio-Taxon
Locality
Community Type
Community type database
• Accurate identification and labelling of organisms is a critical part of collecting, recording and reporting biological data.
• Increasingly, research in biodiversity and ecology is based on the integration (and re-use) of multiple datasets.
• New tools are producing flawed results!
1. Biodiversity informatics depends on accurate and
precise taxonomy
High-elevation fir trees of western North America
AZ NM CO WY MT AB eBC wBC WA OR
Abies lasiocarpa
var. arizonica
Abies lasiocarpa var. lasiocarpa
Distribution
USDA - ITIS
Flora North America
Abies bifolia Abies lasiocarpa
R. plumosa
R. plumosa
R plumosav. intermedia
R. plumosav. plumosa
R. intermedia
R. plumosav. interrupta
R. pineticola
R. plumosa
R. sp. 1
R. plumosav. plumosa
R. plumosav. pineticola
Multiple concepts of Rhynchospora plumosa s.l.
Elliot 1816
Gray 1834
Kral 2003
Peet 2006?
1
2
3
Chapman1860
Multiple concepts of Andropogon virginicus L. sl
The Taxonomic database challenge:
Standardizing organisms and communities
The problem: Integration of data potentially
representing different times, places, investigators and taxonomic standards.
The traditional solution: A standard list of organisms /
communities.
Standardized taxon lists failto allow dataset integration
The reasons include:
• Taxonomic concepts are not defined (just lists),
• Multiple party perspectives on taxonomic concepts and names cannot be supported or reconciled,
• The user cannot reconstruct the database as viewed at an arbitrary time in the past.
This is the single largest impediment to large-scale synthesis in biodiversity & ecology.
Name ReferenceConcept
Taxonomic theory
A taxon concept represents a unique combination of a name and a reference.
Report -- name sec reference.
.
Name ConceptUsage
A usage represents an association of a concept with
a name.
• The name used in defining the concept need not be the same name used in your work.
e.g. Carya alba = Carya tomentosa sec. Gleason & Cronquist 1991.
• Usage can be used to apply multiple name systems to a concept
When reporting the identity of organisms in publications, data, or on specimens, provide not only the full scientific name of each kind of organism recognized, but also the reference that formed the basis of the taxonomic concept.
e.g., Abies lasiocarpa sec. Flora North America 1997.
2. Always report a taxon by reference to a concept
• Reference high-quality sources for taxon concepts such as a major compendium that provides its own defined concepts, or a source that references the concepts of others.
• Avoid checklists (e.g. ITIS) as they typically lack true taxonomic descriptions or circumscriptions
Choice of concepts
SEEK & GBIF are working to provide standards for concept
data• Several data models incorporate
taxon concepts. The IOPI, VegBank, and Taxonomer models are optimized for different uses.
• SEEK, GBIF, and TDWG developed TCS, which was adopted by TDWG in August 2005 and is being implemented by GBIF and SEEK
• A name in a publication could be either a concept or an identification.
• Identifications should include linkage to at least one concept, but need not be limited to a single concept.
Eg. --< Potentilla sec. Cronquist 1991 +~ Potentilla simplex sec Cronquist 1991 +~ Potentilla canadensis sec Cronquist 1991
3. Concepts and identifications are distinct.
4. Biodiversity informatics depends on standards and
connectivityDarwin Core and EML are widely used and
under continued development, but effectively obsolete.
• Names (Linnean Core)• Publications (Alexandrian core, etc)• Observations (proposed TDWG standard)• Identifications (proposed EML extension)• Taxonomic concepts (TCS)• GUIDS (under development by GBIF)
Step 1: Adoption of minimum standards and best practices by high-quality journals, funding agencies, and professional organizations.
Distributed information systems - and the way
ahead
Publishers, curators and data managers need to tag taxon
interpretations with concepts
• Precedence exists with tagging literature citations and GenBank accessions
• Presses are linking scientific names in many ejournals to ITIS (e.g. Evolution, Ecology)
Step 2: Creation, availability, and maintenance of databases that document core sets of taxonomic concepts and the relationships of these concepts to each other.
The way ahead
Relationships among concepts
• Exactly equal (identification)• Congruent, equal (=)• Includes (>)• Included in (<)• Overlaps (><)• Disjunct (|)
True concept-based checklists
• Equivalent of ITIS but with concept documentation and including how other concepts map onto the concepts accepted by the party.
• Several are operative or in development including EuroMed, IOPI-GPC, Biotics, VegBank. Concept documentation planned for ITIS/USDA.
Registration system and standard identifiers for names, references, and
concepts• Essential for data exchange
• GBIF is hosting a set of international workshops to design the GUID infrastructure.
Step 3: Development and provision of tools to facilitate mark-up of data and manuscripts with taxonomic concepts
The way ahead
Tools to develop and map concepts
• Taxonomists need mapping and visualization tools for relating concepts of various authors. SEEK will build prototypes for review and possible adoption.
• Aggregators need tools for mapping relationships among concepts.
• Users need tools for entering legacy concepts. Several are in development
Build on the infrastructure provided by
1) The VegBank data model2) The NVC peer review system3) GBIF & TDWG standards4) The Weakley concept dataset for
the Southeast
The Opportunity
Aus aus L.1758 Aus aus L.1758
(v) Aus L.1758
Xus Pargiter 2003
Xus beus (Archer) Pargiter 2003.
in Pargiter 2003
(ii) Aus L.1758
Aus bea Archer 1965
in Archer 1965
(i) Aus L.1758
Aus aus L.1758
in Linneaus 1758
Aus bea Archer 1965
Aus cea BFry 1989
(iii) Aus L.1758
in Fry 1989
Aus ceus BFry 1989
Aus aus L. 1758
A diligent nomenclaturist, Pyle (1990), notes that the species epthithets of Aus bea and Aus cea are of the wrong gender and publishes the corrected names Aus beus corrig. Archer 1965 and Aus ceus corrig. BFry 1989
Tucker publishes his revison without noting Pyle’s corrigendum of the name of Aus cea
Pargiter publishes his revison using Pyle’s corrigendum of the epithet bea to beus and Aus cea to Aus ceus.
Timeline showing taxonomic history (revisions and nomenclatural changes) pertaining to species comprising the imaginary genus Aus.
Aus aus L.1758
in Tucker 1991
(iv) Aus L.1758
Aus cea BFry 1989