thomas ecn 2012
DESCRIPTION
TRANSCRIPT
{
Entomology specimens develop dissociative identity disorder
Jennifer Thomas – Assistant Collection Manager Division of Entomology
University of Kansas
Synopsis
Current common problems with digital specimen records
Common issues with disparate data, data integration
General types of unique identifiers and how they function
Specific digital ID’s for Natural History objects and the need for a global solution
What to do until we have a solution – best practices
Current common problems for Entomology collection object records Multiple barcodes for single
specimen/collection object
This happens when…
Specimens are gifted to another institution
Specimens are retained by another institution as a result of a revisionary work
Specimens are returned to an institution in accordance with permitting requirements for that particular country.
The digital records rarely accompany any of these specimen transactions.
Current problems for all Natural History specimens
Example from Briefings in Bioinformatics: Roderic Page 2008. vol. 9 (5): 345-354. Biodiversity Informatics: The challenge of linking data and the role of shared identifiers.
Melissotarsus sp. BLF ml.
The discussion continues
Lot’s of literature out there.
The Dr. Page blog is great.http://iphylo.blogspot.com/
The NHColl listserve recently hosted a long discussion on identifiers
Euglossa embera Hinojosa-Diaz, Nemesio, Engel 2012
We’ll only ever have more specimens, more associated data, more data portals, and more ways to share that data…
Natural History Specimens need Globally Unique Identifiers that really work!
Familiar Unique ID’s
ISBN = International Standard Book Number.
ISSN: International Standard Serial Number.
SSN: Social Security Number. 1st 3 numbers = area code2nd 2 numbers = group number Last 4 numbers = serial number
Where did Gooooids come from?
GUID 1 = Globally Unique Identifier (/Gooooid/). Unique reference number used as an identifier in computer hardware/software and based on the UUID standard. [128-bit values displayed as 32 hexadecimal digits separated by hyphens] Ex: 3F2504E0-4F89-11D3-9A0C-0305E82C3301
UUID = Universally Unique Identifier. An identifier standard used in software construction standardized by the Open Software Foundation.
GUID 2 = RSS definition still Globally Unique ID. The <guid> element defines a unique identifier for the item. Aggregators must view the guid as a string. No rules for syntax. Up to the creator of the RSS document to establish uniqueness.
Persistent identifiers… DOI = Digital Object Identifier. A character string used
to uniquely identify an object. Used mostly by publishers (CrossRef, DataCite). A URN commonly assigned to scientific articles in their electronic form.
Managed by the International DOI Foundation (IDF), the governance body of the DOI system.
Appoints registration agencies that provide services to DOI registrants like allocating DOI prefixes, registering DOI names, etc.
resolution using the Handle System
More… ARK = Archival Resource Key. ARK’s are URL’s
(Uniform Resource Locator) designed to support long-term access to information objects. Used extensively by University digital Libraries/digital archives and Google! Also requires a registry maintained by the California
Digital Library. NAA = name assigning authority NAAN = name assigning authority number!
Everyone and everythingwants a unique ID!
ASIN (Amazon Standard Identification Number, a proprietary product identifier)
CODEN (serial publication identifier currently used by libraries; replaced by the ISSN for new works)
DOI (Digital Object Identifier) ETTN (Electronic Textbook Track Number) ISAN (International Standard Audiovisual Number) ISBN (International Standard Book Number) ISMN (International Standard Music Number) ISRC (International Standard Recording Code) ISWC (International Standard Musical Work Code) LCCN (Library of Congress Control Number) OCLC (Online Computer Library Center)
Has the world gone identifier crazy?
YES!
Natural History Collectionsimplementations
LSID = Life Science Identifiers (no funny pronunciation). It is a URN. Ex: Applied to species names in Species 2000 and ITIS Catalogue of Life Project.
Again, requires a registry. The governing body here is TDWG “Biodiveristy Information Standards” (formerly The International Working Group on Taxonomic Databases).
John Deck, University of California, BerkeleyBrian Stucky, University of Colorado, BoulderLukasz Ziemba, University of Florida, GainesevilleNico Cellinese, University of Florida, GainesvilleRob Guralnick, University of Colorado, Boulder
BiSciCol TeamReed Beaman, Nico Cellinese, Jonathan Coddington, Neil Davies, John Deck, Rob Guralnick, Bryan P. Heidorn, Chris Meyer, Tom Orrell, Rich Pyle, Kate Rachwal, BrianStucky, Rob Whitton, Lukasz Ziemba
Natural History Collectionsimplementations
The Solution
The Museum community should implement an international system for distribution and maintenance of persistent unique identifiers for all of our biological objects.
Best PracticesBiSciCol Blog: http://biscicol.blogspot.com/
GUID’s must be globally unique. The “Darwin Core Triplet” might not be good enough.
GUID’s must be persistent. GUID’s must be assigned as close to the source
as possible. GUID’s propagate downstream to other systems. Don’t conflate GUID’s for physical material with
GUID’s for metadata about the physical object. GUID’s need to be attached in a meaningful way
to semantic services.
Acknowledgements
KU Bioinformatics Andy BentleyRod SpearsTheresa Lammer
KU Division of EntomologyZack Falin
Michael Engel - PINSF DBI – 1057366: A specimen-level database of the world’s bees (Apoidea) at the University of
Kansas