triplifier talk
DESCRIPTION
Connecting content with a tool to convert database and spreadsheet data to be useable on the semantic web.TRANSCRIPT
![Page 1: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/1.jpg)
John Deck, University of California, BerkeleyBrian Stucky, University of Colorado, BoulderLukasz Ziemba, University of Florida, GainesevilleNico Cellinese, University of Florida, GainesvilleRob Guralnick, University of Colorado, Boulder
BiSciCol TeamReed Beaman, Nico Cellinese, Jonathan Coddington, Neil Davies, John
Deck, RobGuralnick, Bryan P. Heidorn, Chris Meyer, Tom Orrell, Rich Pyle, Kate
Rachwal, BrianStucky, Rob Whitton, Lukasz Ziemba
Data Curation and
Biodiversity Research --
Lessons from BiSciCol and
a look at the “Triplifier
Simplifier”
![Page 2: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/2.jpg)
• BiSciCol is National Science Foundation funded 2010 – 2014
• Infrastructure to tag & track specimens & derivates in cyberspace
• Relies on globally unique identifiers (GUIDs) to track objects
• Implements a Linked Data approach
• Provides support for the Global Names Architecture
![Page 3: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/3.jpg)
Taxonomic Type Filter
Class Filter
X
X
Specimens
Tissues
Sequences
A Biological Relationship Graph …
![Page 4: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/4.jpg)
Why Linked Data? Why BiSciCol?
(Prefers to collect stuff)
Generates Lots of Data…
Here is Gustav’s Problem
![Page 5: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/5.jpg)
Biodiversity Data Challenges
Data is Distributed
Rapidly Changing
Technologies
Covers Multiple
Domains
![Page 6: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/6.jpg)
Group data into classes.
Publish.[ ] Ocean Sampling Day
[X] Moorea Biocode
[X] SI MSNGR System
[+] Add My Data
Link identifiers.
Is a dwc:Event
Solving Biodiversity Data Challenges with
BiSciCol and Linked Data
Assign identifiers. Is a dwc:Event
![Page 7: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/7.jpg)
The Triplifier(Advanced Interface)
Powered by:
Naming and Identifying Objects
Linking Objects
Publishing
Loading Data
![Page 8: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/8.jpg)
Advanced Interface: Loading Data
MySQL
Darwin Core
Archive
Mysql
DarwinCoreArchive
KEMU
Spreadsheets
![Page 9: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/9.jpg)
Advanced Interface: Entities
Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health R Biomed Inform. 2006 Jun;39(3):362-78.
78
From Gary Larsen and adapted by Barry Smith in Referent Tracking presentation at the Semantics of Biodiversity Workshop, 2012.
Result is identifiers assigned to Entities:78 a door .
427 a cat .
<http://biocode.berkeley.edu/collectorspecimens/BMOO_2665> a <dwc:Occurrence> .
<http://biocode.berkeley.edu/collectorevents/MIB_25> a <dwc:Event> .
Tissue
![Page 10: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/10.jpg)
Advanced Interface: Entity Relations
Relations as Triples:<http://biocode.berkeley.edu/collectorevents/MIB_25> <ma:isSourceOf> <http://biocode.berkeley.edu/collectorspecimens/BMOO_2665> .
<http://biocode.berkeley.edu/collectorevents/MIB_37> <ma:isSourceOf> <http://biocode.berkeley.edu/collectorspecimens/BMOO_2667> .
<http://biocode.berkeley.edu/collectorspecimens/BMOO_2665> <ma:isSourceOf> <http://biocode.berkeley.edu/plate_well/Plate_M037F10> .
<http://biocode.berkeley.edu/collectorspecimens/BMOO_2667> <ma:isSourceOf> <http://biocode.berkeley.edu/plate_well/Plate_M028G5> .
![Page 11: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/11.jpg)
Qu
ery
Response
Triplify!: View graph based data
![Page 12: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/12.jpg)
The Triplifier (Simple Interface)
Publish
![Page 13: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/13.jpg)
What challenges are we facing now?
(for BiSciCol, Linked Data, and data integration
In general)
![Page 14: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/14.jpg)
Identifier IssuesPersistence
Assignment at the source is difficult
The digestible RFID tag
Solutions: • DOIs (http://doi.org/)• EZIDs (http://ezid.net/)
Solutions: • Calculated namespaces (e.g. geo:lat,lng) via PDAs• UUIDs (randomly unique)
Solution: • Promote use of URIs for identifiers in all Standards.
Semantic web requires URIs but many standards (including Darwin Core) do not require URIs for identifiersscheme : string
URI
![Page 15: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/15.jpg)
Classification Issues
Solutions: • Continue working on clarity in term
definitions• Work from upper level ontologies (e.g.
Basic Formal Ontology) to derive definitions.
Confusion between representational units
“Sample, Specimen, Individual, Aggregation”
Inadequate representational units
“Occurrence”
![Page 16: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/16.jpg)
Relation Issues
Solution: • apply directional links only where
appropriate.
Non-sensical conclusions are possible!
![Page 17: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/17.jpg)
Adoption IssuesCritical mass required for effective utilization
Reality is complicated
Solutions: • Work collaboratively (e.g.
BioPortal, hackathons, interdisciplinary workshops)
Solutions: • Work with aggregators (GBIF, VertNet, NCBI).• View Triples as a publishable unit
![Page 18: Triplifier talk](https://reader033.vdocuments.us/reader033/viewer/2022060201/559aa9181a28ab851c8b466c/html5/thumbnails/18.jpg)
• BiSciCol tackles biodiversity data challenges:
• Tracking and integration of objects across disciplines
• Linking derivatives back to their source
• BiSciCol is about community, collaborative practice
• Commitment to standards, ontologies
• Agreement on permanent, resolvable identifiers
• Triplification of data sources to enhance linked data
The BiSciCol Mission
http://biscicol.blogspot.com/ http://biscicol.org