1 the dryad data repository: metadata workflows and processes 2nd data management workshop november...
TRANSCRIPT
![Page 1: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/1.jpg)
1
The Dryad Data Repository: Metadata
Workflows and Processes2nd Data Management Workshop
November 28th – 29th 2014University of Cologne, Germany
Jane Greenberg Professor, College of Computing & Informatics (CCI)Director, Metadata Research Center <MRC>Erin Clary, Dryad Curator, CCI/MRC
![Page 2: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/2.jpg)
![Page 3: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/3.jpg)
3
![Page 5: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/5.jpg)
![Page 6: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/6.jpg)
Pre-populated metadatafield
![Page 7: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/7.jpg)
7
![Page 8: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/8.jpg)
8
Elsevier’s Science Direct: EXAMPLE: Dryad Unmack, et al, Phylogeny and biogeography…Molecular Phylogenetics and Evolution http://dx.doi.org/10.1016/j.ympev.2012.12.019.
![Page 9: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/9.jpg)
Elsevier’s Science Direct: EXAMPLE: Dryad Unmack, et al, Phylogeny and biogeography…Molecular Phylogenetics and Evolution http://dx.doi.org/10.1016/j.ympev.2012.12.019
![Page 10: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/10.jpg)
![Page 11: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/11.jpg)
Data downloads reuse citation
Observations, motivating study of metadata capital1.Metadata generation costs money
2.Metadata reuse is a BIG a BIG part part of Dryad’s workflow3.Metadata reuse via OAI4.Metadata reuse via data sharing, reuse, and repurposing
Download 10678 times
![Page 12: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/12.jpg)
Greenberg J, Swauger S, Feinstein EM (2013) Data from: Metadata capital in a data repository. Proceedings of the International Conference on Dublin Core and Metadata Applications http://dx.doi.org/10.5061/dryad.8c1p6
![Page 13: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/13.jpg)
Journal Re.Wrkfl
Blackout
AmNtrl N NMBE N NBioRisk Y NBMJ Open
Y N
…. Y
Type Total 30 days
Data packages 6867 198
Data files 21056 977
Journals 364 77
Authors 24500 3492
Downloads 639314 36006
• Journals (80+…PLOS): http://datadryad.org/pages/integratedJournals
• X >10GB = $15,$10+
![Page 14: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/14.jpg)
http://wiki.datadryad.org/Sample_Dryad_Content#Examples_by_file_type
![Page 15: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/15.jpg)
TechnologyDSpace DOIs via CDL/DataCiteCC0 (<m> + data)Integration with specialized repositories and databasesFederated searching with TreeBASE and KNB LTERTreeBASE submission (OAI-PMH)GenBank (currently in development)
Governance““non-profit status, 12 non-profit status, 12 member Board of Directors”member Board of Directors”
Sets policy, goals•science, journals, societies, OCLC, MS
2006 Dryad development – NESCent +<MRC>•Stakeholders: journals, publishers and scientific societies, and researchers.
2009-2012: Interim Board
$ PAYMENT-Sept. 1,2014
![Page 16: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/16.jpg)
Sustainability: Plan Comparison
Payment Plan Member Non-member Minimum purchase
1. Voucher Plan USD$65 per data package
USD$70 per data package 25 vouchers
2. Deferred Payment Plan
USD$70 per data package
USD$75 per data package 1 yr contract
3. Subscription Plan
Annual fee based on USD$25 per published research article
Annual fee based on USD$30 per published research article
2 yr contract
For individuals:Pay on acceptance NA
USD$80 per data package, payable by the submitter
1 data package
![Page 17: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/17.jpg)
More on grown and sustainability Membership:
http://datadryad.org/pages/membershipOverview
Pricing and sponsorship of deposits: http://datadryad.org/pages/pricing
Journal integration: http://datadryad.org/pages/
journalIntegration
![Page 18: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/18.jpg)
18
![Page 19: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/19.jpg)
Metadata research & developmentMetadata research & development1.Curation workflow - cognitive walkthroughs2.Dryad metadata scheme development - crosswalk analyses (Dube, et al, 2007; Carrier, et al, 2007; White et al., 2008, Greenberg, et al, 2010; Greenberg 2009; 2010)3.Metadata reuse - content analysis (Greenberg, IDCC Research Summit, 2010) 4.Instantiation - multi-method study (comprehensions assessment) (Greenberg, RDAP, 2010, UNAM 2012)5.Name-authority control - exploratory study (Haven, 2009, INLS 720)6.KO/metadata community practices - Concurrent triangulation mixed methods (survey + simulation experiment) (White, 2010, ASIST, 2010 JLM)7.Metadata functions - quantitative categorical analysis (Willis, Greenberg, and White, 2010, CODATA, 2012, JASIST) 8.Vocabulary needs (HIVE) (HIVE) – mapping study (Greenberg, 2009, CCQ; Scherle, 2010, Code4Lib)9.Metadata theory – deductive analysis (Greenberg, 2009)
![Page 20: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/20.jpg)
Singapore Framework
Dryad DCAP, ver. 3.0bibo (The Bibliographic Ontology)dcterms (Dublin Core terms)dryad (Dryad) DwC (Darwin Core)
Vision1.Simple: automatic metadata gen; heterogeneous datasets *Data-package centric2.Interoperable: harvesting, cross-system searching 3.Semantic Web compatible: sustainable; supporting machine processing
Greenberg, et al, 2009, Metadata Best Practice for a Scientific Data Repository, JLM, DOI:10.1080/19386380903405090.
![Page 21: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/21.jpg)
21
Helping Interdisciplinary Vocabulary Engineering (HIVE)HIVE)
![Page 22: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/22.jpg)
~~~~Amy~~~~Amy
DATADATA
publicationpublication
![Page 23: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/23.jpg)
![Page 24: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/24.jpg)
![Page 25: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/25.jpg)
Package metadata harvested from email
Subj. 177 (gr. 97%, rd. 2%, bl. 1%)
Contr. 101 (gr. 99%, bl. 1%)
![Page 26: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/26.jpg)
Modified Capital-sigma notation
Reuse
nR + ∑ ai = R + a1 + a2 +a3 + …an
i=1R = value of the metadata recordi= number of usagesa = incremental increase in valuen = maximum number of reuse
![Page 27: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/27.jpg)
27
Author/Submitter | Curator
100 metadata instantiations•8 of 12 metadata properties had reuse @ 50% or greater•5 of 8 confirmed reuse at• 80% or higher. •Basic bib. vs. complex
![Page 28: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/28.jpg)
Author
Subject
Dcterms.spatial
DwC.ScientificName
![Page 29: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/29.jpg)
Conclusion…other Valuation Approaches
Market cap of Facebook per user: $40 – $300 Revenues per record per user: $4 – $7 per year
• Facebook• Experian
Market prices of personal data:
• $0.50 for street address• $2.00 for date of birth• $8 for social security number• $3 for driver’s license number• $35 for military record
SOURCE: OECD. Exploring the Economics of Personal Data: A Survey of Methodologies for Measuring Monetary Value. OECD Digital Economy Papers. Office for Economic Cooperation and Development Publishing, 2013.
![Page 30: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/30.jpg)
Concluding comments
Success story Contribution, have to start
somewhere…• Good timing, the right discipline
Confirmed use, reuse Machine capabilities An educative commons, intellectually
engaging
![Page 31: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/31.jpg)
http://wiki.datadryad.org/Sample_Dryad_Content
![Page 32: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/32.jpg)
32
Acknowledgments Dryad Consortium Board, journal partners, and data authors NESCent: Laura Wendell (Executive Director), Hilmar Lapp,
Heather Piwowar, Peggy Schaeffer, Ryan Scherle, Todd Vision (PI)
**Drexel/UNC <Metadata Research Center>: Jose R. Pérez-Agüera, Sarah Carrier, Elena Feinstein, Lina Huang, Robert Losee, Hollie White, Craig Willis, Jane Smith, Shea Swuager, Liz Turner, Christine Mayo, Adrian Ogletree, Erin Clary
U British Columbia: Michael Whitlock NCSU Digital Libraries: Kristin Antelman HIVE: Library of Congress, USGS, and The Getty Research
Institute; and workshop hosts Yale/TreeBASE: Youjun Guo, Bill Piel DataONE: Rebecca Koskela, Bill Michener, Dave Veiglais, and
many others British Library: Lee-Ann Coleman, Adam Farquhar, Brian Hole Oxford University: David Shotton
![Page 33: 1 The Dryad Data Repository: Metadata Workflows and Processes 2nd Data Management Workshop November 28th – 29th 2014 University of Cologne, Germany Jane](https://reader035.vdocuments.us/reader035/viewer/2022062721/56649f275503460f94c3ec98/html5/thumbnails/33.jpg)
33
http://datadryad.org http://blog.datadryad.org http://datadryad.org/wiki
http://code.google.com/p/[email protected]
Facebook: Dryad Twitter: @datadryad
http://ils.unc.edu/mrc/hive/ http://code.google.com/p/hive-mrc/
Metsdata Reserch Center: http://cci.drexel.edu/mrc
http://datadryad.org http://blog.datadryad.org http://datadryad.org/wiki
http://code.google.com/p/[email protected]
Facebook: Dryad Twitter: @datadryad
http://ils.unc.edu/mrc/hive/ http://code.google.com/p/hive-mrc/
Metsdata Reserch Center: http://cci.drexel.edu/mrc