metadata for data rescue and data at risk
DESCRIPTION
A presentation I gave at PV 2011 in Toulouse, France on behalf of CODATA's Data-at-Risk Task Group.TRANSCRIPT
![Page 1: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/1.jpg)
Metadata for Data Rescue Metadata for Data Rescue and Data at Riskand Data at Risk
William L. Anderson, John L. Faundeen, Jane Greenberg, Fraser Taylor
PV2011, Toulouse, 17 November 2011 PV2011, Toulouse, 17 November 2011 Presented by Nico CarverPresented by Nico Carver
In collaboration with the DARi SILS Student Learning Circle
![Page 2: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/2.jpg)
OutlineOutline• Major Questions• Metadata Scheme Design• Case Study• Next Steps• Acknowledgements• Questions/Comments
![Page 3: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/3.jpg)
Major Questions informing Research Major Questions informing Research
Where is at-risk data?
How are scientists using historic data?
How do we define at-risk?
“8 inch floppy” Retrieved from: http://johnkingworld.com/aplus/images/storage-8inch-floppy.jpg
How do others define at-risk?
What must be done to rescue data-at-risk?
![Page 4: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/4.jpg)
Major Question informing Scheme DesignMajor Question informing Scheme Design
What is essential metadata for describing data-at-risk and aiding in data rescue?
![Page 5: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/5.jpg)
Metadata requirementsMetadata requirements
• Be applicable across a range of disciplines and scientific research areas.
• Sufficiently support the data rescue mission.
![Page 6: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/6.jpg)
Functions of the InventoryFunctions of the Inventory
Describe data of scientific value that is at-risk of being lost, unused, or destroyed.
1. Science area2. Nature of data3. Date or date-span4. Location of original5. Present location
Act as a starting point for the data rescue mission.
6. Expected future7. Risk level
Function Initial Metadata Properties
![Page 7: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/7.jpg)
Metadata Frameworks Useful Metadata Frameworks Useful for Data-at-Riskfor Data-at-Risk
Metadata Property
1. Science area
2. Nature of data
3. Date or date-span
4. Location of original
5. Present location
6. Expected future
7. Risk level
DARTG Chair Elizabeth Griffin’s initial proposed DARTG metadata properties
![Page 8: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/8.jpg)
Metadata Frameworks Useful Metadata Frameworks Useful for Data-at-Riskfor Data-at-Risk
U.S.Geological Service: “Create a Rescue Request”, URL: http://eros.usgs.gov/government/archive_rescue/archive_request.php
![Page 9: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/9.jpg)
Metadata Frameworks Useful Metadata Frameworks Useful for Data-at-Riskfor Data-at-Risk
“Growing the Vocabuary” http://dublincore.org/resources/training/frd_20091217/Tutorial_FRD_baker-1.pdf
![Page 10: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/10.jpg)
Metadata Frameworks Useful Metadata Frameworks Useful for Data-at-Riskfor Data-at-Risk
“The PREMIS Data Dictionary” http://www.loc.gov/standards/premis/v2/premis-dd-2-1.pdf
![Page 11: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/11.jpg)
Data-at-Risk Inventory (DARI) Metadata Data-at-Risk Inventory (DARI) Metadata Scheme: guiding principlesScheme: guiding principles
• Simple
• Broadly applicable
• Extensible
![Page 12: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/12.jpg)
DARI Metadata Scheme (current)DARI Metadata Scheme (current)Metadata Element Name Element Description
Research Area(s) The domains represented by DARTG experts and the more general category of “Other”.
Title The name associated with the collection.
Physical form of the data Paper, photograph, specimen, record book, magnetic tape, etc.
Content and context of the data History, topic, etc. -- if known
Name of current holder Institution, organization or individual.
Dates associated with data Time period when data were collected.
Size Extent, volume, size.
Data condition Stable, deteriorating, etc.
Risk level Poor storage conditions, limited storage time, etc.
Known access and restrictions Public domain, private collection, etc.
Notes Any additional information.
Contact information Address or other contact information for the institution, organization or individual.
DARTG DARI Metadata, Version 1.0
![Page 13: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/13.jpg)
Case Study: introductionCase Study: introduction
![Page 14: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/14.jpg)
Case Study: implementationCase Study: implementation
![Page 15: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/15.jpg)
Case Study: ResultsCase Study: Results• 7 Dataset Descriptions
total. 5 out of 7 were completed unassisted using the metadata template
• 13.5 out of 16 metadata elements considered useful on average (85%)
• 4 out of 5 scientists said they would use the inventory again
![Page 16: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/16.jpg)
Case Study: conclusionsCase Study: conclusions
• The purpose of the inventory had to be more clearly stated on the website
• Instructions for filling out the web form had to be simple, but clear
• 3 metadata properties were determined unnecessary, 4 properties were altered for clarity
• The remaining metadata properties were successful in their ability to cut across scientific disciplines while fully describing data-at-risk
![Page 17: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/17.jpg)
Next StepsNext Steps
• Complete focus groups and surveys at UNC- Chapel Hill and elsewhere to determine possible use cases
• Disseminate information and generate interest for the inventory and the Data-at-Risk project
• Finalize the inventory design and start populating it
![Page 18: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/18.jpg)
Submit a description:Submit a description:http://ibiblio.org/data-at-risk/contribution
![Page 19: Metadata for Data Rescue and Data at Risk](https://reader034.vdocuments.us/reader034/viewer/2022042607/559843b31a28ab0a328b474d/html5/thumbnails/19.jpg)
Questions/Questions/Comments?Comments?
Acknowledgements:•The University of North Carolina Center for Global Initiatives’ support of the Data At Risk Inventory SILS Student Learning Circle•The Council for Scientific and Technical Data•And the following people for their leadership, guidance, and assistance: Bill Anderson, School of Information, University of Texas at Austin; Jane Greenberg, School of Information and Library Science; Elizabeth Griffin, Herzberg Institute of Astrophysics; Dav Robertson, National Institute of Environmental Health Sciences, NIH; and Paul Jones & John Reuning, ibiblio, University of North Carolina at Chapel Hill.