data and donuts: how to write a data management plan
TRANSCRIPT
![Page 1: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/1.jpg)
How to write a data
management plan
C. Tobin Magle, PhDSept. 29, 2016
10:00-11:00 a.m.Morgan Library Computer
Classroom 173
*inspired by content from CU Boulder research computing
![Page 2: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/2.jpg)
What is research data?
• “The recorded factual material commonly accepted in the scientific community as necessary to validate research findings”
- White House Office of Management and Budget
• Reality: anything that is a (digital) product or your research
![Page 3: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/3.jpg)
What is a data management plan?
A description of how you plan to describe, preserve and share your research data.
Often required by funding agencies
![Page 4: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/4.jpg)
DMPTool
• Review requirements from different agencies
• https://dmptool.org/guidance
• Create new DMPs based on funding agency templates
• Search public DMPs
![Page 5: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/5.jpg)
Successful DMPs include
• A data inventory, including type(s) and size
• A strategy for describing the data
• A plan for preserving the data long term
• A method for access to the data
Always make sure to follow funder requirements
![Page 6: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/6.jpg)
Data inventory
• What type of data are you going to collect?
• What file type will be produced?
• What size will these files be? How many files?
• What other research outputs will be produced?• Code/Software?• Templates/protocols?
![Page 7: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/7.jpg)
Data inventorymiRNA sequences
FASTQ files
1 GB per filex 64 strainsx 3 replicates-------------------~200 GB
R scripts for analysis and visualization
Data use tutorials
• What type of data are you going to collect?
• What file type will be produced?
• What size will these files be? How many files?
• What other research outputs will be produced?• Code/Software?• Templates/protocols?
![Page 8: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/8.jpg)
Data formats
• Avoid proprietary formats• Know what software can read your data
Proprietary Format Alternative FormatExcel (.xls, .xlsx) Comma Separated Values (.csv)Word (.doc, .docx) plain text (.txt)PowerPoint (.ppt, .pptx) PDF/A (.pdf)Photoshop (.psd) TIFF (.tif, .tiff)Quicktime (.mov) MPEG-4 (.mp4)MPEG 4 Protected audio (.m4p) MP3 (.mp3)
![Page 9: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/9.jpg)
Exercise: Data InventoryWhat kind of data are you going to collect?
What file type will be produced?
What size will these files be? How many files?
What other research outputs will be produced?
![Page 10: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/10.jpg)
A strategy for describing the data
• Metadata: Relevant information for re-creation and re-use
• Contact info• How data was collected• Details about collection• Date, location of collection• Units
• Can be as simple as a text file
![Page 11: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/11.jpg)
Genomics example (README)This project contains next-generation miRNA sequencing data from 64 mouse strains.
Brain tissue from 10 week old male mice were harvested, stored in RNA later. RNA was extracted using an RNeasy kit, and miRNA libraries were produced using an Illumina kit. They were run on an Illumina mySeq sequencer. The FASTQ Files produced were analyzed in R using Bioconductor.
The data and descriptive will be made available on NCBI in the bioproject (PRJXXXX). The scripts used to analyzed the data are available on github (URL). Tutorials for data use will be made available in the Digital Collections of Colorado (handle).
Contact Tobin Magle ([email protected]) for more information. http://orcid.org/0000-0003-3185-7034
![Page 12: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/12.jpg)
Metadata standards• Dublin Core: http://dublincore.org/documents/dcmi-terms/
• Can be applied to anything
• Many discipline specific metadata standards• EML: https://knb.ecoinformatics.org/#external//emlparser/docs/index.html• MIAME: http://fged.org/projects/miame/
• Search for other standards: • http://www.dcc.ac.uk/resources/metadata-standards• https://biosharing.org/standards/
![Page 13: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/13.jpg)
Genomics example (NCBI template)
![Page 14: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/14.jpg)
Exercise: Describe your dataWhat do people need to know to reuse your data?
Are there any discipline-specific metadata standards?
What format will you describe your data in (text, XML, tabular)?
What fields will you include (author, date, format, identifier?)
![Page 15: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/15.jpg)
A plan for preserving the data long term
• What will you do to ensure data are properly stored and preserved?
• Include metadata and other products needed for reuse
• Might change over course of the project
![Page 16: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/16.jpg)
Preservation questions
• What will you store?
• Who will be in charge?
• How long will you store it?
• Where will you store it? • Multiple copies
![Page 17: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/17.jpg)
Recommendations for backing up data
• Store in geographically distinct locations
• Automation: Will you remember to do it manually?
• Security: Are you working with PHI?
![Page 18: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/18.jpg)
Exercise: Preservation planWhat will you store?
Who will be responsible for the data (person or position)?
How long will you store it?
Where will you store it?
How will you back it up?
![Page 19: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/19.jpg)
A method to access the data
• Important to funding agencies• Reproduce existing research• Promote further research
• Must be easily available: • No “by request only”• Embargoes are “ok”
• Data security: consider privacy and IP issues before sharing
![Page 20: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/20.jpg)
Data access and sharing best practices
• Non-proprietary formats
• Include metadata
• Proper storage
• Stable identifier
• Licensing: conditions for reuse
![Page 21: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/21.jpg)
Trusted Repositories: store and share• Discipline specific repositories
• Search: http://service.re3data.org/browse/by-subject/
• Generic: • Figshare - https://figshare.com/• Dryad - http://datadryad.org/
• CSU Digital Repository:• http://lib.colostate.edu/digital-collections/ http://
67.media.tumblr.com/6228cbe58a9652f1a85e8ab1ed08d715/tumblr_inline_n6oukhNlZW1qf11bs.png
![Page 22: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/22.jpg)
CSU Digital repository
• http://lib.colostate.edu/services/data-management/the-digital-repository
• Data Collection: https://dspace.library.colostate.edu/handle/10217/172830
• At no cost <1 TB
![Page 23: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/23.jpg)
Stable identifiers
• URLs break
• Stable identifiers are permanent in a database
• Some provide linking capabilities• DOI – https://
doi.org/10.1109/5.771073
• Handle- http://hdl.handle.net/10217/177356
![Page 24: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/24.jpg)
Licensing
• State your conditions for reuse• Paper citation?
• Disclaimers
• Must justify limitations, describe how you’ll advertise them
• Creative common licenses are a good starting point
![Page 25: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/25.jpg)
Exercise: Access methodsWhere will people be able to access the data?
Does your discipline have a repository?What kind of stable identifier will it have?
What are the conditions for reuse?Are there any limitations to use of these data? Why?
![Page 26: Data and Donuts: How to write a data management plan](https://reader036.vdocuments.us/reader036/viewer/2022081604/58738b691a28ab272d8b6ae7/html5/thumbnails/26.jpg)
Need help?
• Email: [email protected]
• DMPTool: http://dmptool.org/
• Data Management Services website: http://lib.colostate.edu/services/data-management
• Being updated