data management tools david wallom. your data does not belong to you! it belongs to your employing...
TRANSCRIPT
Data Management Tools
David Wallom
YOUR DATA DOES NOT BELONG TO YOU!
IT BELONGS TO YOUR EMPLOYING INSTITUTION!
The RDM Lifecycle
Slide borrowed with permission from Anthony Beitz, Monash University. Presented at OR 2012, Edinburgh
CollaborateConceive Design Experiment Publish ExposeAnalyse CollaborateExperiment PublishAnalyse Expose
The RDM Lifecycle
Slide borrowed with permission from Anthony Beitz, Monash University. Presented at OR 2012, Edinburgh
Conceive Design
Conceive and Design
• Collaborative– Multiple Investigators, multiple institutions
• Process driven– RCUK, EC, a.n.other funder all have processes that
must be followed• Time limited– Calls come with deadlines and everyone leaves it
until the last moment…
Tools
• Shared document development tools– E.g. Google Docs, Office 365
• Shared document/project management tools– Sharepoint – locally managed and normally connected to an Exchange
system– Online
• Teambox• Projecturf• Apollo• Basecamp• Huddle
• Data Management Planning– DMPOnline
The RDM Lifecycle
CollaborateExperiment Analyse CollaborateExperiment Analyse
Backup
• What data you need to back-up?– Criteria?
• How many versions you should retain?– Current, raw, processed?
• How often you intend to backup?– After every change, daily, weekly, monthly?
• How many copies you should retain and their location?– 2, 3, onsite, offsite?
• How you intend storing backup data?– Media, online, cloud?
• What software you will use to manage backups?– Operating system based, 3rd party, criteria to choose?
Tools
• Desktop Drive (USB)• RO Media• Online
– ‘Institutional Dropbox’– SkyDrive– iCloud– Dropbox– Humyo– Memopal– Mozy– ZumoDrive
Data Security• The available skills and expertise required to ensure an adequate level of data security;• Risk assessment to determine the value of data
– the level of confidentiality required– applicable statutory requirements– impact of unauthorised access to, or loss of, the data– steps required to provide appropriate data protection.
• The prevention of unauthorised and malicious access to buildings and rooms where computers and other devices holding data may be housed.
• How access to data is managed, authorised and logged.• How data is protected from loss or damage, for example by regular backups, implementing
version control and installing anti-malware software.• The means to access data from both within Oxford and from outside the Oxford network; and
the transmission of data from one computer to another (e.g. via email, ftp, Web server).• The storage and encryption of data taken offsite (whether, for example, on an external drive,
laptop, mobile device).• The process to verify the deletion of confidential data (for example, when equipment is re-
deployed or in line with a project's exit strategy)
Active Research Data Storage
– DataStage
– DataVerse
– NeuroHub/Hub
http://www.dcc.ac.uk/resources/external/category/active-data-storage
Workflow and Lab Books
• LabTrove
• cRUNCH
• NeuroHub/Hub
http://www.dcc.ac.uk/resources/external/category/workflow-and-lab-notebook-management
The RDM Lifecycle
Slide borrowed with permission from Anthony Beitz, Monash University. Presented at OR 2012, Edinburgh
Publish ExposePublish Expose
Sharing Your Data
• Enables that data to be validated and tested, improving the scientific record.
• Meets funding body requirements obliging award holders to share their data to avoid duplication of effort and to cut costs.
• Is in the public interest, where research data has been publicly funded.
• Can facilitate its rediscovery and its preservation as technology becomes obsolete.
• Means that data can be reused for scientific and educational purposes.
When should I consider not sharing my data?
• There may be occasions when you should consider not sharing your data:– If your research data is potentially commercially valuable or
exploitable by your employer.• If there are ethical issues, legal issues, time constraints and
other issues which could limit data sharing opportunities,• If there are conditions of confidentiality (eg. through
industrial sponsors) attached to the funding of your research.• Often, sensitive and confidential data can be shared ethically
if informed consent for data sharing has been given, or by anonymising research data.
Community Data Repositories