an introduction to data management university of the arts london, 19 march 2014 jonathan rans...
TRANSCRIPT
An Introduction to Data ManagementUniversity of the Arts London, 19 March 2014
Jonathan RansDigital Curation Centre, University of Edinburgh
Overview
1. Definitions
2. National drivers
3. Institutions’ responses
4. What support is available?
The Digital Curation Centre
The (est. 2004) is… A UK centre of expertise in digital preservation, with
a particular focus on research data management (RDM)
Based across three sites: Universities of Edinburgh, Glasgow and Bath
Working with a number of UK universities to identify gaps in RDM provision and raise capabilities across the sector
Also involved in a variety of national and international collaborations…
1. Definitions
What is research data management?
“the active management and appraisal of data over the lifecycle of scholarly and
scientific interest”
Plan
Collect
Assure
Describe
Preserve
Discover
Integrate
Analyze
SHARE
…and RE-USE
The DataONE lifecycle model
Data management is a part of good research practice
- RCUK
Developments in sensor technology, networking and digital storage enable new research and scientific paradigms
As costs also fall, possibilities for data sharing, citation and re-use become much more widespread
Research funders and publishers recognise the value of this and now tend to have greater expectations of the research that they support…
Why is it a growing concern?
What are the benefits? PRESERVATION: Lots of data is unique, and can only
be captured once. If lost, it’s irreplaceable. EFFICIENCY: Data collection can be funded once, and
used many times for a variety of purposes TRANSPARENCY: The data that underpins research
can be made open for anyone to scrutinise, and attempt to replicate findings
RISK MANAGEMENT: A pro-active approach to data management reduces the risk of inappropriate disclosure of sensitive data, whether commercial or personal
Definitions vary from discipline to discipline, and from funder to funder…
Here’s a science-centric definition: “The recorded factual material commonly accepted in the scientific community as
necessary to validate research findings.” (US Office of Management and Budget, Circular 110)
And another from the visual arts: “Evidence which is used or created to generate new knowledge and
interpretations. ‘Evidence’ may be intersubjective or subjective; physical or emotional; persistent or ephemeral; personal or public; explicit or tacit; and is consciously or unconsciously referenced by the researcher at some point during the course of their research.”
(Leigh Garrett, KAPTUR project: see http://kaptur.wordpress.com/2013/01/23/what-is-visual-arts-research-data-revisited/
So what is ‘data’ exactly?
Some characteristics of Arts and Humanities data are likely to require a different kind of handling from that given to other disciplines
They are often personal and may not be factual in nature. Furthermore, they may be quite valuable or precious to their creator.
The digital data in the Arts is as likely to be an outcome of the creative research process as an input to a workflow
Event resources… http://www.dcc.ac.uk/events/research-data-management-forum-rdmf
/rdmf10-research-data-management-arts-and-humanities
http://www.digital.hss.ed.ac.uk/archive-events/201314-events/managing-humanities-research-data/
Data in the Arts and Humanities
2. The national view
Nature, 09/08 Economist, 02/10
Popular Science, 11/11Science, 02/11
Nature, 09/09ACM, 12/08
InformationWeek, 08/10 Computerworld, 11/12
Five years of front pages…
Open Data
Open Data is a philosophy, underpinned by pragmatism… transparency + utility.
“Open data is the idea that certain data should be freely available to everyone to use and republish as they wish, without restrictions from copyright, patents or other mechanisms of control.” – Wikipedia
Governments, cities etc are all getting onboard
Open Knowledge Foundation is basically the political / activist wing: http://okfn.org/
From the government / industry side, we have the Open Data Institute: http://theodi.org/
What do funders have to say? (i)
Seven “Common Principles on Data Policy” – Data as a public good; Preservation; Discovery; Confidentiality; Right of first use; Recognition; Public funding for RDM
Six of the seven RCUK councils require data management plans, or equivalent, at the application stage
The seventh (EPSRC) requires nothing short of an institutional data infrastructure
What do funders have to say? (ii)
AHRC requires that significant electronic resources or datasets are made available in an accessible repository for at least three years after the end of the grant
Applicants submit a statement on data sharing in the relevant section of the Je-S form, and provide a two-page data management and sharing plan addressing 9 distinct themes
Datasets must be offered to the UK Data Archive on conclusion of the project
3. How are institutions responding?
Components of an RDM service
http://www.dcc.ac.uk/resources/how-guides/how-develop-rdm-services
Data management planning support
Requirements written in to institutional policy
Research office, IT and Library provide support
DCC’s DMPonline Free for researchers Institutions can use the tool to deliver local support UAL has an institutional template
http://www.dcc.ac.uk/resources/developing-rdm-services/dmps-arts-and-humanities
Active storage
Institutions investing in managed storage for active data are making substantial amounts available free
Institutional collaborative platforms
5 TB1 TB 0.5 TB
http://www.dcc.ac.uk/blog/defining-institutional-data-storage-requirements
Selection and deposit1. Relevance to Mission – including any legal/funder
requirement to retain the data beyond its immediate use.
2. Scientific or Historical Value – significance and relationship to publications etc.
3. Uniqueness – can it be found elsewhere / if we don’t preserve it, who will?
4. Potential for Redistribution – quality / IP / ethical concerns are addressed.
5. Non-Replicability – either impossible to replicate (e.g. atmospheric or social science data) or not financially viable.
6. Economic Case – costs of managing and preserving the resource stack up well against potential future benefits.
7. Full Documentation – surrounding / contextual information necessary to facilitate future discovery, access, and reuse is adequate.
How to Appraise & Select Research Data for Curation Angus Whyte, Digital Curation Centre, and Andrew Wilson, Australian National Data Service (2010)
Data repositories
http://datashare.is.ed.ac.uk
www.dspace.cam.ac.uk/
Essex-RDR and DataPool at Southampton
Not intended to replace national, subject or other established data
collections
Acknowledge hybrid environment http://www.researchdata.arts.ac.uk/
Data Catalogues
DataFinder (Oxford)
Researcher Dashboard (Lincoln)
UK Research Data Registry (DCC and Jisc)
Disciplinary support services
There may be scope for centres with a specific disciplinary focus to provide tailored support
4. SUPPORT
i. DCC resources Publications
Briefing Papers and How-To Guides
Training e.g. DC101 events and Curation Reference Manual
Advice e.g. Disciplinary metadata, www.dcc.ac.uk
/resources/metadata-standards
Events International Digital Curation Conference (next one in
London, February 2015) Research Data Management Forum (next one TBC, but
always held in UK)
Tools DMPonline, CARDIO, Data Asset Framework,
DRAMBORA
ii. UAL resources
DCC and UAL ran an institutional engagement between 2011 and 2013, which developed… A data management guidance web area:
http://www.arts.ac.uk/research/research-environment/research-management/data-management/
An institutional policy: http://www.arts.ac.uk/media/research/documents/UAL-Research-Data-Management-Policy.pdf
A UAL data management planning template: http://dmponline.dcc.ac.uk
UAL was involved in the KAPTUR project: http://kaptur.wordpress.com
Thank you
Any questions?
This work is licensed under the Creative Commons Attribution
2.5 UK: Scotland License.
For more about DCC services see www.dcc.ac.uk or follow us on twitter @digitalcuration and #ukdcc
Jonathan RansDigital Curation Centre
University of Edinburgh
[email protected]@JNRans