board on research data and information national academy of sciences 30 november 2010
Post on 03-Jan-2016
15 Views
Preview:
DESCRIPTION
TRANSCRIPT
Research Data:Who will share what, with whom, when, and why?
Christine L. BorgmanProfessor & Presidential Chair in Information Studies
University of California, Los Angeles
Board on Research Data and InformationNational Academy of Sciences
30 November 2010
Deluge!!!
Data!!Data!!
ScientistsScientists
Social ScientistsSocial Scientists
Funding agenciesFunding agencies Policy makersPolicy makers
HumanistsHumanists
LibrariansLibrarians
http://www.guzer.com/pictures/suprise_suprise.jpg
Dissemination and Sharing of Research Results
NSF Data Sharing Policy• Investigators are expected to share with other researchers, at no more
than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants. Grantees are expected to encourage and facilitate such sharing. See Award & Administration Guide (AAG) Chapter VI.D.4.
NSF Data Management Plan Requirements• Beginning January 18, 2011, proposals submitted to NSF must include a
supplementary document of no more than two pages labeled “Data Management Plan”. This supplementary document should describe how the proposal will conform to NSF policy on the dissemination and sharing of research results. See Grant Proposal Guide (GPG) Chapter II.C.2.j for full policy implementation.
4
What are data?Categories of data* • Observational• Computational• Experimental• Records
What data to keep?• Why?• Who cares?
http://datalib.ed.ac.uk/GRAPHICS/blue_data.gif *Long-Lived Data, NSF, 20055
Some purposes of data-driven research
Hypothesis-driven Synoptic survey
Model system
Experimental Theoretical
Long term
Describe phenomena
Short term
6
Some methods of data-driven research
Hand-collect samples
Collaborative teamsIndividual investigator
Machine-collect samples
Hand markup Machine markup
Community repositoriesLocal control of data
7
Researchers’ incentives to share data
• Open science, scholarship• Recognition• Collaboration • Reciprocity• Coercion
8
Image source:www.buffaloworks.us/ images/sharing%20orangs.jpg
Researchers’ incentives not to share
• Lack of rewards
• Labor to document data
• Competition, priority of claims
• Intellectual property
• Control over data and sources
• Access to data and sources
9
Image source: www.buildingsrus.co.uk/.../ target1.htm
Arguments for sharing research data
• Motivations– Means to advance scientific research– Promote the public good
• Interests served– Producers of scientific data– Users of scientific data
10projects.kmi.open.ac.uk
athensacademy.net
toposytropos.com.ar
Public good / user arguments
1. Public monies should serve the public good
2. With data, anyone can be a scientist
11http://digitalassetmanagement.org.uk/2010/02/01/the-winds-of-change-are-blowing-in-the-clouds-favor/
data discovery
http://annualreport.ucdavis.edu/2008/images/photos/discovery.jpg
Scientific good / producers arguments
3. Data curation advances science
4. Www5. With data, results can be reproduced
12
http://chemistry.curtin.edu.au/research/index.cfm
http://serc.carleton.edu/cismi/broadaccess/groupwork.html
WISE image Worldwide Telescope
Motivations and interests in sharing data
Interests of data producersInterests of data producers
Science-driven motivationsScience-driven motivations
Interests of data usersInterests of data users
4. Reproducibility
2. Ask new questions2. Ask new questions
1. Public goods1. Public goodsPublic-driven motivationsPublic-driven motivations
3. Advance science
13
Enabling Virtual Conversations
Curators Consumers
Science-Ready Data
Products
Screening and
Processingby Data Valets
Initial quarantine
data results
Screening by
Curators
Quarantine data
results
Quarantine Database and Data CubeScience Database and Data Cubes
Authors
Original Data
Archive
Data Valets
PublishersData
Curators ConsumersAuthors
Data Valets
Publishers
Reports, Hosted Excel PivotTables and Charts
Excel and MatLab Interface
Blog, maps, data explanations, protected data download and web parts for each role
Quarantine Data Science Data
Sharepoint Collaboration Web Portal
Collaboration- Centric View
Data-Centric View
Slide courtesy of Catherine van Ingen, Microsoft Research
Why openness matters• Interoperability trumps all
• Import and export in open formats• Mixup and mashup• Add value• Avoid lock in
• Discoverability of related• Documents• Data• Assorted digital objects
• Usability and reusability• For research• For learning
15http://pzwart.wdka.hro.nl/mdr/research/lliang/mdr/mdr_images/opencontent.jpg/
• Scholarly information infrastructure– Enable and promote new kinds of scholarship– Distributed, collaborative, open access to scholarly work
• Lack of clear guidelines for sharing data– What are considered to be data?– What is “incremental cost”?– What is “reasonable period of time”?
Implications for scholarship - 1
http://serc.carleton.edu/cismi/researchonlearning/
• Who is responsible for implementation, costs?– PI, grad students, department, university, library?– Curate for duration of grant or to the end of time?
• How to assign credit for new forms of scholarly contributions?
Implications for scholarship - 2
http://serc.carleton.edu/cismi/researchonlearning/
• Clear guidelines for sharing scholarly products– Based on practices within and between fields– Flexible and innovative– Avoid lowest common denominator
• Identify stakeholders and costs– Investigators, students, post-docs…– Universities, libraries, research institutes …
• Develop policy, technology, and practice– Ownership, access, and control of scholarly products– Credit for scholarly contributions– Value chain of scholarly artifacts
Implications for regulation
Lessig, Free Culture, 2004, p125
Conclusions• Data sharing scenarios
– Release all of the data, all of the time, to anyone– Release none of the data, at any time, to anyone– Release some of the data, under certain conditions, to some of
the people • Science-driven data curation
– Examine policy arguments– Recognize data diversity– Identify stakeholders
• Motivations• Interests
– Engage stakeholders
http://plus.maths.org/content/text-bytes-and-videotape
19
Acknowledgements• Paper comments: CENS Data Practices team at UCLA – David Fearon, Matthew Mayernik,
Katie Shilton, Jillian Wallis, and Laura Wynholds; Paul Uhlir of the National Academies.
• Audience comments on prior versions of this talk– China-North America Library Conference, Beijing, September, 2010– Santa Fe Institute, November, 2010
• Research funding:– National Science Foundation
• CENS: Cooperative Agreement #CCR-0120778, D.L. Estrin, UCLA, PI. • CENS Education Infrastructure: #ESI- 0352572, W.A. Sandoval, PI; C.L. Borgman, co-PI.• Towards a Virtual Organization for Data Cyberinfrastructure, #OCI-0750529, C.L. Borgman,
UCLA, PI; G. Bowker, Santa Clara University, Co-PI; T. Finholt, University of Michigan, Co-PI.• Monitoring, Modeling & Memory: Dynamics of Data and Knowledge in Scientific
Cyberinfrastructures: #0827322, P.N. Edwards, UM, PI; Co-PIs C.L. Borgman, UCLA; G. Bowker, SCU; T. Finholt, UM; S. Jackson, UM; D. Ribes, Georgetown; S.L. Star, SCU)
• Data Conservancy: OCI0830976, Sayeed Choudhury, PI, Johns Hopkins University. – Microsoft External Research: Tony Hey, Lee Dirks, Catherine van Ingen, Catherine Marshall
20
top related