measuring the adoption of open science

40
Measuring the adoption of open science Heather Piwowar Department of Biomedical Informatics University of Pittsburgh PSB workshop on Open Science, January 2008

Upload: heather-piwowar

Post on 01-Nov-2014

2.096 views

Category:

Health & Medicine


2 download

DESCRIPTION

Measuring the Adoption of Open Science, presented at PSB 2009 Open Science workshop

TRANSCRIPT

Page 1: Measuring the Adoption of Open Science

Measuringtheadoptionofopenscience

HeatherPiwowarDepartmentofBiomedicalInformatics

UniversityofPittsburgh

PSBworkshoponOpenScience,January2008

Page 2: Measuring the Adoption of Open Science

youcannotmanagewhatyoudonotmeasure

Page 3: Measuring the Adoption of Open Science

Measuringtheadoptionofopensciencesharingdata

Page 4: Measuring the Adoption of Open Science

lotsofdatasharing!

http://www.genome.jp/en/db_growth.html

Page 5: Measuring the Adoption of Open Science

buthowmuchisn’tshared?

whatisn’tshared?

whoisn’tsharingit?whynot?

whatcanwedoaboutit?

howmuchdoesitmatter?

Page 6: Measuring the Adoption of Open Science

I’llbehighlightingtheresultsofanumberofstudies:

surveysmanualreviewscitationanalyses

Page 7: Measuring the Adoption of Open Science

Althoughsomescientistsvoluntarilysharetheirresearchdata,manydon’t.

Datawithholdingcorrelateswiththeusualsuspects.

Feedbackonincentivesmaysurpriseyou.

Muchroomforcontinuedresearch,includingseveralwaysthatyoucanhelp.

Preview

Page 8: Measuring the Adoption of Open Science

Howmuchdatagetsshared?

Page 9: Measuring the Adoption of Open Science

Noor et al. PLoS Biology 2006.Ochsner et al. Nature Methods 2008.

Piwowar et al. PLoS ONE 2007.Editorial. Nature Biotech 2007.

DNA sequences

gene expression microarrays

proteomics spectra

0% 25% 50% 75% 100%

Datasharingfrequencydependsondatatype

Page 10: Measuring the Adoption of Open Science

self-reported denying a request in last 3 years

trainees self-reported denying a request

been denied access to data, materials, code

authors “not able to retrieve raw data”

not willing to release data

0% 10% 20% 30% 40%

Datasharingfrequencydependsonwhoyouask

Campbell et al. JAMA. 2002.Kyzas et al. J Natl Cancer Inst. 2005.

Vogeli et al. Acad Med. 2006.Reidpath et al. Bioethics 2001.

Page 11: Measuring the Adoption of Open Science

Aretheoutcomesofdatasharingpositiveornegative?

Page 12: Measuring the Adoption of Open Science

Blumenthal et al. Acad Med. 2006

positive only

mixed experiences

negative only

0% 10% 20% 30% 40% 50%

80%ofscientistsreportpositiveexperiencesfromdatasharing

Positiveexperiences:collaboration,newresearch,etc.Negative:scooping,preventingpublishing,IP,or$$benefit,etc.

Page 13: Measuring the Adoption of Open Science

Whyisdatawithheld?

Page 14: Measuring the Adoption of Open Science

Blumenthal et al. Acad Med. 2006

industry involvement

perceived competitiveness of field

male

sharing discouraged in training

human participants

academic productivity

0 1 2 3

Withholdingisassociatedwithindustrylinks,competiveness

40%ofsurveyedscientistssaiddatasharingwasdiscouragedduringtheirtraining!

Page 15: Measuring the Adoption of Open Science

Campbell et al. JAMA 2002.

sharing is too much effort

want student or jr faculty to publish more

they themselves want to publish more

cost

industrial sponsor

confidentiality

commercial value of results0% 20% 40% 60% 80%

Withholdbecausetoomucheffort,desireforcontinuedpublishing

Page 16: Measuring the Adoption of Open Science

Hedstrom. Society of Am Archivists Ann Meeting. 2008.

want to publish more papers first

want exclusive use

ensure data confidentiality

control

avoid cost of preparation0% 10% 20% 30% 40% 50%

Obstaclesforsharing:publishing,control,cost

Page 17: Measuring the Adoption of Open Science

Commentsshowdesireforcontrol`BeforeIsendyouthedatacouldIaskwhatyouwantitfor?'

`Canyoubemoreexplicit,please,abouttheanalysesyouhaveinmindandwhatyouplantodowiththem?'

`We'llhavetodiscussyourrequestwiththeothercoauthors.Beforewedothat,I'dliketoknowyourproposedanalysisplan.'

`Wearenotfinishedusingthedata,butwhenwearefinishedwithit,wewouldbeopentorequestsforthedata.'

`Anyuseofthedataotherthanforthespecificpurposelaiddowninthecontractofcollaborationiseffectivelyruledout.'

Reidpath et al. Bioethics 2001.

Page 18: Measuring the Adoption of Open Science

Whataretheperceivedandmeasuredbenefits?

Page 19: Measuring the Adoption of Open Science

Hedstrom et al. IASSIST 2006.

saves other people effort

for the public good

will be cited and enhance my reputation

saves me effort in answering questions

saves me effort in managing my data0% 20% 40% 60% 80%

Benefitsbothsocietalandpersonal

Page 20: Measuring the Adoption of Open Science

Measuringsocietalbenefit‐assumeeachdatabasehitsaves$0.10,orafractionofdatacollectioncosts

‐assumethevalueisapproximatedbythe(idealized)fundingtargetfordatamaintenance:20‐25%thecostofgeneratingthedata

Remembering,moreover,theindirectbenefitsaremuchhigherthanthedirectones.

Ball et al. Nature Biotechol. 2004.

Page 21: Measuring the Adoption of Open Science

Gleditsch et al. Int Studies Perspectives. 2003.Piwowar et al. PLoS ONE. 2007.

Measuringpersonalbenefit:increasedcitations

Page 22: Measuring the Adoption of Open Science

Whatincentivesarevalued?

Page 23: Measuring the Adoption of Open Science

Hedstrom. Society of Am Archivists Ann Meeting. 2008.

if I thought it would really benefit othersif required for future funding

if required for publicationif deposits counted as a publication

if citations to data were valuedif monetary compensation

0% 25% 50% 75%

Incentivestoshare:perceivedvalue,mandates,recognitionaspublication

Page 24: Measuring the Adoption of Open Science

Hedstrom et al. IASSIST 2006.

Whatwouldmakeiteasier?helpandstraightforwardguidelines

more funder time and moneyhelp with confidentiality issues

on-site helpmore training

better guidelinesbetter tools

simpler requirementsless staff turn-over

0% 25% 50% 75%

Page 25: Measuring the Adoption of Open Science

Hedstrom et al. IASSIST 2006.

if I had help

if quality was visible to others

if I noticed that others had higher quality

if the archivists nagged me

if data users nagged me

if I had released it sooner

0% 10% 20% 30% 40%

Incentivesforqualityanddocs:help,visibility,andnagging

Page 26: Measuring the Adoption of Open Science

Dojournalmandateswork?

Page 27: Measuring the Adoption of Open Science

Piwowar, Chapman. A review of journal policies for sharing research data. ELPUB 2008.

sharing rate when no policy (baseline)

unenforceable policy

enforceable policy

0 1 2 3 4

Journalswithenforceablepolicieshavemoreshareddatasets

Page 28: Measuring the Adoption of Open Science

Onceshared,alwaysthere?

Page 29: Measuring the Adoption of Open Science

Datacontactsandstoragedecaywithtime

URLdecay:emaildecay:

Supplementaryinformation:in6topjournals:5%unavailableafter2years,10%unavailafter5years

Evangelouetal.FASEBJ.2006.Wren.Bioinformatics2008.Wrenetal.EMBORep2006.

Page 30: Measuring the Adoption of Open Science

Anythingelse?

Page 31: Measuring the Adoption of Open Science

datacompleteness?replicability?

theoreticalmodelsofinfobehaviour?

Goodquestions.Outoftime.

Askorseeonlinebibliographyformoreinfo.

Page 32: Measuring the Adoption of Open Science

Dofundermandateswork?

Whichsubdisciplineshavebestpractices?particularweaknesses?

why?

Goodquestions.Researchunderway....

Page 33: Measuring the Adoption of Open Science

NIH: Haga, S. Exploring Attitudes About Data Disclosure and Data-Sharing in Genomics Research.

NSF: Hedstrom, M. Incentives for Data Producers to Create Archive-Ready Data Sets.

National Inst of Nursing Research: Pienta, A. Barriers and Opportunities for Sharing Research Data.

NLM training grant: Piwowar, H. Impact, prevalence, and patterns of shared biomedical data.

+others

Page 34: Measuring the Adoption of Open Science

Insomecasesdothecostsoutweighthebenefits?

Domandatesdecreasequalityofshareddata?

Whatistheprevalenceofdatareuse?

Whatwouldfacilitatereuse?

Goodquestions.Wedon’tknow.Futureresearch!

Page 35: Measuring the Adoption of Open Science

Conclusions

Page 36: Measuring the Adoption of Open Science

Althoughsomeresearchersvoluntarilysharedata,manydon’t.

thefrequencyofsharingdependsondatatype,whoyouask,howyouask,whatyouplantodowiththedata,whatjournalitispublishedin....

Takehome#1

Page 37: Measuring the Adoption of Open Science

Withholdingiscorrelatedwiththeusualsuspects:desiretopublishmore,avoideffort,maintaincontrol,industryrelationships.

Relativevalueofincentivesissurprising:demonstratedvalue,visibility,help,straightforwardguidelines,effectivemandates,andnagging:)Eachofuscanmakeadifferencehere:Writeletterstotheeditoraboutjournalpolicies,blogahow‐toguideinplainEnglish,getinvolvedindatastandards,offerhelptocolleagues,communicateinstancesofvalue.

Takehome#2

Page 38: Measuring the Adoption of Open Science

Muchroomforfutureresearch:costsandbenefits,dataquality,reuse....

Opportunitiesfortraditionallarge‐scalegrantsacrossarangeofdisciplinesandagenciesButalsoopportunityforimpactinlessformalchannels:Youcanhelpcommunicateanecdotes,evaluations,andvisualizations

viablogs,publishedresearchnotes,perspectives,letterstotheeditor,andwater‐coolerconversations.

Takehome#3

Page 39: Measuring the Adoption of Open Science

Ifwemeasurecurrentbehaviour,we’lllearnhowtofacilitatetheadoption

ofopenscience,and

We’llknowwhatandwhentocelebrate!

youcannotmanagewhatyoudonotmeasure

‐>

Page 40: Measuring the Adoption of Open Science

ThankstoWendyChapman+theDeptofBiomedicalInformaticsatUofPittsburgh

NLMfortraininggrantfunding:5T15LM007059‐22(UofPittDBMI)

NIHforresearchandtravelfunding:1R01LM009427‐01(WendyChapman)

PSB,Shirley,andCameronfororganizingthisworkshop

Studyreferencesavailableathttp://www.citeulike.org/user/hpiwowar/tag/psb‐talk

[email protected]

Myshareddata:www.dbmi.pitt.edu/piwowarShareyourresearchdatatoo!