empowering data in scholarly publishing
DESCRIPTION
2014 Charleston Conference Friday, Nov 7, 12:45 PMTRANSCRIPT
EMPOWERING DATA IN SCHOLARLY PUBLISHING
The 60,000-Foot View
Cathy Giffi, Director, Strategic Market Analysis
November 7, 2014
Outline• Definitions
• What We Mean By “Data”
• Wiley Researcher Data Insights Survey• Methodology• Data Sharing Behavior• Data Sharing by Field• Data Sharing by Geography
• The Role of Publishers • The Future
What Comes to Mind
What We Mean By DataResearch data is data that is collected, observed, or created, for purposes of analysis to produce original research results.
• Text or Word documents, spreadsheets• Laboratory notebooks, field notebooks, diaries• Questionnaires, transcripts, codebooks• Audiotapes, videotapes• Photographs, films• Test responses• Slides, artifacts, specimens, samples• Collection of digital objects acquired and generated during the process of research• Data files• Database contents including video, audio, text, images• Models, algorithms, scripts• Contents of an application such as input, output, log files for analysis software,
simulation software, schemas• Methodologies and workflows• Standard operating procedures and protocols
Source: Boston University Libraries
WILEY RESEARCHER DATA INSIGHTS SURVEYMarch 2014
Wiley Researcher Data Insights Survey
Wiley Researcher Data Insights Survey
Our objective was to establish a baseline view of data sharing practices, attitudes, and motivations globally, with participation from researchers in every scholarly field.
In March 2014, more than 90,000 researchers around the world were invited to participate in Wiley’s Researcher Data Insights Survey. Participants were researchers who had published at least one journal article in the past year with any publisher.
We received an overwhelming 2,886 responses from around the world.
Wiley Researcher Data Insights Survey
Key Findings• Most researchers are sharing their data.• Those not sharing have a variety of reasons.• Data that’s being shared typically is <10 GB.• The most common type of data that is being shared is flat,
tabular data (.csv, .txt, .xl)• Data is usually saved on hard drives.
Wiley Researcher Data Insights Survey
Why Researchers Do Not Share• Intellectual property or confidentiality issues (59%)• Concerned research might be “scooped” (39%)• Concerns about misinterpretation or misuse (32%)• Concerns about attribution/citation credit (31%)• Ethical concerns (24%)• Insufficient time/resources (19%)• Funder/institution does not require sharing (13%)• Lack of funding (13%)• Not sure where to share (5%)• Not sure how to share (3%)
Wiley Researcher Data Insights Survey
Why do you share your data?Sharing is standard practice within their research communities (59%)
Sharing increases the impact and visibility of their research (56%)
Sharing benefits the public (51%)
52%48%
Have made data publicly availableHave not made data publicly available
While only 52% have made their data publicly available, 66% of researchers share their data.
File types # of Respondents
% Total
Tabular (flat) data (CSV, spreadsheet, txt)
1,767 83%
Images 2-D 807 38%
Executable code/Models 460 22%
Interview transcripts (or data generated from interview scripts)
298 14%
Relational Databases (SQL, Oracle, Access, etc.)
254 12%
Images 3-D 254 12%
Video/Audio 228 11%
Other 227 11%>100 TBs
I don't know
2 - 50 TB
101-500 GB
501 GB -1 TB
51-100 GB
21-50 GB
11-20 GB
<1 GB
1-10 GB
10
65
72
89
96
127
141
206
606
648
Data Produced by ResearchFile Sizes
Wiley Researcher Data Insights Survey
Computer hard drive 24%
External hard drive 22%
Shared/networked drive 11.5%
USB/flash drive 10.5%
Web service e.g. Dropbox 9%
Non-digital lab notebooks 8.5%
Institutional repository 6%
Email 6%
General purpose repository 1.5%
Other 1%
Where do you store your tabular data once a project is complete?
Wiley Researcher Data Insights Survey
GERMANY55%
JAPAN44%
AUSTRALIA41%
US46%
UK43%
• US researchers are highly likely to be sharing data as supplementary material in journals. The majority of US researchers say sharing data is standard practice within their communities.
• UK and Australian researchers are more comfortable sharing data at conferences rather than on publicly and permanently accessible platforms. Their primary motivation for sharing data is to increase the impact and visibility of their work.
• Australian and German researchers are more driven by their global counterparts to share their data to ensure preservation as well as to allow for transparency and reuse.
• Japanese researchers are significantly more likely to be using discipline-specific repositories (44% compared with 26% for the full pool).
Regional Differences in Data Sharing(% Overall that Share Data)
Wiley Researcher Data Insights Survey
INDIA65%
CHINA36%
BRAZIL52%
Data Sharing in Developing Markets
Data sharing practices vary acrossdeveloping markets, aligning largelywith the presence (or lack thereof) of funder mandates.
• Chinese researchers share their data when they are required to (by journals or funders) but are less likely overall to share their work because they don’t believe it is their personal responsibility.
• Researchers in India are significantly more likely to utilize institutional (46%) and discipline-specific (41%) repositories compared to the global pool of respondents.
Wiley Researcher Data Insights Survey
Life Sciences
Avera
ge
Life
Scienc
e
Physic
al Scie
nce
Health
Scie
nce
Social
Scie
nce
0%20%40%60%80%
100%
52 66 45 48 36
48 34 55 52 64
I've shared my data publiclyI haven't shared my data pub-licly
The majority of Life Science researchers that share data are doing so as supplementary material in a journal. Four in ten are utilizing institutional data repositories while 29% are sharing via personal/institutional/lab webpages.
Top Motivations to Share Top Reasons Not to Share
Standard practice within their research community (64%)
Concerns that their research will be scooped (56%)
Journal requirement (56%) Intellectual property or confidentiality issues (54%)
To increase the impact and visibility of their research (55%)
Concerns about misinterpretation or misuse (43%)
Wiley Researcher Data Insights Survey
Health Sciences
The majority of Health Science researchers that share data are doing so as supplementary material in a journal (68%). About one in three researchers are utilizing institutional data repositories or personal/institutional/lab webpages, while 21% are depositing into discipline-specific repositories to share and archive their data.
Top Motivations to Share Top Reasons Not to Share
Data sharing is standard practice within their research community (57%)
Intellectual property or confidentiality issues (68%)
To increase the impact and visibility of their research (52%)
Ethical concerns (36%)
For the public’s benefit (49%) Concerns about misinterpretation or misuse (36%)
Wiley Researcher Data Insights Survey
Avera
ge
Life
Scienc
e
Physic
al Scie
nce
Health
Scie
nce
Social
Scie
nce
0%20%40%60%80%
100%
52 66 45 48 36
48 34 55 52 64
I've shared my data publiclyI haven't shared my data pub-licly
The majority of Physical Science researchers that share data are doing so as supplementary material in a journal (69%) while four in ten are sharing via personal/institutional/lab webpages. Just under a third utilize institutional data repositories (28%). Compared with the global average, physical science researchers are significantly less likely to be utilizing discipline-specific repositories (10%) or general purpose repositories (3%) to share and archive their data.
Top Motivations to Share Top Reasons Not to Share
Standard practice within their research community (61%)
Intellectual property or confidentiality issues (47%)
To increase the impact and visibility of their research (59%)
No funder or institutional require (29%)
For the public’s benefit (52%)
Concerns that their research will be scooped (27%)
Physical Sciences
Wiley Researcher Data Insights Survey
Avera
ge
Life
Scienc
e
Physic
al Scie
nce
Health
Scie
nce
Social
Scie
nce
0%20%40%60%80%
100%
52 66 45 48 36
48 34 55 52 64
I've shared my data publiclyI haven't shared my data pub-licly
The majority of SSH researchers that share data are doing so as supplementary material in a journal (52%), or on personal, institutional or project websites (51%). A quarter are utilizing institutional data repositories (25%) while, cumulatively, only 5% are sharing in general purpose or discipline-specific repositories.
Top Motivations to Share
Top Reasons Not to Share
To increase the impact and visibility of their research (53%)
Intellectual property or confidentiality issues (47%)
Data sharing is standard practice within their research community (53%)
Concerns about being scooped (30%)
For the public’s benefit (46%)
No funder or institutional requirement (28%)
Wiley Researcher Data Insights Survey
Avera
ge
Life
Scienc
e
Physic
al Scie
nce
Health
Scie
nce
Social
Scie
nce
0%20%40%60%80%
100%
52 66 45 48 36
48 34 55 52 64
I've shared my data publiclyI haven't shared my data pub-licly
Social Science and Humanities (SSH)
The Role of Publishers
The Future