eosc wortking slides b...chair of dtl-data head of elixir node nl ec member of open phacts chair of...
TRANSCRIPT
Barend Mons
Biosemantics Group LUMC and EMC
LS integrator Netherlands eScience Center
Chair of DTL-dataHead of ELIXIR node NL
EC member of Open PHACTS
Chair of High Level Expert Group EOSC
The opinions expressed in this presentation are my personal opinions and do not necessarily reflect the draft report of the High Level Expert Group for the European Open Science Cloud.
F.A.I.R
Cloudhttps://ec.europa.eu/research/participants/portal/desktop/en/opportunities/h2020/topics/2054-infradev-04-2016.html
Press Conference today
https://vimeo.com/162062013
First GO-FAIR lab in DTL (Health-RI) (2016-17)
Stated aim to create GO-FAIR labs in other
domains (2017-20)
First (sister) GO-FAIR lab in SD
(NIH/NHS/NSF) (2016)
Second (sister) FOIL in Sao Paulo (Scielo)(2016)
Third (sister) GO-FAIR lab in Arusha (2017)
Stated aim to create GO-FAIR labs in other MS
(DE, SWE, UK, FI, CZ) 2017
https://vimeo.com/162062013
Data loss is real and significant, while data growth is staggering
Nature news, 19 December 2013 • Computer speed and storage capacity is doubling every 18 months and this rate is steady
• DNA sequence data is doubling every 6-8 months over the last 3 years and looks to continue for this decade‘Oops, that link was the laptop of my PhD student’
Science1.0
90% of world's data generated over last two years (http://www.sciencedaily.com/releases/2013/05/130522085217.htm)
US$28B/year (50%) spent on preclinical research is not reproducibleFreedman et al. http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002165
Only 12% of NIH funded datasets are demonstrably deposited in recognized repositories.
So: approximately 200,000 to 235,000 invisible datasets cannot be effectively reused
Read et al. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0132735
What about US? (as in us, MS)?
The Data Stewardship Cycle
9
5%
something to refer tohttp://www.nature.com/articles/sdata201618
European
Open
Science
Cloud
Cloud
European Open Science Cloud
Cloud
o Public consultation and validation workshops on Open Science (July-December 2014);
o Final report on Open Science (February 2015); o HLEG EOSC stakeholder workshop (30 November 2015); o DSM Consultation on platforms, data and cloud (closed
January 2016); o Research funders' workshop (15 March 2016).
+ PC meetings, EAG meetings, e-IRG meetings, concertation meetings, info days, conferences, events, ...
Strong stakeholder support
Community emerging standardsEOSC
Cloud
stop driving on the left
Trusted access to services & systems
Re-use of shared data
Across disciplinary, social and geographical borders
Federated environment, across Member States
EOSC: FramingCloud
Minimal international guidance and governance
Maximum freedom to implement.
Globally interoperable and accessible
Globally embedded in a ‘Commons’
EOSC: ‘Internet approach’Cloud
Human expertise
Core resources
Standards, Best Practices
Underpinning technical infrastructures
A web of Data and Services
EOSC: ScopeCloud
Open Science
Open Innovation
Systematic and professional data management
Long term data stewardship
EOSC: SupportsCloud
The majority of the challenges are social rather than technical
Not just the size of data, but in particular complex data and analytics across domains.
Shortage of data experts globally and in the European Union
Archaic system of rewards and funding of science and innovation
‘Valley of death’ between (e-)infrastructure providers and domain specialists.
Short funding cycles of core research infrastructures are not fit for purpose
Fragmentation between domains causes repetitive and isolated solutions
Distributed data sets increasingly do not move (size & privacy reasons)
Centralised HPC is insufficient to support distributed meta-analysis and learning.
However, the major components for a first generation EOSC are largely ‘there’
But ‘lost in fragmentation’ and spread over 28 Member States.
EOSC: Challenges and ObservationsCloud
EOSC: Key requirementsCloud
New modes [4] of scholarly communication Modern reward and recognition practices [3] need to support data sharing and re-use Innovative, fit for purpose funding schemes for sustainable underpinning infrastructures Core data experts [7] need to be trained and their career perspective significantly improved
Cross-disciplinary collaboration-specific measures for review, funding and infrastructure
Support for the transition from scientific insights towards societal innovation [8] The EOSC needs to be developed as an eco-system of infrastructures [2]
Key Performance Indicators should be developed for the EOSC [2]
The EOSC should enable automation of data processing: machine actionability [1]is key.
FAIR principles [1, 6] (http://www.nature.com/articles/sdata201618)
Research Integrity [6]
EOSC: Policy RecommendationsCloud
P1: Take immediate, affirmative action in close concert with Member States
P2: Close discussions about the ‘perceived need’
P3: Build on existing capacity and expertise where possible
P4: Frame the EOSC as supporting Internet based protocols & applications
EOSC: Governance RecommendationsCloud
G1: Aim at the lightest possible, internationally effective governance
G2: Guidance only where guidance is due
G3: Define Rules of Engagement for formal participation in the EOSC
G4: Federate the Gems across Member States
EOSC: Implementation RecommendationsCloud
I1: Turn this report into an EC approved document to guide EOSC initiative
I2: Develop, Endorse and implement a Rules of Engagement scheme
I3: Fund a concentrated effort to locate and develop Data Expertise in Europe
I4: Install a highly innovative guided funding scheme for the preparatory phase
I5: Make adequate data stewardship mandatory for all research proposals
I6: Install an executive team to deal with international coherence of the EOSC
I7: Install an executive team to deal with the preparatory phase of the EOSC
EOSC: 7 Immediate actions based on feed backCloud
II1: Publish the report (final draft available)
II2: Develop, Pilot and implement a Rules of Engagement scheme
II3: Detail the transition and sustainability model (and pilot it)
II4: Train the data experts to bridge between ‘e-INFRA’ and ‘ESFRI’
II5: Assist data stewardship planning and exec. tools for all researchers
II6 Develop the plan for what ‘minimal essential governance’ means in practice
II7 Federate Interoperability standards and best practices (key RDA-role)
executive group(s)
deliverables
II2 II2II3II3II3II3II3 II3II4 II4 II4II5
II5II5II5
II6
II7
II7RDA
II7FDG’sII3 Impactstory
ORCID, FORCE 11
Approves DSPlan before
submission of grant
core resource
EOSC
In DS plan
Review panel(DS = Tickbox)
FUNDERCloud Coin Authority
Monthly invoice to Cloud Coin Authority?
endorsed Repository
Data Publisher
HPA service
GO-FAIR-
Applications &
e-Infra providers
Other Peoples’ Data
FAIR exchange
FAIR Research Data
FAIR Workflows
FOILS
Open Science
Open Access
FAIR data