barend mons slides from #ismb 2014: trends in data publishing

22
Trends in Datapublishing Barend Mons

Upload: gigascience-bgi-hong-kong

Post on 28-Jan-2015

106 views

Category:

Science


2 download

DESCRIPTION

Barend Mons slides from #ISMB 2014: Trends in data publishing. Talk 3 in the "What Bioinformaticians need to know about digital publishing beyond the PDF2" workshop at ISMB 2014, Boston, 16th July 2014

TRANSCRIPT

Page 1: Barend Mons slides from #ISMB 2014: Trends in data publishing

Trends in Datapublishing

Barend Mons

Page 2: Barend Mons slides from #ISMB 2014: Trends in data publishing

Simplified eScience

RO’sAll Core Legacy Information +

WorkFlows

User

New datase

t

New Insights

Page 3: Barend Mons slides from #ISMB 2014: Trends in data publishing

X

AREAL SURVEY DEEP EXCAVATION

‘Why would I believe this association’???

Page 4: Barend Mons slides from #ISMB 2014: Trends in data publishing

2005, 6:142

Why I gave up spending most of my energy on text mining years ago

Page 5: Barend Mons slides from #ISMB 2014: Trends in data publishing

Data loss is real and significant…

Nature news, 19 December 2013

…and so is Data growth

Page 6: Barend Mons slides from #ISMB 2014: Trends in data publishing

The Data cycle in eScience

6

Page 7: Barend Mons slides from #ISMB 2014: Trends in data publishing

Prof. Carole Goble

Page 8: Barend Mons slides from #ISMB 2014: Trends in data publishing

www.datafairport.org

Page 9: Barend Mons slides from #ISMB 2014: Trends in data publishing

F

A

I

R

Findable:- PID for each concept used- PID assignment (authorities)- ARTA-service

Accessable:- Machines can Map (IMS-service)- License on data elements- Authentication/Authorization

Interoperable:- Machines understand data- Download-link-data formats - Workflows (Research Objects)

Re-usable:- Functionally Interlinked- Harmonized- Citable and available for...

Page 10: Barend Mons slides from #ISMB 2014: Trends in data publishing
Page 11: Barend Mons slides from #ISMB 2014: Trends in data publishing

PID

'provenance' (user defined)

Data (elements)

Metadata (intrinsic)

A simplified diagram of a Digital (data) Object irrespective of technological choices and naming

Page 12: Barend Mons slides from #ISMB 2014: Trends in data publishing

PID

'provenance' (user defined)

Data (elements)

Metadata (intrinsic)

Digital Object Architecture s are Digital Objects

Nanopublications are Research ObjectsSome Research Objects are

Page 13: Barend Mons slides from #ISMB 2014: Trends in data publishing

PID\\\

Metadata (intrinsic)'provenance' (user

defined)

Data (elements)

Totally UNFAIR

PID

Metadata (intrinsic)'provenance' (user

defined)

Data (elements)

FindableUsable for Humans

PID

Metadata (intrinsic)'provenance' (user

defined)

Data (elements)

FAIR metadata

PID

Metadata (intrinsic)

'provenance' (user defined)

Data (elements)

FAIR data-restricted access

PID

Metadata (intrinsic)

'provenance' (user defined)

Data (elements)

FAIR data-Open Access

PID

Metadata (intrinsic)

'provenance' (user defined)

Data (elements)

FAIR data-Open Access/Functionally Linked

Data as increasingly FAIR Digital Objects

Page 14: Barend Mons slides from #ISMB 2014: Trends in data publishing

BYOD-01

Page 15: Barend Mons slides from #ISMB 2014: Trends in data publishing

15

Page 16: Barend Mons slides from #ISMB 2014: Trends in data publishing

16

Page 17: Barend Mons slides from #ISMB 2014: Trends in data publishing

17

Page 18: Barend Mons slides from #ISMB 2014: Trends in data publishing

Combine FANTOM5

&LOVD

Page 19: Barend Mons slides from #ISMB 2014: Trends in data publishing

16 TSS 10 Tissues

Page 20: Barend Mons slides from #ISMB 2014: Trends in data publishing

3 tissuesHeart muscle

Skeletal muscleCerebral Cortex

RNA detectedIn many more

tissues

25 TSS???

Page 21: Barend Mons slides from #ISMB 2014: Trends in data publishing

Repositories

Data Owners

(supp)data

Databases

ELIXIR FAIR Data Search Index

End-users

FAIR L2

ELIXIR semantic data repository

ELIXIRDataFAIRPort

ELIXIR federated data

FAIR L1

Search for datasets

Download data (sub)

sets in many formats (xml, rdf, json etc)

FAIR L3

FAIR L4

ASPs, Inhouse IT, Bioinformatics

Etc..

Tools &Applications

ElixirFin.

ElixirEsp.

ElixirNor.ElixirUKElixir

SWEElixirNL..ElixirFin.

ElixirEsp.

ElixirNor.ElixirUKElixir

SWEElixirNL..

3

1

2

4

Page 22: Barend Mons slides from #ISMB 2014: Trends in data publishing

Develop ELIXIR-NL

Engage key data owners

Training & Education

Develop Infrastructure &

service

Sustainable funding

Policy Alignment

1

2

3

6

5

4