biosharing - eudat semantic workshop

22
n connecting standards, databases and data policies Susanna-Assunta Sansone Associate Director Oxford e-Research Centre, University of Oxford

Upload: susanna-assunta-sansone

Post on 13-Apr-2017

124 views

Category:

Data & Analytics


0 download

TRANSCRIPT

n

connecting standards, databases and data policies

Susanna-Assunta Sansone

Associate Director

Oxford e-Research Centre, University of Oxford

• Domain-level descriptors that are essential for interpretation, verification, reproducibility and reusability of datasets

• The depth and breadth of descriptors vary according to the domain broadly covering the what, who, when, how and why

Content standards

Formats Terminologies Guidelines

Content standards: three categories

Minimum information reporting requirements, checklists

o Report the same core, essential information

o e.g. MIAME guidelines

Controlled vocabularies, taxonomies, thesauri, ontologies etc.

o Unambiguous identification and definition of concepts o e.g. Gene Ontology

Conceptual model, schema, exchange formats etc

o Define the structure and interrelation of information, and the transmission format

o e.g. FASTA Formats Terminologies Guidelines

Content standards: three categories

Formats Terminologies Guidelines

Community-driven initiatives

de jure de factograss-roots

groupsstandard

organizations

Nanotechnology Working Group

883 -> ~1000

220+

115+

548

source sourcesource

Content standards in numbers

Formats Terminologies Guidelines

MIAMEMIRIAM

MIQASMIXMIGEN

ARRIVEMIAPE

MIASE

MIQE

MISFISHIE….

REMARK

CONSORT

SRAxml

SOFT FASTADICOM

MzMLSBRML

SEDML…

GELML

ISA

CML

MITAB

AAOCHEBIOBI

PATO ENVOMOD

BTOIDO…

TEDDY

PRO

XAO

DO

VO

MIAPPESample-Tab

Content standards

Data policies by funders, journals and other organizations

Databases, tools and services

Formats Terminologies Guidelines

Mapping this evolving landscape

Content standards

Data policies by funders, journals and other organizations

Databases, tools and services

Formats Terminologies Guidelines

a resource of the ELIXIR Interoperability Platform

• Aweb-based,curatedandsearchableportalthat monitorstheir

development andevolution toinform andeducate

Not just quantity but quality: rich, curated and community

vetted descriptions

Indicators to describe the status of standards and databases

Readyforuse,implementation,orrecommendation

Indevelopment

Statusuncertain

Deprecatedassubsumedorsuperseded

Manuallycuratedandverifiedbythecommunitybehindeachresource

Tracking evolution, e.g.:

Visualizing relations, e.g.:

DataPolicyListoftheir

recommendeddatabasesandstandards

…to inform and educate on existing and new resources

DataPolicy

Working with/for the community and our ‘adopters’, e.g.:

Standard developing groups:Journal, publishers:

Cross-links, data exchange:

Societies and organisations: Institutional RDM services:

Projects, programmes: 533

responders

Progressively cross-linking with other ELIXIR resources

Cross-links, data exchange:

Societies and organisations:

Standard developing groups:Journal, publishers:

Institutional RDM services:

Projects, programmes:

• Increase discoverability (e.g. by search engines), aggregation (e.g. by indices)

and analysis of content in different websites and services• use of schema.org structured semantic markup (for web pages’ content) by Google, Bing,

Yahoo, Yandex• coordinate its extension, where needed, in the life science area

Gaining traction and support by:

Acknowledgements