the diversity of biomedical data, databases and standards (research data alliance (rda) 8th plenary)

Post on 13-Apr-2017

37 Views

Category:

Science

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The Diversity of Biomedical Data,

Databases and StandardsPeter McQuilton

BioSharing Content Leadhttps://www.biosharing.org

@biosharing

IG Elixir Bridging Force, WG Biosharing Registry,WG Data Type Registries,WG Metadata Standards CatalogInternational Data Week, RDA, Denver, 15th September, 2016

A growth in data, a growth in databases, a growth in standards

Number of databases in the NAR database issue, up to 2015 (from @AlexBateman1)

• Data/content standards:

• Structure, enrich and report the description of the datasets

and the experimental context under which they were produced

• Facilitate the discovery, sharing, understanding and reuse of

datasets

• ensure all digital research outputs are Findable, Accessible,

Interoperable and Reusable (FAIR)

Data has to be structured for sharing – we need standards

Content standards – enablers

Formats Terminologies Guidelines

Minimum information reporting

requirements, checklists o Report the same core,

essential information o e.g. MIAME guidelines

Controlled vocabularies, taxonomies,

thesauri, ontologies etc.o Use the same word and refer to

the same ‘thing’o e.g. Gene Ontology

Conceptual model, conceptual

schema, exchange formats etco Allow data to flow from one

system to anothero e.g. FASTA

de jure de factograss-roots

groupsstandard

organizations Nanotechnology Working Group

Over 700 content standards in biomedical sciences

miameMIAPA

MIRIAMMIQASMIX

MIGEN

ARRIVEMIAPE

MIASE

MIQE

MISFISHIE….

REMARK

CONSORT

MAGE-TabGCDML

SRAxmlSOFT FASTA

DICOM

MzMLSBRML

SEDML…

GELML

ISA-Tab

CML

MITAB

AAOCHEBI

OBIPATO ENVO

MOD

BTOIDO…

TEDDY

PROXAO

DO

VO

Formats Terminologies Guidelines

…….... …….... ……....

Technologically-focused content standards

Biologically-focused content standards

Even if common features exists, e.g.:- description of source biomaterial- experimental design componentsthese are inconsistently duplicated

Arrays

ScanningArrays &Scanning

ColumnsGels

MS MS

FTIRNMR

transcriptomics proteomics metabolomics

plant biologyepidemiology microbiology

Diversity in Standards

What is BioSharing?

A web-based, curated and searchable portal that monitors the development and evolution of standards, their use in databases and the adoption of both in data

policies, to inform and educate the user community.

What is BioSharing?

Standards are digital objects too and we make them FAIR

Data policies by funders, journals and other organizations

(>100)

Database, tools and services

(>1000)

Content standards(>700)

Complex and evolving landscape

Formats Terminologies Guidelines

Working with and for the community

NCBI Taxon

~1400 tagsSome hierarchySynonyms4 axes – - Process - Material - Datatype - Property

What data do we capture?

Collections group together

one or more types of

resource by domain,

project or organization.

Recommendations are a

core-set of resources that

are selected and

recommended by a funder

or journal data policy.

Grouping records for different use cases

“BioSharing and its interactive browser will allow us to discover which databases and standards are not currently included in our author guidelines, enabling us to regularly monitor and refine our policies as appropriate, in support of our mission to help our authors enhance the reproducibility of their work.” – Holly Murray, F1000Research

Advisory Board Operational Team

top related