nih bd2k biocaddie datamed: data discovery index

Post on 13-Apr-2017

352 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Consultant, Honorary Academic Editor

Associate Director, Principal Investigator

!

Susanna-Assunta Sansone, PhD!!

!

Alan Turing Institute Symposium Oxford, 6-7 April, 2016

A Data Discovery Index prototype that:!•  Helps users find and access shared data !

•  Interoperates in the NIH Commons (biomedical digital assets) !

Repositories

Metadata Ingestion ElasticSearch

Terminology server

User Interface

Online datasets

Publishers Funding Agencies

Data producers

Dat

a So

urce

s

Ingestion Indexing

Searching

prototype!

aggregator'A'

B C

Aaggregator'

Data'Discovery'Index'

data'

Organizing framework and portal for data

Dashed lines: mapping of metadata standards, links to aggregators, data Aggregators: repositories or various indices Data: digital research objects

Pilot projects* Core development team

* There is work for everyone (and more)

Designed as an element of the ecosystem!

Use cases- community-driven effort!

The ‘right’ level of metadata elements!!

Examples of competency questions, derived from the use cases

The ‘appropriate’ metadata standards!!

Mapping the landscape of standards and databases in the life sciences

mapped a variety of metadata standards and database schemas

Generic schemas:!•  schema.org!•  DataCite!•  RIF-CS!•  DCAT!•  PROV!•  VOID!•  Dublin Core !•  etc…!!

Life/biomedical schemas:!•  BioProject!•  BioSample!•  MiNIML!•  PRIDE-ml!•  GA4GH metadata schema!•  SRA xml!•  CDISC SDM / BRIDGE model !•  etc…!

We have aimed to have maximum coverage of use cases with minimal number of data elements

We do foresee that not all questions can be answered in full

From to!

Prototype, model, mappings, documentation and more at!https://biocaddie.org and https://github.com/biocaddie !

Supported by the NIH grant 1U24 AI117966-01 to the University of California, San Diego

top related