datacure › openrisknet › events › ... · structures generates a dataset that can be used for...

9
www.openrisknet.org OpenRiskNet: Open e-Infrastructure to Support Data Sharing, Knowledge Integration and in silico Analysis and Modelling in Risk Assessment Project Number 731075 DataCure Data curation and creation of pre-reasoned datasets and searching Noffisat Oki, Tim Dudgeon, Marc Jacobs, Danyel Jennen, Thomas Exner

Upload: others

Post on 30-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,

www.openrisknet.org

OpenRiskNet: Open e-Infrastructure to Support Data Sharing, Knowledge Integration and in silico Analysis and Modelling in Risk Assessment

Project Number 731075

DataCureData curation and creation of pre-reasoned datasets

and searching

Noffisat Oki, Tim Dudgeon, Marc Jacobs, Danyel Jennen, Thomas Exner

Page 2: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,

www.openrisknet.org

Case Study objective

Data curation and merging Text mining

Page 3: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,

www.openrisknet.org

CypP450 data curation with Squonk

● Merge multiple datasets from ChEMBL into single set

● Uses ChEMBL identifiers to identify common structures

● Generates a dataset that can be used for machine learning

● See on GitHub

Page 4: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,

www.openrisknet.org

Data merging via data APIs

Page 5: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,

www.openrisknet.org

OpenAPI + JSON-LD

Subject or object

Predicate

Page 6: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,

www.openrisknet.org

Finding datasets

Page 8: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,

www.openrisknet.org

Page 9: DataCure › openrisknet › events › ... · structures Generates a dataset that can be used for machine learning ... ' application/json Content - Type. requests . post(url, data—payload,

www.openrisknet.org

OpenRiskNet example workflowTask:● Identify the concept of

acetaminophen (definition, identifiers, synonyms)

● Find all relevant documents in the context of acetaminophen and carcinogenity

● What are the most relevant statements

Technology:● Semantic index of PubMed/PMC (> 20

terminologies)● Solr index + OLS index + UIMA

pipeline