iuclid substance data - cefic-lricefic-lri.org/wp-content/uploads/2014/03/4.kochev... · a chemical...

22
CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC Workshop on CEFIC LRI Project EEM9.4 LRI AMBIT with IUCLID6 support and extended search capabilities IUCLID Substance Data 1 Nikolay Kochev Ideaconsult Ltd. Sofia,Bulgaria

Upload: others

Post on 27-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Workshop on CEFIC LRI Project EEM9.4

LRI AMBIT with IUCLID6 support and extended search capabilities

IUCLID Substance Data

1

Nikolay Kochev

Ideaconsult Ltd.

Sofia,Bulgaria

Page 2: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Chemical structure vs. Substance

A chemical structure describes a well-defined molecule.

Chemicals synthesized in reality are not pure substances. In fact such substances

represent mixtures of several components. Therefore real substances can not be

associated with an unique structure. In contrast, components (i.e.: constituents,

impurities and/or additives) can clearly be characterized by a defined structure in

each case.

Under REACH, the concept of substance is clearly described. This definition is

implemented in the IUCLID data base.

Public, IUCLID Substance Data2

1,2-dimethoxyethane

Page 3: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Substances under REACH

under REACH, a chemical substance is composed of:

Constituents (n>=1)

Impurities (n>=0)

Additives (n>=0)

under REACH, a chemical substance can have several compositions, e.g. crude,

distilled, etc.

under REACH, the type of a chemical substance can be:

Either mono-constituent (a substance, defined by its composition, in which one

main constituent is present to at least 80% (w/w)).

Or multi-constituent (a substance, defined by its composition, in which more than

one main constituent is present in a concentration 10% (w/w) and < 80% (w/w))

Or UVCB (Substance of Unknown or Variable composition, Complex reaction

products or Biological materials)

Public, IUCLID Substance Data3

Page 4: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

REACH substance definition implemented in IUCLID Example: mono-constituent substance

Three different

compositions

Public, IUCLID Substance Data4

Page 5: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

REACH substance definition implemented in IUCLID Example: mono-constituent substance

Three different

compositions

Public, IUCLID Substance Data5

Page 6: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

REACH substance definition implemented in IUCLID Example: mono-constituent substance

Three different

compositions

Public, IUCLID Substance Data6

Page 7: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

REACH substance definition implemented in IUCLID Example: UVCB N,N-dimethyl-C12-14-(even numbered)-alkyl-1-amines

Public, IUCLID Substance Data7

Page 8: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

REACH substance definition implemented in IUCLID Example: multi-constituent substance

Public, IUCLID Substance Data8

The substance has 3 constituents and

3 impurities characterized by different structures

Page 9: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Public, IUCLID Substance Data9

IUCLID6 support in AMBIT

• Given : Completely new XML schema of all objects

• 372 schema files, 111 endpoint study record files

• Different approach of linking between objects (compared to IUCLID5)

• Implementation

• Java classes generated from the XML schema (via JAXB)

• AMBIT code to convert the generated classes to the internal data model and be able to store into the database

• Use existing code for writing into the database

• And existing UI to show the data

• Transparent from user point of view: select .i6z or .i5z

Page 10: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Public, IUCLID Substance Data10

IUCLID6 support in AMBIT

• Files (both IUCLID5 and IUCLID6)

• Transparent from user point of view: select .i6z or .i5z

• Web services

• IUCLID5 (SOAP) and IUCLID6 (REST)

• All endpoint study records supported previously (and more)

• Potential to support all endpoint study records

• The “Test material” is no more a checkbox

• Each study record links to a test material (a substance, identified by UUID)

• Substance and compositions

• Reference substances

Page 11: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Public, IUCLID Substance Data11

IUCLID6 new composition types

• legal entity composition of the substance (default)

• boundary composition of the substance

• composition of the substance generated upon use

• other:

• IUCLID5 composition is migrated to “Legal entity composition”

• The composition record includes study information

• Introduced mostly because of nanomaterials, as REACH substance is defined by the main constituent

• (e.g. all TiO2 materials, regardless of the coatings=one substance)

• All different nanoforms are described as different compositions of the same substance

• And they have different shape, size, etc (i.e. characterisation)

Page 12: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Detailed information Composition (1)Every constituent, impurity and additive is described in detail with a “Reference

substance” with several identifiers

Public, IUCLID Substance Data12

Page 13: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Detailed information Composition (2)

The structure associated to the

reference substance is stored in the

IUICLID as a picture format only

which is normally not searchable.

InChI notation could be used for

structure identification.

SMILES notation could be used for

structure identification only if unique

SMILES strings are used both on data

import and query definition.

Public, IUCLID Substance Data13

Page 14: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Public, IUCLID Substance Data14

Full structure support in AMBIT for all substance components

Various chemoinformatics approaches for handling chemical

structures

Page 15: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Motivation to transfer IUCLID data to Ambit chemoinformatic system

IUCLID Limitation:

IUCLID allows queries in the substance data but has no functionality to search

chemical structures (exact, similar, or substructures). Queries using the SMILES

and InChI notation are possible.

In addition, IUCLID describes endpoints in very detailed complexity. Extraction of

key information relevant for substance evaluation is not convenient.

The IUCLID substance composition and IUCLID endpoint data can be transferred

and updated into the Ambit system. During this process structures are assigned

automatically to the constituents/impurities/additives of the substance.

In contrast to IUCLID, Ambit allows structure and data search.

Public, IUCLID Substance Data15

Page 16: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Motivation to transfer IUCLID data to Ambit chemoinformatic system

Ambit advantages:

Chemical structure searching: exact, similarity and substructure search;

Read-across workflow;

Flexible faceted and free text searching for structure and data;

Export to various data formats preferred by industry and scientific community;

Modelling, data analysing and visualization utilities;

Support for chemical substances including nanomaterials;

Programmatic access via REST API;

User friendly web interface.

Public, IUCLID Substance Data16

Page 17: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Extracting data from IUCLID Substances which should be transferred to AMBIT have to be flagged in IUCLID

In the IUCLID chapter “1.3 Identifiers” company specific flags can be added

Public, IUCLID Substance Data17

Company specific flags

examples:

TRA number to identify trade

products in the SAP System

Substances will be

transferred to Ambit

(CompTox – Ambit Transfer)

All Flags will be transferred to

Ambit and are searchable in

Ambit

Page 18: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Import criteria to specify which studies will be imported into AMBIT

Where can I find these fields in

IUCLID?

In each Endpoint study record the

relevant fields are located in

Administrative Data

Data source

Public, LRI Project EEM9.3, IUCLID Substance Data18

Page 19: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Why a selection is reasonable?

Only high quality study records of the IUCLID substance itself should be imported

into AMBIT, therefore we recommend to select only:

Key studies and Supporting studies (Adequacy of Study/Purpose flag/); the flags

weight of evidence and disregarded study are not high quality information.

Reliability 1 and 2 (Reliability); 3 (not reliable) and 4 (not assignable) are not

helpful to characterize the relevant endpoint information.

Experimental result (Study result type); Read across information should not be

selected, because these information will be transferred with the original IUCLID

substance to AMBIT.

Study reports, Publications and Review article (Reference type); secondary

source and grey literature should not be imported

Public, LRI Project EEM9.3, IUCLID Substance Data19

Page 20: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

Import IUCLID files in AMBITIn Ambit some import filters can be selected

Public, IUCLID Substance Data20

Page 21: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC

In Ambit some import filters can be selected

Public, IUCLID Substance Data21

Retrieve substances in AMBIT from IUCLID server

Page 22: IUCLID Substance Data - Cefic-Lricefic-lri.org/wp-content/uploads/2014/03/4.Kochev... · A chemical structure describes a well-defined molecule. Chemicals synthesized in reality are

CEFIC Long-range Research Initiative, CEFIC LRI Project EEM9.4-IC