exploration of chemical space by molecular morphing

13
Exploration of Chemical Space by Molecular Morphing David Hoksza 1 , Daniel Svozil 2 1 SIRET Research Group Department of Software Engineering, FMP, Charles University in Prague, Czech Republic 2 Laboratory of Informatics and Chemistry Institute of Chemical Technology, Prague, Czech Republic

Upload: others

Post on 03-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploration of Chemical Space by Molecular Morphing

Exploration of Chemical Space by

Molecular Morphing

David Hoksza1, Daniel Svozil2

1 SIRET Research Group

Department of Software Engineering, FMP, Charles University in Prague, Czech Republic

2 Laboratory of Informatics and Chemistry

Institute of Chemical Technology, Prague, Czech Republic

Page 2: Exploration of Chemical Space by Molecular Morphing

Outline • Overview and Motivation

• Chemical Space Exploration o morphing operators

o molecule representation

o distance definition

o space exploration

• Experimental Evaluation

October 26, 2011 BIBE 2011 2

Page 3: Exploration of Chemical Space by Molecular Morphing

Chemical Space • All possible organic compounds comprise a “chemical

space”

• Can be viewed as being analogous to the cosmological universe in its vastness, with chemical compounds populating space instead of stars

• Size o Estimated size of the chemical space: 10100-10200 (SciFinder ~ 6107)

o Around one sextillion (1021) stars in the observable universe

o For example, there are more than 1029 possible derivatives of n-hexane

o Chemical space is infinite for our purposes

• Not all theoretically postulated compounds fall within the limits of what is synthetically feasible

October 26, 2011 BIBE 2011 3

Page 4: Exploration of Chemical Space by Molecular Morphing

Chemical Space Exploration - Motivation

• Motivation o 2 ligands

October 26, 2011 BIBE 2011 4

Page 5: Exploration of Chemical Space by Molecular Morphing

General Algorithm 1. Generate n morphs

from MS

2. Accept each morph

with probability give by its distance to MT

3. Accepted morphs form generation M1

4. For each morph Mi from M1 repeat from 1 using MS = Mi

5. Finish when one of the morphs is identical with MT

October 26, 2011 BIBE 2011 5

Page 6: Exploration of Chemical Space by Molecular Morphing

Molecular Structure Representation

• Fragment-based representation o The fragments present in a structure can be represented as a sequence of

0s and 1s

00010100010101000101010011110100

• 0 means fragment is not present in structure

• 1 means fragment is present in structure (perhaps multiple times)

o structural keys – fixed dictionary of fragments (1:1 relationship bit:fragment, problem: structure containing no fragments in dictionary)

o hashed fingerprints – the fragment description (C-C-N-C-O) can be hashed to the e.g. 1-1024 and this bit is set (problem: collisions, how to

work back from position to fragment?)

October 26, 2011 BIBE 2011 6

Page 7: Exploration of Chemical Space by Molecular Morphing

Molecular Structure Similarity

• Count the “on” bits in both molecules

• Count the “on” bits in each molecule

struct A: 00010100010101000101010011110100 13 bits on (A)

struct B: 00000000100101001001000011100000 8 bits on (B)

A AND B: 00000000000101000001000011100000 6 bits on (C)

• Tanimoto similarity coefficient

similarity = 𝐶

𝐴 + 𝐵 − 𝐶=

6

13 + 8 − 6= 0.4

October 26, 2011 BIBE 2011 7

Page 8: Exploration of Chemical Space by Molecular Morphing

Morphing Operators Morphing Operators

MS

MT

Path Example

October 26, 2011 BIBE 2011 8

Page 9: Exploration of Chemical Space by Molecular Morphing

Exploration Parameters • cnt_max_iterations

• cnt_morphs

• cnt_morphs_det

• dist_det

• cnt_accept

• cnt_accept_max

• cnt_it_prune

• cnt_morphs_max

October 26, 2011 BIBE 2011 9

Page 10: Exploration of Chemical Space by Molecular Morphing

Evaluation - Datasets • 3 start/target pairs datasets from Pubchem

• 20 pairs in each set

• 3 difficulty levels based on pair similarity o representation of start and target structures by their PubChem

substructure fingerprints

o similarity quantified as the Tanimoto score

• D1 … 0.7 – 0.8 similarity

• D2 … 0.5 – 0.6 similarity

• D3 … 0.3 – 0.4 similarity

• time constraint – 8h

October 26, 2011 BIBE 2011 10

Page 11: Exploration of Chemical Space by Molecular Morphing

Evaluation - Results

October 26, 2011 BIBE 2011 11

75% 60% 35%

Page 12: Exploration of Chemical Space by Molecular Morphing

Molpher Student Project • To start at the end of 2011

• Algorithm optimization

• Parallel processing

• Visualization

• Extensive Logging

October 26, 2011 BIBE 2011 12

Page 13: Exploration of Chemical Space by Molecular Morphing

Questions?

October 26, 2011 BIBE 2011 13