classification and analysis of 21 million commercially available … · 2017. 6. 27. ·...

22
Building innovative drug discovery alliances Classification and analysis of 22 million commercially available compounds ChemAxon EUGM May 2013

Upload: others

Post on 26-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

Building innovative

drug discovery alliances

Classification and analysis of 22 million commercially available compounds

ChemAxon EUGM May 2013

Page 2: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Agenda

About Evotec

About EVOsource

Calculating Properties

Analysis of the data

Final Thoughts

1

Page 3: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE 1) Manfred Eigen (*1927), German biophysical chemist and one of the worldwide leading pioneers in biotechnology. In 1967, he won the Nobel Prize in Chemistry for his

work on a special measuring method of fast chemical reactions, which, until then, were considered to be immeasurable. He initiated the foundation of Evotec AG.

A global company with a complete offering

Evotec worldwide operations

Sales representation (Boston, Tokyo) Operations & sales representation

2

San Francisco, US

~30 employees

Compound Procurement

Compound QC and storage

Abingdon, UK

~215 employees

Med Chem

Comp Chem

DMPK

Structural biology

Munich, Germany

~30 employees

Phospho-proteomics

Chemical proteomics

Göttingen, Germany

~50 employees

Metabolics

Regenerative Medicine

Thane, India

~130 employees

Library synthesis & mgmt.

Development chemistry

Hamburg,

Germany

~200 employees

Screening

HTS,NMR

in vitro & in vivo

biology

Manfred Eigen1) Campus

Page 4: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Agenda

About Evotec

About EVOsource

Calculating Properties

Analysis of the data

Final thoughts

3

Page 5: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

EVOsource

4

Compound Sourcing

Multiple search terms

Single or combination

Category Count

Suppliers 218

Catalogues 539

Unique

Compounds 22156696

Parts 92490159

Page 6: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

EVOsource

5

Compound selection

Can order from stores

Can see if ordered

Can order from supplier

Can request a quote

Can see if available in another

lab (or site)

Additional information displayed

JChem Cartridge accessed

through Java Persistence API

Marvin Applet

Structure to Name

Standardizer

Page 7: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

The challenges of loading supplier catalogues

Integrated Cyclic Process

Process

Contact Receive

Prepare

Load

Receive catalogues – New catalogues

– Catalogue updates

Prepare catalogues – Convert to SD file

– Structure Checker

– Structure to Name

– Name to Structure

Contact Suppliers – Existing suppliers

– Preferred suppliers

– New suppliers

Process – Fix errors

– Expire old data

Load catalogue data – Multiple parallel

processes

6

Page 8: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Agenda

About Evotec

About EVOsource

Calculating Properties

Analysis of the data

Final Thoughts

7

Page 9: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Drug likeness categories

Classification of compounds

Feature count is defined as: # of 5- and 6-membered aromatic rings + # of Lipinski acceptors + # of Lipinski donors

Drug like

No SS fails

Lipinski fails ≤ 1

Additional property

constraints:

MOE LogP ≤ 6

MW ≤ 600

Rot bonds ≤ 10

EVO LogS ≥ -7

TPSA ≤ 180

Amber

No red SS fails

Lipinski fails ≤ 1

or

No SS fails

Lipinski fails ≤ 2

Red

Everything else!

Lead like

No SS fails

No Lipinski fails

Additional property

constraints:

MOE LogP ≤ 3.5

MW ≤ 350

Rot bonds ≤ 6

EVO LogS ≥ -5

TPSA ≤ 140

Fragments

No SS fails

No FSS fails

Lip. donors 1-3

Lip. acc. 0-4

MOE LogP ≤ 3.0

MW 150-350

Rot bonds ≤ 5

EVO LogS ≥ -3

TPSA ≤ 70

Feature count 4-7

8

Page 10: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Property calculations

Historical process

Historical process Export all structures

calculate properties using MOE

Structural alerts using MOE

Stored in a MOE database

Problems Only available to Comp Chem Group

Difficult to update and time consuming

Different values using different tools – chemists use ChemAxon tools

to calculate properties

9

Page 11: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Property calculations

New process stage 1

Calculate simple properties

Use ChemAxon JChem cartridge calculations where available

Weekly job that calculates properties for new structures

LogS uses MOE to calculate

LogS manually updated every 2 months

10

Page 12: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Calculated Results

Breakdown of chemical space by simple descriptors

MW HBD

HBA

logP RotB

11

Page 13: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

MOE SMARTS converted to ChemAxon format

Structural alerts using two types of filter 121 SMARTS Structural Alerts – general screening compounds

19 SMARTS Fragment filters – fragment screening

Assign compounds to category – fragment, lead like

etc.

Problems Took 6 months to calculate substructure search as had to be broken into

small chunks

MOE SMARTS not automatically converted to ChemAxon format

Property calculations

New process stage 2

12

Page 14: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Quantitative Estimate of Drug-likeness (QED)

New process stage 3

Quantifying the chemical beauty of drugs, A. L. Hopkins et al, Nature Chemistry 2012, 4, 90–98

13

QED calculation is based on similar parameters used to assign compounds to the Evotec

categories, each parameter is weighted from a model fitting

A proof of concept was carried out on a subset of our screening collection

Literature weights were modified with a bias in favour of structural alerts

LogD was substituted for ALogP to consider ionisation

QED calculation runs as a weekly job

Weighted QED

Rel

ativ

e fr

equ

ency

0.49 0.67

Page 15: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

What is a beautiful molecule?

Evotec weighted QED

14

QED = medium

QED = high

QED = low

Page 16: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Agenda

About Evotec

About EVOsource

Calculating Properties

Analysis of the data

Final Thoughts

15

Page 17: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

EVOsource composition

16

EVOsource composition for Evotec drug-likeness classes

Weighted QED

Rel

ativ

e fr

equ

ency

15.3%

2.9M

26.9%

5.3M

57.8%

11.1M

0.49 0.67

EVOsource composition for QED index

Lead-like 3.1M

Fragment-like 1M

Red 2.1M

Amber 4.2M

Drug-like 8.1M

Page 18: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Excellent agreement of QED and Evotec flags to

discriminate between different drug-like classes

17

0.67 0.49

Rel

ativ

e fr

equ

ency

Weighted QED

QED index profile of each Evotec drug-likeness class

Page 19: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

EVOsource QED distribution shows large

overlap with orally available drugs

18

0.67 0.49

Weighted QED

Re

lati

ve f

req

ue

ncy

Orally available (682 analyzed out of 770)

EVOsource

Orally available drugs

EVOSource

Page 20: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Agenda

About Evotec

About EVOsource

Calculating Properties

Analysis of the data

Final Thoughts

19

Page 21: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

PAGE

Final thoughts and Acknowledgements

EVOsource provides: A useful tool for our chemists in ordering compounds for our clients

A tool for our Computational Chemists to do analyses of chemical space to better support our clients in

library design

QED calculation is providing a useful tool for filtering data to quickly make virtual

libraries QED has been added to latest version of ChEMBL

20

Thanks to:

Dr Oliver Barker

Dr Mirco Meniconi

Catherine Reisser

Dr Dan Warner

Page 22: Classification and analysis of 21 million commercially available … · 2017. 6. 27. · Classification and analysis of 22 million commercially available compounds ... ian.berry@evotec.com

Your contact:

Building innovative

drug discovery alliances

Ian Berry Bob Marmon Manager, Informatics Senior Applications Developer

+44.(0).1235.441451 Office +44.(0).1235.441402 Office +44.(0) 7802.438044 Mobile +44.(0).1235.861561 Switchboard +44.(0).1235.441503 Fax +44.(0).1235.441503 Fax [email protected] [email protected]