prediction model for absorptivity of drug-like compounds...

Post on 01-May-2018

222 Views

Category:

Documents

6 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Prediction Model for Absorptivity of Drug-like Compounds

Based on Structural Features and Interfacial Properties

Chihae Yang, Glenn Myatt, Paul BlowerLeadScope, Inc.

Jim RathmanThe Ohio State University

Objectives

§ Compound Description by Structural Features– Selection of features

§ Compound Description by Physical Properties– Molecular parameters– Interfacial parameters

§ Prediction Models– Structure feature based– Property based– Structure feature and property based

Develop Prediction Model for Absorptivity:

Structural Description of Dataset

143,4 ring system

21Pyridine (partially saturated)

65,6 or 6,6-fused rings

11SteroidsNatural Products

2Naphthalenes

9Pyrrolidone

375 membered ring

70N-containing heterocyclesHeterocycles

# of compoundsSub classificationClasses

L.G. Martini, et.al, European journal of pharmaceutics and biopharmaceutics 48 (1999) 259-263K. Palm, K.Lutman, et. al., J. Med. Chem, 1998, 41, 5382-5392W.L. Chiou, Pharmaceutical Research, Vol 17, No 2, 135-140, 2000P. Stenberg; U. Norinder, et. al., J. Med. Chem, 2001 44, 1927-1937

Total compounds ~100

Structural Description of Dataset

37Ether

15Sulfide

38Halide

2Quinone

18Carboxylate and carboxylic acid

91Amines

56AlcoholFunctional Group

80Any 1-substitution

501,4 substitution

451,3 substitution

551,2 substitutionBenzenes

4Bases, nucleosides

14Amino acids

# of compoundsSub classificationClasses

Distribution of % Fraction Absorbed Data

Fraction Absorbed after oral administration to humans*

* P. Stenberg; U. Norinder, et. al., J. Med. Chem, 2001 44, 1927-1937K. Palm et. al, Pharm Research 1997, 14, 568-571

K. Palm, K.Lutman, et. al., J. Med. Chem, 1998, 41, 5382-5392

Clustering of Compounds Against %FA

90 – 99 %

71-89 %

100 % (removed from model)

0 –10 %

25-44 %

50-70 %

Factors Affecting Absorption

§ Physical Properties: - Solubility- Dissolution rate- Molecular size- Partition coefficient

§ Physiological Properties:- Regional pH- Intestinal Permeability

§ Not considered:- Active transport, binding, complexation, etc.- Pericellular- Metabolism- Gastric and intestinal transit

Compound Description By Physical Properties

§ Molecular weight

§ Hydrogen bond acceptors and donors

§ Log P

§ Log DCalculated at pH 1, pH 4, pH 7, pH8

§ pKa and solubility (at pH 1, 4, 7, 8)

§ Polar surface area

§ Thermodynamic solution/interfacial property- Activity coefficients at infinite dilution

Property Distributions of Dataset

molecular weight rotatable bonds Hydrogen bond acceptors Hydrogen bond donors

polar surface area aLogP

Relationships (or Lack of) between FA and Properties

Prediction Based on Properties using NIPALS*

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 1

actual_FA

pre

dic

ted

_FA

Properties: MW, HBD, HBA, PSA, LogP, Log D(@ pH 1,4,7,8), solubility (pH 1,4, 7), pKa

Compounds: 93 compounds ranging %FA from 0 -100

R2 =0.40 R2 =0.49 (if 100 % absorption is excluded)

Nonlinear iterative partial least squares algorithm from Geladi and Kowalski, Analytica Chimica Acta, 185 (1986) 1-17.

Prediction Based on Structural Features

§ Selection of representative features from the dataset– global – local neighbors

§ Scoring or extraction criteria § Reduction of dimensionality

Feature Selection by Scoring Criteria

§ Method 1: Scoring of features (select 25 from ∼1600)- Coverage atoms - maximize- Partition of compound set

• prioritize features to partition the compound set to ~50:50- Complementarity of features

• minimize the overlap between features

§ Method 2: Extraction from principal components- diagnostic: influence function*

§ Compared features from method 1& 2 for selection

§ Used 25 feature counts per compound as fingerprint values

* Brooks, S.P. The Statistician (1994), 43, 483-494* Pack, P.; Jolliffe, I.T.; Morgan, B.J.T. Journal of Applied Statistics (1988), 15, 39-52.

Selected Features According to CriteriaCounts in the data setFeatures

14[benzene, 1-amino-] + [benzene, 1-amino-]

27

21

17

39

58

33

39

20

11

31

37

51

33

22

21

48

47

36

31

40

[amine, alkyl, acyc-] + [tert-amine, p-alkyl-]

[alcohol] + [ether, p-alkyl-]

[benzene, 1-(alkyl, cyc)-] + [benzene, 1-(alkyl, cyc)-]

[carbonyl, alkyl, acyc-] + [carboxamide]

[carbonyl] + [methane, 1-aryl-,1-carbonyl-]

[amine, alkyl, cyc-] + [carboxamide(NHR), alkyl-]

carboxamide

benzene, 1-chloro-

amine(NR), diphenyl

benzene, 1-(alkyl, acyc)-

ether

alcohol, alkyl-

benzene, 1-oxy-

benzene, 1-(alkyl, cyc)-

pyridine(H)

tert-amine

amine, alkyl, cyc-

alkene

benzene, 1-amino-

carbonyl, alkyl, acyc-

Counts in the data setFeatures

14[benzene, 1-amino-] + [benzene, 1-amino-]

27

21

17

39

58

33

39

20

11

31

37

51

33

22

21

48

47

36

31

40

[amine, alkyl, acyc-] + [tert-amine, p-alkyl-]

[alcohol] + [ether, p-alkyl-]

[benzene, 1-(alkyl, cyc)-] + [benzene, 1-(alkyl, cyc)-]

[carbonyl, alkyl, acyc-] + [carboxamide]

[carbonyl] + [methane, 1-aryl-,1-carbonyl-]

[amine, alkyl, cyc-] + [carboxamide(NHR), alkyl-]

carboxamide

benzene, 1-chloro-

amine(NR), diphenyl

benzene, 1-(alkyl, acyc)-

ether

alcohol, alkyl-

benzene, 1-oxy-

benzene, 1-(alkyl, cyc)-

pyridine(H)

tert-amine

amine, alkyl, cyc-

alkene

benzene, 1-amino-

carbonyl, alkyl, acyc-

Fingerprint Table: Features and Counts

TemplateName Acebutolol Acetazolamine Alprazolam Amiodarone Amitriptylline .......carbonyl, alkyl, acyc- 2 1 0 0 0benzene, 1-alkylamino- 0 0 0 0 0benzene, 1-amino- 1 0 1 0 0alkene 0 0 0 0 1amine, alkyl, cyc- 0 0 0 0 0tert-amine 0 0 0 1 1benzene, 1-alkoxy- 1 0 0 1 0benzene, 1-(alkyl, cyc)- 0 0 0 0 2benzene, 1-oxy- 1 0 0 1 0alcohol 1 0 0 0 0alcohol, alkyl- 1 0 0 0 0ether 1 0 0 1 0benzene, 1-(alkyl, acyc)- 0 0 0 0 0amine(NR), diphenyl 0 0 0 0 0benzene, 1-chloro- 0 0 1 0 0carboxamide 1 1 0 0 0[tert-amine] + [pyridine(H)] 0 0 0 0 0.....

Introduction of Solution/Interfacial Properties

§ Factors important for passive diffusion through lipid bilayer- Headgroup interaction- Hydrophobic tail interaction- Hydrophilic to lipophilic balance (HLB)

§ Partition model of drug molecules in lipid layer :

lipid Drug Drug

at equilibrium

partition coefficient:

:activity coefficient

bulk

Drug bulk Drug lipid

Drug-lipid Drug bulk

Drug bulk Drug lipid

a a

xK

x

γ

γ

γ

− −

− −

=

≈ =

Partition and Activity Coefficients

Partition coefficient: (in dilute solution)

log log log

bulkDrug

lipidDrug

bulk lipiddrug drug

K

K

γ

γ

γ γ

≈ −

tan tan

tan

tan

Compare with LogP:

(octanol-water)

log log log

Oc ol wateroc olDrug pure Drug

water oc olwaterpureDrug Drug

water oc oldrug drug

C CP

CC

P

γ

γ

γ γ

= ≈ ⋅

∝ −

UNIFAC Activity Coefficient Model

molecular volume and surface area effects(size, shape, packing)

intermolecular energy effects (interaction)

“combinatorial” term

“residual” term

ln ln lnC Ri i iγ γ γ= +

1

ln ln ln2

CC i i ii i i j j

ji i i

zq l x l

x xφ θ φ

γφ =

= + + − ∑

( ) ( )12i i i i

zl r q r= − − −where:

Combinatorial Termln γi

C is calculated using a group contribution approach:

• The drug and solvent molecules are decomposed into simple fragments.

• Volume (r) and surface area (q) parameters are computed for each molecule by summing values for the appropriate fragments.

• At a given mole fraction xi, the fraction of the total volume (φι) and total surface area (θi) due to compound i are calculated.

Residual Termln γi

R is calculated using the same fragments:

• Pairwise interaction terms (Ψmn and Ψ nm) are available for the fragments.

• Ψ values are directly related to intermolecular potentials:

•Ψmn = exp[(unn – umn)/RT] Ψnm = exp[(umm – umn)/RT]

• Although in theory these can be calculated from intermolecular potential functions, in practice they are based on experimental data (from primarily petrochemical and polymer databases).

ln 1 lnθγ θ

θ

Ψ ∝ − Ψ − Ψ

∑ ∑∑R m kmi k m mk

m m n nmn

q

UNIFAC Group Contribution

CH3CH2CHCCH2=CHCH=CHCH2=CC=CArHArCArCH3ArCH2ArCH

OHCH3OHH2OArOHCH3C(O)CH2C(O)CH(O)CH3C(O)OCH2C(O)OHC(O)OCH3OCH2OCH-ORing-CH2O

CH3NH2CH2NH2CHNHCH3NCH2NArNH2C5H5NC5H4N C5H3N CH3CNCH2CNCOOHHCOOH

CH2ClCHClCClCH2Cl2CHCl2CCl2CHCl3CCl3CCl4ArClCH3NO2CH2NO2CHNO2ArNO2

CS2CH3SHCH2SHCF3CF2CF (CH2OH)2FurfuralCl(C=C)Me2SOC(O)N(Me)2C(O)N(Me)CH2C(O)N(CH2)2

The properties of Gases & Liquids, 4th ed., R. Reid, J. Prausnitz, B. Poling, McGraw Hill, 1987

Lipid As A Solvent Phase

POO

O

O

O

OO

NO

O

O

O

O

O

O

O

O

OO

Example of Activity Coefficients in Various Environment

O

O H

0.73Hexadecane

0.12Glycolipid

-0.40Lipid tail

0.05Octanol

5.23Water

Log10 γ∞Solvent

Due to its origin in petrochemical applications, standard UNIFAC tables do not include a few of the basic drug-like fragments present in this preliminary study.

030

6090

0

5

10

15

-4

-1

2

5

-6-303

-505

15

0

20

40

FA

0 20 50 80110

water

0 5 10 15

octanol

-4 -2 0 2 4 6

glycolipid

-6 -4-2 0 2 4

tails

-5 0 5 10 15

hexadecane

0 10203040

Pairwise Correlations of Variables

Model Comparisons

0.670.69Structure features + PSA + HBA

0.34

0.32

other

0.4011 Properties only (LogP, PSA, LogD, pKa, MW, HBA, HBD, etc.)

R2Model

0.72

0.73

0.69

0.70

0.69

0.67

20 factors

0.68Structure feature + activity coefficients + HBA

0.68Structure features + PSA

PSA only1

0.70Structure feature + activity coefficients + PSA

0.66Structure features + HBA

0.66Structure features + activity coefficients

0.65Structural features only

Activity coefficients only1

7 factors

1 By a simple linear regression; all other by nonlinear iterative partial least squares (NIPALS).Order of importance: Features>activity coefficient ≈ PSA >H-bond acceptors

Feature and Interfacial Property Based Prediction Model

0

20

40

60

80

100

pred

icte

d

0 10 20 30 40 50 60 70 80 90 100actual

Model: Structural features, Activity coefficients, PSAMethod: nonlinear iterative partial least squares (NIPALS) with 7 factors

R2 = 0.70

Preliminary Prediction

§ Test set: 5 compounds were randomly selected (one from each cluster of the FA values) and were not included in the model building§ Training set: 66 compounds were used as the training set using

NIPALS method with 7 factors. The model was based on structural features, PSA, 5 activity coefficients

5920Penicillin-G

7170Mianserine

8395Metoprolol

85Doxorubicine

4450Acebutolol

Predicted (%)Actual (%)Drug name

Conclusions

§ Assuming passive diffusion to be the most critical factor for small molecule absorption in the GI tract, structural features extracted from the compound dataset described %FA much better than any properties.

§ Activity coefficient calculations may explain why LogP does not correlate well with absorption: partitioning into in a highly hydrophobic environment (lipid tail region) is not modeled properly using octanol.

§ This preliminary study shows that models based on structural features may be further improved by addition of interfacial properties such as activity coefficients and polar surface area.

Next Steps§ Apply to larger dataset§ Further elaborate the scoring function for feature selection§ Method refinement of UNIFAC to model drug-like compounds

– Calculate R and Q values for the selected features from this dataset.

– Calculate activity coefficients at infinite dilution– Explore activity coefficients in multicomponent environments

§ Model can be applied to Caco-2 cell permeability studies– Human or animal absorption data may be too complicated to

model with predictive accuracy§ The model will also be compensated for transport phenomena.

Acknowledgement

§ Julie Roberts, LeadScope, Inc.– building structures

§ Kevin Cross, LeadScope, Inc.– calculation of LogP and PSA

§ Tim Sötherlund, Kibron, Inc.– application of surface (air-liquid) properties to ADME properties

top related