comparative evaluation of 11 scoring functions for molekular docking authors: renxiao wang, yipin lu...

Comparative Evaluation of 11 Scoring Functions for Molekular

DockingAuthors: Renxiao Wang, Yipin Lu

and Shaomeng Wang

Presented by Florian Lenz

Today‘s Docking Programs

• 1. Sampling

• 2. Selecting

• Scoring function are needed for both!– Guiding the sampling– Evaluating the results

Previous Studies

• Compared combinations of docking programs / scoring functions– one combination fails: blame the Scoring

Function, the Docking Program, or the combination?

– Even if all the functions are tested under the same conditions: A unmonitored sampling process could yield inadequate samples

Solution

• Only use ONE docking program, and a wide range of parameters

• Monitor the sampling results

• 100 different complexes

• Three kinds of tests:– Reproduce experimental determined structure– Reproduce experimental determined binding

affinities– Describe a funnel shaped energy surface

Selecting the test cases

• Starting point: 230 complexes

• Only these with a resolution better then 2.5 Å are used (172)

• Creating a diverse ensemble (100)

Sampling

• AutoDock using Genetic Algorithms• Protein-Conformation is fixed• Ligand:

– Every rotatable single bond may rotate– Flexibility of cyclic part is neglected– Translation: 0.5 Å, Rotation: 15°, Torsion: 15°

• Docking Box: 30x30x30 Å around the observed binding position

• For each complex: 100 sampled conformation and the „real“ conformation

Monitoring• Repetition: Aim is not to find energy

minimum, but to create a diverse test set– RMSD must cover a wide range (0 to 15 Å)– # of clusters between 30 and 70– Enough results near the “real” position and

meaningful conformations.

• Key Parameter: Length of the GA-Runs– Too short -> Results are too close to initial

position– Too long -> Results enrich at very few

clusters

Problems with too long/short runs

• For every complex, the numbers of generations have to be determined separately

• If even 200 generations don‘t lead to a satisfying result, the complex is discarded

Example for a monitored ensemble

The 11 scoring functions

• 3 force-field based: AutoDock, G-Score and D-Score

• 6 empirical: LigScore, PLP, LUDI, F-Score, ChemScore and X-Score

• Knowledge-based: PMF and DrugScore

First Tests: Docking Accuracy• „How close is the ligand in the best scored solution to its

“real” position?“

1. Tests: Docking Accuracy

Type of Interaction vs. Docking Accuracy

(CVDW)(VDW) + (CH-bond)(HB) + (Chydrophobic)(HS) + (Crotor)(RT)+C0

Consensus ScoringExample:

1st place with X-Score, 7th place with LigScore = ((1+7)/2=) 4th place X-Score+LigScore

2nd Test: Binding Affinity Prediction

• Compare the ranking by scores with the ranking of the free energies.

• Using Spearman Correlation:

•dj is the distance between the rank by score and the rank by free energy for complex number j•Rs = 1 correspond to a perfect correlation•Rs= -1 correspond to a perfect inverse correlation•Rs = 0 correspond to a complete disorder


Best Result: X-Score (Rs = 0.660

4th best result: G-Score (Rs = 0.569)

3rd Test: Funnel Shaped Energy Surface

• Theory stems from Protein Folding

• Ligand is guided by decreasing free energy

• Scoring functions should show a correlation between RMSD Value and score

• How does the Ligand reach the binding pocket of the Protein?


Example: PDB Entry 1cbx (Carboxypeptidase with Benzylsuccinate)

X-Score (Rs: 0.877) LigScore (Rs: 0.135)

Side Result: The Outliers

• In seven ensembles, none of the 11 function was able to pick a conformation with a RMSD below 2.0 Å

• Analysis of these shows the general problems of today’s scoring functions– Indirect interactions (1CLA, 2CLA, 3CLA)– Very shallow groove instead of binding pocket

(1THA, 1RGL, 1TET)

Indirect Interactions

• In samples, water molecules are not included• F-Score predicted that the ligand binds on the surface• DrugScore, LigScore and PLP found another little hole

in the protein to put the ligand in

Very shallow groove

• Correct “binding pocket”• But only partial overlapping and wrong

orientation

Most important results

• Empirical Function worked best in Docking Accuracy

• Consensus scoring of the six best functions greatly improves the success rate (above 80%)

• Prediction of Binding Affinities was less encouraging

• There are examples, to which none function could find a good solution to

Thank You

comparative evaluation of 11 scoring functions for molekular docking authors: renxiao wang, yipin lu...

Documents

docking accuracy slide

drugscore slide

real conformation slide

place xscore ligscore

monitored ensemble slide

florian lenz slide

complete disorder slide

xscore rs