bioexcel webinar series #1: integrative modelling of biomolecular complexes with haddock

50
bioexcel.eu Partners Funding Interactive Modeling with HADDOCK 2.2 Presenter: Alexandre Bonvin Host: Rossen Apostolov BioExcel Educational Webinar Series 29 April, 2016

Upload: bioexcel

Post on 19-Jan-2017

35 views

Category:

Science


0 download

TRANSCRIPT

Page 1: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

bioexcel.eu

Partners Funding

Interactive Modeling with HADDOCK 2.2

Presenter: Alexandre BonvinHost: Rossen Apostolov

BioExcel Educational Webinar Series

29 April, 2016

Page 2: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

bioexcel.eu

This seminar is being recorded

Page 3: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

bioexcel.eu

Objectives of BioExcelExcellence in Biomolecular ScienceImprove the performance, efficiency and scalability of key codes

• GROMACS (Molecular Dynamics Simulations)• HADDOCK (Integrative modeling of macro-

assemblies)• CPMD (hybrid QM/MM code for enzymatic reactions,

photochemistry and electron transfer processes)

Page 4: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

bioexcel.eu

Objectives of BioExcelExcellence in Usability

• Make ICT technologieseasier to use bybiomolecular researchers,both in academia andindustry

• Devise efficient workflowenvironments withassociated data integration

4

Page 5: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

bioexcel.eu

Objectives of BioExcel

Competence-building among academia and industryPromote best practices and train end users to make best use of both software and computational infrastructure• academic and non-profit users• industrial users• independent software

vendors (ISVs) andacademic code providersof related software

• academic and commercialresource providers

Page 6: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

bioexcel.eu

Interest Groups

• Integrative Modeling• Free Energy Calculations• Best practices for performance tuning• Hybrid methods for biomolecular systems• Biomolecular simulations entry level users • Practical applications for industry

Support platformshttp://bioexcel.eu/contact

Forums Code Repositories Chat channel Video Channel

Page 7: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

bioexcel.eu

The presenter

Prof. Alexandre Bonvin is a full professor of computational structural biology in Utrecht University since 2009. Research within his computational structural biology group focuses on the development of reliable bioinformatics and computational approaches to predict, model and dissect biomolecular interactions at atomic level, as implemented in the widely used HADDOCK software for the modelling of biomolecular complexes (http://bonvinlab.org/software). His work has resulted in over 190 peer-reviewed publications (Research ID A-5420-2009, h-index 48 (ISI), http://www.uu.nl/staff/AMJJBonvin )

Page 8: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

Integrative modelling of biomolecular complexes using HADDOCK2.2

Prof. Alexandre M.J.J. BonvinBijvoet Center for Biomolecular ResearchFaculty of Science, Utrecht Universitythe [email protected]

Page 9: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

9

The protein-protein interaction Cosmos

Page 10: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Unique interactions in interactomes

E.coli H.sapiens

with complete structureswith partial (domain-domain) or complete models

with structures for the interactors (suitable for docking)

without structural data

• ~7,500 binary interactions in E.coli

• ~44,900 binary interactions in H.sapiens

Structural coverage of interactomes

Page 11: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Molecular Docking

Page 12: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

What is Integrative Modeling?

Page 13: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Related reviews• Halperin et al. (2002) Principles of docking: an overview of search

algorithms and a guide to scoring functions. PROTEINS: Struc. Funct. & Genetics 47, 409-443.

• Special issues of PROTEINS: (2003) (2005) (2007) (2010) and (2103), which are dedicated to CAPRI.

• de Vries SJ and Bonvin AMJJ (2008). How proteins get in touch: Interface prediction in the study of biomolecular complexes. Curr. Pept. and Prot. Research 9, 394-406.

• Melquiond ASJ, Karaca E, Kastritis PL and Bonvin AMJJ (2012). Next challenges in protein-protein docking: From proteome to interactome and beyond. WIREs Computational Molecular Science 2, 642-651 (2012).

• Karaca E and Bonvin AMJJ (2013). Advances in integrated modelling of biomolecular complexes. Methods, 59, 372-381 (2013).

• Rodrigues JPGLM and Bonvin AMJJ (2014). Integrative computational modelling of protein interactions. FEBS J., 281, 1988-2003 (2014).

Page 14: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

Overview

HADDOCK machinery

Page 15: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Incorporates ambiguous and low-resolution data to aid the docking

Capable of docking up to 6 molecules

Symmetries can be leveraged

Powerful algorithms to handle flexibility at the interface

Final flexible refinement in explicit solvent

One of the best performing software in CAPRI

HADDOCK:An integrative modeling platform

Dominguez, Boelens & Bonvin. JACS 125, 173 (2003).

Page 16: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Searching the interaction space in HADDOCK

• Experimental and/or predicted information is combined with an empirical force field into an energy function whose minimum is searched for

• Vpotential = Vbonds + Vangles

+ Vtorsion

+ Vnon-bonded

+ Vexp

• Search is performed by a combination of gradient driven energy minimization and molecular dynamics simulations

Van der Waals electrostatic

Page 17: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Succession of energy minimization and molecular dynamics protocolsreminiscent of NMR structure calculations

it1 itwit0

HADDOCK docking protocol

Page 18: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Rigid-body energy minimization guided by restraints for fast samplingin the absence of data, define restraints between centers of mass

it0

Rigid-body Energy Minimization

Rigid-body protocol allows generation of several thousand of models in a short period of time.

Simultaneous docking of max. 6 molecules, resembling in vivo complex assembly (vs. sequential docking)

Typically, 10.000 conformations are sampled but only the best 1.000 are written to disk.

Rotational and translational optimization of the interacting partners, guided by the data-driven energy function.

HADDOCK docking protocol

Page 19: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Flexible simulated annealing in torsion angle space at the interface regionthorough optimization reproduces small conformational changes

it1

Semi-flexible simulated annealing3-step process that increasingly allows more flexibility at the interface: rigid-body, side-chain, backbone + side-chain.

Flexibility reproduces conformation changes up to 2Å, typical of small induced fit.

Typically, the 200 best models of it0 undergo refinement.

Torsion angle dynamics allows for faster integration time steps, while sampling relevant motions.

HADDOCK docking protocol

Page 20: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Refinement in explicit solvent to optimize the contacts at the interfacecan be used in isolation to refine and score existing models

itw

Refinement in explicit solvent

Short molecular dynamics simulation in explicit solvent to refine residue-residue contacts, mainly electrostatics, at the interface.

Position restraints on backbone heavy atoms ensure conformation remains largely the same.

Explicit solvent models include TIP3P water and DMSO (membrane mimic).

Typically, all models of it1 are refined, i.e. there is no selection between it1 and itw.

HADDOCK docking protocol

Page 21: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

HADDOCK & Flexibility

• Several levels of flexibility:• Implicit:

– docking from ensembles of structures– Scaling down of intermolecular interactions

• Explicit: – semi-flexible refinement stage with both side-

chain and backbone flexibility during in torsion angle dynamics

– Final refinement in explicit solvent

Page 22: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Energetics & Scoring• OPLS non-bonded parameters (Jorgensen, JACS 110, 1657 (1988))

• 8.5Å non-bonded cutoff, switching function, ε=10• Clustering of solutions

• Ranking of based on cluster-based HADDOCK score:

– Eair: ambiguous interaction restraint energy

– Edesolv: desolvation energy using Atomic Solvation Parameters (Fernandez-Recio et al JMB 335, 843 (2004))

– BSA: buried surface area

Rigid: Score = 0.01 Eair + 0.01 EvdW + 1.0 Eelec + 1.0 Edesolv – 0.01 BSAFlexible: Score = 0.1 Eair + 1.0 EvdW + 1.0 Eelec + 1.0 Edesolv – 0.01 BSAWater: Score = 0.1 Eair + 1.0 EvdW + 0.2 Eelec + 1.0 Edesolv

Scor

e

Page 23: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

Overview

HADDOCK machinery One application example

Page 24: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Circadian clock controlled by the Kai system consisting of three proteins: KaiA, KaiB and KaiC

Interactions define the phosphorylation status of KaiC and control the phase of the cycle

Information from MS:• From native MS: Stochiometry of the KaiB-KaiC complex

(6:1)

• From HD exchange: Binding interface and allosteric effects upon binding

Insight into cyanobacterial circadian timing: the KaiB-KaiC interaction

Snijder et al. PNAS 111, 1379 (2014)

Page 25: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

The KaiB-KaiC interaction: HDX

KaiB

Page 26: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

The KaiB-KaiC interaction: HDX

KaiC

Page 27: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Collision cross section from MS allows to filter the HADDOCKing solutions

The KaiB-KaiC interaction: CCS

HADDOCK best scoring/most populated solution of CII

Page 28: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Can deal with complex molecules

Page 29: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

Overview

HADDOCK machinery One application example The web server

Page 30: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Haddockweb portal

• > 7000 registered users• > 120000 served runs

since June 2008• > 33% on the GRID

De Vries et al. Nature Prot. 2010

Van Zundert et al. J.Mol.Biol. 2016

Page 31: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

HADDOCK’s user base

Page 32: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Distribution of resources

Currently (April 2015) ~ 111’000 CPU cores

41 sites

Page 33: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

What is happening behind the scenes?

Page 34: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

What does the server do for you compared to a manual run?

• PDB file validation:– PDB format– Check for duplication of residues– Check for proper definition of ions (see Nature Protocols 2010, Box 3)– Check for multiple occupancies– Runs reduce from MolProbity to correct issues with the

structure and automatically define the protonation state of Histidine residues

• Ambiguous interaction restraints:– Automatic definition of surface neighbors of active residues

• Gaps– Automatic generation of restraints to keep fragments

together

Page 35: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

What does the server do for you compared to a manual run?

• Restraint files validation– Syntax check (e.g. the correct number of selections,

numbers... – but NOT if the selection actually exists)– See Nature Protocol 2010 Box 4 for format (Xplor/CNS

format)

• Small ligands / co-factors– Automatic topology/parameter files definition from PRODRG

• Post-analysis of docking results– Cluster analysis and statistics– Graphical representation of docking results

Page 36: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Page 37: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

Overview

HADDOCK machinery One application example The web server Running locally

Page 38: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Local run: setup

• Prepare your PDBs – Remove double occupancies– Renumber if necessary to remove duplications– Split an ensemble of structures into single files and create a

listing of them with directory, between double quotes– Define the protonation state of Histidines (defined in run.cns)

Note: useful tools to manipulate/check PDBs available from the HADDOCKing GitHub repo:

– https://github.com/haddocking/pdb-tools

Page 39: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Local run: setup• Prepare your restraint files

– Define manually the surface neighbors (passive residues)– Generate the AIR file (can be done using the GenTBL interface

of the server)• Generate the initial new.html file to setup your HADDOCK

run• Create manually if needed the ligand topology and

parameter files using e.g:– PRODRG server:

http://davapc1.bioch.dundee.ac.uk/programs/prodrg/prodrg.html– ACPYPE: http://www.ccpn.ac.uk/software/ACPYPE-folder– Automatic Topology Builder: http://compbio.biosci.uq.edu.au/atb

• Edit the run.cns parameter file– Manually or online at

http://www.bonvinlab.org/software/haddock2.2/haddock-start.html

Page 40: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Local run: summary1. Prepare your new.html file, manually or online at

http://www.bonvinlab.org/software/haddock2.2/haddock-start.html

2. Give the haddock command in the dir where new.html was saved3. Go into the created run directory and edit the run.cns parameter

file to modify the parameter settings where needed (manually or online – same URL as above)

4. If needed copy the ligand param/topo files into the toppar directory

5. Launch haddock in the run dir6. For post-processing and analysis refer to the online HADDOCK

manual athttp://www.bonvinlab.org/software/haddock2.2/manual.html

Page 41: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

Overview

HADDOCK machinery One application example The web server Running locally Additional information

Page 42: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

bonvinlab.org/software

Page 43: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

bonvinlab.org/education

Page 44: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

http://www.wenmr.eu/wenmr/support/documentation/nmr-services/haddock

Page 45: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

ask.bioexcel.eu

Page 46: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

[Faculty of ScienceChemistry]

Some relevant papers• C. Dominguez, R. Boelens and A.M.J.J. Bonvin HADDOCK: A protein-protein

docking approach based on biochemical or biophysical information. J. Am. Chem. Soc., 125, 1731-1737 (2003).

• S.J. de Vries, M. van Dijk and A.M.J.J. Bonvin The HADDOCK web server for data-driven biomolecular docking. Nature Protocols, 5, 883-897 (2010)

• G.C.P. van Zundert and A.M.J.J. Bonvin. Modeling protein-protein complexes using the HADDOCK webserver. Methods in Molecular Biology: Protein Structure Prediction. Ed. Daisuke Kihara. Humana Press Inc., 163-179 (2014).

• J.P.G.L.M Rodrigues, E. Karaca and A.M.J.J. Bonvin. Information-driven structural modelling of protein-protein interactions. Methods in Molecular Biology: Molecular Modelling of Proteins. Ed. Andreas Kokul. Humana Press Inc. 399-424 (2015).

• G.C.P van Zundert, J.P.G.L.M. Rodrigues, M. Trellet, C. Schmitz, P.L. Kastritis, E. Karaca, A.S.J. Melquiond, M. van Dijk, S.J. de Vries and A.M.J.J. Bonvin. The HADDOCK2.2 webserver: User-friendly integrative modeling of biomolecular complexes. J. Mol. Biol. 428, 720-725

Page 47: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

Acknowledgments

VICITOP-PUNT

BioNMRWeNMR

West-LifeEGI-Engage

INDIGOBioExcel

€€

Thanks to all former and current group members

bonvinlab.org/people

Page 48: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

HADDOCK online: • http://haddock.science.uu.nl• http://bonvinlab.org/software• http://ask.bioexcel.eu

Thank you for your attention!

Page 49: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

bioexcel.eu

Audience Q&A sessionPlease use the Chat or Question functions in GoToWebinar

Page 50: BioExcel Webinar Series #1: Integrative Modelling of Biomolecular Complexes with HADDOCK

bioexcel.eu

www.bioexcel.eu/contact

NEXT WEBINAR

“Performance Tuning and Optimization of GROMACS”with Mark Abraham

11 May 201615:00-16:00 CET