bioexcel webinar series #1: integrative modelling of biomolecular complexes with haddock
TRANSCRIPT
bioexcel.eu
Partners Funding
Interactive Modeling with HADDOCK 2.2
Presenter: Alexandre BonvinHost: Rossen Apostolov
BioExcel Educational Webinar Series
29 April, 2016
bioexcel.eu
This seminar is being recorded
bioexcel.eu
Objectives of BioExcelExcellence in Biomolecular ScienceImprove the performance, efficiency and scalability of key codes
• GROMACS (Molecular Dynamics Simulations)• HADDOCK (Integrative modeling of macro-
assemblies)• CPMD (hybrid QM/MM code for enzymatic reactions,
photochemistry and electron transfer processes)
bioexcel.eu
Objectives of BioExcelExcellence in Usability
• Make ICT technologieseasier to use bybiomolecular researchers,both in academia andindustry
• Devise efficient workflowenvironments withassociated data integration
4
bioexcel.eu
Objectives of BioExcel
Competence-building among academia and industryPromote best practices and train end users to make best use of both software and computational infrastructure• academic and non-profit users• industrial users• independent software
vendors (ISVs) andacademic code providersof related software
• academic and commercialresource providers
bioexcel.eu
Interest Groups
• Integrative Modeling• Free Energy Calculations• Best practices for performance tuning• Hybrid methods for biomolecular systems• Biomolecular simulations entry level users • Practical applications for industry
Support platformshttp://bioexcel.eu/contact
Forums Code Repositories Chat channel Video Channel
bioexcel.eu
The presenter
Prof. Alexandre Bonvin is a full professor of computational structural biology in Utrecht University since 2009. Research within his computational structural biology group focuses on the development of reliable bioinformatics and computational approaches to predict, model and dissect biomolecular interactions at atomic level, as implemented in the widely used HADDOCK software for the modelling of biomolecular complexes (http://bonvinlab.org/software). His work has resulted in over 190 peer-reviewed publications (Research ID A-5420-2009, h-index 48 (ISI), http://www.uu.nl/staff/AMJJBonvin )
Integrative modelling of biomolecular complexes using HADDOCK2.2
Prof. Alexandre M.J.J. BonvinBijvoet Center for Biomolecular ResearchFaculty of Science, Utrecht Universitythe [email protected]
9
The protein-protein interaction Cosmos
[Faculty of ScienceChemistry]
Unique interactions in interactomes
E.coli H.sapiens
with complete structureswith partial (domain-domain) or complete models
with structures for the interactors (suitable for docking)
without structural data
• ~7,500 binary interactions in E.coli
• ~44,900 binary interactions in H.sapiens
Structural coverage of interactomes
[Faculty of ScienceChemistry]
Molecular Docking
[Faculty of ScienceChemistry]
What is Integrative Modeling?
[Faculty of ScienceChemistry]
Related reviews• Halperin et al. (2002) Principles of docking: an overview of search
algorithms and a guide to scoring functions. PROTEINS: Struc. Funct. & Genetics 47, 409-443.
• Special issues of PROTEINS: (2003) (2005) (2007) (2010) and (2103), which are dedicated to CAPRI.
• de Vries SJ and Bonvin AMJJ (2008). How proteins get in touch: Interface prediction in the study of biomolecular complexes. Curr. Pept. and Prot. Research 9, 394-406.
• Melquiond ASJ, Karaca E, Kastritis PL and Bonvin AMJJ (2012). Next challenges in protein-protein docking: From proteome to interactome and beyond. WIREs Computational Molecular Science 2, 642-651 (2012).
• Karaca E and Bonvin AMJJ (2013). Advances in integrated modelling of biomolecular complexes. Methods, 59, 372-381 (2013).
• Rodrigues JPGLM and Bonvin AMJJ (2014). Integrative computational modelling of protein interactions. FEBS J., 281, 1988-2003 (2014).
Overview
HADDOCK machinery
[Faculty of ScienceChemistry]
Incorporates ambiguous and low-resolution data to aid the docking
Capable of docking up to 6 molecules
Symmetries can be leveraged
Powerful algorithms to handle flexibility at the interface
Final flexible refinement in explicit solvent
One of the best performing software in CAPRI
HADDOCK:An integrative modeling platform
Dominguez, Boelens & Bonvin. JACS 125, 173 (2003).
[Faculty of ScienceChemistry]
Searching the interaction space in HADDOCK
• Experimental and/or predicted information is combined with an empirical force field into an energy function whose minimum is searched for
• Vpotential = Vbonds + Vangles
+ Vtorsion
+ Vnon-bonded
+ Vexp
• Search is performed by a combination of gradient driven energy minimization and molecular dynamics simulations
Van der Waals electrostatic
[Faculty of ScienceChemistry]
Succession of energy minimization and molecular dynamics protocolsreminiscent of NMR structure calculations
it1 itwit0
HADDOCK docking protocol
[Faculty of ScienceChemistry]
Rigid-body energy minimization guided by restraints for fast samplingin the absence of data, define restraints between centers of mass
it0
Rigid-body Energy Minimization
Rigid-body protocol allows generation of several thousand of models in a short period of time.
Simultaneous docking of max. 6 molecules, resembling in vivo complex assembly (vs. sequential docking)
Typically, 10.000 conformations are sampled but only the best 1.000 are written to disk.
Rotational and translational optimization of the interacting partners, guided by the data-driven energy function.
HADDOCK docking protocol
[Faculty of ScienceChemistry]
Flexible simulated annealing in torsion angle space at the interface regionthorough optimization reproduces small conformational changes
it1
Semi-flexible simulated annealing3-step process that increasingly allows more flexibility at the interface: rigid-body, side-chain, backbone + side-chain.
Flexibility reproduces conformation changes up to 2Å, typical of small induced fit.
Typically, the 200 best models of it0 undergo refinement.
Torsion angle dynamics allows for faster integration time steps, while sampling relevant motions.
HADDOCK docking protocol
[Faculty of ScienceChemistry]
Refinement in explicit solvent to optimize the contacts at the interfacecan be used in isolation to refine and score existing models
itw
Refinement in explicit solvent
Short molecular dynamics simulation in explicit solvent to refine residue-residue contacts, mainly electrostatics, at the interface.
Position restraints on backbone heavy atoms ensure conformation remains largely the same.
Explicit solvent models include TIP3P water and DMSO (membrane mimic).
Typically, all models of it1 are refined, i.e. there is no selection between it1 and itw.
HADDOCK docking protocol
[Faculty of ScienceChemistry]
HADDOCK & Flexibility
• Several levels of flexibility:• Implicit:
– docking from ensembles of structures– Scaling down of intermolecular interactions
• Explicit: – semi-flexible refinement stage with both side-
chain and backbone flexibility during in torsion angle dynamics
– Final refinement in explicit solvent
[Faculty of ScienceChemistry]
Energetics & Scoring• OPLS non-bonded parameters (Jorgensen, JACS 110, 1657 (1988))
• 8.5Å non-bonded cutoff, switching function, ε=10• Clustering of solutions
• Ranking of based on cluster-based HADDOCK score:
– Eair: ambiguous interaction restraint energy
– Edesolv: desolvation energy using Atomic Solvation Parameters (Fernandez-Recio et al JMB 335, 843 (2004))
– BSA: buried surface area
Rigid: Score = 0.01 Eair + 0.01 EvdW + 1.0 Eelec + 1.0 Edesolv – 0.01 BSAFlexible: Score = 0.1 Eair + 1.0 EvdW + 1.0 Eelec + 1.0 Edesolv – 0.01 BSAWater: Score = 0.1 Eair + 1.0 EvdW + 0.2 Eelec + 1.0 Edesolv
Scor
e
Overview
HADDOCK machinery One application example
[Faculty of ScienceChemistry]
Circadian clock controlled by the Kai system consisting of three proteins: KaiA, KaiB and KaiC
Interactions define the phosphorylation status of KaiC and control the phase of the cycle
Information from MS:• From native MS: Stochiometry of the KaiB-KaiC complex
(6:1)
• From HD exchange: Binding interface and allosteric effects upon binding
Insight into cyanobacterial circadian timing: the KaiB-KaiC interaction
Snijder et al. PNAS 111, 1379 (2014)
[Faculty of ScienceChemistry]
The KaiB-KaiC interaction: HDX
KaiB
[Faculty of ScienceChemistry]
The KaiB-KaiC interaction: HDX
KaiC
[Faculty of ScienceChemistry]
Collision cross section from MS allows to filter the HADDOCKing solutions
The KaiB-KaiC interaction: CCS
HADDOCK best scoring/most populated solution of CII
[Faculty of ScienceChemistry]
Can deal with complex molecules
Overview
HADDOCK machinery One application example The web server
[Faculty of ScienceChemistry]
Haddockweb portal
• > 7000 registered users• > 120000 served runs
since June 2008• > 33% on the GRID
De Vries et al. Nature Prot. 2010
Van Zundert et al. J.Mol.Biol. 2016
[Faculty of ScienceChemistry]
HADDOCK’s user base
[Faculty of ScienceChemistry]
Distribution of resources
Currently (April 2015) ~ 111’000 CPU cores
41 sites
[Faculty of ScienceChemistry]
What is happening behind the scenes?
[Faculty of ScienceChemistry]
What does the server do for you compared to a manual run?
• PDB file validation:– PDB format– Check for duplication of residues– Check for proper definition of ions (see Nature Protocols 2010, Box 3)– Check for multiple occupancies– Runs reduce from MolProbity to correct issues with the
structure and automatically define the protonation state of Histidine residues
• Ambiguous interaction restraints:– Automatic definition of surface neighbors of active residues
• Gaps– Automatic generation of restraints to keep fragments
together
[Faculty of ScienceChemistry]
What does the server do for you compared to a manual run?
• Restraint files validation– Syntax check (e.g. the correct number of selections,
numbers... – but NOT if the selection actually exists)– See Nature Protocol 2010 Box 4 for format (Xplor/CNS
format)
• Small ligands / co-factors– Automatic topology/parameter files definition from PRODRG
• Post-analysis of docking results– Cluster analysis and statistics– Graphical representation of docking results
[Faculty of ScienceChemistry]
Overview
HADDOCK machinery One application example The web server Running locally
[Faculty of ScienceChemistry]
Local run: setup
• Prepare your PDBs – Remove double occupancies– Renumber if necessary to remove duplications– Split an ensemble of structures into single files and create a
listing of them with directory, between double quotes– Define the protonation state of Histidines (defined in run.cns)
Note: useful tools to manipulate/check PDBs available from the HADDOCKing GitHub repo:
– https://github.com/haddocking/pdb-tools
[Faculty of ScienceChemistry]
Local run: setup• Prepare your restraint files
– Define manually the surface neighbors (passive residues)– Generate the AIR file (can be done using the GenTBL interface
of the server)• Generate the initial new.html file to setup your HADDOCK
run• Create manually if needed the ligand topology and
parameter files using e.g:– PRODRG server:
http://davapc1.bioch.dundee.ac.uk/programs/prodrg/prodrg.html– ACPYPE: http://www.ccpn.ac.uk/software/ACPYPE-folder– Automatic Topology Builder: http://compbio.biosci.uq.edu.au/atb
• Edit the run.cns parameter file– Manually or online at
http://www.bonvinlab.org/software/haddock2.2/haddock-start.html
[Faculty of ScienceChemistry]
Local run: summary1. Prepare your new.html file, manually or online at
http://www.bonvinlab.org/software/haddock2.2/haddock-start.html
2. Give the haddock command in the dir where new.html was saved3. Go into the created run directory and edit the run.cns parameter
file to modify the parameter settings where needed (manually or online – same URL as above)
4. If needed copy the ligand param/topo files into the toppar directory
5. Launch haddock in the run dir6. For post-processing and analysis refer to the online HADDOCK
manual athttp://www.bonvinlab.org/software/haddock2.2/manual.html
Overview
HADDOCK machinery One application example The web server Running locally Additional information
[Faculty of ScienceChemistry]
bonvinlab.org/software
[Faculty of ScienceChemistry]
bonvinlab.org/education
[Faculty of ScienceChemistry]
http://www.wenmr.eu/wenmr/support/documentation/nmr-services/haddock
[Faculty of ScienceChemistry]
ask.bioexcel.eu
[Faculty of ScienceChemistry]
Some relevant papers• C. Dominguez, R. Boelens and A.M.J.J. Bonvin HADDOCK: A protein-protein
docking approach based on biochemical or biophysical information. J. Am. Chem. Soc., 125, 1731-1737 (2003).
• S.J. de Vries, M. van Dijk and A.M.J.J. Bonvin The HADDOCK web server for data-driven biomolecular docking. Nature Protocols, 5, 883-897 (2010)
• G.C.P. van Zundert and A.M.J.J. Bonvin. Modeling protein-protein complexes using the HADDOCK webserver. Methods in Molecular Biology: Protein Structure Prediction. Ed. Daisuke Kihara. Humana Press Inc., 163-179 (2014).
• J.P.G.L.M Rodrigues, E. Karaca and A.M.J.J. Bonvin. Information-driven structural modelling of protein-protein interactions. Methods in Molecular Biology: Molecular Modelling of Proteins. Ed. Andreas Kokul. Humana Press Inc. 399-424 (2015).
• G.C.P van Zundert, J.P.G.L.M. Rodrigues, M. Trellet, C. Schmitz, P.L. Kastritis, E. Karaca, A.S.J. Melquiond, M. van Dijk, S.J. de Vries and A.M.J.J. Bonvin. The HADDOCK2.2 webserver: User-friendly integrative modeling of biomolecular complexes. J. Mol. Biol. 428, 720-725
Acknowledgments
VICITOP-PUNT
BioNMRWeNMR
West-LifeEGI-Engage
INDIGOBioExcel
€€
Thanks to all former and current group members
bonvinlab.org/people
HADDOCK online: • http://haddock.science.uu.nl• http://bonvinlab.org/software• http://ask.bioexcel.eu
Thank you for your attention!
bioexcel.eu
Audience Q&A sessionPlease use the Chat or Question functions in GoToWebinar
bioexcel.eu
www.bioexcel.eu/contact
NEXT WEBINAR
“Performance Tuning and Optimization of GROMACS”with Mark Abraham
11 May 201615:00-16:00 CET