https://engage.cpc.wmin.ac.uk parameter sweep workflows for modelling carbohydrate recognition...
TRANSCRIPT
https://engage.cpc.wmin.ac.uk
Parameter Sweep Workflows for Modelling Carbohydrate
Recognition
ProSim Project
Tamas Kiss, Gabor Terstyanszky, Noam Weingarten Pamela Greenwell, Hans Heindl
AHM’09Oxford, UK, 07-09 December 2009
https://engage.cpc.wmin.ac.uk
The research interest• The motivation:
• Understanding how sugars interact with their protein partners may lead to development of new treatment methods for many diseases.
• The obstacle:• Investigation of the binding of proteins to sugars in “wet
laboratory” (in vitro) experiments is expensive and time consuming
• Expensive substrates• Sophisticated machinery
• The solution: • Use “in silico” tools (computer simulation) to select best
binding candidates• In vitro work only on selected candidates
https://engage.cpc.wmin.ac.uk
The research task
Binding pocket
Sugar (ligand)
Protein (receptor)
https://engage.cpc.wmin.ac.uk
The research interest
• Advantages of in silico methods:• Better focusing wet laboratory resources:
• Better planning of experiments by selecting best molecules to investigate in vitro
• Reduced time and cost• Increased number of molecules screened
• Problems of in silico experiments:• Time consuming
• Weeks or months on a single computer• Simulation tools are too complex for bio-scientists
• Unix command line interfaces + software packages (Amber, GROMACS)• Bio-molecular simulation tools are not widely tested and validated
• Are the results really useful and accurate?
https://engage.cpc.wmin.ac.uk
What can we gain via the simulation?1. Validation and refinement of in-silico modelling tools
2. Filter potential scenarios for wet lab experiments
https://engage.cpc.wmin.ac.uk
The researcher’s interest
• What does the researcher want?• Run the simulations faster
• Use compute resources – National Grid Service (NGS)• Run the simulations
• Using seamless access to compute resources web based
interface • Combining many simulation, analysis and visualisation tools workflows• Running multiple docking experiments to investigate different
protein and sugar combinations parameter study
https://engage.cpc.wmin.ac.uk
Westminster Grid Application Support Service (W-GRASS)
https://engage.cpc.wmin.ac.uk
Bio- and Life ScienceBio- and Life Science- Molecular Dynamics Simulation using CHARMmMolecular Dynamics Simulation using CHARMm- Patient Readmission Analysis with RPatient Readmission Analysis with R- GAMESS-UK - ab initio molecular electronic structure program - GAMESS-UK - ab initio molecular electronic structure program - MultiBayes - program for analysing DNA sequences of genes MultiBayes - program for analysing DNA sequences of genes - ProSim - Modelling Protein Carbohydrate Recognition in-silico – ProSim - Modelling Protein Carbohydrate Recognition in-silico –
application application- In silico Modelling Using AutoDockIn silico Modelling Using AutoDockEngineeringEngineering- - DASP - Digital Alias-free Signal ProcessingDASP - Digital Alias-free Signal Processing- Extraction of X-RAY Diffraction ProfilesExtraction of X-RAY Diffraction Profiles- Cellular Automata-Based Laser DynamicsCellular Automata-Based Laser DynamicsMulti-mediaMulti-media- Rendering portal - Grid-based on-line rendering service Rendering portal - Grid-based on-line rendering service PhysicsPhysics- VisIVO – Visualisation Interface to the Virtual ObservatoryVisIVO – Visualisation Interface to the Virtual Observatory
Application Ported by W-GRASS
https://engage.cpc.wmin.ac.uk
ProSim – Protein Molecule Simulation on the Grid
• Funded by the JISC- ENGAGE program• Engaging Research with e-Infrastructure • promote the greater engagement of academic researchers in the UK with
the UK's e-Infrastructure
• Prosim objectives:– define user requirements and user scenarios of protein molecule
simulation
– Identify, test and select software packages for protein molecule simulation
– automate the protein molecule simulation creating workflows and parameter study support.
– develop application specific graphical user interfaces
– run protein molecule simulation on the UK National Grid Service and make it available for the bioscience research community.
https://engage.cpc.wmin.ac.uk
The User ScenarioPDB file 1(Receptor) PDB file 2
(Ligand)
Energy Minimization(Gromacs)
Validate(Molprobity)
Check(Molprobity)
Perform docking(AutoDock)
Molecular Dynamics(Gromacs)
Phase 1
Phase 2
Phase 3
Phase 4
https://engage.cpc.wmin.ac.uk
The User Scenario in detail
Public repository
Local database
User provided
Preparation and standardisation
Solvation and charge
neutralization
Energy minimisation
Validation
phase 1 – selection and preparation of receptor
Solvation
Energy minimisation
Built using
SMILESPublic
repositoryLocal
databaseUser
provided
phase 2 – selection and preparation of ligand
https://engage.cpc.wmin.ac.uk
The User Scenario
Prepare docking: docking parameters and grid-space -
AutoGrid
Docking and selection of best results according to total
energyAutoDock
10 AutoDock executions, 100 genetic algorithm
runs each
phase 3 – docking ligand to receptor
Solvation of the ligand-receptor structure
Energy minimisation – GROMACS
Molecular dynamicsGROMACS MPI version
Molecule trajectory data analysis
phase 4 – refining the ligand-receptor molecule (performed
on 10 best results of the AutoDock simulation)
https://engage.cpc.wmin.ac.uk
The Workflow in g-USE
• a combination of GEMLCA and standard g-USE jobs
• Executed on 5 different sites of the UK NGS
• Parameter sweeps in phases 3 and 4
https://engage.cpc.wmin.ac.uk
Running simulationsSet input parameters
Upload input filesSelect executor sites
Follow execution progress
Typical execution time: 24 hours
https://engage.cpc.wmin.ac.uk
User views
• Researchers (or End-User)• Minimal computer, Grid and portal skills• Only interested in running their own research• Import, parameterize, execute and visualise workflows
• Application Developers (and/or Expert Users) • Computer literate researcher or software engineer• Define user scenarios and design new experiments• Create, test and deploy and modify workflows• Communicate with end-users and consider their requirements
https://engage.cpc.wmin.ac.uk
The ProSim visualiser• Visualisation in a newly developed portlet• Allows visualisation of receptor, ligand and docked
molecules at any phase during and after simulation (if the necessary files have already been generated)
• Allows to visualise and compare two molecules at a time.
• Energy, pressure, temperature and other important statistics statistics are also displayed.
• Using the KiNG ((Kinemage, Next Generation) visualisation tool
https://engage.cpc.wmin.ac.uk
The ProSim visualiser
https://engage.cpc.wmin.ac.uk
The ProSim visualiser
https://engage.cpc.wmin.ac.uk
Lessons learned• Communication between scientists and Grid experts is
extremely difficult• More than 50% of total time spent for the project is for
communication and describing/understanding user requests/requirements
• Novice Grid users require totally transparent access to Grid resources• Users interested in their research and not in Globus, MPI or
WMS.
https://engage.cpc.wmin.ac.uk
Future plans
• Make workflow more flexible to accommodate numerous different user scenarios
• Investigate further scenarios such as virtual screening of many ligands to one selected receptor
https://engage.cpc.wmin.ac.uk
Thank you for your attention!Any questions?
https://engage.cpc.wmin.ac.uk