https://engage.cpc.wmin.ac.uk using the ws-pgrade portal in the prosim project protein molecule...
TRANSCRIPT
https://engage.cpc.wmin.ac.uk
Using the WS-PGRADE Portal in the ProSim
Project ProProtein Molecule tein Molecule SimSimulation on the Gridulation on the Grid
Tamas Kiss, Gabor Testyanszky, Noam Weingarten, Zsolt Lichtenberger
1st P-GRADE Portal User Community WorkshopJune 10-11, Zurich, Switzerland
https://engage.cpc.wmin.ac.uk
The biological interest• The motivation:
• Understanding how sugars interact with their protein partners may lead to development of new treatment methods for many diseases.
• The obstacle:• Investigation of the binding of proteins to sugars in “wet
laboratory” (in vitro) experiments is expensive and time consuming
• Expensive substrates• Sophisticated machinery
• The solution: • Use “in silico” tools (computer simulation) to select best
binding candidates• In vitro work only on selected candidates
https://engage.cpc.wmin.ac.uk
The biological interest
Binding pocket
Sugar (ligand)
Protein (receptor)
https://engage.cpc.wmin.ac.uk
The biological interest
• Advantages of in silico methods:• Better focusing wet laboratory resources:
• Better planning of experiments by selecting best molecules to investigate in vitro
• Reduced time and cost• Increased number of molecules screened
• Problems of in silico experiments:• Time consuming
• Weeks or months on a single computer• Simulation tools are too complex for an average bio-scientist
• Unix command line interfaces• Bio-molecular simulation tools are not widely tested and validated
• Are the results really useful and accurate?
https://engage.cpc.wmin.ac.uk
What can we gain via the simulation?1. Validation and refinement of in-silico modelling tools
2. Filter potential scenarios for wet lab experiments
https://engage.cpc.wmin.ac.uk
The researcher’s interest
• What does the researcher want?
• Run the simulations faster• use Grid resources – National Grid Service (NGS) and EGEE
• Run the simulations • using seamless access to compute resources
• web based interface • combining many simulation, analysis and visualisation tool
• workflows• running multiple docking experiments and molecular dynamics
analysis to investigate different scenarios• parameter study
https://engage.cpc.wmin.ac.uk
ProSim – Protein Molecule Simulation on the Grid
• Funded by the JISC- ENGAGE program• Engaging Research with e-Infrastructure • promote the greater engagement of academic researchers in the UK with
the UK's e-Infrastructure
• Prosim objectives:– define user requirements and user scenarios of protein molecule
simulation
– Identify, test and select software packages for protein molecule simulation
– automate the protein molecule simulation creating workflows and parameter study support.
– develop application specific graphical user interfaces
– run protein molecule simulation on the UK National Grid Service and make it available for the bioscience research community.
https://engage.cpc.wmin.ac.uk
The User ScenarioPDB file 1(Receptor) PDB file 2
(Ligand)
Energy Minimization(Gromacs)
Validate(Molprobity)
Check(Molprobity)
Perform docking(AutoDock)
Molecular Dynamics(Gromacs)
Phase 1
Phase 2
Phase 3
Phase 4
https://engage.cpc.wmin.ac.uk
The User Scenario in detail
Public repository
Local database
User provided
Preparation and standardisation
Solvation and charge
neutralization
Energy minimization
Validation
phase 1 – selection and preparation of receptor
Solvation
Energy minimization
Built using
SMILESPublic
repositoryLocal
databaseUser
provided
phase 2 – selection and preparation of ligand
https://engage.cpc.wmin.ac.uk
The User Scenario
Prepare docking: docking parameters and grid-space -
AutoGrid
Docking and selection of best results according to total
energyAutoDock
10 AutoDock executions, 100 genetic algorithm
runs each
phase 3 – docking ligand to receptor
Solvation of the ligand-receptor structure
Energy minimisation – GROMACS
Molecular dynamicsGROMACS MPI version
Molecule trajectory data analysis
phase 4 – refining the ligand-receptor molecule (performed
on 10 best results of the AutoDock simulation)
https://engage.cpc.wmin.ac.uk
The Workflow in WS-PGRADE
• a combination of GEMLCA and standard g-USE jobs
• Executed on 5 different sites of the UK NGS
• Parameter sweeps in phases 3 (via a script) and 4 (via WS-PGRADE parameter sweep)
Phase 1
Phase 2
Phase 4
Phase 3
Phase 1
Phase 2
Phase 4
Phase 3
https://engage.cpc.wmin.ac.uk
Running simulationsSet input parameters
Upload input filesSelect executor sites
Follow execution progress
Typical execution time: 24 hours
https://engage.cpc.wmin.ac.uk
User views
• Biologist end-user• Minimal computer and g-USE skills• Only interested in running her own reserach• Import, parameterize, execute and visualise workflows only
• Expert user • g-USE and computer literate biologist • Modify workflows• Design new experiments• Communicate end-user request towards IT team
https://engage.cpc.wmin.ac.uk
The ProSim visualiser• Visualisation in a newly developed portlet• Allows visualisation of receptor, ligand and docked
molecules at any phase during and after simulation (if the necessary files have already been generated)
• Allows to visualise and compare two molecules at a time.
• Energy, pressure, temperature and other important statistics statistics are also displayed.
• Using the KiNG ((Kinemage, Next Generation) visualisation tool
https://engage.cpc.wmin.ac.uk
The ProSim visualiser
https://engage.cpc.wmin.ac.uk
The ProSim visualiser
https://engage.cpc.wmin.ac.uk
Lessons learned• Communication between scientists and Grid experts is
extremely difficult• More than 50% of total time spent for the project is for
communication and describing/understanding user requests/requirements
• Novice Grid users require totally transparent access to Grid resources• User is interested in her science and not in MPI, Globus or
WMS.
https://engage.cpc.wmin.ac.uk
Future plans
• Make workflow more flexible to accommodate numerous different user scenarios
• Investigate further scenarios such as virtual screening of many ligands to one selected receptor
• Led to follow-up co-operation with Imperial College London: High-throughput Molecular Dynamics Modelling of Molecular Machines - proposal under review
https://engage.cpc.wmin.ac.uk
Thank you for your attention!Any questions?
Contact and more information:https://engage.cpc.wmin.ac.uk
Paper:Tamas Kiss, Pamela Greenwell, Hans Heindl, Gabor Terstyanszky and Noam Weingarten, Parameter Sweep Workflows for Modelling Carbohydrate Recognition, Submitted to the Journal of Grid Computing, Currently under review.