28.09.2006 / 1w. sudholt, k. baldridge swiss grid day, geneva 28.09.2006 grid computing for...

18
W. Sudholt, K. Baldridge 28.09.2006 / 1 Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim Baldridge 1,2 1 Institute of Organic Chemistry, University of Zurich, Switzerland 2 San Diego Supercomputer Center, University of California, San Diego, USA wibke@oci . unizh . ch , kimb@oci . unizh . ch

Upload: emory-williamson

Post on 12-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 1

Swiss Grid Day, Geneva 28.09.2006

Grid Computing for Computational Chemistryand Beyond

Wibke Sudholt1 and Kim Baldridge1,2

1Institute of Organic Chemistry,University of Zurich, Switzerland2San Diego Supercomputer Center,University of California, San Diego, USA

[email protected], [email protected]

Page 2: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 2

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Overview

• Why do we use Grid Computing to answer scientific questions?

• How are we involved in Grid Computing?

• What can we do to support users to apply Grid Computing?

• How does Grid Computing help us to solve real-world problems?

Page 3: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 3

Towards a Global Cyberinfrastructure

Workstationsand PCs

Supercomputersand clusters

Internetand grids

ComputingStorage

InstrumentsNetworking

CollaborationInformation

Page 4: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 4

Interdisciplinary Research

Mathematics

Physics

Chemistry

Biology

Computer Science

ComputationalChemistry

Page 5: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 5

Bridging Scientific Gaps

Atoms

Molecules

Proteins

Cells

Organs

Organisms

Nuclei

10-14 m

10-9 m

10-8 m

10-4 m

10-1 m

100 m

10-10 m

Accuracy

Complexity

Quantum mechanics

Classical mechanics

Page 6: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 6

Grid Computing

“A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities.”

Foster/Kesselman, 1999

• Grid types:- Computational Grids- Desktop Grids- Data Grids- Knowledge Grids

• Grid opportunities:- Performance- Throughput- Scalability- Fairness- Collaboration- Knowledge exchange and

dissemination

• Grid challenges:- Heterogeneity- Network speed- Fault tolerance- Distribution algorithms- Job scheduling- Legacy codes- Policy issues

Page 7: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 7

Grid Infrastructure Layers

Applications

Upper-level middleware

Resources

User interfacesWeb portalsResource brokersWorkflow systems

Lower-level middlewareSecurity infrastructureResource managementInformation servicesData management

Scientific or business softwareScientific or business dataVisualization

NetworkJob managersOperating systemsHardware

Page 8: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 8

Baldridge Group Hardware

• Chempossiblecluster at SDSC

• Individual resources• Grid resources

Y. Potier, M. Packard, W. Sudholt, UniZH, et al.

• Mac laptops• Matterhorn cluster

at UniZH

Page 9: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 9

Grid Middleware Experience

• Cluster management:-Rocks-SGE

• Open source or freeware:-Globus-Nimrod-ProActive-UNICORE-Condor-BOINC-GridPort-SRB-Web services

• Commercial products:-DataSynapse

-United Devices

• Workflow infrastructures:-Kepler

-Informnet

• Security:-CA

-GAMA

Page 10: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 10

QuickTime™ and aTIFF (LZW) decompressor

are needed to see this picture.

Virtual Organizations

• Swiss Grid Initiative:http://www.gridinitiative.ch/

• Swiss Bio Grid (SBG):http://www.swissbiogrid.org/

• Southern European Partnership for Advanced Computing (SEPAC):http://www.sepac-grid.org/

• Pacific Rim Applications and Grid Middleware Assembly (PRAGMA):http://www.pragma-grid.net/

• Chemomentum:http://www.chemomentum.org/

• I2CAM:http://www.i2cam.org/

Page 11: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 11

Computational ChemistryGrid User Interfaces

• Molecular visualization and remote execution(K. Baldridge, J. Greenberg, SDSC):

-QMView

• GridPort web portals(J. Greenberg, SDSC, et al.):

-GAMESS-APBS-Euler-AMBER-CE

• Workflow and integrated infrastructure projects(UniZH/SDSC):

-Resurgence-Gemstone

Page 12: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 12

The Resurgence Project

• RESearch sURGe ENabled by CyberinfrastructurE

• http://www.baldridge.unizh.ch/resurgence/

• Description:- Workflow tool for computational chemistry- Allow researchers to easily combine existing

computational chemistry tools in innovative ways

- Exploit the possibilities of the growing web and grid infrastructure

- Focus on high-throughput calculations- Based on and included in the collaborative

Kepler scientific workflow system

• Interfaced programs:- GAMESS (quantum chemistry), Babel,

Open Babel (file format conversion), QMView (molecular visualization)

- In preparation: Nimrod/G (grid distribution)- Planned: APBS (biomolecular continuum

electrostatics)

• Participants:- UniZH/ETHZ: W. Sudholt et al.- SDSC: I. Altintas et al.

• Status:- Project has ended- Many ideas to be taken over into the Gemstone

framework

Page 13: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 13

The Gemstone Project

• Grid Enabled Molecular Science Through Online Networked Environments

• http://gemstone.mozdev.org/• Goals:

- Integrated framework for accessing grid resources

- Support scientific exploration, workflow capture and replay, and a dynamic services oriented architecture

- Provide researchers in the molecular sciences with a tool to discover and compose remote grid application services

• Components:- Dynamic rich-client desktop interface

(Firefox extension, XUL, registry lookup)- Strongly typed data schemas (XML

Schema, CML, GamessXML)- Molecular visualization tools (Flash, Garnet,

interface to QMView)- Optional: Authenticated interaction (GAMA)- Planned: Workflow integration (Informnet)

• Application web services:- APBS, GAMESS, Autodock, SIESTA codes- MolPrep, Babel, PDB2PQR, Psize utilities

• Related and interfaced projects:- Opal: Web services wrapping toolkit- Topaz: GridFTP Firefox extension

• Status:- Ongoing NSF and NBCR-supported project- Version 1.0 released on 26.09.2006

• Participants:- UniZH: K. Baldridge, C. Amoreira, A.

Bowen, Y. Potier et al.- SDSC: K. Bhatia, J. Greenberg, S.

Krishnan, S. Mock, B. Stearn et al.

Page 14: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 14

Example Computational Chemistry Grid Projects and Collaborations

• Parameterization of a Group Difference Pseudopotential for QM/MM calculations using GAMESS and the Nimrod distributed parametric modeling tool (W. Sudholt, UCSD/UniZH/ETHZ, D. Abramson, Monash, et al.)

• Coupling of the GAMESS quantum chemical code with the BOINC desktop grid platform (M. Taufer, UCSD/UTEP, et al.)

• Investigation of protein-ligand interactions based on a GAMESS and APBS pipeline using Nimrod and Gemstone on the PRAGMA testbed (C. Amoreira, UniZH, et al.)

• Implementation of the material science application SIESTA into the Gemstone framework (A. Garcia, UPV/ICMAB, Spain)

Page 15: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 15

Parameterization of a Group Difference Pseudopotential

• Challenge:- Parameterization of a pseudopotential for

QM/MM calculations- Embarrassingly parallel parameter sweeps

and optimizations

• Setup:- GAMESS quantum chemistry code (pre-

deployed)- Globus and Nimrod grid middleware- PRAGMA testbed resources (mainly at the

PRAGMA 4 and Supercomputing 2003 conferences)

• Results:- Up to about 60’000 jobs- More than 200 days of computing time in

less than 48 hours real time- Parameterized group difference potential- http://www.baldridge.unizh.ch/~wibke/

personal/pubs.html

• Participants:- UCSD/UniZH/ETHZ: W. Sudholt- Monash University, Australia: D. Abramson,

C. Enticott, S. Garic

Page 16: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 16http://www.baldridge.unizh.ch/nsf/ITR_RTIGNS/

Page 17: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 17

Collaborative Research Project with Swiss Re

• NatCat application:- Probabilistic modeling of losses for

insurance portfolios from natural catastrophes (earthquakes, tropical cyclones etc.) based on pre-simulated events at specific locations

- Java sources, Oracle database, test cases• Goals:

- Distribution of main Rate process over a computational grid

- Improvement of performance, scalability, stability, and fairness

- Testing of the DataSynapse GridServer and INRIA ProActive grid middleware tools

• Results:- Distribution over a number of Linux

machines by an event set-based algorithm- Performance considerably improved- Database access represents bottleneck- Some results already in production version- Currently working on improving the

distribution algorithm• Participants:

- Swiss Re: M. Spühler, P. Pfister et al.- UniZH: W. Sudholt, M. Packard, H.

Mahmood, M. Dänzer, M. Monroe, K. Baldridge

0

2000

4000

6000

8000

10000

12000

14000

medium_alm_no_inuringmedium_dlm_with_inuring

large_dlm_no_inuring_1large_dlm_no_inuring_2large_dlm_no_inuring_5large_dlm_with_inuring_4

Rate time/s

Local

2 Nodes

4 Nodes

8 Nodes

Page 18: 28.09.2006 / 1W. Sudholt, K. Baldridge Swiss Grid Day, Geneva 28.09.2006 Grid Computing for Computational Chemistry and Beyond Wibke Sudholt 1 and Kim

W. Sudholt, K. Baldridge 28.09.2006 / 18

Summary

• Conclusions:- Grid computing is important for our domain-specific as well as our computer science research and

has helped us to establish new local and international collaborations.- Our group and coworkers now have a lot of experience in grid computing including project

participation and organization, infrastructure setup, software development, and application to scientific problems.

- By developing grid user interfaces, we try to make grids easier accessible for the domain scientists.- We are building a record of grid projects in computational chemistry and also reach out to new

fields, concepts, and collaborations.- Grid computing still requires a lot of effort, but the future is bright if we learn from the successes

and failures, are aware of the limits, have clearly defined needs and goals, and do not reinvent the wheel.

• Thanks for funding:- UniZH, UCSD, SDSC, NSF, DAAD, ETHZ, EU, Swiss Re and others

• Swiss Grid Initiative:- This is an important initiative, and we are interested in participating.- We could provide our open source software and expertise.- We would contribute personnel and hardware resources only for well-defined, collaborative, and

financed projects.- We expect knowledge exchange and dissemination, new collaborations, access to resources and

funding, and not much organizational overhead.- This has to be a win-win situation with mutual trust, clear goals, and freedom for research.