grid computing development at aei - dfn
Post on 02-Nov-2021
2 Views
Preview:
TRANSCRIPT
CeBIT 13 March 2005
Grid ComputingDevelopment At AEI
CeBIT
Gabrielle Allen gabrielle.allen@aei.mpg.de
Kelly Davis kelly.davis@aei.mpg.de
Robert Engel robert.engel@aei.mpg.de
Hartmut Kaiser hartmut.kaiser@aei.mpg.de
Jason Novotny jason.novotny@aei.mpg.de
Thomas Radke thomas.radke@aei.mpg.de
Ed Seidel ed.seidel@aei.mpg.de
Oliver Wehrens oliver.wehrens@aei.mpg.de
and
Jarek Nabrzyski naber@poznan.man.pl
Albert Einstein Institute
Talk prepared by Michael Russell michael.russell@aei.mpg.de
With (many) contributors, including:
CeBIT 13 March 2005
THE GRID: Dependable,consistent, pervasive access
to high-end resources
CACTUS is a freely available, modular,portable and manageable environment
for collaboratively developing parallel, high-performance multi-dimensional simulations
www.CactusCode.org
CeBIT 13 March 2005
Cactus: Parallel, Collaborative,Modular Application Framework
http://www.CactusCode.orgOpen source PSE for scientists and engineers ... USER DRIVEN ...easy parallelism, no new paradigms, flexible, Fortran, legacy codes.Flesh (ANSI C) provides code infrastructure (parameter, variable,scheduling databases, error handling, APIs, make, parameter parsing)Thorns (F77/F90/C/C++) are plug-in and swappable modules orcollections of subroutines providing both the computationalinstructructure and the physical application. Well-definedinterface through 3 config filesEverything implemented as a swappable thorn ... use best availableinfrastructure without changing application thorns.Collaborative, remote and Grid toolsComputational Toolkit: existing thorns for (Parallel) IO, elliptic, MPIunigrid driver, coordinates, interpolations, and more.Integrate other common packages and tools, HDF5, PETSc, GrACE ..
CeBIT 13 March 2005
Modularity of Cactus...
Sub-app
AMR (GrACE, etc)
I/O layer 2
Globus Metacomputing Services
User selectsdesired functionality…Code created...
Abstractions...
Remote Steer 2MDS/Remote
Spawn
Legacy App 2Symbolic
Manip App
Unstructured...
Application 2...
Cactus Flesh
MPI layer 3
Application 1
CeBIT 13 March 2005
Grid-Enabled Cactus
Cactus and its ancestor codes have beenusing Grid infrastructure since 1993 ...motivated by simulation requirements ...Support for Grid computing was part of thedesign requirements for Cactus 4.0(experiences with Cactus 3)Cactus compiles out-of-the-box with Globus[using globus device of MPICH-G(2)]
Design of Cactus means that applicationsare unaware of the underlying machine/sthat the simulation is running on …applications become trivially Grid-enabledInfrastructure thorns (I/O, driver layers)can be enhanced to make most effectiveuse of the underlying Grid architectureInvolved in lots of ongoing Grid projects ....
CeBIT 13 March 2005
Why Grid Computing?
AEI Numerical Relativity Group has access to high-end resourcesin over ten centers in Europe/USAThey want:
Bigger simulations, more simulations and faster throughputIntuitive IO at local workstationNo new systems/techniques to master!!
How to make best use of these resources?Provide easier access … no one can remember ten usernames, passwords,batch systems, file systems, … great start!!!Combine resources for larger productions runs (more resolution badlyneeded!)Dynamic scenarios … automatically use what is availableRemote/collaborative visualization, steering, monitoring
Many other motivations for Grid computing ...
CeBIT 13 March 2005
Grids, The Main Idea
The idea is to make computational resources(clusters, data servers, applications, scientiticinstruments, etc.) as readily available as electricalpower….And to provide computational resources transparentlyto users of differing levels of expertise and applicationbackgrounds.Computational services should interact to performspecified tasks efficiently, securely and with minimalhuman intervention…This is what we’re trying to build… we’re a long wayfrom this vision!
CeBIT 13 March 2005
Grand Picture
Remote steeringand monitoring
from airport
Origin: NCSA
Remote Viz inSt Louis
T3E: Garching
Simulationslaunched fromCactus PortalGrid enabled
Cactus runs ondistributedmachines
Remote Viz andsteering from Berlin
Viz of data fromprevious simulations in
SF caf?
DataGrid/DPSSDownsampling
Globus
http
HDF5
IsoSurfaces
CeBIT 13 March 2005
Computing On Demand!
NCSA
Go!
Clone job with steered parameter
Queue time over, find new machine
Add more resources
Found a horizon,try out excision
Look forhorizon
Calculate/OutputGrav. Waves
Calculate/OutputInvariants
Find bestresources
Free CPUs!!
SDSC RZG
SDSC
LRZ Archive data
CeBIT 13 March 2005
From User’s Point Of View
CeBIT 13 March 2005
Cactus Grid Projects
User Portal (KDI Astrophysics Simulation Collaboratory)Efficient, easy, access to resources … interfaces to everything else
Collaborative Working Methods (KDI ASC)Large Scale Distributed Computing (Globus)
Only way to get the kind of resolution we really needRemote Monitoring (TiKSL/GriKSL)
Direct access to simulation from anywhereRemote Visualization (Live/Offline) (TiKSL/GriKSL)
Collaborative analysis during simulations/Viz of large datasetsRemote Steering (TiKSL/GriKSL)
Live collaborative interaction with simulation (eg IO/Analysis)Dynamic, Adaptive Scenarios (GridLab/GrADs)
Simulation adapts to changing Grid environmentMake Grid Computing useable/accessible for application users !!
GridLab: Grid Application Toolkit
CeBIT 13 March 2005
GridLab Project
Funded by the EU (5+ M€), January 2002 – December 2004Application and Testbed oriented
Cactus Code, Triana Workflow, all the other applications that want to be Grid-enabled
Main goal: to develop a Grid Application Toolkit (GAT) and set of gridservices and tools...:
resource management (GRMS),data management,monitoring,adaptive components,mobile user support,security services,portals,
... and test them on a real testbed with real applications
CeBIT 13 March 2005
GridLab Is An Architecture
CeBIT 13 March 2005
And A Global Effort
PSNC (Poznan) - coordinationAEI (Potsdam)ZIB (Berlin)Univ. of LecceCardiff UniversityVrije Univ. (Amsterdam)SZTAKI (Budapest)Masaryk Univ. (Brno)NTUA (Athens)Sun MicrosystemsCompaq (HP)ANL (Chicago, I. Foster)ISI (LA, C.Kesselman)UoWisconsin (M. Livny)
collaborating with:Users!
EU Astrophysics Network,DFN TiKSL/GriKSLNSF ASC Project
other Grid projectsGlobus, Condor,GrADS,PROGRESS,GriPhyn/iVDGL,Most of the otherEuropean Grid Projects(GRIDSTART)GWEN
CeBIT 13 March 2005
GridLab Testbed Snapshot
CeBIT 13 March 2005
GridLab Goals
Get Computational Scientists using the “Grid” andGrid services for real, everyday, production work (AEIRelativists, EU Network, Grav Wave Data Analysis,Cactus User Community), all the other potential gridappsMake it easier for applications to make flexible,efficient, robust, use of the resources available totheir virtual organizationsDream up, prototype, and test new applicationscenarios which make adaptive, dynamic, wild, andfuturistic uses of resources.
CeBIT 13 March 2005
What Do Our Users Need?
Application oriented environmentFlexible, easy-to-use, simple interfacesEfficient and effective use of resourcesRobustness, fail-safety, adapabilityThe ability to work in distributed teamsSupport for mobile working environments
CeBIT 13 March 2005
What Do Our Users Want?
Larger computational resourcesMemory/CPU
Faster throughputCleverer scheduling, configurable scheduling, co-scheduling, exploitation of un-used cycles
Easier use of resourcesPortals, grid application frameworks, information services, mobile devices
Remote interaction with simulations and dataNotification, steering, visualization, data management
Collaborative toolsNotification, visualization, video conferencing, portals
Dynamic applications, New scenariosGrid application frameworks connecting to services
CeBIT 13 March 2005
Many Application Scenarios!
Dynamic Stagingmove to faster/cheaper/bigger machine
Multiple Universecreate clone to investigate steered parameter
Automatic Convergence Testingfrom initial data or initiated during simulation
Look Aheadspawn off and run coarser resolution to predictlikely future
Spawn Independent/Asynchronous Taskssend to cheaper machine, main simulationcarries on
Application Profilingbest machine/queuechoose resolution parameters based onqueue
Dynamic Load Balancinginhomogeneous loadsmultiple grids
PortalUser/virtual organisation interface to thegrid.
Intelligent Parameter Surveysfarm out to different machines
Make use ofRunning with management tools such asCondor, Entropia, etc.Scripting thorns (management, launchingnew jobs, etc)Dynamic use of eg MDS for findingavailable resources
CeBIT 13 March 2005
Our Role in GridLab
Development of the Grid Application ToolkitDevelopment of application scenarios using GAT andGridLab technologiesNumerical relativists are the target user group forGridLab.Development of GridSphere Portal FrameworkRequirements and design for data management toolsand visualization servicesGeneral support of GridLab services on ourproduction resources
CeBIT 13 March 2005
Grid Application Toolkit
CeBIT 13 March 2005
Grid Application Toolkit
The GAT provides functionality through a carefullyconstructed set of generic high-level APIs, throughwhich an application will be able to call the underlyinggrid services,Set of application developer APIs for Grid tools,services and software libraries, (and exampleimplementations) that support the development ofgrid-enabled applications (open source!)Usable from any high level “application” (any genericcode, Cactus, Triana, Portals, Scripts, …)
CeBIT 13 March 2005
GAT Goals
The GAT provides an API and an associated setof tools which enable end-users and applicationdevelopers to make easy and flexible use of theGrid,The infrastructure, and in particular the GAT,must allow developers to develop theirapplications independently of the deployment ofgrid services,Users must be able to make use of suchapplications in the absence of a fully-deployedinfrastructure.
CeBIT 13 March 2005
The Grid is complex …
Monitoring
Resource Management
InformationSecurity
DataManagement
GLOBUS
ApplicationManager
Logging
Notification Migration
Profiling
SOAP WSDL Corba OGSA Other
Other GridInfrastructure?
Application
“Is there a better resource I could be using?”
UNICORE
CeBIT 13 March 2005
…need to make it easier to use
GAT
Application
“Is there a better resource I could be using?”
GAT_FindResource( )
The Grid
CeBIT 13 March 2005
The Same Application …
Application
GAT
Application
GAT
Application
GAT
Laptop The GridSuper Computer
No network! Firewall issues!
CeBIT 13 March 2005
Why Another Grid-API?
The situation today:Grids: everywhere
Supposedly. At least many projects ☺Grid applications: nowhere
Almost. At least our experience that this is difficult, GGFAPPS group
Why is this?Application programmers accept the Grid as a computingparadigm only very slowly.Problems: (multifold and often cited - amongst others)
Interfaces are NOT simple (see next slides. . .)Typical Globus code... ahem... ☺
Different and evolving interfaces to the ’Grid’Versions, new services, new implementations, WSDL doesnot solve all problems at all
Environment changes in many waysGlobus, grid members, services, network, applications, ...
CeBIT 13 March 2005
Dynamic Middleware
Globus, Unicore, my_service, your_service, . . .The same functionality has different interfaces allover the place.
But you don't want to recompile your app every time, not to speak of recoding...WSDL does not mean end of all problems (see CoG code), but begin of new ones... - on application level, WSDLis not trivial enough
Restricting yourself to Globus does not help either:version changes every couple of months(2.4.x, 3.2.y, 4.a.b)
and gets bug fixes. Changes often are MAJOR - we have seen a number of them over the last couple of years...
The application that runs today will fail tomorrow!Right now, it is basically impossible for a programmer to focus on the science, not on IT (i.e. Grid) problems.
CeBIT 13 March 2005
Dynamic Grids
Services (and interfaces) get exchanged (“upgraded”) on regularbasis
That is related to the point above, but also a social problem!
Institutions (resources, services, applications) join/leave YOUR gridwithout (much) notice.
The grid is designed to ease and simplify that kind of fluctuation - its not a bug, its a feature!But the applications are not able to make use of that feature right now …
The Grid changes AT RUNTIME – services go down, resources getbusy/free, disks and storage nodes are empty/full, . . . THINGSCONSTANTLY CHANGE.
Today Grid middleware allows to cope with that, but utilizing that in an intelligent way is a major programming effort, andblows the application with code that needs constancy maintenance...
Applications need LOTS of code for handling transient problems.
Most applications share most of these problems, but code reuse isdifficult/impossible.
We can reuse the Globus libraries, right, but isn't every project re-inventing its own abstraction layer for these?In our experience/projects: they do!
Aren’t we all re-inventing abstraction layers for this?
CeBIT 13 March 2005
Copy a File: Globus GASSif (source_url.scheme_type == GLOBUS_URL_SCHEME_GSIFTP ||
source_url.scheme_type == GLOBUS_URL_SCHEME_FTP ) {
globus_ftp_client_operationattr_init (&source_ftp_attr);
globus_gass_copy_attr_set_ftp (&source_gass_copy_attr,
&source_ftp_attr);
}
else {
globus_gass_transfer_requestattr_init (&source_gass_attr,
source_url.scheme);
globus_gass_copy_attr_set_gass(&source_gass_copy_attr,
&source_gass_attr);
}
output_file = globus_libc_open ((char*) target,
O_WRONLY | O_TRUNC | O_CREAT,
S_IRUSR | S_IWUSR | S_IRGRP |
S_IWGRP);
if ( output_file == -1 ) {
printf ("could not open the file \"%s\"\n", target);
return (-1);
}
/* convert stdout to be a globus_io_handle */
if ( globus_io_file_posix_convert (output_file, 0,
&dest_io_handle)
!= GLOBUS_SUCCESS) {
printf ("Error converting the file handle\n");
return (-1);
}
result = globus_gass_copy_register_url_to_handle (
&gass_copy_handle, (char*)source_URL,
&source_gass_copy_attr, &dest_io_handle,
my_callback, NULL);
if ( result != GLOBUS_SUCCESS ) {
printf ("error: %s\n", globus_object_printable_to_string
(globus_error_get (result)));
return (-1);
}
globus_url_destroy (&source_url);
return (0);
}
int RemoteFile::GetFile (char const* source, char const* target) {
globus_url_t source_url;
globus_io_handle_t dest_io_handle;
globus_ftp_client_operationattr_t source_ftp_attr;
globus_result_t result;
globus_gass_transfer_requestattr_t source_gass_attr;
globus_gass_copy_attr_t source_gass_copy_attr;
globus_gass_copy_handle_t gass_copy_handle;
globus_gass_copy_handleattr_t gass_copy_handleattr;
globus_ftp_client_handleattr_t ftp_handleattr;
globus_io_attr_t io_attr;
int output_file = -1;
if ( globus_url_parse (source_URL, &source_url) != GLOBUS_SUCCESS ) {
printf ("can not parse source_URL \"%s\"\n", source_URL);
return (-1);
}
if ( source_url.scheme_type != GLOBUS_URL_SCHEME_GSIFTP &&
source_url.scheme_type != GLOBUS_URL_SCHEME_FTP &&
source_url.scheme_type != GLOBUS_URL_SCHEME_HTTP &&
source_url.scheme_type != GLOBUS_URL_SCHEME_HTTPS ) {
printf ("can not copy from %s - wrong prot\n", source_URL);
return (-1);
}
globus_gass_copy_handleattr_init (&gass_copy_handleattr);
globus_gass_copy_attr_init (&source_gass_copy_attr);
globus_ftp_client_handleattr_init (&ftp_handleattr);
globus_io_fileattr_init (&io_attr);
globus_gass_copy_attr_set_io (&source_gass_copy_attr, &io_attr);
&io_attr);
globus_gass_copy_handleattr_set_ftp_attr
(&gass_copy_handleattr,
&ftp_handleattr);
globus_gass_copy_handle_init (&gass_copy_handle,
&gass_copy_handleattr);
CeBIT 13 March 2005
Copy a File: CoG/RFT
TransferRequestType transferRequest = new TransferRequestType ();
transferRequest.setTransferArray (transfers1);
int concurrency = Integer.valueOf
((String)requestData.elementAt(6)).intValue();
if (concurrency > transfers1.length)
{
System.out.println ("Concurrency should be less than the number"
"of transfers in the request");
System.exit (0);
}
transferRequest.setConcurrency (concurrency);
TransferRequestElement requestElement = new TransferRequestElement ();
requestElement.setTransferRequest (transferRequest);
ExtensibilityType extension = new ExtensibilityType ();
extension = AnyHelper.getExtensibility (requestElement);
OGSIServiceGridLocator factoryService = new OGSIServiceGridLocator ();
Factory factory = factoryService.getFactoryPort (new URL (source_url));
GridServiceFactory gridFactory = new GridServiceFactory (factory);
LocatorType locator = gridFactory.createService (extension);
System.out.println ("Created an instance of Multi-RFT");
MultiFileRFTDefinitionServiceGridLocator loc
= new MultiFileRFTDefinitionServiceGridLocator();
RFTPortType rftPort = loc.getMultiFileRFTDefinitionPort (locator);
((Stub)rftPort)._setProperty (Constants.AUTHORIZATION,
NoAuthorization.getInstance());
((Stub)rftPort)._setProperty (GSIConstants.GSI_MODE,
GSIConstants.GSI_MODE_FULL_DELEG);
((Stub)rftPort)._setProperty (Constants.GSI_SEC_CONV,
Constants.SIGNATURE);
((Stub)rftPort)._setProperty (Constants.GRIM_POLICY_HANDLER,
new IgnoreProxyPolicyHandler ());
int requestid = rftPort.start ();
System.out.println ("Request id: " + requestid);
}
catch (Exception e)
{
System.err.println (MessageUtils.toString (e));
}
}
package org.globus.ogsa.gui;
import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.net.URL;
import java.util.Date;
import java.util.Vector;
import javax.xml.rpc.Stub;
import org.apache.axis.message.MessageElement;
import org.apache.axis.utils.XMLUtils;
import org.globus.*
import org.gridforum.ogsi.*
import org.gridforum.ogsi.holders.TerminationTimeTypeHolder;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
public class RFTClient {
public static void copy (String source_url, String target_url) {
try {
File requestFile = new File (source_url);
BufferedReader reader = null;
try {
reader = new BufferedReader (new FileReader (requestFile));
} catch (java.io.FileNotFoundException fnfe) { }
Vector requestData = new Vector ();
requestData.add (target_url);
TransferType[] transfers1 = new TransferType[transferCount];
RFTOptionsType multirftOptions = new RFTOptionsType ();
multirftOptions.setBinary (Boolean.valueOf (
(String)requestData.elementAt (0)).booleanValue ());
multirftOptions.setBlockSize (Integer.valueOf (
(String)requestData.elementAt (1)).intValue ());
multirftOptions.setTcpBufferSize (Integer.valueOf (
(String)requestData.elementAt (2)).intValue ());
multirftOptions.setNotpt (Boolean.valueOf (
(String)requestData.elementAt (3)).booleanValue ());
multirftOptions.setParallelStreams (Integer.valueOf (
(String)requestData.elementAt (4)).intValue ());
multirftOptions.setDcau(Boolean.valueOf(
(String)requestData.elementAt (5)).booleanValue ());
int i = 7;
for (int j = 0; j < transfers1.length; j++)
{
transfers1[j] = new TransferType ();
transfers1[j].setTransferId (j);
transfers1[j].setSourceUrl ((String)requestData.elementAt (i++));
transfers1[j].setDestinationUrl ((String)requestData.elementAt (i++));
transfers1[j].setRftOptions (multirftOptions);
}
CeBIT 13 March 2005
Copy a File: GAT/C
#include <GAT.h>
GATResult RemoteFile_GetFile (GATContext context, char const* source_url, char const* target_url){ GATStatus status = 0; GATLocation source = GATLocation_Create (source_url); GATLocation target = GATLocation_Create (target_url); GATFile file = GATFile_Create (context, source, 0); if (source == 0 || target == 0 || file == 0) { return GAT_MEMORYFAILURE; } if ( GATFile_Copy (file, target, GATFileMode_Overwrite) != GAT_SUCCESS ) { GATContext_GetCurrentStatus (context, &status); return GATStatus_GetStatusCode (status); } GATFile_Destroy (&file); GATLocation_Destroy (&target); GATLocation_Destroy (&source);
return GATStatus_GetStatusCode (status);}
CeBIT 13 March 2005
Copy a File: GAT/C++
#include <GAT++.hpp>
GAT::Result RemoteFile::GetFile (GAT::Context context, std::string source_url, std::string target_url){ try { GAT::File file (context, source_url); file.Copy (target_url); } catch (GAT::Exception const &e) { std::cerr << "Some error: " << e.what() << std::endl; return e.Result(); } return GAT_SUCCESS;}
CeBIT 13 March 2005
Code Statistics
55510Language
291020Cleanup113030Action253030Init
152080100Linestotal
C++ GATC GATCoGGASSCode
CeBIT 13 March 2005
GridLab & Cactus
CeBIT 13 March 2005
Cactus/GAT Integration
GATLibrary
Cactus Flesh
Thorn
CGATThorn
Thorn
ThornThorn
Thorn
Physics andComputationalInfrastructure
Modules
Cactus GAT wrappersAdditional functionality
Build system
GridLab Service
GridLab Service
CeBIT 13 March 2005
TFM
TFM TFM TFM TFM
Task Farming on the Grid
TFM implementedin Cactus
GAT (GRAM, GRMS) used for starting remote TFMs
Designed for the Grid
Tasks can be anything
fork/exec
CeBIT 13 March 2005
Task Farming Motivation
Requested by local physics groupParameter surveys, e.g. looking for critical phenomena ingravitational wave collapse by varying amplitude, testing differentformalisms of Einstein Equations for evolving same initial data
Scenario is inherently quite robust and fault tolerantGood migration path to the Grid
Start easy (not too much Grid!), task farm across localhomogeneous workstations and on single supercomputers.
Use public keys first, then test standard Grid infrastructureUse of GAT then means users can start testing GridLab services(should still work for them if services not ready)CGAT team can then test real physics runs using wider Grid andGridLab services.
CeBIT 13 March 2005
GridSphere Portal
CeBIT 13 March 2005
What is a Grid Portal?
“A portal is a web basedapplication that commonlyprovides personalization, singlesign on, content aggregation fromdifferent sources and hosts thepresentation layer of InformationSystems”(JSR 168)Grid Portals build upon the familiarWeb portal model, such as Yahooor Amazon, to deliver the benefitsof Grid computing to virtualcommunities of users, providing asingle access point to Gridservices and resources.
CeBIT 13 March 2005
Developing Grid Portals
Grid web application development still remains a tedioustask with little in the way of reusable components, forcingdevelopers to constantly “re-invent” the wheel.Often difficult and hard to maintain glue code must bewritten connecting the portal to Grid services, due to lackof/evolving standards.Lack of real usability has made it difficult to test andevaluate user interfaces.A Portal is only as good as the underlying deployedinfrastructure…. Portal development often involvesdebugging underlying middleware
CeBIT 13 March 2005
Early Grid Portal Projects
Grid-Port:Perl based framework developed by Mary Thomas and SteveMock at San-Diego Supercomputing Center (SDSC)
Grid Portal Development Toolkit (GPDK):Developed by Jason Novotny at Lawrence Berkeley NationalLaboratories (LBNL)
Astrophysics Simulation Collaboratory (ASC):Developed by Michael Russell at University of Chicago
CeBIT 13 March 2005
GridSphere 2.0
CeBIT 13 March 2005
Personalized Environment
CeBIT 13 March 2005
Single Sign-On Capabilities
CeBIT 13 March 2005
Submit Jobs
CeBIT 13 March 2005
Perform File Transfers
CeBIT 13 March 2005
Manage Resources
CeBIT 13 March 2005
Value Added Services
CeBIT 13 March 2005
Data Mgmt And Viz Tools
CeBIT 13 March 2005
CeBIT 13 March 2005
CeBIT 13 March 2005
CeBIT 13 March 2005
Some Achievements…
Many successful demos and awards…Our software is being adopted by many groups andlarge-scale projects around the world, including theD-Grid Initiative and HPC Europa!Our ideas and technologies are becoming topics ofresearch at conferences like the Global Grid Forum.But we’re only now in the process of putting ourtechnologies to use here at AEI.
CeBIT 13 March 2005
Prepared a demo at GGFannouncing kickoff of the“Deutsche-Grid”Migrating a testapplication that was puttogether by our partnersat AEI to help us buildsolutions tailored to theirneeds.Yet the lessons learnedhere apply to a largeclass of applications!
Demo at GGF10!
CeBIT 13 March 2005
SC2002, Baltimore
Varied applications deployed of theGGTC testbed
Cactus Black Hole SimulationsASC GridLab PortalSmith-WatermanNimrod-GGridLab Task Farming scenarioVisapult
HighlightsGGTC won 2 of the 3 HPC AwardsWon (with Visapult/LBL group)Bandwidth Challenge$2000 prize money to UNICEF childrensfund
CeBIT 13 March 2005
Global Grid TestbedCollaboration (GGTC)
Driven by GGF APPS and GridLab testbed and applicationsWhole testbed constructed very swiftly (few weeks)5 continents: North America, Europe, Asia, Africa, AustraliaOver 14 countries, including:
China, Japan, Singapore, S.Korea, Egypt, Australia, Canada, Germany, UK, Netherlands, Czech,Hungary, Poland, USA
About 70 machines, with thousands of processors (~7500)Many hardware types, including PS2, IA32, IA64, MIPS, IBM Power, Alpha,Hitachi/PPC, SparcMany OSs, including Linux, Irix, AIX, OSF, True64, Solaris, Hitachi
Many different organizations (big centers/individuals)All ran same Grid infrastructure! (Globus)
CeBIT 13 March 2005
Global Grid TestbedCollaboration Map
CeBIT 13 March 2005
Bandwidth Challenge:Highest Performing Application
Distributed simulations usingCactus, Globus and VisapultWith John Shalf/LBL and others16.8 Gigabits/secondscinet.supercomp.org/bwcSix sites:USA/Dutch/Czech/Poland
CeBIT 13 March 2005
Grid-xclock
Simple application for testing and debugging.xclock is standard X utility, run on any machine with X installed
Requires:
o xclock binaryo X librarieso To display remotely, need to
open outgoing ports frommachine it is running on tomachine displaying
CeBIT 13 March 2005
Preparing for Production
Now we have some basic tools and APIs available foralpha / beta testing.While there are many enhancements we haveplanned for GAT, GridSphere, etc, we are turning ourattention to the needs of our own users at AEI andmembers of the GridLab Virtual Organization.GridLab technologies are beginning to mature, andthis means we can start building real solutions for thephysicists at AEI, the reason why we are here in thefirst place.
CeBIT 13 March 2005
Building a Production Grid
Constructing a Grid that includes hpc computingresources from LSU-AEI-KISTI.Going to require that users access this Grid with oursoftware, this encourages both better software designand new ways of thinking about how best to exploitthis Grid.This effort will build upon the technologies andexpertise we’ve been developing here at AEI and withour partners in the GridLab Project.
CeBIT 13 March 2005
The Cactus Portal…
Goal is to build a production Grid portal to support theuse of Cactus applications on Grid.Support for job submission and tracking.Data management tools.Higher-level visualization services.Automated software deployment.Notification services (e.g. AIM, Email, SMS).SSH access to resources from portal.Improved credential management.And whatever else our users want!
CeBIT 13 March 2005
Conclusion…
We have a long way to go, but we’ve made realprogress and this is the year we get to test our work.We’re looking not just to support the scientists at thisinstitute, but to get input and collaboration fromcommunities around the world, from varyingapplication backgrounds.Visit www.gridlab.org for more info!
top related