acrossgrids conference – santiago 13 of february 2003 first prototype of the crossgrid testbed...
Post on 15-Jan-2016
214 views
TRANSCRIPT
AcrossGrids Conference – Santiago 13 of February 2003
First Prototype of the CrossGrid Testbed
Jorge Gomes (LIP)On behalf of X# WP4
AcrossGrids Conference – Santiago 13 of February 2003
CrossGrid testbed– The CrossGrid international testbed
includes 16 sites across 9 European countries.
– The site list includes:• Small computing facilities in Universities
and research centers.• Large computing centers.
– CrossGrid offers an ideal mixture of sites to test the possibilities of the grid technologies.
– It is expected that more sites will join in.
AcrossGrids Conference – Santiago 13 of February 2003
CrossGrid testbed sitesThe
testbed sites seen from the CrossGrid Mapcenter.
Mapcenter was developed by DataGrid.
The CrossGrid Mapcenter ismaintained by LIP.http://mapcenter.lip.pt
AcrossGrids Conference – Santiago 13 of February 2003
Testbed middleware– One of the aims of CrossGrid is to
extend the grid coverage in Europe hence it needs to:• be compatible with other testbeds
namely DataGrid.• develop and build upon existing
middleware such as Globus and EDG.
– Since the CrossGrid middleware is still being developed the initial testbed was based entirely in EDG and Globus middleware.
AcrossGrids Conference – Santiago 13 of February 2003
CrossGrid testbeds– The CrossGrid initial testbed deployment
started in May 2002 with 4 sites.– The testbed was setup in the context of the
quality assurance activities and was used to test and validate EDG 1.2.x.
– The initial testbed has grown and was recently divided in two.
Initial testbedEDG 1.2.2/3
Production testbedEDG 1.2.2/3
Validation testbed EDG 1.4.3
AcrossGrids Conference – Santiago 13 of February 2003
CrossGrid testbeds
Production testbed
– Production testbed• Used to run applications.
– Validation testbed• Used to test new production middleware.
– Development testbed• Used to support the development of middleware,
applications and integration of new testbed releases.
– In the future three testbeds will coexist.
Rele
ase
p
roce
ss
Validation testbed
Development testbed
AcrossGrids Conference – Santiago 13 of February 2003
CrossGrid testbed resourcesProduction testbed Validation testbed
Computing Elements 14 Computing Elements 3Worker Nodes 69 Worker Nodes 4CPUs 110 CPUs 5Storage Elements 14 Storage Elements 3Storage capacity 2.7TB Storage capacity 1.2TB
– Of the 16 sites foreseen:• 10 are fully available.• 2 are deployed and being tested.• 2 are currently in the validation testbed.• 2 are deployed but not yet available (not tested).
– The X# testbeds already offer considerable computing and storage resources.
AcrossGrids Conference – Santiago 13 of February 2003
Networking
•Spain•Portugal•Poland•Germany•Greece•Cyprus•Ireland•Slovakia•Netherlands•Austria
– CrossGrid uses the Géant backbone for international connectivity.
Géant
AcrossGrids Conference – Santiago 13 of February 2003
Networking– Site connectivity is provided by the National
Research Networks.
UCY NikosiaDEMO Athens
Auth Thessaloniki
CYFRONET Cracow
ICM & IPJ Warsaw
PSNC Poznan
CSIC IFIC Valencia
UAB Barcelona
CSIC-UC IFCA
Santander
CSIC RedIris Madrid
LIP Lisbon
USC Santiago
TCD Dublin
UvA Amsterdam
FZK Karlsruhe
II SAS Bratislava
Géant
AcrossGrids Conference – Santiago 13 of February 2003
X# Production sites statusSite Location Status CYFRONET Cracow Running 1.2.2. ICM Warsaw Running 1.2.2. INS Warsaw Running 1.2.2. UvA Amsterdam Deployed 1.2.2. FZK Karlsruhe Running 1.4.4, temporarily in the test and
validation testbed. IISAS Bratislava Running 1.2.3. PSNC Poznan Running 1.2.2. UCY Nikosia Deployed 1.2.2, being configured. TCD Dublin Running 1.2.3. IFIC Valencia Running 1.2.3. IFCA Santander Deployed 1.2.2. UAB Barcelona Deployed 1.2.2, under tests. USC/CESGA Santiago Running 1.2.2. Demokritos Athens Running 1.4.4, temporarily in the test and
validation testbed. AUTH Thessaloniki Running 1.2.2. LIP Lisbon Running 1.2.3.
AcrossGrids Conference – Santiago 13 of February 2003
The X# production Resource Broker
JDL Job request
Resource
broker
Any CrossGrid
UserInterface
1
2 4
JSS
Central siteLisbon
1.2. Jobs are sent to the RB located at LIP
3. The RB submits the job to a CE using GRAM
1. Job requests are submitted from remote UIs2. Jobs are sent to the RB located at LIP
4. The RB submits the job to a CE using GRAM
Any X#Remote site
Cracow
Lisbon
ValenciaValencia
BratislavaBratislava
PoznanPoznan
Barcelona
Warsaw
Thessaloniki
Santiago
Karlsruhe(not available)
Athens(not available)
Dublin
I I
3.3. The RB uses site information in the matchmaking
3
AcrossGrids Conference – Santiago 13 of February 2003
Central X# production services
Cen
tral
serv
ices
Monitoring
LCFG
VO
RC
RB
MyProxy
L I P Network
PortugueseResearch Network
GÉANTNetwork
MyProxyRBRCVOUIMonitorin
g
: Certification proxy: Resource broker: Replica catalogue: Virtual organisation
server: User interface: Grid monitoring
The CrossGrid production central services are located in Lisbon and maintained by
LIP.
AcrossGrids Conference – Santiago 13 of February 2003
The X# production Replica Catalogue
– The production RC:• Basically an LDAP server.• Hosted at lngrid08.lip.pt port 9980.• Used by the RB and RM.
VO Collection
Description
crossgrid cgtst0 CrossGrid collection
wpsix wpsixtst0 Used in tests
atlas atlastst0 Used in tests
cms cmstst0 Not usedURL: ldap://lngrid08.lip.pt:9980
/rc=CrossGridReplicaCatalogue,dc=lngrid08,dc=lip,dc=pt
AcrossGrids Conference – Santiago 13 of February 2003
X# validation sites statusSite Location Status
FZK Karlsruhe Running 1.4.3, registered into the test and validation RB. Shared with production.
Demokritos Athens Running 1.4.3, registered into the test and validation RB. Shared with production.
LIP Lisbon Running 1.4.3, registered into the test and validation RB. Dedicated infrastructure.
– The validation testbed was created in the context of the task 4.4 testbed quality assurance.
– Currently EDG 1.4.3 is being tested.– All three sites have been successfully deployed.– The central services for the validation testbed
have been successfully deployed at LIP.
AcrossGrids Conference – Santiago 13 of February 2003
The X# validation Resource Broker
JDL Job request
Resource
Broker
Any CrossGrid
UserInterface
1
2 4
JSS
Central siteLisbon
1.2. Jobs are sent to the RB located at LIP
3. The RB submits the job to a CE using GRAM
1. Job requests are submitted from remote UIs2. Jobs are sent to the RB located at LIP
4. The RB submits the job to a CE using GRAM
Any X#Remote site
Lisbon
Karlsruhe
Athens
Information Index
3.3. The RB uses site information in the matchmaking
3
Self registration
New server
AcrossGrids Conference – Santiago 13 of February 2003
Central X# validation services
Cen
tral
serv
ices
Monitoring
LCFG
VO
RC
RB
MyProxy
L I P Network
PortugueseResearch Network
GEANTNetwork
MyProxyRBRCVOUIMonitorin
gII
: Certification proxy: Resource broker: Replica catalogue: Virtual organisation
server: User interface: Grid monitoring: Information Index
The CrossGrid validation central services are located in Lisbon and maintained by
LIP.
I I
AcrossGrids Conference – Santiago 13 of February 2003
The X# validation Replica Catalogue
– The production RC:• Basically an LDAP server.• Hosted at rc01.lip.pt port 9980.• Used by the RB and RM.
VO Collection
Description
crossgrid cg CrossGrid collection
URL: ldap://rc01.lip.pt:9980/rc=CG Replica Catalog,dc=rc01,dc=lip,dc=pt
AcrossGrids Conference – Santiago 13 of February 2003
Production and validation systems hosted at LIP
Gatekeeper (lngrid02)
Production Test and
Validation Shared Local
Resou
rces
Cen
tral
Serv
ices
WN (...)
SE (lngrid03)
UI (lngrid05)
RB (lngrid06)
RC (lngrid08)
RB (rb01)
RC (rc01)
II (ii01)
Gatekeeper (ce01)
WN (...)
SE (se01)
UI (ui01)
LCFG (lngrid01)
CA (OFFLINE)
MyProxy (lngrid07)
VO (lnnet05)
Monitoring (lnnet07)
AcrossGrids Conference – Santiago 13 of February 2003
Virtual Organizations– CrossGrid has a dedicated VO server
• The VO server is used to build the authorization databases of the X# testbed systems.
• Currently is an LDAP server.• Hosted at grid-vo.lip.pt port 9990.• 43 users are registered in the crossgrid VO.
VO Group Descriptioncrossgrid testbed1 All CrossGrid userscgTV alpha Test and validation
expertscgTV beta Test and validation usersgdmpservers
apptb All production GDMP servers
gdmpservers
tvtb All validation GDMP servers
gdmpservers
devtb Not used
AcrossGrids Conference – Santiago 13 of February 2003
Certification Authorities
– Five new CAs were created and are now recognized by CrossGrid.
– All CAs are operational issuing certs and CRLs.– All CAs are recognized by the DataGrid
Certification Authorities Task force (Cyprus is currently finishing the acceptance process).
Country CrossGrid DataGrid Poland Deployed and accepted Accepted in October Netherlands Accepted Already in DataGrid Germany Deployed and accepted Accepted in J une Slovakia Deployed and accepted Accepted in October Ireland Accepted Already in DataGrid Spain Accepted Already in DataGrid Greece Deployed and accepted Accepted in October Portugal Accepted Already in DataGrid Cyprus Deployed and accepted Acceptance in progress
AcrossGrids Conference – Santiago 13 of February 2003
Testbed support
– The CrossGrid helpdesk application is being tested and is almost ready.
– The current sources of support are:• [email protected]• http://grid.ifca.unican.es/crossgrid/wp4
– The support for the central services is currently provided by LIP.• [email protected]• http://www.lip.pt/computing/cg-services• http://www.lip.pt/computing/cg-tv-
services
AcrossGrids Conference – Santiago 13 of February 2003
Testbed support (2)– A support knowledge database with
solutions for common problems has been adapted and is being tested:• Web access (PHP + SQL).• Users will be able to send questions .• Questions will be routed to the right
expert.• User and administrator level almost
finished.– Unified CrossGrid/DataGrid helpdesk
• One support DB for both projects.• Extend the user support team.• Some helpdesk guidelines already
agreed.
AcrossGrids Conference – Santiago 13 of February 2003
Software repository
– A CVS repository has been established at FZK in Karlsruhe.
– A web portal based on GNU Savannah was deployed to interface with the repository.
– Savannah is based on SourceForge 2.0.– Savannah was customized to the X#
needs.– The repository is now in production and is
being used by the whole project.http://gridportal.fzk.de
AcrossGrids Conference – Santiago 13 of February 2003
Software repository (2)
AcrossGrids Conference – Santiago 13 of February 2003
Testbed monitoringMapcentergrid monitoring framework.Mapcenter was developed by DataGrid.
The CrossGrid Mapcenter ismaintained by LIP.
Excellent tool to monitor the availability of sites and services
http://mapcenter.lip.pt
AcrossGrids Conference – Santiago 13 of February 2003
X# host check tool
Host Checkgrid host checker.
Host Check was developed to support the CrossGrid testbed deployment.
Host Check produces a detailed report for each testbed CE and SE.
http://www.lip.pt/computing/cg-services/site_check
AcrossGrids Conference – Santiago 13 of February 2003
Production RB statistics
– The peak usage of the RB was between last November and December.
– Since the RB doesn’t support parallel jobs, most job submissions pass unnoticed to the RB.
Total users 33
Jobs submitted 1943
Jobs accepted 1904
Jobs with good match 1799
Jobs submitted by JSS 1781
Jobs run 1620
Jobs done 1070Data from end of January
AcrossGrids Conference – Santiago 13 of February 2003
Validation RB statistics
– The test and validation RB has been established recently.
– The validation RB also doesn’t support parallel applications.
163 matching failures 3 not submitted 43 jobs aborted 219 jobs lost
94.8% successData from end of January
Total users 8
Jobs submitted 4173
Jobs accepted 4173
Jobs with good match 4010
Jobs submitted by JSS 4007
Jobs run 3964
Jobs done 3954
AcrossGrids Conference – Santiago 13 of February 2003
Production CEs statisticsSites Connec
tionsPings Jobs
OKFailed Jobs
LCAS CRL exp
Jobman GSS
LIP 6556 462 2836 50 17 92 3099
IFIC 5326 655 2649 100 97 45 1780
Cyfronet 4516 306 2522 0 20 111 1557
II SAS 1404 6 1185 0 15 99 99
FZK 1799 11 1112 118 7 123 428
Demo 9481 5 1111 36 0 51 8278
ICM 705 34 604 8 24 2 33
CESGA 7321 1 544 78 28 13 6657
UAB 600 14 519 0 9 14 44
INS 592 2 517 6 20 20 27
PSNC 582 0 496 15 14 11 46
TCD 145 0 131 0 0 2 12
AUTH 141 0 127 0 3 0 11
TOTAL 39168 1496 14353 411 254 583 22071
AcrossGrids Conference – Santiago 13 of February 2003
Validation CEs statistics
– The validation testbed has been heavily exercised.
– More than 80.000 jobs have been submitted since the end of November.
Sites Connections
Pings Jobs OK
Failed Jobs
LCAS CRL exp
Jobman GSS
LIP 67365 2319 64995 21 0 4 26
FZK 8883 64 8671 38 12 50 48
Demo 10665 0 6170 4 6 2 4483
TOTAL 86913 2383 79836 63 18 56 4557
AcrossGrids Conference – Santiago 13 of February 2003
Collaborative tools
– Travelling is expensive and wastes time.– Frequent meetings are required to
coordinate the activities (2-3 per month).– VRVS was selected as the main tool for
videoconferencing.• Inexpensive (uses the Internet).• Supports several platforms.• Supports a wide range of AV equipment.
– Email and discussion lists are also extremely used.
AcrossGrids Conference – Santiago 13 of February 2003
Test of X# applications– Application prototypes are being tested in the
current testbeds.• Prototype of a X# HEP application:
• Distributed training of a neural network.• Tested by IFCA, LIP, Demokritos• Requiring:
• MPICH-G2• Lowest latency possible (QoS will be important) • MPI traffic across sites
• Other MPI applications following• Air pollution modelling.• Meteorological downscaling.• Flooding control.• …
• MPI test programs.
AcrossGrids Conference – Santiago 13 of February 2003
Test of X# applications (2)– The tests of the X# HEP application prototype
using MPICH-G2 started in November.• Skeleton of the application.• The first application prototype.
– Test were performed:• Using dedicated systems with Globus (IFCA).• Using the CrossGrid production testbed (LIP,
Demokritos).– The tests over the testbed have shown that:
• Its possible to run MPI jobs in the testbed.• MPI across sites with MPICH-G2 works.• However problems were detected in sites using
private IP addresses.
AcrossGrids Conference – Santiago 13 of February 2003
Test of X# applications (3)– It was possible to run the application using
processors in up to seven sites simultaneously.– The application was compiled statically.– Both PBS and FORK job managers were used in
the tests.– Issues:
• There isn’t support for parallel jobs in the RB (yet), matchmaking must be performed by the user.
• Check that the user is authorized at the testbed sites.• Check that there are free CPUs available.
• PBS jobs may end up waiting in a queue.• Sometimes processes don’t die when they should.• Sometimes the execution hangs.
• Problems with invalid IP addresses.• Possible problems with firewalls.
AcrossGrids Conference – Santiago 13 of February 2003
Integration
– The first CrossGrid software integration started last week.
– The main goals were achieved:• Integration of tools.• Integration of middleware components.• Integration of application portals.
– A demonstration of the middleware and applications was performed yesterday.
– The integration work is not yet finished and will continue in the next days.
– The first #X software release will be available in the coming weeks.
AcrossGrids Conference – Santiago 13 of February 2003
IST Demonstration– CrossGrid has participated in the World grid
demonstration involving European and US sites from CrossGrid, DataGrid, GriPhyN and PPDG.
– It took place in November 2002.– It was the largest grid testbed in the world.– Applications from the CERN/LHC experiments
CMS and Atlas were used.– CrossGrid participated with 3 sites:
• LIP - Lisbon• FZK - Karlsruhe• IFIC - Valencia
AcrossGrids Conference – Santiago 13 of February 2003
Near future
Integration of the first X# middleware release has started.
• Monitoring tools.• Development tools.• Migrating desktop.• Remote access server.• Portals.• Parallel scheduler.
Test and validation period.
First production release.
Deployment in the production testbed.
The first X# middleware will use EDG 1.4.3 and Globus 2.
AcrossGrids Conference – Santiago 13 of February 2003
Next 12 months– Support the extension of the testbed to new sites.
• More sites internal to the project.• Possible external sites and users (policy needed).
– Establish a development testbed.– Support clusters already running other Linux
flavours.• Light installation.
– Prepare the test and possible migration to EDG 2.x and Linux 7.x (RH 6.2 is the current OS).
– Study the usage of QoS in CrossGrid.• Create a QoS test infrastructure.
– Start the security group activities.• Policies, guidelines, tracking of problems, patches.
– Stress testing of the infrastructure.
AcrossGrids Conference – Santiago 13 of February 2003
END
LIP
IFIC
IFCA
FKZ