infso-ri-508833 enabling grids for e-science [email protected] egee what it is … … and what it...
Post on 21-Dec-2015
216 views
TRANSCRIPT
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
EGEEWhat it is …… and what it can do for you
Ian BirdIT Department
CERN, Switzerland
EGEE Operations Manager
Discussions with ESRF/ILL/EMBL
Grenoble, 19th October 2005
ESRF, ILL, EMBL; Grenoble, October 19, 2005 2
Enabling Grids for E-sciencE
INFSO-RI-508833
Outline
• The EGEE project– Overview of the project– Applications deployed– Phase 1 Phase 2
• Operations– Release and Deployment– Operations – Support
• Middleware distribution– Software stack
• Site installation process– Grid-enabling a site
• Joining the grid – … and sharing resources
ESRF, ILL, EMBL; Grenoble, October 19, 2005 4
Enabling Grids for E-sciencE
INFSO-RI-508833
The largest e-Infrastructure: EGEE
• Objectives– consistent, robust and secure
service grid infrastructure– improving and maintaining the
middleware– attracting new resources and users
from industry as well as science
• Structure – 71 leading institutions in 27
countries, federated in regional Grids
– leveraging national and regional grid activities worldwide
– funded by the EU with ~32 M Euros for first 2 years starting 1st April 2004
ESRF, ILL, EMBL; Grenoble, October 19, 2005 5
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE Activities
• 48 % service activities (Grid Operations, Support and Management, Network Resource Provision)
• 24 % middleware re-engineering (Quality Assurance, Security, Network Services Development)
• 28 % networking (Management, Dissemination and Outreach, User Training and Education, Application Identification and Support, Policy and International Cooperation)
Emphasis in EGEE is on operating a productiongrid and supporting the end-users
ESRF, ILL, EMBL; Grenoble, October 19, 2005 6
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE Activities
• 48 % service activities (Grid Operations, Support and Management, Network Resource Provision)
• 24 % middleware re-engineering 24 % middleware re-engineering ((Quality Assurance, Security, Network Quality Assurance, Security, Network Services DevelopmentServices Development))
• 28 % networking (Management, 28 % networking (Management, Dissemination and Outreach, User Dissemination and Outreach, User Training and Education, Application Training and Education, Application Identification and Support, Policy and Identification and Support, Policy and International Cooperation)International Cooperation)
Emphasis in EGEE is on operating a productiongrid and supporting the end-users
country sites country sites country sites Austria 2 India 1 Russia 10 Belgium 1 Israel 2 Singapore 1 Bulgaria 4 Italy 25 Slovakia 3 Canada 6 Japan 1 Slovenia 1 China 1 Korea 1 Spain 13 Croatia 1 Netherlands 2 Sweden 2 Cyprus 1 Macedonia 1 Switzerland 2 Czech Republic 2 Pakistan 2 Taiwan 4 France 8 Poland 4 Turkey 1 Germany 8 Portugal 1 UK &Ireland 35 Greece 6 Puerto Rico 1 USA 3 Hungary 1 Romania 1 Yugoslavia 1
EGEE/LCG-2 grid: 160 sites, 36 countries >15,000 processors, ~5 PB storageOther national & regional grids: ~60 sites, ~6,000 processors
Country providing resources Country anticipating joining
EGEE/LCG-2 Grid Sites : September 2005
ESRF, ILL, EMBL; Grenoble, October 19, 2005 8
Enabling Grids for E-sciencE
INFSO-RI-508833
Operations Structure• Operations Management Centre
(OMC):– At CERN – coordination etc
• Core Infrastructure Centres (CIC)– Manage daily grid operations –
oversight, troubleshooting– Run essential infrastructure services– Provide 2nd level support to ROCs– UK/I, Fr, It, CERN, + Russia (M12)– Hope to get non-European centres
• Regional Operations Centres (ROC)– Act as front-line support for user and
operations issues– Provide local knowledge and
adaptations– One in each region – many distributed
• User Support Centre (GGUS)– In FZK – support portal – provide
single point of contact (service desk)
ESRF, ILL, EMBL; Grenoble, October 19, 2005 9
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE Activities
• 48 % service activities (48 % service activities (Grid Grid Operations, Support and Management, Operations, Support and Management, Network Resource ProvisionNetwork Resource Provision))
• 24 % middleware re-engineering (Quality Assurance, Security, Network Services Development)
• 28 % networking (Management, 28 % networking (Management, Dissemination and Outreach, User Dissemination and Outreach, User Training and Education, Application Training and Education, Application Identification and Support, Policy and Identification and Support, Policy and International Cooperation)International Cooperation)
Emphasis in EGEE is on operating a productiongrid and supporting the end-users
ESRF, ILL, EMBL; Grenoble, October 19, 2005 10
Enabling Grids for E-sciencE
INFSO-RI-508833
Grid middleware
• The Grid relies on advanced software, called middleware, which interfaces between resources and the applications
• The GRID middleware:– Finds convenient places for
the application to be run– Optimises use of resources– Organises efficient access to data – Deals with authentication to the
different sites that are used– Runs the job & monitors progress– Recovers from problems– Transfers the result back to the scientist
should
ESRF, ILL, EMBL; Grenoble, October 19, 2005 11
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite middleware
– The 1st release of gLite (v1.0) made end March’05 With frequent releases since – gLite 1.4 now http://glite.web.cern.ch/glite/packages/R1.0/R20050331 http://glite.web.cern.ch/glite/documentation
– Lightweight services– Interoperability & Co-existence with deployed
infrastructure– Performance & Fault Tolerance– Portable– Service oriented approach– Site autonomy– Open source license
ESRF, ILL, EMBL; Grenoble, October 19, 2005 12
Enabling Grids for E-sciencE
INFSO-RI-508833
• Job management Services– Workload Management– Computing Element– Logging and Bookkeeping
• Data management Services– File and Replica catalog– File Transfer and
Placement Services– gLite I/O
• Information Services– R-GMA– Service Discovery
• Security
• Deployment Modules– Distribution available as RPM’s, Binary Tarballs, Source Tarballs and APT
cache
gLite Release 1.0
Grid AccessService
API
Access Services
JobProvenance
Job Management Services
ComputingElement
WorkloadManagement
PackageManager
MetadataCatalog
Data Services
StorageElement
DataManagement
File & ReplicaCatalog
Authorization
Security Services
Authentication
Auditing
Information &Monitoring
Information & Monitoring Services
Application
Monitoring
Site Proxy
Accounting
JRA3 UK
CERN IT/CZ
Serious testing & certification is under way
ESRF, ILL, EMBL; Grenoble, October 19, 2005 13
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE Activities
• 48 % service activities (48 % service activities (Grid Grid Operations, Support and Management, Operations, Support and Management, Network Resource ProvisionNetwork Resource Provision))
• 24 % middleware re-engineering 24 % middleware re-engineering ((Quality Assurance, Security, Network Quality Assurance, Security, Network Services DevelopmentServices Development))
• 28 % networking (Management, Dissemination and Outreach, User Training and Education, Application Identification and Support, Policy and International Cooperation)
Emphasis in EGEE is on operating a productiongrid and supporting the end-users
ESRF, ILL, EMBL; Grenoble, October 19, 2005 14
Enabling Grids for E-sciencE
INFSO-RI-508833
User information & support
• More than 140 training events across many countries– >2000 people trained
induction; application developer; advanced; retreats– Material archive online with >200 presentations– GILDA training testbed
• Public and technical websites constantly evolving to expand information available and keep it up to date
• 3 conferences organized~ 300 @ Cork ~ 400 @ Den Haag ~ 450 @ Athens
• Pisa: 4th project conference 24-28 October ’05
ESRF, ILL, EMBL; Grenoble, October 19, 2005 15
Enabling Grids for E-sciencE
INFSO-RI-508833
Deployment of applications
• Pilot applications– High Energy Physics– Biomed applications
http://egee-na4.ct.infn.it/biomed/applications.html
• Generic applications –Deployment under way– Computational Chemistry– Earth science research – EGEODE: first industrial application– Astrophysics
• With interest from – Hydrology– Seismology – Grid search engines – Stock market simulators– Digital video etc.– Industry (provider, user, supplier)
• Many users– broad range of needs– different communities with different background and internal organization
Pilot New
ESRF, ILL, EMBL; Grenoble, October 19, 2005 16
Enabling Grids for E-sciencE
INFSO-RI-508833
From Phase I to II
• From 1st EGEE EU Review in February 2005: – “The reviewers found the overall performance of the project very good.”– “… remarkable achievement to set up this consortium, to realize
appropriate structures to provide the necessary leadership, and to cope with changing requirements.”
• EGEE I– Large scale deployment of EGEE infrastructure to deliver
production level Grid services with selected number of applications
• EGEE II– Natural continuation of the project’s first phase– Emphasis on providing an infrastructure for e-Science
increased support for applications increased multidisciplinary Grid infrastructure more involvement from Industry
– Extending the Grid infrastructure world-wide increased international collaboration
(Asia-Pacific is already a partner!)
ESRF, ILL, EMBL; Grenoble, October 19, 2005 18
Enabling Grids for E-sciencE
INFSO-RI-508833
Grid Operations
• Several aspects– Integration of middleware
Testing, certification, release preparation
– Deployment of middleware to sites– Operations and operational support– Operational security and policy– User support
2 aspects – helpdesk and integration support
ESRF, ILL, EMBL; Grenoble, October 19, 2005 19
Enabling Grids for E-sciencE
INFSO-RI-508833
Release Process (simplified)
C&TC&T
EISEISGISGIS
GDBGDB
ApplicationsApplicationsRCRCBugs/Patches/Task
SavannahBugs/Patches/Task
Savannah
EISEISCICsCICs
Head of Deployment
Head of Deployment
prioritization&
selection
DevelopersDevelopers
ApplicationsApplications
DevelopersDevelopers
11
List for next release(can be empty)
List for next release(can be empty)22
integration&
first testsC&TC&T
33
Internal ReleasesInternal
Releases
44
User Level install of
client toolsEISEIS
55
full deployment on test clusters (6)
functional/stress tests~1 week
C&TC&T
66
assign and update cost
Bugs/Patches/TaskSavannah
Bugs/Patches/TaskSavannah
componentsready at cutoff
InternalClient
Release
InternalClient
Release
77Client
ReleaseClient
ReleaseService ReleaseService Release
Updates ReleaseUpdates Release
Core Service Release
Core Service Release
C&TC&T
ESRF, ILL, EMBL; Grenoble, October 19, 2005 20
Enabling Grids for E-sciencE
INFSO-RI-508833
Deployment process
Release(s)Release(s)
Certificationis run daily
Update User Guides EISEIS
UpdateRelease Notes
GISGIS
ReleaseNotes
InstallationGuides
UserGuides
Re-Certify
CICCIC
Every Month
1111
ReleaseReleaseReleaseReleaseClient ReleaseClient Release
Deploy ClientReleases
(User Space)GISGIS
Deploy ServiceReleases (Optional) CICs
RCsCICsRCs
Deploy MajorReleases
(Mandatory) ROCsRCs
ROCsRCs
YAIM
Every Month
Every 3 months
on fixed dates !
at own pace
ESRF, ILL, EMBL; Grenoble, October 19, 2005 21
Enabling Grids for E-sciencE
INFSO-RI-508833
Grid Operations
• The grid is flat, but
• Hierarchy of responsibility– Essential to scale the operation
• CICs act as a single Operations Centre– Operational oversight (grid
operator) responsibility
– rotates weekly between CICs
– Report problems to ROC/RC
– ROC is responsible for ensuring problem is resolved
– ROC oversees regional RCs
• ROCs responsible for organising the operations in a region– Coordinate deployment of
middleware, etc
• CERN coordinates sites not associated with a ROC
CIC
CICCIC
CICCIC
CICCIC
CICCIC
CICCIC
RCRC
RCRC RCRC
RCRC
RCRC
ROCROC
RCRC
RCRC
RCRCRCRC
RCRCRCRC
ROCROC
RCRC
RCRC RCRC
RCRC
RCRC
ROCROC
RCRC
RCRC
RCRC
RCRC
ROCROC
OMCOMC
RC = Resource Centre
ESRF, ILL, EMBL; Grenoble, October 19, 2005 22
Enabling Grids for E-sciencE
INFSO-RI-508833
Grid monitoring
– GIIS Monitor + Monitor Graphs– Sites Functional Tests– GOC Data Base– Scheduled Downtimes
– Live Job Monitor– GridIce – VO + Fabric View– Certificate Lifetime Monitor
• Operation of Production Service: real-time display of grid operations
• Accounting Information
Such tools help the operations staff to ensure the sites work continuously
ESRF, ILL, EMBL; Grenoble, October 19, 2005 23
Enabling Grids for E-sciencE
INFSO-RI-508833
Operational Security
• Operational Security team in place– EGEE security officer, ROC security contacts– Concentrate on 3 activities:
Incident response Best practice advice for Grid Admins – creating dedicated web Security Service Monitoring evaluation
• Incident Response– JSPG agreement on IR in collaboration with OSG
Update existing policy “To guide the development of common capability for handling and response to cyber security incidents on Grids”
– Basic framework for incident definition and handling
• Site registration process in draft– Part of basic SLA
• CA Operations– EUGridPMA – best practice, minimum standards, etc.– More and more CAs appearing
• Security group and work was started in LCG – was from the start a cross-grid activity.
• Much already in place at start of EGEE: usage policy, registration process and infrastructure, etc.
•We regard it as crucial that this activity remains broader than just EGEE
• Security group and work was started in LCG – was from the start a cross-grid activity.
• Much already in place at start of EGEE: usage policy, registration process and infrastructure, etc.
•We regard it as crucial that this activity remains broader than just EGEE
Now IGTF
ESRF, ILL, EMBL; Grenoble, October 19, 2005 24
Enabling Grids for E-sciencE
INFSO-RI-508833
Policy – Joint Security Group
Security & Availability Policy
UsageRules
Certification Authorities
AuditRequirements
Best practiceGuides
Incident Response
User RegistrationApplication Development& Network Admin Guide
http://cern.ch/proj-lcg-security/documents.html
GGUSGeneral User
Support
GGUSGeneral User
Support
Grid “Operator on duty”
Grid “Operator on duty”
ROCOperations
ROCOperations
ROCOperations
ROCOperations
SiteSite SiteSite…
ROCLocal
User Support
ROCLocal
User Support
2nd level support:- Deployment- Middleware- etc.
2nd level support:- Deployment- Middleware- etc.
VO Support
VO Support
…
VO Support
VO Support
Reports problem
Reports problem
Re
po
rts pro
ble
m
2nd Level support1st Level support
NA4
SA1
JRA1SA1SA3
VOSupportTPM
I need help! I send e-mail to [email protected] [email protected]
I need help! I send e-mail to [email protected] [email protected]
E-mail automatically converted in GGUS ticket
VO Support ROC Support
Middleware Support
Other Grids Support
Mailing lists
Ticket Process Manager: Monitor ticket assignments.
Direct to correct support unit
VO Support: Receive tickets VO related and follows them.
Solves/forward problems VO specific. Recognize Grid related problems and assign them to specific support units
GGUS Support:GGUS Support:The ModelThe Model
Operations Support
ESRF, ILL, EMBL; Grenoble, October 19, 2005 27
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE infrastructure usage
• Average job duration January 2005 – June 2005 for the main VOs
Infrastructure is continuously used by many groups
ESRF, ILL, EMBL; Grenoble, October 19, 2005 28
Enabling Grids for E-sciencE
INFSO-RI-508833
10,000 jobs /day
ESRF, ILL, EMBL; Grenoble, October 19, 2005 30
Enabling Grids for E-sciencE
INFSO-RI-508833
Middleware
• Middleware is sourced from– VDT (Globus, Condor, etc)– EDG LCG-x– gLite (EGEE)– Other components developed in LCG
File catalogue, disk pool manager
ESRF, ILL, EMBL; Grenoble, October 19, 2005 31
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite Services for Release 1.0Components Summary and Origin
• Computing Element– Gatekeeper, WSS (Globus)– Condor-C (Condor)– CE Monitor (EGEE)– Local batch system (PBS, LSF,
Condor)• Workload Management
– WMS (EDG)– Logging and bookkeeping (EDG)– Condor-C (Condor)
• Storage Element– File Transfer/Placement (EGEE)– glite-I/O (AliEn)– GridFTP (Globus)– SRM: Castor (CERN), dCache
(FNAL, DESY), other SRMs
• Catalog– File and Replica Catalog
(EGEE)– Metadata Catalog (EGEE)
• Information and Monitoring– R-GMA (EDG)– Service Discovery (EGEE)
• Security– VOMS (DataTAG, EDG)– GSI (Globus)– Authentication for C and Java
based (web) services (EDG)
ESRF, ILL, EMBL; Grenoble, October 19, 2005 32
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite Grid Middlewareguiding principles
• Service oriented approach– Allow for multiple interoperable implementations
• Lightweight (existing) services – Easily and quickly deployable– Use existing services where possible
Condor, EDG, Globus, LCG, … • Portable
– Being built on Scientific Linux and Windows• Security
– Sites and Applications• Performance/Scalability & Resilience/Fault Tolerance
– Comparable to deployed infrastructure• Co-existence with deployed infrastructure
– Co-existence with LCG-2 and OSG (US) are essential for the EGEE Grid services
• Site autonomy– Reduce dependence on ‘global, central’ services
• Open source license
ESRF, ILL, EMBL; Grenoble, October 19, 2005 33
Enabling Grids for E-sciencE
INFSO-RI-508833
Architecture & Design
• Design team including representatives from Middleware providers (AliEn, Condor, EDG, Globus,…) including US partners produced middleware architecture and design.
• Takes into account input and experiences from applications, operations, and related projects
• DJRA1.1 – EGEE Middleware Architecture (June 2004)– https://edms.cern.ch/document/476451/
• DJRA1.2 – EGEE Middleware Design (August 2004)– https://edms.cern.ch/document/487871/
• Much feedback from within the project (operation & applications) and from related projects– Being used and actively discussed by OSG, GridLab, etc. Input to
various GGF groups
ESRF, ILL, EMBL; Grenoble, October 19, 2005 34
Enabling Grids for E-sciencE
INFSO-RI-508833
gLite Services for Release 1
Grid AccessService
API
Access Services
JobProvenance
Job Management Services
ComputingElement
WorkloadManagement
PackageManager
MetadataCatalog
Data Services
StorageElement
DataManagement
File & ReplicaCatalog
Authorization
Security Services
AuthenticationAuditing
Information &Monitoring
Information & Monitoring Services
Application
Monitoring
Site Proxy
Accounting
JRA3 UK
CERN IT/CZ
ESRF, ILL, EMBL; Grenoble, October 19, 2005 35
Enabling Grids for E-sciencE
INFSO-RI-508833
Job Management Services
Efficient and reliable scheduling of computational tasks on the available infrastructure
• Started with LCG-2 Workload Management System (WMS)– Inherited from EDG– Support partitioned jobs and jobs with dependencies – Support for different replica catalogs for data based scheduling– Modification of internal structure of WMS
Task queue: queue of pending submission requests Information supermarket: repository of information on resources Better reliability, better performance, better interoperability, support push
and pull mode
– Under development Web Services interface supporting bulk submission (after V1.0)
• Bulk submission supported now by use of DAGs
ESRF, ILL, EMBL; Grenoble, October 19, 2005 36
Enabling Grids for E-sciencE
INFSO-RI-508833
WMS Interaction Overview
ESRF, ILL, EMBL; Grenoble, October 19, 2005 37
Enabling Grids for E-sciencE
INFSO-RI-508833
• Efficient and reliable data storage, movement, and retrieval on the infrastructure
• Storage Element– Reliable file storage (SRM based storage systems)– Posix-like file access (gLite I/O)– Transfer (gridFTP)
• File and Replica Catalog– Resolves logical filenames (LFN) to physical location of files (URL understood
by SRM) and storage elements– Hierarchical File system like view in LFN space– Single catalog or distributed catalog (under development) deployment
possibilities• File Transfer and Placement Service
– Reliable file transfer and transactional interactions with catalogs• Data Scheduler
– Scheduled data transfer in the same spirit as jobs are being scheduled taking into account e.g. network characteristics (collaboration with JRA4)
– Under development• Metadata Catalog
– Limited metadata can be attached to the File and Replica Catalog– Interface to application specific catalogs have been defined
Data Management Services
ESRF, ILL, EMBL; Grenoble, October 19, 2005 38
Enabling Grids for E-sciencE
INFSO-RI-508833
DM Interaction Overview
File andReplica Catalog
StorageIndex
Fireman
Database
WMS
Storage Element
SRM
Storage
gLite I/O gridFTP
File Transfer andPlacement Service FTS
FPS Transfer Agent
Database
VOMS
MyProxy
Getcredential
Storecredential
File I/O
File namespace
and Metadata mgmt
File replication
Proxy renewal ReplicaLocation
ESRF, ILL, EMBL; Grenoble, October 19, 2005 39
Enabling Grids for E-sciencE
INFSO-RI-508833
Information and Monitoring Services
• R-GMA (Relational Grid Monitoring Architecture)– Implements GGF GMA standard– Development started in EDG, deployed on the production infrastructure
for accounting and monitoring
ProducerService
RegistryService
ConsumerService
AP
IA
PI
Mediator
SchemaService
Consumerapplication
Producerapplication
Publish Tuples
Send Query
Receive Tuples
Register
LocateQu
ery
Tu
ples
SQL “CREATE TABLE”
SQL “INSERT”
SQL “SELECT”
ESRF, ILL, EMBL; Grenoble, October 19, 2005 40
Enabling Grids for E-sciencE
INFSO-RI-508833
R-GMA
• Producer, Consumer, Registry and Schema services with supporting tools– Registry replication– Simpler API – matching the next (WS) release
Provides smooth transition between old API and WS
– coping with life on the Grid: poorly configured networks, firewalls, MySQL corruptions etc
• Generic Service Discovery API– Defined but not yet implemented by any gLite services
• Under development– Web Service version– File (as well as memory and RDBMS) based Producers– Native python interface – Fine grained authorization– Schema replication
ESRF, ILL, EMBL; Grenoble, October 19, 2005 41
Enabling Grids for E-sciencE
INFSO-RI-508833
Other Reengineering Activities
• Prototypes of Grid Access Service and Package Manager implemented in the AliEn framework
• Grid Access Service– Acts on user’s behalf– Discovers and manages Grid services for the user
• Package Manager– Provides dynamically distribution of application software needed– Does not install Grid middleware
ESRF, ILL, EMBL; Grenoble, October 19, 2005 42
Enabling Grids for E-sciencE
INFSO-RI-508833
Security
• Job Management Services– Authorization based on VOMS VO, groups, and user information
• Data Services– Authorization: ACL and Unix permissions– Fine-grained ACL on data enforced through gLite-IO and
Catalogs– Catalog data itself is authorized through ACLs
Currently supported through DNs VOMS integration being developed
• Information Services– Fine grained authorization based on VOMS certificates being
implemented
ESRF, ILL, EMBL; Grenoble, October 19, 2005 43
Enabling Grids for E-sciencE
INFSO-RI-508833
Main Differences to LCG-2
• Workload Management System works in push and pull mode
• Computing Element moving towards a VO based scheduler guarding the jobs of the VO (reduces load on GRAM)
• Re-factored file & replica catalogs
• Secure catalogs (based on user DN; VOMS certificates being integrated)
• Scheduled data transfers
• SRM based storage
• Information Services: R-GMA with improved API, Service
• Discovery and registry replication
• Move towards Web Services
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
Joining the grid …
New sitesNew applications
ESRF, ILL, EMBL; Grenoble, October 19, 2005 45
Enabling Grids for E-sciencE
INFSO-RI-508833
New sites
• Join the EGEE infrastructure– Install the middleware– Become certified (check that installation is OK)– Set your policies
Which applications you accept – might be only your local ones You still gain operational monitoring, possibility to use other
resources, direct collaboration with other institutes
• Set up a clone of the EGEE infrastructure– Never been done– Potentially needed for (e.g.) medical applications– Requires investment in all the operational structure– Would re-use only the software … which is rapidly evolving
ESRF, ILL, EMBL; Grenoble, October 19, 2005 46
Enabling Grids for E-sciencE
INFSO-RI-508833
New applications
• Pilot applications– HEP and Biomed in EGEE-I– Several more in EGEE-II
• Reviewed applications– Go through a process to ascertain that science will benefit from
the use of the grid– Expect a certain level of direct support; but also expected to
have a community support behind
• Other applications– Actually quite a few – HEP, Astrophysics, …– More or less self-supporting in terms of getting started
• All benefit from operational and user support– Pilot and reviewed apps get more integration support
ESRF, ILL, EMBL; Grenoble, October 19, 2005 47
Enabling Grids for E-sciencE
INFSO-RI-508833
New applications
• Training courses– General courses, or potentially tailored to need
• GILDA demonstration system• Pre-production service• Production service
ESRF, ILL, EMBL; Grenoble, October 19, 2005 48
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE – What can it deliver?
• A managed operation – providing a service:– A large number of sites of different sizes and capabilities– Developed operational procedures
Monitoring of the grid services providing access to resources
– Support services: user support, training, etc. – Building up considerable experience in grid-enabling a variety of
different applications– Tools for monitoring of resources at a site … if required
• A new VO joining EGEE with a few sites:– Benefits from the operations and support – the VO sites can be
monitored and supported as part of the infrastructure– Potentially access to other resources – It is a significant effort to set up a grid infrastructure from scratch
ESRF, ILL, EMBL; Grenoble, October 19, 2005 49
Enabling Grids for E-sciencE
INFSO-RI-508833
… and what does it cost?
• “The application VO buys into the EGEE model”– Actually not so restrictive now – supports many linux flavours,
IA64, (other teams have worked on AIX, SGI ports)– Simple installation of client software now (can be done on the fly)– Basic grid services are quite general, nothing really application-
specific
• Some unresolved issues:– Commercial licensed software used by an application– Levels of privacy/security needed in some life-science
applications– True interactivity
• … and of course, this is all new, rapidly evolving and many problems still to be overcome
ESRF, ILL, EMBL; Grenoble, October 19, 2005 50
Enabling Grids for E-sciencE
INFSO-RI-508833
Summary
• Grids are a powerful new tool for science
• Several applications are already benefiting from Grid technologies (biomedical is a good example, … and High Energy Physics of course)
• Europe is strong in the development of Grids also thanks to the success of EGEE and related projects
• EGEE offers:– A mechanism for linking together the people, resources and data of your
scientific community– Continuous monitoring of the status of your Virtual Organisation– A set of middleware for gridifying applications with documentation, training and
support– Regular forums for linking with grid experts, other communities and industry
• EGEE-II will further extend support for user communities and applications
ESRF, ILL, EMBL; Grenoble, October 19, 2005 51
Enabling Grids for E-sciencE
INFSO-RI-508833
Contacts
• EGEE Websitehttp://www.eu-egee.org
• How to joinhttp://public.eu-egee.org/join/
• EGEE Project [email protected]
INFSO-RI-508833
Enabling Grids for E-sciencE
www.eu-egee.org
Examples of applications
ESRF, ILL, EMBL; Grenoble, October 19, 2005 53
Enabling Grids for E-sciencE
INFSO-RI-508833
EGEE pilot applications (I)
• High-Energy Physics (HEP)– Provides computing infrastructure (LCG)
for experiments at CERN in Geneva
– Challenging: thousands of processors world-wide generating petabytes of data ‘chaotic’ use of grid with individual user
analysis (thousands of users interactively operating within experiment VOs)
• Biomedical Applications Similar computing and
data storage requirements Major additional challenge:
security & access to data in many formats
Mont Blanc(4810 m)
Downtown Geneva
ESRF, ILL, EMBL; Grenoble, October 19, 2005 54
Enabling Grids for E-sciencE
INFSO-RI-508833
BioMed Overview
• Infrastructure– ~2000 CPUs– ~21 TB of disk– in 12 countries
• >50 users in 7 countries working with 12 applications
• 18 research labs
• ~80.000 jobs launched since 04/2004
• ~10 CPU yearsMonth
Num
ber
of
job
s
PADOVA
BARI
15 resource centres 17 CEs 16 SEs
ESRF, ILL, EMBL; Grenoble, October 19, 2005 55
Enabling Grids for E-sciencE
INFSO-RI-508833
Bioinformatics
• GPS@: Grid Protein Sequence Analysis– Gridified version of NPSA web portal
Offering proteins databases and sequence analysis algorithms to the bioinformaticians (3000 hits per day)
Need for large databases and big number of short jobs
– Objective: increased computing power– Status: 9 bioinformatic softwares gridified– Grid added value: open to a wider community with larger
bioinformatic computations
• xmipp_MLrefine– 3D structure analysis of macromolecules
From (very noisy) electron microscopy images Maximum likelihood approach to find the optimal model
– Objective: study molecule interaction and chem. properties– Status: algorithm being optimised and ported to 3D– Grid added value: parallel computation on different resources of
independent jobs
ESRF, ILL, EMBL; Grenoble, October 19, 2005 56
Enabling Grids for E-sciencE
INFSO-RI-508833
Medical imaging
• GATE– Radiotherapy planning
Improvement of precision by Monte Carlo simulation Processing of DICOM medical images
– Objective: very short computation time compatible with clinical practice
– Status: development and performance testing– Grid Added Value: parallelisation reduces computing time
• CDSS– Clinical Decision Support System
Assembling knowledge databases Using image classification engines
– Objective: access to knowledge databases from hospitals – Status: from development to deployment, some medical end
users – Grid Added Value: ubiquitous, managed access to distributed
databases and engines
ESRF, ILL, EMBL; Grenoble, October 19, 2005 57
Enabling Grids for E-sciencE
INFSO-RI-508833
Medical imaging
• SiMRI3D– 3D Magnetic Resonance Image Simulator
MRI physics simulation, parallel implementation Very compute intensive
– Objective: offering an image simulator service to the research community
– Status: parallelised and now running on EGEE resources– Grid Added Value: enables simulation of high-res images
• gPTM3D– Interactive tool to segment and analyse medical images
A non gridified version is distributed in several hospitals Need for very fast scheduling of interactive tasks
– Objectives: shorten computation time using the grid Interactive reconstruction time: < 2min and scalable
– Status: development of the gridified version being finalized– Grid Added Value: permanent availability of resources
ESRF, ILL, EMBL; Grenoble, October 19, 2005 58
Enabling Grids for E-sciencE
INFSO-RI-508833
Drug Discovery
• Grid-enabled drug discovery process for neglected diseases– In silico docking: compute probability that potential drugs will dock with a target
protein– To speed up and reduce cost required to develop new drugs
• WISDOM (Wide In Silico Docking On Malaria)– Drug Discovery Data Challenge– 11 July – 19 August– 46 million docked ligands produced
(typical for computer clusters: 100 000 ligands)– Equivalent to 80 CPU years – 1000 computers in 15 countries used simultaneously– Millions of files (adding up to a few TB of data)
Never done on a large scale production infrastructure Never done for a neglected disease
• Next steps – Sort through data to identify potential drugs– Develop the next steps of the process (molecular dynamics)
ESRF, ILL, EMBL; Grenoble, October 19, 2005 59
Enabling Grids for E-sciencE
INFSO-RI-508833
Generic Applications
• EGEE Generic Applications Advisory Panel (EGAAP)– UNIQUE entry point for “external” applications
– Reviews proposals and make recommendations to EGEE management Deals with “scientific” aspects, not with technical details Generic Applications group in charge of introducing selected applications to
the EGEE infrastructure
– 6 applications selected so far: Earth sciences (earth observation, geophysics, hydrology, seismology) MAGIC (astrophysics) Computational Chemistry PLANCK (astrophysics and cosmology) Drug Discovery E-GRID (e-finance and e-business) GRACE (grid search engine, ended Feb 2005)
ESRF, ILL, EMBL; Grenoble, October 19, 2005 60
Enabling Grids for E-sciencE
INFSO-RI-508833
Earth sciences applications
• Earth Observations by Satellite – Ozone profiles
• Solid Earth Physics – Fast Determination of mechanisms
of important earthquakes
• Hydrology – Management of water resources
in Mediterranean area (SWIMED)
• Geology– Geocluster: R&D initiative of the
Compagnie Générale de Géophysique
A large variety of applications ported on EGEE which incites new users
Interactive Collaboration of the teams around a project
ESRF, ILL, EMBL; Grenoble, October 19, 2005 61
Enabling Grids for E-sciencE
INFSO-RI-508833
MAGIC
• Ground based Air Cerenkov Telescope 17 m diameter
• Physics Goals: – Origin of VHE Gamma rays– Active Galactic Nuclei– Supernova Remnants– Unidentified EGRET sources– Gamma Ray Burst
• MAGIC II will come 2007• Grid added value
– Enable “(e-)scientific” collaboration between partners – Enable the cooperation between different experiments– Enable the participation on Virtual Observatories
ESRF, ILL, EMBL; Grenoble, October 19, 2005 62
Enabling Grids for E-sciencE
INFSO-RI-508833
Computational Chemistry
• The Grid Enabled Molecular Simulator (GEMS)– Motivation:
Modern computer simulations of biomolecular systems produce an abundance of data, which could be reused several times by different researchers. data must be catalogued and searchable
– GEMS database and toolkit: autonomous storage resources metadata specification automatic storage allocation and
replication policies interface for distributed computation
ESRF, ILL, EMBL; Grenoble, October 19, 2005 63
Enabling Grids for E-sciencE
INFSO-RI-508833
Planck
• On the Grid:> 12 times faster
(with ~5% failures)
• Complex data structure data handling
important
• The Grid as– collaboration
tool
– common user-interface
– flexible environment
– new approach to data and S/W sharing