e. ronchieri – n° 1 edg release 2 elisabetta ronchieri infn cnaf - datagrid wp1 – workload...
Post on 19-Dec-2015
214 views
TRANSCRIPT
E. Ronchieri – n° 1
EDG release 2
Elisabetta Ronchieri INFN CNAF - DataGrid WP1 – Workload Management System
E. Ronchieri – n° 2
Outline
What is Grid?
Grid Projects – Focus on EU Data Grid Project
Selected Areas + Technologies Security – Information and Monitoring Services – Storage Management
- Data Management - Workload Management
Installation
E. Ronchieri – n° 3
Grid Vision
Researchers, Grid Middleware, Scientific instruments and experiments and Resources are the major figures
Researchers interact with colleagues, share and access data Grid middleware provides part of the sw infrastructure Experiments provides huge amount of data
Grid is: a special form of distributed computing
Computing and storage resources are distributed over several sites Sites are typically connected via wide-area NW links
It can be best applied to applications that have the following features: Distributed user community Lots of computing power (Computational Grid) Lots of storage capacity (Data Grid)
Currently, it is applied mainly in computing sciences
E. Ronchieri – n° 4
Grid Today
Still many steps must be done (especially to make the Grid popular to a conventional user)
Considerable expertise is still required (especially to make efficient the use of the Grid technology)
There is no single Grid (several projects,…)
Grids need to work together for a standardization Global Grid Forum (GGF http://www.ggf.org)
Its mission is to promote and develop Grid technologies and applications There are a lot working group in several different areas (Scheduling and
Resource Management, Security, ….)
E. Ronchieri – n° 5
Outline
What is Grid?
Grid Projects – Focus on EU Data Grid Project
Selected Areas + Technologies Security – Information and Monitoring Services – Storage Management
- Data Management - Workload Management
Installation
E. Ronchieri – n° 6
Major US & European Grid Projects, many with strong HEP participation
US projects European projects
Many national, regional Grid projects --GridPP(UK), INFN-grid(I),NorduGrid, Dutch Grid, …
The Virtual DataToolkit (VDT)
The DataGridToolkit
E. Ronchieri – n° 7
Outline
What is Grid?
Grid Projects – Focus on EU Data Grid Project
Selected Areas + Technologies Security – Information and Monitoring Services – Storage Management
- Data Management - Workload Management
Installation
E. Ronchieri – n° 8
EDG Globus-based middleware architecture
EDG is built on the emerging Grid technology
Start: Jan 1, 2001 End: Dec 31, 2003
Current EDG architectural functional blocks: Basic Services provided by Globus 2.2.x (such as authentication authorization, info providers,
replica catalog, secure file transfers) and Condor (such as the submission, the effective job cancellation, the event monitoring, the support for the monitoring)
Higher Level EDG Middleware developed within EDG
Application (such as HEP, BIO, and EO)
OS & Net services
Basic Services
High level Grid middleware
LHCVOs common application layer
Other apps
ALICE ATLAS CMS LHCbSpecific application layer
Other apps
GLOBUS 2.2.x
and Condor
Grid middleware
E. Ronchieri – n° 9
Outline
What is Grid?
Grid Projects – Focus on EU Data Grid Project
Selected Areas + Technologies Security – Information and Monitoring Services – Storage Management
- Data Management - Workload Management
Installation
E. Ronchieri – n° 10
Selected Areas for Grid Technologies in EU DataGrid (and partly Globus)
Security All access to and interaction with Grid resources need to be done in a secure way
Major technologies: PKI (Public Key Infrastructure), and GSS
Information and Monitoring Services Before you start using the Grid, you need to know what resources are there and
what you can use
Major technologies: LDAP based or Web Service approach
Data Management Main focus of a Data Grid
Major technologies: LDAP based or Web Service approach
Workload Management Submit your application to Grid where it is executed
E. Ronchieri – n° 11
Outline
What is Grid?
Grid Project – Focus on EU DataGrid Projects
Selected Areas + Technologies Security – Information and Monitoring Services – Storage Management
- Data Management – Workload Management
Installation
E. Ronchieri – n° 12
Security in EDG
Why: User jobs might access several remote resources
Users need to be Authenticated (Who am I?) Authorized (What can I do?)
Mainly uses: The security infrastructure provided by Globus
Based on PKI (Public Key Infrastructure) and GSS
E. Ronchieri – n° 13
Grid Security Requirements
1) Easy to use
2) Single sign-on
3) Run applications
1) Specify local access control
2) Auditing, accounting, etc.
3) Integration local system kerberos, AFS, license mgr.
User View
Resource Owner View
E. Ronchieri – n° 14
Grid Security Infrastructure (GSI)
Extensions to existing standard protocols & APIs Standards: SSL/TLS, X.509 & CA, GSS
Extensions for single sign-on and delegation
Globus Toolkit reference implementation of GSI SSLeay/OpenSSL + GSS-API + delegation + single sign on
E. Ronchieri – n° 15
Site N(Unix)
Example of GSI usage
Site A(Unix)
Site B
Computer
User
Storagesystem
Proxy Credential
GridFTP Server
Grid Service
Remote file access request
Restricted Proxy
E. Ronchieri – n° 16
VO-LDAP Architecture
mkgridmap grid-mapfile
VOVODirectoryDirectory
CN=Mario Rossi
o=xyz,dc=eu-datagrid, dc=org
CN=Franz ElmerCN=John Smith
Authentication Certificate
Authentication Certificate
Authentication Certificate
ou=People ou=Testbed1 ou=???
local users ban list
Adopted by
DataGrid Testbed0 (2001/02)
DataGrid Testbed1 (2003)
DataTAG Testbed (2003)
E. Ronchieri – n° 17
Outline
What is Grid?
Grid Projects – Focus on EU Data Grid Project
Selected Areas + Technologies Security – Information and Monitoring Services - Storage Management
- Data Management -Workload Management
Installation
E. Ronchieri – n° 18
Grid Information and Monitoring Services
MDS 2.x R-GMA
DATA Model LDAP (Hierarchical) Relational
communicaton LDAP HTTP
Information storage
LDAP-based backends re-written by Globus
Relational Data Base
queries LDAP queriesLdapsearch -x -H ldap://lxshare0225.cern.ch:2135\ -b 'Mds-Vo-name=datagrid,o=grid’\ 'objectclass=StorageElement‘\ seId SEsize
SQL queriesSelect * from StorageElement
Components
GRIS SEGRIS CE
GIIS
WNWNWN
WNWN
Producer
Consumer
Registry
E. Ronchieri – n° 19
EDG release 1.x is totally based on MDS 2.x Due to stability problems of this component, in the last period
we use to deploy a pure LDAP server in front of a top level GIIS
EDG release 2.x is based on both MDS 2.x and R-GMA Since the GIS is a vital service for the WM, the Broker will rely
on MDS 2.x until R-GMA won’t prove to be reliable
Grid Information and Monitoring Services in EDG
E. Ronchieri – n° 20
Outline
What is Grid?
Grid Projects – Focus on EU Data Grid Project
Selected Areas + Technologies Security – Information and Monitoring Services - Storage Management
- Data Management -Workload Management
Installation
E. Ronchieri – n° 21
Interfaces to SE
First release of the SE control System
The three interfaces to the outside world are: Data transfer
Gridftp will be used to transfer files over the WAN and the files will be available to local nodes by NFS
Information Existing MDS information providers will be extended to provide the extra information in
the GLUE storage schema
Control Function such as reservation for reading and writing, metadata modification, access via
gridftp
It is an implementation of the Storage Resource Management (SRM) specification
The SE control interface to a generic MSS has already been tailored for CERN and RAL
Work is under way with in2p3, wp10 and wp9 to adapt it to their MSS
http://sdm.lbl.gov/srm-wg
E. Ronchieri – n° 22
Outline
What is Grid?
Grid Projects – Focus on EU Data Grid Project
Selected Areas + Technologies Security – Information and Monitoring Services - Storage Management
- Data Management – Workload Management
Installation
E. Ronchieri – n° 23
Naming Schemes
GUID – Global Unique Identifier guid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6
LFN – Logical File Name lfn://event20030612
SFN – Storage File Name sfn://ibm139.cnaf.infn.it/edg/storageelement/dev/wpsix/pippo
Host + path + filename
GUID
LFN1
LFN2
LFN3
SFN1
SFN2
SFN3
E. Ronchieri – n° 24
Replica Manager
Replica Metadata Catalog
Replica Location Service
File Transfer
Optimization Client
RLS
RMC
GridFTP
edg-replica-manager
Replication Services: EDG Replica Manager
Used for querying and assigning LFNs
Used for locating replicas and assigning SFNs
Used for transferring file
E. Ronchieri – n° 25
VO VO
Replication Services Architecture
Site
Replica Manager
StorageElement
ComputingElement
Optimiser
Resource Broker
User Interface
ReplicaMetadata Catalog
Site
Replica Manager
StorageElement
ComputingElement
Optimiser
ReplicaLocation Service
LocalReplicaCatalog
LFNs -> GUIDGUID->SFNs
E. Ronchieri – n° 26
Outline
What is Grid?
Grid Projects – Focus on EU Data Grid Project
Selected Areas + Technologies Security – Information and Monitoring Services - Storage Management
- Data Management - Workload Management
Installation
E. Ronchieri – n° 27
Review of WMS architecture
WMS architecture reviewed To apply the “lessons” learned and addressing the shortcomings
emerged with the first release of the software
To address the scalability problems
To increase the reliability of the system
To favor interoperability with other Grid frameworks, by allowing exploiting WP1 modules (e.g. RB) also “outside” the EDG WMS
E. Ronchieri – n° 28
WMS Revised Architecture
UIReplicaManager
Inform.Service
NetworkServer
Job Contr.-
CondorG
WorkloadManager
RB node
CE characts& status
SE characts& status
RBstorage
Match-Maker/ Broker
JobAdapter
Log Monitor
Logging &Bookkeeping
E. Ronchieri – n° 29
Improvements
Duplication of persistent information related to jobs avoided LB only repository of job information Possible to have multiple LB servers per RB (to avoid bottlenecks)
Techniques to quickly recover from failures E.g.: communication among components of WMS much more reliable (done via persistent
queues in the file system)
Also less exposed to memory leaks (coming not only from EDG software)
Flexibility and interoperability increased E.g. RB-Matchmaker as pluggable module Glue Schema compliance
Other enhancements in design and implementation
E. Ronchieri – n° 30
New functionalities User APIs
Including a Java GUI
Trivial job check-pointing service User can save from time to time the state of the job (defined by the application) A job can be restarted from an intermediate (i.e. previously saved) job state
Gang-matching Allow to take into account both CE and SE information in the matchmaking For example to require a job to run on a CE close to a SE with enough space
Support for parallel MPI jobs
Support for interactive jobs Jobs running on some CE worker node where a channel to the submitting (UI) node is available for the
standard streams (by integrating the Condor Bypass software)
E. Ronchieri – n° 31
Outline
What is Grid?
Grid Projects – Focus on EU Data Grid Project
Selected Areas + Technologies Security – Information and Monitoring Services - Storage Management
- Data Management - Workload Management
Installation
E. Ronchieri – n° 32
Installation
EDG SW: Is delivered via rpms Is handled on CVS repository
Globus + Condor SW: are provided via VDT (delivered rpms) upgraded to Globus 2.2.4 and Condor 6.5.1
LCFGng: Is an automatic installation tool based on rpms Is also used for the configuration of the middleware components Works for RH 6.2 and RH 7.3
Sites: Development testbed
E. Ronchieri – n° 33
EDG Deploying
R-GMA, RM, RLS, ROS, RMC, and WMS + GLUE schema
EDG release 2.0 A temporary tag contains the functionalities for EDG 2.0 (deployed at
CERN, NIKHEF, CNAF, and RAL)
not officially tagged as EDG 2.0 until the basic functionalities work (e.g. job submission, data transfers, etc)
Hopefully the first EDG 2.0 tag at the end of this week
The schedule for moving to gcc3.2.2 for all software is planning for this September
The integration of more functionalities is entirely at the mercy of LCG