the disk resource manager a berkeley srm for disk cache implementation and testing experience on the...
Post on 30-Dec-2015
214 Views
Preview:
TRANSCRIPT
The Disk Resource ManagerA Berkeley SRM for Disk Cache
Implementation and Testing Experience on the OSG
Jorge Luis RodriguezIBP/Grid Mini-Workshop
Apr 7-8, 2005
4/7/2005 Jorge L. Rodriguez 2
Storage in Grid3• No real Storage Element in Grid3• Defined a set storage locations to enable jobs to
run at a site– The middleware location
• $Grid3: sharing recommended, the $VDT_LOCATION– Shared across the entire CE and chmod = 0777
• $APP: Applications are installed here • $DATA: Users scratch space lifetime ~ per exp. run• $TMP: Another user scratch space, lifetime ~ per job
– Non-shared local to worker nodes• $WNTMP: Also chmod 0777, lifetime ~ per job
• Very simplistic but something was needed to get us off the ground
On the C
E
4/7/2005 Jorge L. Rodriguez 3
DRM Service Deployment Goals
• To provide managed “tactical” storage to an OSG site– Apply SRM interface to $DATA… shared space– Storage space reservation – Storage request scheduling
• Should function in the OSG environment– OSG Packaging
• pacman
– OSG MIS• Publish relevant service parameters (port #s, Service
names…)
– OSG Authorization• Multi-VO, VOMS managed gridmap-file
4/7/2005 Jorge L. Rodriguez 4
Service Description
• Brief overview of the DRM Service– CORBA based service
• Based on SRM v.1.1 specification• Movement of files into and out of SRM cache• Default Lifetime, space quota, caching and
queuing policies are supported• Can be used through command-line DRM
clients
– Web Service component is provided by JAVA based wrapper • interoperable with other SRMs (e.g. dCache-
SRM)• Can be used through srmcp fermilab client
4/7/2005 Jorge L. Rodriguez 5
Service Description
• DRM dependencies – Functions on top of ordinary UNIX file
systems: local, NFS…– Binaries available for RH7.x and RHEL 3– Statically linked and self contained
• Requires JAVA 1.4+• Requires data movement services: gTK,
gsiftp
4/7/2005 Jorge L. Rodriguez 7
Enterprise
Technical Groups
ResearchGrid Projects
VOs
Researchers
Sites
Service Providers
Labs
activity1activity
1activity1Activities
Advisory Committee
Core OSG Staff(few FTEs, manager)
OSG Council(all members abovea certain threshold,
Chair, officers)
Executive Board(8-15 representatives
Chair, Officers)
OSG Organization
4/7/2005 Jorge L. Rodriguez 8
Privilegec
TG-Storage OSG deployment
Monitoring
DRM
The DRM Activity
UFlorida
LBL
Argonne
Vanderbilt
FNAL
•
•
•
Integration
PrivilegecPrivilege
Documents
ReadinessPlan
TestbedPlone
website
4/7/2005 Jorge L. Rodriguez 9
The DRM Activity
• Readiness Plan– Details what we plan to do to get the service
ready for deployment in the OSG– Examine deployment strategies
• Deploy with other CE services• Deploy as stand alone service (on separate host)
– Examine impact on other OSG servicesAuthorization, monitoring, packagin…
– Develop testing alogrithum• Include service verification• Include mock up and real application tests
4/7/2005 Jorge L. Rodriguez 10
The DRM Activity
• Logistics and Documentation – Telecon meetings on a semi-regular basis– e-mail list osg-drm@opensciencegrid.org
• Plone website plone.opensciencegrid.org– Installation instructions– Meetings and minutes– Collection of interesting document – DRM testbed site information
4/7/2005 Jorge L. Rodriguez 12
Installation
• DRM is included a part of VDT1.3.3+– DRM vers 1.2.4 in the VDT – The CORBA components are installed and
configured by the VDT
• Can be deployed with everything else…– How its currently deployed in DRM testbed – Canonical Grid3 site…
• Can also be deployed stand alone– VDT will handle dependencies– Also being tried in DRM testbed
4/7/2005 Jorge L. Rodriguez 13
Configuration• Should install as a non-privileged user (srm)
– Service provider recommends this for security reasons
• File system configuration – These recommendations are a work in progress!– Allocate a separate partition for the cache– RW for the service uid, all others RO
• Can and should use service certificates• Auto configuration and sysV startup script
included– Hand entered parameters will likely be required– Web Service configuration is now working
4/7/2005 Jorge L. Rodriguez 15
Site Level Testing
• CORBA and Java Web Services are running• Simple File movement exercise
– Move files into and out of the DRM cache via the SRM interface
– Move files into and out of from a remote DRM client
– Move files into and out of DRM cache from a remote SRM-dCache client
• DRM test harness (from LBL team)– Included with DRM vers 1.2.4+– Will be part of the overall site verification
4/7/2005 Jorge L. Rodriguez 16
Application Level Test• Simplified (mockup) application test will include:
1. stage file into DRM cache from remote DRM/SRM-dCache client
2. Execute job that access “staged in” file in DRM cache via POSIX calls
3. Job creates an output file on local storage ($WNTMP or $TMP)
4. Job moves file from local storage into local DRM cache5. Job moves file from DRM cache to SRM managed or
regular remote location• Testing by real Application(s)
– Make and document recommendations for use of DRM services in job work flow
– Make DRM enabled sites available for application testers• Schedule with application administrators • Monitor activity by both site and application administrators
4/7/2005 Jorge L. Rodriguez 17
Service Monitoring
• Will need input from Service Provider as to what to monitor and what to measure
• Performance of storage requests– How does the reservation system work– How does storage request scheduling
function
• Resource load from DRM activities– Impact on server load– Impact on other running services
4/7/2005 Jorge L. Rodriguez 19
Where are We Now?
• DRM will NOT be ready for OSG-0– Service was deployed on two OSG based sites– Could not complete simple service operational
exercises in time for OSG deployment decision• Need more “user” effort (testbed…)• Dependency on other services critical• Service difficult to use due to lack of documentation• Ongoing service development, changes and new
significant new features between versions
• Will try for next OSG relase– OSG integration is very busy with other
services anyway…
4/7/2005 Jorge L. Rodriguez 20
Where Are We Now?
• Communication channels created– plone activity, email lists and meetings– Coordinating with OSG-TG
• Four sites in the readiness tesbed– Argonne: Ed May– Buffalo: Mark Green – Florida: Jorge Rodriguez – LBL: Alex Sim (DRM developers)
4/7/2005 Jorge L. Rodriguez 21
Where Are We Now?
• Activities now focused on using DRM software– CORBA services installed and used
• Exercised file movement with gsiftp and SRM • Files moved into DRM cache worked OK• Problems with new/old proxies
– Web Services• All of these are now up and running out of the box• So far only experts have operated the service using these
• Need to continue going through the procedures outlined in the readiness
4/7/2005 Jorge L. Rodriguez 22
Next Steps• Continue to work out kinks in the installation and
configuration procedures• Feed those back into the VDT packaging• Run through the testing and validation steps
outlined in slides 8 through 11• Understand, document and communicate the
service functionality:– To OSG site admins:
• Impact on an individual site and services• Impact on OSG grid wide services
– To OSG application community:• How will it impact on previously used work flows?• What other changes are needed effectively use the new
services?
top related