1
Use of SRM File Streaming by Gateway
Alex Sim
Arie Shoshani
May 2008
2
File Streaming – how it works
Get file
Release file
user quota
user quota
user quota
user quota
large request for fileslarge request for files Small file request
Get file
Release file
3
File Streaming – what’s the advantages
• Can accommodate very large requests with a limited quota
• No waste of space: use only space needed for files
• Reuse space as soon as files are transferred and “released”
• Share files that multiple users ask for
• Keep “popular” (hot) files in cache as long as space permits
• Can have smaller quotas to accommodate more users
• Length of lifetime can be longer, as long as files are released(can avoid cutting off request before they finish)
• With DML, transfer can be started right away, even if only someof the files are in cache
• Overlap transfer to cache (from archive or another site) with transfer to User
4
Scenario 1Scenario 1: Simple Scenario for User File Access: Simple Scenario for User File Access
DiskCache
NCAR/MSS
SRM
ESG Gateway
DiskCache
User’s browser
BrowserOr wget
httptransfer
NCAR User’s machine
• Simple http or wget download from ESG Gateway siteSimple http or wget download from ESG Gateway site• User goes to Gateway, selects files, requests files• User’s quota must be sufficient to hold all requested files (otherwise request
refused)• Gateway gets files into SRM disk from local or remote sites• User finds out the status from the Gateway either by browser or email• User downloads files either by clicking file links on the browser or by wget• User “clicks” Gateway to release files that are downloaded (no way to enforce that)
DiskCache
NERSC/HPSS
SRM
DiskCache
ORNL/HPSS
SRM
LBNL/NERSCORNL
5
Scenario 2Scenario 2: Download with DML from ESG : Download with DML from ESG - No File Streaming- No File Streaming
• User downloads DataMoverLite (DML)• User goes to Gateway, select files, requests files• User’s quota must be sufficient to hold all requested files (otherwise request refused)• Gateway submits the request to the SRM. • SRM returns the request token to the user.• User launches DML with request token. • DML checks the status of the request and starts transferring files from Gateway to the user’s local disk
(right away for files already in cache)• DML downloads files and sends “release” to SRM through the Gateway, so space is not wasted• After file transfers are completed, DML sends “request completed” to Gateway)
DiskCache
MSS
SRM
ESG Gateway
DiskCache
User’s browser
NCAR User’s machine
DataMoverLite
release
request
HTTP/HTTPS
transfer
DiskCache
NERSC/HPSS
SRM
DiskCache
ORNL/HPSS
SRM
LBNL/NERSCORNL
request token
6
Scenario 3Scenario 3: Download with DML from ESG : Download with DML from ESG - File Streaming- File Streaming
• File StreamingFile Streaming• When files fit user quota space: DML starts downloading files (right away for
file already in the cache)• When files do not fit the user quota space, DML uses the “file streaming”
feature (DML repeatedly sends Status to Gateway, and downloads available files )
• SRM brings files into Gateway’s SRM cache only as many files as it can fit into the user’s space quota.
• As DML downloads files and sends “release” to SRM through the gateway, more files are being brought in, and “streamed” to the client.
DiskCache
MSS
SRM
ESG Gateway
DiskCache
User’s browser
Filetransfer
NCAR User’s machine
DataMoverLite
release
request
Status
DiskCache
NERSC/HPSS
SRM
DiskCache
ORNL/HPSS
SRM
LBNL/NERSCORNL
request token
7
File StreamingFile Streaming
DiskCache
MSS
SRM
ESG Gateway
Browser DiskCache
User’s browser
FiletransferNCAR
User’s machine
DataMoverLite
File Selection
and request
Stat
us
DiskCache
NERSC/HPSS
SRM
DiskCache
ORNL/HPSS
SRM
LBNL/NERSCORNL
request token
request token
rele
ase
DiskCache
SRMGridFTP
FTPHTTP
Data Nodes
request tokenrequest
8
Scenario 4Scenario 4: Download with DML : Download with DML from from any data source locationsany data source locations
• User downloads DataMoverLite• User goes to Gateway, select files• Gateway gets files information with source location (Gateway does not
request any files) • User launches DML with files info provided by the Gateway • DML contacts Data Source with appropriate transfer protocol
(security mechanism needs to be worked out for GSI access)• When a file is available, DML downloads to its local disk• DML releases files when done transferring.• DML reports statistics to the Gateway
DiskCache
NCAR’s MSS
SRM
ESG Gateway
DiskCache
User’s browser
Gateway User’s machine
DataMoverLite
SRMs with Disk
HTTP
GridFTP
FTP
(2) Request file(s)From NCAR’s
MSS(2) Request file(s)
(1) Find file(s),Get SURLs
(3) transfer file(s)
(3) transfer file
(s)(4) Release file(s)
(4) Release file(s)
SRMs with MSS
9
Status of BeStMan Java Interfaces
• Provide API to find out cache usage• Done: srmPing provides Total_space and Used_space
• Provide API for request status summary• Done: srmRequestSummary provides no_of_files
(requested, in-cache, released, in-queue)
• Provide API to get all available Transfer URLs within a user’s quota• Done: srmRequestStatus provides that.
• Provide API to abort a request• Done: srmAbortRequest aborts request and releases all files
• Gateway needs to estimate total size of request• in order to check if request fits in quota• Accordingly, it advise user what to use • Discuss: should wget be used in streaming mode (or al-in-cache
mode)?