reliable and efficient grid data placement using stork and diskrouter
DESCRIPTION
Reliable and Efficient Grid Data Placement using Stork and DiskRouter. Tevfik Kosar University of Wisconsin-Madison [email protected] April 15 th , 2004. A Single Project. LHC (Large Hadron Collider) Comes online in 2006 Will produce 1 Exabyte data by 2012 - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/1.jpg)
Reliable and Efficient Grid Data Placement
using Stork and DiskRouter
Tevfik Kosar University of Wisconsin-Madison
April 15th, 2004
![Page 2: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/2.jpg)
A Single Project..
LHC (Large Hadron Collider) Comes online in 2006 Will produce 1 Exabyte data by 2012 Accessed by ~2000 physicists, 150
institutions, 30 countries
![Page 3: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/3.jpg)
And Many Others..
Genomic information processing applicationsBiomedical Informatics Research Network (BIRN) applicationsCosmology applications (MADCAP)Methods for modeling large molecular systems Coupled climate modeling applicationsReal-time observatories, applications, and data-management (ROADNet)
![Page 4: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/4.jpg)
The Same Big Problem..
Need for data placement: Locate the data Send data to processing sites Share the results with other sites Allocate and de-allocate storage Clean-up everythingDo these reliably and efficiently
![Page 5: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/5.jpg)
Outline
IntroductionStorkDiskRouterCase StudiesConclusions
![Page 6: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/6.jpg)
Stork
A scheduler for data placement activities in the GridWhat Condor is for computational jobs, Stork is for data placement Stork comes with a new concept:“Make data placement a first class
citizen in the Grid.”
![Page 7: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/7.jpg)
The Concept
• Stage-in
• Execute the Job
• Stage-out
Stage-in
Execute the job
Stage-outRelease input space
Release output space
Allocate space for input & output data
Individual Jobs
![Page 8: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/8.jpg)
The Concept
• Stage-in
• Execute the Job
• Stage-out
Stage-in
Execute the job
Stage-outRelease input space
Release output space
Allocate space for input & output data
Data Placement Jobs
Computational Jobs
![Page 9: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/9.jpg)
DAGMan
The Concept
CondorJob
QueueDaP A A.submitDaP B B.submitJob C C.submit…..Parent A child BParent B child CParent C child D, E…..
C
StorkJob
Queue
E
DAG specification
A CBD
E
F
![Page 10: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/10.jpg)
Why Stork?
Stork understands the characteristics and semantics of data placement jobs.Can make smart scheduling decisions, for reliable and efficient data placement.
![Page 11: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/11.jpg)
Failure Recovery and Efficient Resource Utilization
Fault tolerance Just submit a bunch of data placement jobs,
and then go away..
Control number of concurrent transfers from/to any storage system Prevents overloading
Space allocation and De-allocations Make sure space is available
![Page 12: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/12.jpg)
Support for Heterogeneity
Protocol translation using Stork memory buffer.
![Page 13: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/13.jpg)
Support for Heterogeneity
Protocol translation using Stork Disk Cache.
![Page 14: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/14.jpg)
Flexible Job Representation and Multilevel Policy Support[
Type = “Transfer”; Src_Url =
“srb://ghidorac.sdsc.edu/kosart.condor/x.dat”; Dest_Url =
“nest://turkey.cs.wisc.edu/kosart/x.dat”;…………Max_Retry = 10;Restart_in = “2 hours”;
]
![Page 15: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/15.jpg)
Run-time AdaptationDynamic protocol selection[ dap_type = “transfer”; src_url = “drouter://slic04.sdsc.edu/tmp/test.dat”; dest_url = “drouter://quest2.ncsa.uiuc.edu/tmp/test.dat”; alt_protocols = “nest-nest, gsiftp-gsiftp”;]
[ dap_type = “transfer”; src_url = “any://slic04.sdsc.edu/tmp/test.dat”; dest_url = “any://quest2.ncsa.uiuc.edu/tmp/test.dat”;]
![Page 16: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/16.jpg)
Run-time Adaptation
Run-time Protocol Auto-tuning[
link = “slic04.sdsc.edu – quest2.ncsa.uiuc.edu”; protocol = “gsiftp”;
bs = 1024KB; //block sizetcp_bs = 1024KB; //TCP buffer sizep = 4;
]
![Page 17: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/17.jpg)
Outline
IntroductionStorkDiskRouterCase StudiesConclusions
![Page 18: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/18.jpg)
DiskRouter
A mechanism for high performance, large scale data transfersUses hierarchical buffering to aid in large scale data transfers Enables application-level overlay network for maximizing bandwidthSupports application-level multicast
![Page 19: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/19.jpg)
Store and Forward
Improves performance when bandwidth fluctuation between A and B is independent of the bandwidth fluctuation between B and C
DiskRouter
With DiskRouter
Without DiskRouter
A
B
C
![Page 20: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/20.jpg)
DiskRouter Overlay Network
A B
90 Mb/s
![Page 21: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/21.jpg)
DiskRouter Overlay Network
A B
DiskRouter
90 Mb/s
400 Mb/s 400 Mb/s
C
Add a DiskRouter Node C which is not necessarily on the path from A to B, to enforce use of an
alternative path.
![Page 22: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/22.jpg)
Data Mover/Distributed Cache
Source writes to the closest DiskRouter and Destination receives it up from its closest DiskRouter
Source Destination
DiskRouter Cloud
![Page 23: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/23.jpg)
Outline
IntroductionStorkDiskRouterCase StudiesConclusions
![Page 24: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/24.jpg)
Case Study I: SRB-UniTree Data Pipeline
Transfer ~3 TB of DPOSS data from SRB @SDSC to UniTree @NCSAA data pipeline created with Stork and DiskRouter
SRB Server UniTree
Server
SDSC Cache
NCSA Cache
Submit Site
![Page 25: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/25.jpg)
UniTree not responding Diskrouter reconfigured and restarted
SDSC cache reboot & UW CS Network outage Software problem
Failure Recovery
![Page 26: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/26.jpg)
Case Study -II
![Page 27: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/27.jpg)
Dynamic Protocol Selection
![Page 28: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/28.jpg)
Runtime Adaptation
Before Tuning:
• parallelism = 1
• block_size = 1 MB
• tcp_bs = 64 KBAfter Tuning:
• parallelism = 4
• block_size = 1 MB
• tcp_bs = 256 KB
![Page 29: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/29.jpg)
Conclusions
Regard data placement as first class citizen.Introduce a specialized scheduler for data placement.Introduce a high performance data transfer tool.End-to-end automation, fault tolerance, run-time adaptation, multilevel policy support, reliable and efficient transfers.
![Page 30: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/30.jpg)
Future work
Enhanced interaction between Stork, DiskRouter and higher level planners co-scheduling of CPU and I/O
Enhanced authentication mechanismsMore run-time adaptation
![Page 31: Reliable and Efficient Grid Data Placement using Stork and DiskRouter](https://reader036.vdocuments.us/reader036/viewer/2022062517/56813e5c550346895da85964/html5/thumbnails/31.jpg)
You don’t have to FedEx your data anymore.. We deliver it for you!
For more information Stork:
• Tevfik Kosar• Email: [email protected]• http://www.cs.wisc.edu/condor/stork
DiskRouter:• George Kola• Email: [email protected]• http://www.cs.wisc.edu/condor/diskrouter