tevfik kosar computer sciences department university of wisconsin-madison kosart@cs.wisc.edu ...

Post on 05-Jan-2016

213 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Tevfik KosarComputer Sciences DepartmentUniversity of Wisconsin-Madison

kosart@cs.wisc.eduhttp://www.cs.wisc.edu/condor

Managing and Scheduling Data Placement (DaP)

Requests

www.cs.wisc.edu/condor

Outline

› Motivation

› DaP Scheduler

› Case Study: DAGMan

› Conclusions

www.cs.wisc.edu/condor

Demand for Storage

› Applications require access to larger and larger amounts of data Database systems Multimedia applications Scientific applications

• Eg. High Energy Physics & Computational Genomics

• Currently terabytes soon petabytes of data

www.cs.wisc.edu/condor

Is Remote access good enough?

› Huge amounts of data (mostly in tapes)

› Large number of users› Distance / Low Bandwidth › Different platforms› Scalability and efficiency concerns=> A middleware is required

www.cs.wisc.edu/condor

Two approaches

› Move job/application to the data Less common Insufficient computational power on

storage site Not efficient Does not scale

› Move data to the job/application

www.cs.wisc.edu/condor

Move data to the Job

Huge tape library (terabytes)

Compute cluster

LAN

Local Storage Area (eg. Local Disk, NeST Server..)

WAN

Remote Staging Area

www.cs.wisc.edu/condor

Main Issues

› 1. Insufficient local storage area

› 2. CPU should not wait much for I/O

› 3. Crash Recovery

› 4. Different Platforms & Protocols

› 5. Make it simple

www.cs.wisc.edu/condor

Data Placement Scheduler (DaPS)

› Intelligently Manages and Schedules Data Placement (DaP) activities/jobs

› What Condor is for computational jobs, DaPS means the same for DaP jobs

› Just submit a bunch of DaP jobs and then relax..

www.cs.wisc.edu/condor

DaPS Architecture

DAPS Server

AcceptExec.

Sched.

DaPS Client

DaPS Client

Req.

Req.

GridFTP Server NeST Server

SRB Server

Local Disk

GridFTP Server

SRM Server Req.

Buffer

Req.

LocalRemote

Queue

Thirdparty transfer

Get

Put

www.cs.wisc.edu/condor

DaPS Client Interface

› Command line: dap_submit <submit file>

› API: dapclient_lib.a dapclient_interface.h

www.cs.wisc.edu/condor

DaP jobs

› Defined as ClassAds

› Currently four types: Reserve Release Transfer Stage

www.cs.wisc.edu/condor

DaP Job ClassAds[ Type = Reserve; Server = nest://turkey.cs.wisc.edu; Size = 100MB; reservation_no = 1; ……][ Type = Transfer; Src_url = srb://ghidorac.sdsc.edu/kosart.condor/x.dat; Dst_url = nest://turkey.cs.wisc.edu/kosart/x.dat; reservation_no = 1; ...... ]

www.cs.wisc.edu/condor

Supported Protocols

› Currently supported: FTP GridFTP NeST (chirp) SRB (Storage Resource Broker)

› Very soon: SRM (Storage Resource Manager) GDMP (Grid Data Management Pilot)

www.cs.wisc.edu/condor

Case Study: DAGMan.dagFile

CondorJobQueue

A

DAGManDAGMan

C

D

A

B

www.cs.wisc.edu/condor

Current DAG structure

› All jobs are assumed to be computational jobs

Job A

Job B Job C

Job D

www.cs.wisc.edu/condor

Current DAG structure

› If data transfer to/from remote sites is required, this is performed via pre- and post-scripts attached to each job.

Job A

PRE

Job BPOST

Job C

Job D

www.cs.wisc.edu/condor

New DAG structure

Add DaP jobs to the DAG structurePRE

Job BPOST

Transfer in

Reserve In & out

Job B

Transfer out

Releasein

Release out

www.cs.wisc.edu/condor

New DAGMan Architecture

.dagFile

CondorJobQueue

A

DAGManDAGMan

B

D

A

C DaPSJob

Queue

X

Y

X

www.cs.wisc.edu/condor

Conclusion

› More intelligent management of remote data transfer & staging increase local storage utilization maximize CPU throughput

www.cs.wisc.edu/condor

Future Work

› Enhanced interaction with DAGMan

› Data Level Management instead of File Level Management

› Possible integration with Kangaroo to keep the network pipeline full

www.cs.wisc.edu/condor

Thank You for Listening &

Questions

› For more information Drop by my office anytime

• Room: 3361, Computer Science & Stats. Bldg.

Email to:• condor-admin@cs.wisc.edu

top related