building efficient data planner for peta-scale science · building efficient data planner for...

22
Building Efficient Data Planner for Peta-scale Science Michal Zerola 1 erˆ ome Lauret 3 Roman Bart´ ak 2 Michal ˇ Sumbera 1 1 Nuclear Physics Institute, Academy of Sciences, Czech Republic 2 Faculty of Mathematics and Physics, Charles University, Czech Republic 3 Brookhaven National Laboratory, USA ACAT 2010, Jaipur Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 1 / 18

Upload: others

Post on 21-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Building Efficient Data Planner for Peta-scale Science

Michal Zerola1 Jerome Lauret3 Roman Bartak2 Michal Sumbera1

1Nuclear Physics Institute, Academy of Sciences, Czech Republic

2Faculty of Mathematics and Physics, Charles University, Czech Republic

3Brookhaven National Laboratory, USA

ACAT 2010, Jaipur

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 1 / 18

Page 2: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Outline

1 Introduction

MotivationFAQOptimizationRequested features

2 Implementation

ArchitecturePlanner

Requirements

Database schemaData Mover

3 Show case

4 Conclusions

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 2 / 18

Page 3: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Why planning and scheduling in distributedenvironment?

Exploiting remote sites and centers implicitly opens the question:

How to handle, control and efficiently use the resources?

balance between being fair to the users and optimizing utilization

random and uncoordinated access to the resources will hardly be optimal

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 3 / 18

Page 4: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Why planning and scheduling in distributedenvironment?

Exploiting remote sites and centers implicitly opens the question:

How to handle, control and efficiently use the resources?

balance between being fair to the users and optimizing utilization

random and uncoordinated access to the resources will hardly be optimal

Our research tackles the efficient data movement:

decoupling the data movement from job processing first

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 3 / 18

Page 5: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Why planning and scheduling in distributedenvironment?

Exploiting remote sites and centers implicitly opens the question:

How to handle, control and efficiently use the resources?

balance between being fair to the users and optimizing utilization

random and uncoordinated access to the resources will hardly be optimal

Our research tackles the efficient data movement:

decoupling the data movement from job processing first

Current goal

Create mechanism for efficient and controlled way of moving datasets (replicated) to thedestinations in the fastest way

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 3 / 18

Page 6: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Why planning and scheduling in distributedenvironment?

Exploiting remote sites and centers implicitly opens the question:

How to handle, control and efficiently use the resources?

balance between being fair to the users and optimizing utilization

random and uncoordinated access to the resources will hardly be optimal

Our research tackles the efficient data movement:

decoupling the data movement from job processing first

Current goal

Create mechanism for efficient and controlled way of moving datasets (replicated) to thedestinations in the fastest way

⇒ combination of deliberative and reactive planning ⇐

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 3 / 18

Page 7: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Situation

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 4 / 18

Page 8: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Goal

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 5 / 18

Page 9: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

FAQ

Are you building another Nth data transfer tool?

No, we are building a mechanism sitting between users and existingefficient data transfer tools providing control, optimization andload-balancing.

Are you mirroring the topology and characteristic of the full network

in your model?

No, the model is based on approximation of the latencies/bandwidthand is from its startup point self adaptive to the environment.

Can I use my own planner or fair-share policy?

Yes, all decision-making modules are separate independent plugins.

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 6 / 18

Page 10: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Optimization

Two levels of optimization:

1 Among data services (sharing the data)

2 Among sites - data centers (geographically spread)

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 7 / 18

Page 11: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Requested features

Control:

respect different user priorities and usage history

support any queue-based fair share policy

provide estimates and status for the users

Load balancing:

prevent uncontrolled access and overloading of resources

load balancing of storage elements and network

Adaptability:

proper balance between reactive and deliberative planning

adapt to the network or service changes automatically

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 8 / 18

Page 12: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Architecture

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 9 / 18

Page 13: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Planner

Constrained based, two approaches:

Constraint Programming (Choco)Mixed Integer Programming (GLPK)

Fast, short-term deliberative planning

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 10 / 18

Page 14: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Requested features

resources should be used effectively

objective: minimize time to bring files to the users

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 11 / 18

Page 15: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Requested features (cont.)

use information about links usage from previously scheduled transfers

avoid creating bottlenecks

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 12 / 18

Page 16: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Database schema

most exposed parts: 105− 106 records in STAR

support for MySQL (InnoDB) and Postgresql

nodes

PK i d char−32

A cache_size in t

A service_id char−32

A host varchar−255

A por t in t

A path_prefix varchar−1024

N site char−32

links

PK i d in t

FK node_from_id nodes(id)

FK node_to_id nodes(id)

A speed in t

A enabled bool

scheduled_transfers

PK i d in t

FK file_lfn files(lfn)

FK link_id links(id)

A start_time datetime

A end_time datetime

A speed float

FK status_id statuses(id)

A retries in t

statuses

PK i d in t

A description varchar−255

files

PK l fn in t

A size in t

requested_files

PK request_id requests(id)

PK file_lfn files(lfn)

FK status_id statuses(id)

catalogue

PK file_pfn varchar−255

PK node_id nodes(id)

FK file_lfn files(lfn)

membership

PK user_id users(id)

PK group_id groups(id)

groups

PK i d char−32

A priority in t

users

PK i d char−32

A full_name varchar−255

requests

PK i d in t

FK user_id users(id)

FK group_id groups(id)

FK destination nodes(id)

N init_date datetime

N query varchar−1024

N total_files in t

N success_files in t

N failed_files in t

FK status_id statuses(id)

UserStatusCatalogueNetworkPlanner

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 13 / 18

Page 17: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Data Mover

distributed component, using efficient and existing transfer tools

running at each computing center, implemented in Python

back-ends for available services realize the transfers

works in a reactive way, following the computed plan

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 14 / 18

Page 18: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Show case - setup

moving data set to the single destination - NFS location in Prague

every file from the dataset available at HPSS, NFS and Xrootd

service in BNL

matrix for success is to be limited by WAN speed alone

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 15 / 18

Page 19: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Show case - comparison

Planner was set up to reason about:1 all services (HPSS, NFS, Xrootd) as possible data sources,

2 only HPSS (slow),

3 only NFS (fast),

4 only Xrootd (fast)

The use of resources was: HPSS - 19%, NFS - 38%, Xrootd - 43%

0

10

20

30

40

50

60

70

80

90

100

17:02 18:02 19:02 20:02 21:02 22:02 23:02 00:02 01:02

Tra

nsfe

rred

file

s (%

)

Time

Performance

Xrootd+NFS+HPSSHPSSXrootd

NFS

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 16 / 18

Page 20: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Show case - conclusion

Remarks:

Files are usually not all on NFS (central storage reserved for ongoingdata production in STAR, not past data production series)

Users would have to grab files from Xrootd or HPSS

Xrootd would create a load and impact analysis - our systemprovides immediate relief without sacrifice of the transferplan time

HPSS stress and frequent access to HPSS hence tape access could

be damaging to tape (wear-out) + competes with other

access (production sync to HPSS)

Our system balances resources in an adaptive fashion.

0

10

20

30

40

50

60

70

80

90

100

17:02 18:02 19:02 20:02 21:02 22:02 23:02 00:02 01:02

Tra

nsfe

rred

file

s (%

)

Time

Performance

Xrootd+NFS+HPSSHPSSXrootd

NFS

Ultimately

Utilizing all resources brings the same performance as relying only onthe fastest one (NFS/Xrootd) while bringing load-balancing, control andredundancy.

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 17 / 18

Page 21: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Conclusions

Status:

planner, database, web interface are prepared and functional

performance of the pure planner extensively studied and tested in simulatedenvironment

all components are functional and installed in STAR, currently running tests

Perspectives:

implement multi-site transfers: we expect similar benefits in balancingimmediate relief to the Tier-0 centerself-adaptive capabilities will determine the best transfer path

data integrity benefits for ”free”

Summary:

the concept of controlled and efficient data movement brings:better efficiency due to intelligent planner

controlled coordination

load-balancing

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 18 / 18

Page 22: Building Efficient Data Planner for Peta-scale Science · Building Efficient Data Planner for Peta-scale Science Michal Zerola1 J´erome Lauret3 Roman Bart´ak2 Michal ˇSumbera 1

Conclusions

Status:

planner, database, web interface are prepared and functional

performance of the pure planner extensively studied and tested in simulatedenvironment

all components are functional and installed in STAR, currently running tests

Perspectives:

implement multi-site transfers: we expect similar benefits in balancingimmediate relief to the Tier-0 centerself-adaptive capabilities will determine the best transfer path

data integrity benefits for ”free”

Summary:

the concept of controlled and efficient data movement brings:better efficiency due to intelligent planner

controlled coordination

load-balancing

Thank you!

Michal Zerola (NPI, ASCR) Building Efficient Data Planner 25.2.2010 18 / 18