final presentation, summer internship program, cern

90
MARTINWILL S IM GRID Simulator Martin Barisits Will Boyd Supervised by Mario Lassnig and Vincent Garonne August 13, 2009 Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 1 / 41

Upload: wbinventor

Post on 29-Nov-2014

1.467 views

Category:

Education


2 download

DESCRIPTION

This was my final presentation of the summer to my research group at CERN. I gave this presentation with the other student I worked with, Martin Barisits, to the Distributed Data Management group of ATLAS at CERN.

TRANSCRIPT

Page 1: Final Presentation, Summer Internship Program, CERN

MARTINWILLSIMGRID Simulator

Martin Barisits Will Boyd

Supervised by Mario Lassnig and Vincent Garonne

August 13, 2009

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 1 / 41

Page 2: Final Presentation, Summer Internship Program, CERN

Content

1 Introduction

2 Approach

3 Topology Generator

4 Load Generator

5 Simulator

6 Simulation Results

7 Conclusion

8 Acknowledgements

9 References

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 2 / 41

Page 3: Final Presentation, Summer Internship Program, CERN

Introduction

Introduction

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 3 / 41

Page 4: Final Presentation, Summer Internship Program, CERN

Introduction Authors

The Authors

Martin BarisitsVienna UT, Austria

• BSc: Medical ComputerScience

• MSc: ComputationalIntelligence

Will BoydGeorgia Tech, USA

• BSc: Physics & ComputerScience

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 4 / 41

Page 5: Final Presentation, Summer Internship Program, CERN

Introduction Problem

The Problem

• Goal: Test data distribution strategies• Need: Simulator• Need: Ability to load the Simulator with the current GRID Topology• Need: Inject the Simulator with realistic workloads• Process Results

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 5 / 41

Page 6: Final Presentation, Summer Internship Program, CERN

Approach

Approach

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 6 / 41

Page 7: Final Presentation, Summer Internship Program, CERN

Approach Design

• Two basic challenges• Write a tool to get a snapshot of the whole GRID environment

(Topology, Usage) and to generate Loads• Write a Simulator which can execute this input

• Different Simulators for GRID analysis are available in theresearch community

• For time reasons we decided to use a Simulator package to baseour Simulator on

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 7 / 41

Page 8: Final Presentation, Summer Internship Program, CERN

Approach Design

• Two basic challenges• Write a tool to get a snapshot of the whole GRID environment

(Topology, Usage) and to generate Loads• Write a Simulator which can execute this input

• Different Simulators for GRID analysis are available in theresearch community

• For time reasons we decided to use a Simulator package to baseour Simulator on

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 7 / 41

Page 9: Final Presentation, Summer Internship Program, CERN

Approach Design

• Two basic challenges• Write a tool to get a snapshot of the whole GRID environment

(Topology, Usage) and to generate Loads• Write a Simulator which can execute this input

• Different Simulators for GRID analysis are available in theresearch community

• For time reasons we decided to use a Simulator package to baseour Simulator on

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 7 / 41

Page 10: Final Presentation, Summer Internship Program, CERN

Approach Package Evaluation

Package Evaluation

• Evaluation of GRID/cloud computing simulation packages• SimGrid[2]

• Based on pure C

• Pros: Fast execution time; low memory consumption; scalable

• Cons: Lacking in some functionality; High level of abstraction• GridSim[3]

• Java-based

• Pros: Highly developed; internal logging of network traffic; easier touse; Packet-based

• Cons: Slow execution time; bad memory consumption; not scalable

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 8 / 41

Page 11: Final Presentation, Summer Internship Program, CERN

Approach Package Evaluation

Package Evaluation

• Evaluation of GRID/cloud computing simulation packages• SimGrid[2]

• Based on pure C

• Pros: Fast execution time; low memory consumption; scalable

• Cons: Lacking in some functionality; High level of abstraction• GridSim[3]

• Java-based

• Pros: Highly developed; internal logging of network traffic; easier touse; Packet-based

• Cons: Slow execution time; bad memory consumption; not scalable

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 8 / 41

Page 12: Final Presentation, Summer Internship Program, CERN

Approach Package Evaluation

Package Evaluation

• Evaluation of GRID/cloud computing simulation packages• SimGrid[2]

• Based on pure C

• Pros: Fast execution time; low memory consumption; scalable

• Cons: Lacking in some functionality; High level of abstraction• GridSim[3]

• Java-based

• Pros: Highly developed; internal logging of network traffic; easier touse; Packet-based

• Cons: Slow execution time; bad memory consumption; not scalable

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 8 / 41

Page 13: Final Presentation, Summer Internship Program, CERN

Approach Package Evaluation

Package Performance

• Attempted to simulate oneday on GRID (1.5 millionfile transfers)

• GridSim: exponential inCPU time with increasingtransfers

• SimGrid: linear in CPUTime with increasingtransfers

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 9 / 41

Page 14: Final Presentation, Summer Internship Program, CERN

Approach Flow

Flow

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 10 / 41

Page 15: Final Presentation, Summer Internship Program, CERN

Topology Generator

Topology Generator

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 11 / 41

Page 16: Final Presentation, Summer Internship Program, CERN

Topology Generator The GRID

GRID Sites

GRID sites across the world

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 12 / 41

Page 17: Final Presentation, Summer Internship Program, CERN

Topology Generator The GRID

ATLAS Computing Model

• Hierarchical computingnetwork

• Tier-0

• Tier-1

• Tier-2

• Tier-0 (CERN)generates data

• Tier-1s store data

• Tier-2s process data The Tier-0 and Tier-2network configuration[1]

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 13 / 41

Page 18: Final Presentation, Summer Internship Program, CERN

Topology Generator The GRID

ATLAS Computing Model

• Hierarchical computingnetwork

• Tier-0

• Tier-1

• Tier-2

• Tier-0 (CERN)generates data

• Tier-1s store data

• Tier-2s process data The Tier-0 and Tier-2network configuration[1]

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 13 / 41

Page 19: Final Presentation, Summer Internship Program, CERN

Topology Generator The GRID

ATLAS Computing Model

• Hierarchical computingnetwork

• Tier-0

• Tier-1

• Tier-2

• Tier-0 (CERN)generates data

• Tier-1s store data

• Tier-2s process data The Tier-0 and Tier-2network configuration[1]

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 13 / 41

Page 20: Final Presentation, Summer Internship Program, CERN

Topology Generator The GRID

ATLAS Computing Model

• Hierarchical computingnetwork

• Tier-0

• Tier-1

• Tier-2

• Tier-0 (CERN)generates data

• Tier-1s store data

• Tier-2s process data The Tier-0 and Tier-2network configuration[1]

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 13 / 41

Page 21: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

The Topology Generator

• TopologyGen.py• Script to construct GRID topology

• Parses TiersOfATLASCache.py

• Finds and associates Tier-1s and Tier-2s

• Queries the DQ2 database

• Total disk space capacity

• Used disk space

• Topology is written to two XML files

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41

Page 22: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

The Topology Generator

• TopologyGen.py• Script to construct GRID topology

• Parses TiersOfATLASCache.py

• Finds and associates Tier-1s and Tier-2s

• Queries the DQ2 database

• Total disk space capacity

• Used disk space

• Topology is written to two XML files

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41

Page 23: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

The Topology Generator

• TopologyGen.py• Script to construct GRID topology

• Parses TiersOfATLASCache.py

• Finds and associates Tier-1s and Tier-2s

• Queries the DQ2 database

• Total disk space capacity

• Used disk space

• Topology is written to two XML files

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41

Page 24: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

The Topology Generator

• TopologyGen.py• Script to construct GRID topology

• Parses TiersOfATLASCache.py

• Finds and associates Tier-1s and Tier-2s

• Queries the DQ2 database

• Total disk space capacity

• Used disk space

• Topology is written to two XML files

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41

Page 25: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

The Topology Generator

• TopologyGen.py• Script to construct GRID topology

• Parses TiersOfATLASCache.py

• Finds and associates Tier-1s and Tier-2s

• Queries the DQ2 database

• Total disk space capacity

• Used disk space

• Topology is written to two XML files

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41

Page 26: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

The Topology Generator

• TopologyGen.py• Script to construct GRID topology

• Parses TiersOfATLASCache.py

• Finds and associates Tier-1s and Tier-2s

• Queries the DQ2 database

• Total disk space capacity

• Used disk space

• Topology is written to two XML files

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41

Page 27: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

The Topology Generator

• TopologyGen.py• Script to construct GRID topology

• Parses TiersOfATLASCache.py

• Finds and associates Tier-1s and Tier-2s

• Queries the DQ2 database

• Total disk space capacity

• Used disk space

• Topology is written to two XML files

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 14 / 41

Page 28: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

Platform and Deployment Files

• Platform file• Node declarations

• Link declarations

• Route declarations

• Deployment file• Logfiles for each node

• Total and used disk space

• Used disk space by datatype

• Tier-0 loadfiles

• Associated Tier-1s and Tier-2s

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 15 / 41

Page 29: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

Platform and Deployment Files

• Platform file• Node declarations

• Link declarations

• Route declarations

• Deployment file• Logfiles for each node

• Total and used disk space

• Used disk space by datatype

• Tier-0 loadfiles

• Associated Tier-1s and Tier-2s

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 15 / 41

Page 30: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

Platform and Deployment Files

• Platform file• Node declarations

• Link declarations

• Route declarations

• Deployment file• Logfiles for each node

• Total and used disk space

• Used disk space by datatype

• Tier-0 loadfiles

• Associated Tier-1s and Tier-2s

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 15 / 41

Page 31: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

Route declaration from a Tier-0 to Tier-1 in the platform file<route src= ’CERN_52 ’ ds t= ’RAL−LCG2_MCDISK ’>

< l i n k : c t n i d = ’RAL−LCG2_MCDisk_InternalLink ’>< l i n k : c t n i d = ’ RAL_OPNLinkInternal ’ / >< / l i n k : c t n i d = ’ CERN_52_InternalLink ’>

< / rou te>

Host definition in the platform file<process f u n c t i o n = ’ T ier1Storage ’ host= ’ INFN−T1_DATADISK ’>

<argument value= ’ 1 ’ / > < !−− l o g f i l e −−><argument value= ’ 214576722 ’ / > < !−− t o t a l d isk space −−><argument value= ’ 75266283 ’ / > < !−− used d isk space −−><argument value= ’ 2631309 ’ / > < !−− RAW−−><argument value= ’ 0 ’ / > < !−− SIM −−><argument value= ’ 0 ’ / > < !−− DRD−−><argument value= ’ 28882683 ’ / > < !−− ESD−−><argument value= ’ 21172405 ’ / > < !−− AOD−−><argument value= ’ 0 ’ / > < !−− DPD−−><argument value= ’ 244615 ’ / > < !−− TAG−−><argument value= ’ INFN−MILANO−ATLASC_DATADISK; INFN−NAPOLI−ATLAS_DATADISK ; ’ / >

< / process>

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 16 / 41

Page 32: Final Presentation, Summer Internship Program, CERN

Topology Generator Simulator Topology

MARTINWILLSIM GRID Topology

The topology that is generated for simulation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 17 / 41

Page 33: Final Presentation, Summer Internship Program, CERN

Load Generator

Load Generator

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 18 / 41

Page 34: Final Presentation, Summer Internship Program, CERN

Load Generator LoadGen.py

Generating a Load

• Loadfile given to each Tier-0

• Loadfiles define dataset transfers• Unique dataset ID

• Random (uniform) target Tier-1 storage node

• Random (uniform) filesize (0.5-6GB)

• Random (weekly distribution) inter-submission time

• Dataset datatype (i.e., RAW)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 19 / 41

Page 35: Final Presentation, Summer Internship Program, CERN

Load Generator LoadGen.py

Generating a Load

• Loadfile given to each Tier-0

• Loadfiles define dataset transfers• Unique dataset ID

• Random (uniform) target Tier-1 storage node

• Random (uniform) filesize (0.5-6GB)

• Random (weekly distribution) inter-submission time

• Dataset datatype (i.e., RAW)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 19 / 41

Page 36: Final Presentation, Summer Internship Program, CERN

Load Generator Simulating Real Loads

Load Distribution

• Dataset distribution• Uniform background

traffic

• Wednesday/Fridaypeak traffic

• Random "spikes" oftraffic

• Each component isweighted

• Distribution can easilybe adjusted

An example weekly dataset transfer distribution

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41

Page 37: Final Presentation, Summer Internship Program, CERN

Load Generator Simulating Real Loads

Load Distribution

• Dataset distribution• Uniform background

traffic

• Wednesday/Fridaypeak traffic

• Random "spikes" oftraffic

• Each component isweighted

• Distribution can easilybe adjusted

An example weekly dataset transfer distribution

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41

Page 38: Final Presentation, Summer Internship Program, CERN

Load Generator Simulating Real Loads

Load Distribution

• Dataset distribution• Uniform background

traffic

• Wednesday/Fridaypeak traffic

• Random "spikes" oftraffic

• Each component isweighted

• Distribution can easilybe adjusted

An example weekly dataset transfer distribution

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41

Page 39: Final Presentation, Summer Internship Program, CERN

Load Generator Simulating Real Loads

Load Distribution

• Dataset distribution• Uniform background

traffic

• Wednesday/Fridaypeak traffic

• Random "spikes" oftraffic

• Each component isweighted

• Distribution can easilybe adjusted

An example weekly dataset transfer distribution

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41

Page 40: Final Presentation, Summer Internship Program, CERN

Load Generator Simulating Real Loads

Load Distribution

• Dataset distribution• Uniform background

traffic

• Wednesday/Fridaypeak traffic

• Random "spikes" oftraffic

• Each component isweighted

• Distribution can easilybe adjusted

An example weekly dataset transfer distribution

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41

Page 41: Final Presentation, Summer Internship Program, CERN

Load Generator Simulating Real Loads

Load Distribution

• Dataset distribution• Uniform background

traffic

• Wednesday/Fridaypeak traffic

• Random "spikes" oftraffic

• Each component isweighted

• Distribution can easilybe adjusted

An example weekly dataset transfer distribution

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 20 / 41

Page 42: Final Presentation, Summer Internship Program, CERN

Simulator

Simulator

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 21 / 41

Page 43: Final Presentation, Summer Internship Program, CERN

Simulator Facts

Facts

• Based on SimGrid[2]• Implemented in C• Intent to implement an extensible Simulation-Framework rather

than a strict Simulator• Goals:

• Fast• Scalable• Representative

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 22 / 41

Page 44: Final Presentation, Summer Internship Program, CERN

Simulator Facts

Facts

• Based on SimGrid[2]• Implemented in C• Intent to implement an extensible Simulation-Framework rather

than a strict Simulator• Goals:

• Fast• Scalable• Representative

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 22 / 41

Page 45: Final Presentation, Summer Internship Program, CERN

Simulator Facts

Facts

• Based on SimGrid[2]• Implemented in C• Intent to implement an extensible Simulation-Framework rather

than a strict Simulator• Goals:

• Fast• Scalable• Representative

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 22 / 41

Page 46: Final Presentation, Summer Internship Program, CERN

Simulator Facts

Facts

• Based on SimGrid[2]• Implemented in C• Intent to implement an extensible Simulation-Framework rather

than a strict Simulator• Goals:

• Fast• Scalable• Representative

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 22 / 41

Page 47: Final Presentation, Summer Internship Program, CERN

Simulator Facts

Features

• Build network topology according to the injected Topology File• Simulate the shipment and processing of DataSets• Give the user a framework to implement/change own behavior• Provide functions to write simulation output• Background Noise Generation (Traffic from other Experiments,

. . . )

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 23 / 41

Page 48: Final Presentation, Summer Internship Program, CERN

Simulator Facts

Features

• Build network topology according to the injected Topology File• Simulate the shipment and processing of DataSets• Give the user a framework to implement/change own behavior• Provide functions to write simulation output• Background Noise Generation (Traffic from other Experiments,

. . . )

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 23 / 41

Page 49: Final Presentation, Summer Internship Program, CERN

Simulator Facts

Features

• Build network topology according to the injected Topology File• Simulate the shipment and processing of DataSets• Give the user a framework to implement/change own behavior• Provide functions to write simulation output• Background Noise Generation (Traffic from other Experiments,

. . . )

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 23 / 41

Page 50: Final Presentation, Summer Internship Program, CERN

Simulator Facts

Features

• Build network topology according to the injected Topology File• Simulate the shipment and processing of DataSets• Give the user a framework to implement/change own behavior• Provide functions to write simulation output• Background Noise Generation (Traffic from other Experiments,

. . . )

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 23 / 41

Page 51: Final Presentation, Summer Internship Program, CERN

Simulator Facts

Features

• Build network topology according to the injected Topology File• Simulate the shipment and processing of DataSets• Give the user a framework to implement/change own behavior• Provide functions to write simulation output• Background Noise Generation (Traffic from other Experiments,

. . . )

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 23 / 41

Page 52: Final Presentation, Summer Internship Program, CERN

Simulator Design

Major Entities

• Nodes (Tier0, Tier1, Tier2)• Tasks (Datatransfer)• DataSets• Links

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 24 / 41

Page 53: Final Presentation, Summer Internship Program, CERN

Simulator Design

Major Entities

• Nodes (Tier0, Tier1, Tier2)• Tasks (Datatransfer)• DataSets• Links

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 24 / 41

Page 54: Final Presentation, Summer Internship Program, CERN

Simulator Design

Task

TaskTaskname Size (DataSet)

• Taskname• Command

• Size• Communication

Size• Execution Size

• DataSet• DataSet ID• DataSet Size• DataSet Type

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 25 / 41

Page 55: Final Presentation, Summer Internship Program, CERN

Simulator Design

Task

TaskTaskname Size (DataSet)

• Taskname• Command

• Size• Communication

Size• Execution Size

• DataSet• DataSet ID• DataSet Size• DataSet Type

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 25 / 41

Page 56: Final Presentation, Summer Internship Program, CERN

Simulator Design

Task

TaskTaskname Size (DataSet)

• Taskname• Command

• Size• Communication

Size• Execution Size

• DataSet• DataSet ID• DataSet Size• DataSet Type

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 25 / 41

Page 57: Final Presentation, Summer Internship Program, CERN

Simulator Design

Task

TaskTaskname Size (DataSet)

• Taskname• Command

• Size• Communication

Size• Execution Size

• DataSet• DataSet ID• DataSet Size• DataSet Type

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 25 / 41

Page 58: Final Presentation, Summer Internship Program, CERN

Simulator Design

Command Language

• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41

Page 59: Final Presentation, Summer Internship Program, CERN

Simulator Design

Command Language

• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41

Page 60: Final Presentation, Summer Internship Program, CERN

Simulator Design

Command Language

• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41

Page 61: Final Presentation, Summer Internship Program, CERN

Simulator Design

Command Language

• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41

Page 62: Final Presentation, Summer Internship Program, CERN

Simulator Design

Command Language

• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41

Page 63: Final Presentation, Summer Internship Program, CERN

Simulator Design

Command Language

• (PUSH, DataSet)• (PULL, DataSetTemplate)• (DELETE, DataSetTemplate)• (PROCESS, DataSet)• (NOISE)• (INITSHUTDOWN)• (FINALIZE)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 26 / 41

Page 64: Final Presentation, Summer Internship Program, CERN

Simulator Design

Nodes

• Different types of nodes with different functions• Producer (Tier 0)• Storages (Tier 1, Tier 2)• Hosts (Tier 1, Tier 2)• FinalizeNode

• All nodes understand the Command Language• Different nodes execute commands differently

• It’s up to the user to define the semantics of a node• Node Features

• DataSet Store (Hashmap)• Ability to write simulation output• Queues• Simulation Functions (Execute, Sleep, . . . )

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 27 / 41

Page 65: Final Presentation, Summer Internship Program, CERN

Simulator Design

Nodes

• Different types of nodes with different functions• Producer (Tier 0)• Storages (Tier 1, Tier 2)• Hosts (Tier 1, Tier 2)• FinalizeNode

• All nodes understand the Command Language• Different nodes execute commands differently

• It’s up to the user to define the semantics of a node• Node Features

• DataSet Store (Hashmap)• Ability to write simulation output• Queues• Simulation Functions (Execute, Sleep, . . . )

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 27 / 41

Page 66: Final Presentation, Summer Internship Program, CERN

Simulator Design

Nodes

• Different types of nodes with different functions• Producer (Tier 0)• Storages (Tier 1, Tier 2)• Hosts (Tier 1, Tier 2)• FinalizeNode

• All nodes understand the Command Language• Different nodes execute commands differently

• It’s up to the user to define the semantics of a node• Node Features

• DataSet Store (Hashmap)• Ability to write simulation output• Queues• Simulation Functions (Execute, Sleep, . . . )

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 27 / 41

Page 67: Final Presentation, Summer Internship Program, CERN

Simulator Design

Nodes

• Different types of nodes with different functions• Producer (Tier 0)• Storages (Tier 1, Tier 2)• Hosts (Tier 1, Tier 2)• FinalizeNode

• All nodes understand the Command Language• Different nodes execute commands differently

• It’s up to the user to define the semantics of a node• Node Features

• DataSet Store (Hashmap)• Ability to write simulation output• Queues• Simulation Functions (Execute, Sleep, . . . )

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 27 / 41

Page 68: Final Presentation, Summer Internship Program, CERN

Simulator Design

Pseudo Code of a Tier-1 Storagewhile ( 1 ) {

task = receiveTask ( ) ;switch ( task )

case <PUSH, dataSet >:s to re ( dataSet ) ; / / S t o r e i n t h e F i l e S y s t e m

t i e r 2 = getNextT ier2 ( ) ;send ( t i e r 2 , PUSH, dataSet ) ; / / S e n d t h e t a s k

w r i t e S t a t i s t i c s ( ) ;case <DELETE, dataSet >:

. . .case <FINALIZE , dataSet >:

w r i t e S t a t i s t i c s ( ) ;break ;

. . .}

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 28 / 41

Page 69: Final Presentation, Summer Internship Program, CERN

Simulator Design

The way of a DataSet (1/4)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 29 / 41

Page 70: Final Presentation, Summer Internship Program, CERN

Simulator Design

The way of a DataSet (2/4)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 30 / 41

Page 71: Final Presentation, Summer Internship Program, CERN

Simulator Design

The way of a DataSet (3/4)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 31 / 41

Page 72: Final Presentation, Summer Internship Program, CERN

Simulator Design

The way of a DataSet (4/4)

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 32 / 41

Page 73: Final Presentation, Summer Internship Program, CERN

Simulation Results

Simulation Results

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 33 / 41

Page 74: Final Presentation, Summer Internship Program, CERN

Simulation Results Disk Space Evolution

Overloading the GRID

Tier-0 Dataset submissiondistribution

Disk space evolution withincreasing daily dataset transfers

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 34 / 41

Page 75: Final Presentation, Summer Internship Program, CERN

Simulation Results Disk Space Evolution

Disk Space Evolution

Tier-1 storage node An associated Tier-2 storagenode

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 35 / 41

Page 76: Final Presentation, Summer Internship Program, CERN

Simulation Results Disk Space Evolution

Data Storage by Datatype

Uniform dataset transfer distribution Simulated dataset transfer distribution

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 36 / 41

Page 77: Final Presentation, Summer Internship Program, CERN

Simulation Results Scalability

Scalability of MARTINWILLSIM

• MartinWillSim run tosimulate increasingnumber of days

• 250,000 tasks/day(800TB/day)

• Simulated one month ofdataset transfers in 40mins.

• CPU time linear withnumber of simulated tasks

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 37 / 41

Page 78: Final Presentation, Summer Internship Program, CERN

Conclusion

Conclusion

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 38 / 41

Page 79: Final Presentation, Summer Internship Program, CERN

Conclusion Conclusion

Recap

• Evaluation of different Simulator Packages [2 Weeks]• Design and Implementation of:

• Topology & Load Generator [6 Weeks]• MartinWillSim Simulator [6 Weeks]

• Result Analysis• Documentation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 39 / 41

Page 80: Final Presentation, Summer Internship Program, CERN

Conclusion Conclusion

Recap

• Evaluation of different Simulator Packages [2 Weeks]• Design and Implementation of:

• Topology & Load Generator [6 Weeks]• MartinWillSim Simulator [6 Weeks]

• Result Analysis• Documentation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 39 / 41

Page 81: Final Presentation, Summer Internship Program, CERN

Conclusion Conclusion

Recap

• Evaluation of different Simulator Packages [2 Weeks]• Design and Implementation of:

• Topology & Load Generator [6 Weeks]• MartinWillSim Simulator [6 Weeks]

• Result Analysis• Documentation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 39 / 41

Page 82: Final Presentation, Summer Internship Program, CERN

Conclusion Conclusion

Recap

• Evaluation of different Simulator Packages [2 Weeks]• Design and Implementation of:

• Topology & Load Generator [6 Weeks]• MartinWillSim Simulator [6 Weeks]

• Result Analysis• Documentation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 39 / 41

Page 83: Final Presentation, Summer Internship Program, CERN

Conclusion Future Work

Future Work

• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis

• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior

• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41

Page 84: Final Presentation, Summer Internship Program, CERN

Conclusion Future Work

Future Work

• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis

• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior

• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41

Page 85: Final Presentation, Summer Internship Program, CERN

Conclusion Future Work

Future Work

• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis

• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior

• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41

Page 86: Final Presentation, Summer Internship Program, CERN

Conclusion Future Work

Future Work

• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis

• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior

• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41

Page 87: Final Presentation, Summer Internship Program, CERN

Conclusion Future Work

Future Work

• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis

• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior

• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41

Page 88: Final Presentation, Summer Internship Program, CERN

Conclusion Future Work

Future Work

• Add LISP hooks for Decision Making• Add more detail of the ATLAS Computing model to the simulator• Add functions for random errors (node failures, link failures)• Add recording of other statistics for result analysis

• Link Throughput• Usage of Processing Capacities• Replication Factors• User behavior

• For higher detail: Add Files as smallest Entity to the Simulator• Further Validation

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 40 / 41

Page 89: Final Presentation, Summer Internship Program, CERN

Acknowledgements

Thank you!

• Mario Lassnig

• Vincent Garonne

• Angelos Molfetas

• Ingrid Schmid

• All those who helped make the 2009 Summer Student Programmepossible!

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 41 / 41

Page 90: Final Presentation, Summer Internship Program, CERN

References

[1] "https://twiki.cern.ch/twiki/pub/LHCOPN/ImplementationDetails/map-lhcopn.png"

[2] Casanova, Legrand, Quinson: SimGrid: a Generic Frameworkfor Large-Scale Distributed Experiments, 2008

[3] Buyya, Murshed: GridSim: A Toolkit for the Modeling andSimulation of Distributed Resource Management and Schedulingfor Grid Computing, 2002

Barisits, Boyd (Vienna UT, Georgia Tech) MARTINWILLSIM August 13, 2009 41 / 41