stateless load balancing - early results

21
Medilink srl Sezione Ricerca e Sviluppo STATELESS LOAD BALANCING A distributed algorithm for Abstract: Distributing data-packets on stations with scalable and optimal store and retrival functionalities. Ensuring load balance without collecting load-info from stations. Keywords: Distributed-Systems, Algorithms, Big-Data, Cloud, Balancing [research early results] Prof. Eng. O. Tomarchio University tutor: Università di Catania Dipartimento di Ingegneria Elettrica, Elettronica e Informatica Eng. A. Maddalena Company supervisor: Medilink srl Team Leader - R&D Manager Dr. A. Tino Trainee: Università di Catania Facoltà di Ingegneria Informatica Specialistica Showing results August 2013

Upload: andrea-tino

Post on 21-May-2015

156 views

Category:

Technology


1 download

DESCRIPTION

Showing early results for the implementation of an algorithm used to balance data load on a distributed system of stations.

TRANSCRIPT

Page 1: Stateless load balancing - Early results

Medilink srlSezione Ricerca e Sviluppo

STATELESS LOAD BALANCING

A distributed algorithm for

Abstract: Distributing data-packets on stations with scalable and optimal store and retrival functionalities. Ensuring load balance without collecting load-info from stations.

Keywords: Distributed-Systems, Algorithms, Big-Data, Cloud, Balancing

[research early results]

Prof. Eng. O. Tomarchio

University tutor:Università di CataniaDipartimento di Ingegneria Elettrica, Elettronica e Informatica

Eng. A. Maddalena

Company supervisor:Medilink srlTeam Leader - R&D Manager

Dr. A. Tino

Trainee:Università di CataniaFacoltà di Ingegneria Informatica Specialistica

Showing results

August 2013

Page 2: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

STATUS: SIMULATING...Some real-scenario simulations have been conducted over several rings. Now showing early results while other simulations are still running.

about simulations: state of art

Ring size: Every simulation will create a fixed number of stations, each of which is assigned a personal hash (identifier in the ring).

Packets volume: When running, a simulation will generate a fixed (usually very high) number of random packets to be fed to the the ring.

Packet size: When running, simulations will generate packets of fixed length. Each packet will be fed to a hash function.

August 2013

Page 3: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

TUNING SIMULATIONSSimulations have been designed to be scalable, configurable and flexible. Basic functionalities enabled at the moment, further improvements planned.

Number of stations: NNumber of generated packets: MPacket length/size: P

simulation parameters

Developed in C++Template based, fast, support for reals and big-realsResults output on files: raw+stats

simulation tech details

T3

T1

T2

T4

T5

T6

s1 s2 s3 s4 s5 s6

0 hmax-1

August 2013

Page 4: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

SIMULATIONS OVERVIEWSimulating small, medium-sized and large rings against small to large packets volums and size.

Ring size: N

Packets volume: M

Packet length/size (bytes): P

Small rings simulated: N10 to N30Mid-sized rings simulated: N30 to N50Big rings rings simulated: N50 to N100

Low volumes: M1.000 to M100.000High volumes: over M1.000.000

Simulated range: P10 to P1000

vssmall big

vsfew many

vsshort long

August 2013

Page 5: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

INGREDIENTS 4 SIMULATIONS Simulations are designed to be flexible, configurable, fast, type independent and extensible.

Intel Threading Building Blocks (TBB)

Intel Core i7 Computer architecture

GNU C/C++ compiler (gcc/g++)

Unix RHEL based CentOS systems

Boost C++ Libraries

Tina’s rnd number generators

OpenSSL cryptographic libraries

Circos circo-diagrams drawing library

August 2013

Page 6: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

SIMULATIONS ARCHITECTUREWhen simulating a NMP-scenario, the whole process consists of different stages.

Configuration through bash scripts

Compilation needed for fast sims

Quantities are evaluated

Simulation is executed

Results on files

dhtlb::SIM_ENV_N

dhtlb::SIM_ENV_M

dhtlb::SIM_ENV_PKTSIZEdhtlb::Ring<N,M,P>::HPart

dhtlb::Ring<N,M,P>::seed

Bash scripts are used to change the src code to configure each simulation.

Templated classes (on N,M,P as well) cause compilation as a step.

cpphpp hpp hpp

cpp cpp cpphpp

g++ -lx

obj01

out01

obj01

obj01

out01

Bash scripts create random hpartitions for stations that are passed to simulations.

sh sh sh

./x.sh out01 bin/cppsim

./x.out

Sims are executed on a long-running task machine. Simulations use use all mach cores.

out01 bin/cppsim

writing...

dat tab kry

At the end of each simulation, data is written and summarized on files.

st1: 0x0385fa6bcst2: 0x746c6aa6b

stn: 0xa12345ddd...

running

August 2013

Page 7: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

ANATOMY OF A SIMULATIONWhen a simulation starts, many things happen...

A considerable amount of memory is needed to run a simulation.

The core part of the simulation is handled in parallel thanks to its Monte Carlo scheme.

つづく

A packet created in every parallel cycle!

Memoryinitialization

Packets handling

Data manipulation

Results on files

Resource collection

Packet generation Routing Result in

memoryHash/PhiH

computation

seeddat tab kry

August 2013

Page 8: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

ANATOMY OF A SIMULATIONA better scheme allows the possibility to run more simulations on the same set of packets.

Packets are ALL generated at once first.

Each parallel cycle will find corresponding packet in a specific assigned array position.

終わり

Memoryinitialization

Packets handling

Data manipulation

Results on files

Resource collection

seeddat tab kry

Hash/PhiHcomputation Routing Result in

memoryPackets

generation

August 2013

Page 9: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

OTHER FACTSThe simulator has several aspects to be detailed and many extensible components.

Packet generation policy can be decided through compilation flags.

To achieve fast implementations, compilation is necessary everytime a simulation parameter changes.

advantages drawbacks

Templated classes to support hash values real, big-reals and packet internal representation.Parallel vs. single implementations. Simulations can be run as parallel or not using different classes.High performance on Intel multi-core architectures when in parallel mode.

Parallel mode only available for Intel architectures.

Compilation does not take much time, bash scripts handle compilation.

Single mode available for all architectures.

August 2013

Page 10: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

SHOWING RESULTSSimulations are still ongoing, so far good results to be detailed in next slides.

Showing heavy simulations’ results collected so far.

and still counting...

Showing simulation machine details...

OVERALLSIMTIME

19d

OVERALL# OF SIMS

84

AVG TIME PER SIM

3h

O V E R A L LGENTD PKTS #

14.1M

Machine architectureOperating systemCPU details

HP ProLiant DL180 G6CentOS 6 (RHEL)Intel Xeon (4-Core)

= data updating in time

August 2013

Page 11: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

GROUPING SIMULATIONSFor a better analysis, we are going to group simulations in clusters basing on the ring size.

Focusing on different rings: from 10 to 100 stations.

Packets volume ranging 100k...3M

N 1 0SIMS #

41

N 3 0SIMS #

60

N 5 0SIMS #

10

N 1 0 0SIMS #

10

August 2013

Page 12: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

N10MxPy SIMULATIONSRing size 10. Many packet volumes.

SIMS # 41P A C K E T S VO L U M E S

31x100k

10x1M

Ov. SIMT I M E

13h

G E NPKTS #13.1M

Dispersion against std. deviation for H.

Std. deviation for PHI & HPART for each simulation.

Dispersion against std. deviation for PHI.

Relating HPART variance to PHI variance.

0

1

2

3

4

5

0

500

1000

1500

2000

2500

3000

Std deviation for HPART. Mult. coe�: E37 Std deviation for PHIStd deviation for PHI

0 20000 40000 60000 80000 100000 1200000

30000

60000

90000

120000

150000

0 500 1000 1500 2000 2500 30000

200

400

600

800

1000 Std deviation for H

Disp

ersio

n for

HDi

sper

sion f

or PH

I

Std deviation for PHI

August 2013

Page 13: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

N30MxPy SIMULATIONSRing size 10. Many packet volumes.

SIMS # 60

Dispersion against std. deviation for H.

Std. deviation for PHI & HPART for each simulation.

Dispersion against std. deviation for PHI.

Relating HPART variance to PHI variance.

0

5

10

15

20

01000020000300004000050000600007000080000

Std deviation for HPART. Mult. coe�: E36 Std deviation for PHI

0 50000 100000 150000 200000 2500000

50000

100000

150000

200000

0 10000 20000 30000 40000 50000 60000 70000 800000

5000

10000

15000

20000 Std deviation for H

Disp

ersio

n for

HDi

sper

sion f

or PH

I

Std deviation for PHI

P A C K E T S VO L U M E S

49x1M10x3M1x10M

Ov. SIMT I M E

17g

G E NPKTS #89.0M

August 2013

Page 14: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

N50MxPy SIMULATIONSRing size 10. Many packet volumes.

SIMS # 10

Dispersion against std. deviation for H.

Std. deviation for PHI & HPART for each simulation.

Dispersion against std. deviation for PHI.

Relating HPART variance to PHI variance.

012345678

0

1000

2000

3000

4000

5000

6000

Std deviation for HPART. Mult. coe�: E36 Std deviation for PHI

15000 20000 2500010000

15000

20000

25000

30000

0 1000 2000 3000 4000 5000 60000

300

600

900

1200

1500 Std deviation for H

Std deviation for PHI

Disp

ersio

n for

HDi

sper

sion f

or PH

I

P A C K E T S VO L U M E S

10x1M

Ov. SIMT I M E

3g

G E NPKTS #10.0M

August 2013

Page 15: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

N30 SIMS CIRCO-DIAGS + LLEVELSLoad balancing performed! Showing load-levels and circo-diagrams for pkt-migrations.

How

to re

ad th

is di

agra

m?

See

slide

“Ho

w to

read

sim

ulati

ons”

つづく

N30M1MP1k2013.11.25:084556

0

30000

60000

90000

120000

150000

0

10000

20000

30000

40000

50000

0

50000

100000

150000

200000

0

10000

20000

30000

40000

50000

0

20000

40000

60000

80000

100000

120000

0

5000

10000

15000

20000

25000

30000

35000

0

20000

40000

60000

80000

100000

120000

05000

10000150002000025000300003500040000

0

50000

100000

150000

200000

05000

10000150002000025000300003500040000

N30M1MP1k2013.11.25:104210

N30M1MP1k2013.11.25:165131

N30M1MP1k2013.11.25:124859

N30M1MP1k2013.11.25:145003

August 2013

Page 16: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

N30 SIMS CIRCO-DIAGS + LLEVELSLoad balancing performed! Showing load-levels and circo-diagrams for pkt-migrations.

How

to re

ad th

is di

agra

m?

See

slide

“Ho

w to

read

sim

ulati

ons”

つづく

N30M1MP1k2013.11.25:185047

0

50000

100000

150000

200000

250000

05000

10000150002000025000300003500040000

0

30000

60000

90000

120000

150000

0

20000

40000

60000

80000

100000

0

50000

100000

150000

200000

250000

0

5000

10000

15000

20000

25000

30000

35000

0

50000

100000

150000

200000

0

5000

10000

15000

20000

25000

30000

35000

0

20000

40000

60000

80000

100000

0

5000

10000

15000

20000

25000

30000

35000

N30M1MP1k2013.11.26:025536

N30M1MP1k2013.11.25:224923

N30M1MP1k2013.11.26:005204

N30M1MP1k2013.11.25:204218

August 2013

Page 17: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

N30 SIMS CIRCO-DIAGS + LLEVELSLoad balancing performed! Showing load-levels and circo-diagrams for pkt-migrations.

How

to re

ad th

is di

agra

m?

See

slide

“Ho

w to

read

sim

ulati

ons”

終わり

N30M1MP1k2013.11.26:045420

0

20000

40000

60000

80000

100000

120000

0

5000

10000

15000

20000

25000

30000

35000

0

30000

60000

90000

120000

150000

05000

10000150002000025000300003500040000

0

20000

40000

60000

80000

100000

05000

10000150002000025000300003500040000

0

30000

60000

90000

120000

150000

0

10000

20000

30000

40000

50000

0

50000

100000

150000

200000

250000

0

10000

20000

30000

40000

50000

N30M1MP1k2013.11.26:123917

N30M1MP1k2013.11.26:084514

N30M1MP1k2013.11.26:103934

N30M1MP1k2013.11.26:065230

August 2013

Page 18: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

FOCUSING ON A N30M1MP1kShowing an ideal case: almost perfect balancing, levels get more homogeneous.

How

to re

ad th

is di

agra

m?

See

slide

“Ho

w to

read

sim

ulati

ons”

N30M1MP1k2013.11.28:063449

0

30000

60000

90000

120000

150000

0

10000

20000

30000

40000

50000

Showing H load-levels.

Showing PHI load-levels.

August 2013

Page 19: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

FOCUSING ON A N30M1MP1kShowing a NOT SO ideal case: almost perfect balancing but one level drops significantly.

How

to re

ad th

is di

agra

m?

See

slide

“Ho

w to

read

sim

ulati

ons”Sometimes strange behaviors

appear: very narrow-coverage stations are not correctly balanced.

Symptom: very high hpart std.dev.std_dev = 1.2e37 (MD5: 0..2^128-1)

N30M1MP1k2013.11.27:204818

Showing H load-levels. Showing PHI load-levels.

0

30000

60000

90000

120000

150000

0

10000

20000

30000

40000

50000

60000

August 2013

Page 20: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

SEARCHING FOR HIDDEN PATTERNSUsing circo-diagrams it is possible to reveal hidden patterns.

Wide-covarage stations are more likely to donate packets to other stations.

Narrow-covarage stations are more likely to receive packets to other stations.

August 2013

Page 21: Stateless load balancing - Early results

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

HOW TO READ SIMULATIONSSimulation data are presented using dimension-reduction in order to grasp and understand patterns in a faster way. In this slide, find a little guide introducing most important quantities.

Migration flow: a number of pkts virtually moving from one station to another one upon 2 different station assignments.

Assignments using H and PHI hash functions.

Circo-diagrams showing migration flows.

Function H: MD5 (128 bit, cryptographic hash function).Function PHI: Balancing hash function (secret).

August 2013