stateless load balancing - research overview

13
Medilink srl Sezione Ricerca e Sviluppo STATELESS LOAD BALANCING A distributed algorithm for Abstract: Distributing data-packets on stations with scalable and optimal store and retrival functionalities. Ensuring load balance without collecting load-info from stations. Keywords: Distributed-Systems, Algorithms, Big-Data, Cloud, Balancing [research overview] Prof. Eng. O. Tomarchio University tutor: Università di Catania Dipartimento di Ingegneria Elettrica, Elettronica e Informatica Eng. A. Maddalena Company supervisor: Medilink srl Team Leader - R&D Manager Dr. A. Tino Trainee: Università di Catania Facoltà di Ingegneria Informatica Specialistica August 2013

Upload: andrea-tino

Post on 11-May-2015

180 views

Category:

Technology


0 download

DESCRIPTION

Master Degree training program research project. The presentation introduces main objectives of the thesis and describes (without providing in-depth details) the most important aspects of the activity.

TRANSCRIPT

Page 1: Stateless load balancing - Research overview

Medilink srlSezione Ricerca e Sviluppo

STATELESS LOAD BALANCING

A distributed algorithm for

Abstract: Distributing data-packets on stations with scalable and optimal store and retrival functionalities. Ensuring load balance without collecting load-info from stations.

Keywords: Distributed-Systems, Algorithms, Big-Data, Cloud, Balancing

[research overview]

Prof. Eng. O. Tomarchio

University tutor:Università di CataniaDipartimento di Ingegneria Elettrica, Elettronica e Informatica

Eng. A. Maddalena

Company supervisor:Medilink srlTeam Leader - R&D Manager

Dr. A. Tino

Trainee:Università di CataniaFacoltà di Ingegneria Informatica Specialistica

August 2013

Page 2: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

PROBLEM DESCRIPTIONMany stations & data to store. Data can be fregmented into little units (packets) and sent to stations. When balancing load, some problems occur.

problems solutions of modern algorithms

Which station to choose for a packet? Basing on info collected from stations or by uniformly distributed random algorithms.

How to send a packet to a station? IP address database, centralized solutions, distributed ip tables.

How to retireve a packet? How to locate the station it is stored in?

Need to memorize couple (packet-id, station-id) after choosing dst station.

How to balance packets among different stations?

Round-robin (stateless) approaches or basing on station loads.

August 2013

Page 3: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

DEFINING TARGETSWhat we want to find is an algorithm for load balancing able to reach the following objectives.

distributed systemNo centralization. If one station faults, the system will still be running. Performance decay is, however, allowed.

statelessThe algorithm does not need any info regarding station current load to perform station selection.

packet lookupWhen retrieving a packet from a station, the process must be the most efficient possible.

scalabilityThe architecture must be scalable. More stations can be added (also at runtime). Detached stations must not cause the system to fault.

August 2013

Page 4: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

WHAT ABOUT THE OTHERS?Load balancing is a known field in literature. Common practices exist.

つづく

A typical architecture is centralizing load balancing into a single network component responsible for that task.

The Load Balancer typically knows everything about all stations. Its task is to open connections on stations upon requests. The decision is selecting a station to open a connection to.Very often, common architectures like Cysco and IBM, organize servers in clusters and pools to handle group configurations.The balancer is not physically connected to stations. Everything is done through TCP/IP and a list of IPs is kept. In any case, the balancer has a complete knowledge.

August 2013

Page 5: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

WHAT ABOUT THE OTHERS?Load balancing is a known field in literature. There are famous algorithms out there.

終わり

dummy/naiveFirst alive, static assignment. Stateless approach. Provides poor balancing.

hash orientedUsing hashes of IP-header entries to calculate destination station. Stateless. Direct data-retrieval, bad balancing.

round-robinRotating IP-addrs. Stateless. Need to keep track of dst station. Good balancing on servers with uniform capabilities.

predictiveStation state is monitored on few fixed periods. Predictions on current state are made basing on asc/desc trends.

weighted r-rLike round-robin but halting rotation on stations with higher weights. Keep track of dst station. Good balancing on static conds.

station stateDecision took basing on each station state (e.g. current load). Introduces overhead on net. Good balancing in all conditions.

August 2013

Page 6: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

KEY CONCEPT: NO CENTRALIZATIONThe architecture must not include any centralized device or station. Think about P2P, but a little bit more reliable and less chaotic.

Topology must ensure the absence of centralized schemes.System deployed in each station as a ditributed architecture.Networking like P2P butdata exchange and stations are more reliable.

Packets are routed!

August 2013

Page 7: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

KEY CONCEPT: DIRECT ADDRESSINGWhen assigning a station to a packet, the system will not save data about this association anywhere. At retrieval, given the packet-id, the station must be located immediately.

On packet forwarding: destination station is computed but not memorized anywhere. The packet will be stored at the corresponding station with no further overhead.

On packet retrieval: destination station is computed without relying on other info. Destination station is reached and packet correctly fetched.

August 2013

Page 8: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

KEY CONCEPT: STATELESS BALANCINGTo balance data-load on stations, no info is required from stations. The packet is assigned with a station without any further operation.

Data load balancing does not require data from stations prior to station assignment or in any further moment.

No overhead is generated on the network and in time evaluations when balancing data-loads.

Stations keep (almost) the same amount of packets all the time.

August 2013

Page 9: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

SUMMARIZING KEY CONCEPTSTo balance data-load on stations, no info is required from stations. The packet is assigned with a station without any further operation.

Allows the architecture to benefit from P2P properties: scalability, flexibility and fault tolerance.

distributed system

direct addressing

stateless balancing

Fast resource management. Packets can be located with constant complexity algorithms.

No need to introduce overhead in communications. No need to wait for or store state data from stations.

August 2013

Page 10: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

SHOWING EARLY RESULTSMost simplistic simulations show very good load balancing on basic station pools.

10 station basic pool simulations. 1000 pkts fed to the pool. Difference shown.

August 2013

Page 11: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

NOT A FIELD OF DAISESThere are many problems to solve. In particular, accurate simulations are needed.

Good simulations should try to emulate real scenarions with hundreds of thousands of packets => big loads sent to stations and many more stations => big station pools.

Current developed simulations are slow (Mathworks Matlab, Wolfram Mathematica). Mathematical environments + functional languages cannot provide good performance. Need for better simulations => parallelization is possible!

Numerical problems on the way. Need for numerical methods => Need for good and fast libraries!

Parallelization would definitely fasten simulations. Need for coded simulations => C/C++: good performance. Parallelization libraries + good performance: architecture dependent parallel libraries.

August 2013

Page 12: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

WHERE TO GO FROM HEREMost simplistic simulations show very good load balancing on basic station pools.

Coding new simulations in C/C++. Very fast, but also difficult!

Integrating libraries for numerical methods.

Integrating libraries for cryptography and networking.

Integrating Intel Cilk or Intel TBB libraries for multi-core parallelization.

Need for high performance architectures: 4-core or 6-core.

August 2013

Page 13: Stateless load balancing - Research overview

Dr. Andrea TinoUniversità degli Studi di CataniaIngegneria Informatica

Research trainee:Eng. Andrea MaddalenaSoftware Development

Supervisor:

Medilink srl

Medilink srlSezione Ricerca e Sviluppo

Prof. Eng. Orazio TomarchioDIIEI

Tutor:

Università di Catania

MORE THINGS TO HANDLE The balancing architecture discovered so far is good, but more questions arise.

What if packets have not the same size? => Balancing with a known packet size (continuos?) distribution.

How to handle dynamic station attachment/detachment from the pool?

Naive simulations show quite interesting (undesired) behaviors. What the causes? How to solve these problems?

August 2013