applying biology to distributed computing

25
Applying Biology to Distributed Computing Teodoro Cipresso [email protected] Course: CS249 FALL 2006 Professor: Dr. Teng Moh Teodoro Cipresso, [email protected] , CS249, Fall 2006 Based on the research papers: [1 ] O. Babaoglu et al., Design Patterns From Biology for Distributed Computing [2 ] G. Canright et al., Chemotaxis-Inspired Load Balancing [3 ] Van Renesse, R. The importance of aggregation

Upload: jetta

Post on 12-Jan-2016

43 views

Category:

Documents


1 download

DESCRIPTION

Applying Biology to Distributed Computing. Teodoro Cipresso [email protected] Course: CS249 FALL 2006 Professor: Dr. Teng Moh. Based on the research papers: [ 1 ] O. Babaoglu et al., Design Patterns From Biology for Distributed Computing - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Applying Biology to Distributed Computing

Applying Biology to Distributed Computing

Teodoro Cipresso [email protected]

Course: CS249 FALL 2006

Professor: Dr. Teng Moh

Teodoro Cipresso, [email protected], CS249, Fall 2006

Based on the research papers:

[1] O. Babaoglu et al., Design Patterns From Biology for Distributed Computing

[2] G. Canright et al., Chemotaxis-Inspired Load Balancing

[3] Van Renesse, R. The importance of aggregation

Page 2: Applying Biology to Distributed Computing

Bio-Inspired Design Patterns

• Patterns emerge through experience

• Observe natures successful solutions to problems

• Ecosystems are rich in well tested solutions

• Cooperation and Competition

• Fault-tolerance

• Large scale and dynamic distributed systems

• Have similar problems found in ecosystems:

• Sustain unexpected events

• Adapt behavior with limited information

• Handle a massive population (network)

Teodoro Cipresso, [email protected], CS249, Fall 2006

Page 3: Applying Biology to Distributed Computing

Plain Diffusion In Biology

Plain Diffusion

• Spontaneous spreading of matter

• Particles move from area of high concentration to low

• Areas of low concentration can observe a gradient

• Eventually equal distribution is achieved

High

Low

“Fick's law of diffusion”

Teodoro Cipresso, [email protected], CS249, Fall 2006

Page 4: Applying Biology to Distributed Computing

P

P

P

P

P

P

P

Nodes are Idealized Portions of Space

Teodoro Cipresso, [email protected], CS249, Fall 2006Plain Diffusion

• Nodes are connected using the fluid transport

• An overlay network may be constructed

Page 5: Applying Biology to Distributed Computing

Material (information) Diffuses

Using Passive Transport

Liquid Passive Transport

Teodoro Cipresso, [email protected], CS249, Fall 2006Plain Diffusion

Page 6: Applying Biology to Distributed Computing

P0

P1

P4 P5

P2

P7 P8

P3

P6 P14

P11 P12

P15 P16

P13

P9 P17 P18 P19

P10

Nodes Form a Connected Graph

• Epidemic-like behavior touches all nodes

• Nodes interact at random to converge to a state

• An example of such state, is a global average

Particles are continually moving from areas of high concentration to lower.

Spanning Overlay Network

Teodoro Cipresso, [email protected], CS249, Fall 2006Plain Diffusion

Page 7: Applying Biology to Distributed Computing

Mapping Diffusion to Distributed Computing

• Plain Diffusion Pattern Identified in [1]

• Applications to Distributed Computing:

• Use Distributed Aggregation to solve for the:

• Network size (COUNT)

• Total free storage

• Maximum load

• Location and Intensity of hotspots (gradients)

• Control, monitor and optimize the network

Teodoro Cipresso, [email protected], CS249, Fall 2006Plain Diffusion

Page 8: Applying Biology to Distributed Computing

Applying Plain Diffusion Pattern : Network Size

• Aggregate computation without central control

• Want to estimate size θ of an overlay network

• Infuse a value = 1 into the network at node Pi

• Nodes converge to a global average Σ by combining estimates

=1

0

0

.5

.5

.25

0

.25

.5

.25

0

.375

.375

.25

.188

.188

.375

.25

.19

.282

.282

… ≈.25

≈.25

≈.25

≈.25

Σ≈.25

0

0

0

1

Teodoro Cipresso, [email protected], CS249, Fall 2006Plain Diffusion

.25

.236

.236

.282

.25

.236

.259

.259

.25

.248

.248

.259

Page 9: Applying Biology to Distributed Computing

• Find network size estimate θ from and Σ

Applying Plain Diffusion Pattern : Network Size

0

0

0

1=1 [infused value] / [global average] ≈ [network size]

/ Σ = θ

1 / .25 = 4 processors

Plain Diffusion Teodoro Cipresso, [email protected], CS249, Fall 2006

≈.25

≈.25

≈.25

≈.25

Σ≈.25

Page 10: Applying Biology to Distributed Computing

Initially neighborEstimate = 0, localEstimate = 0;

01: Upon receiving no message: // periodically initiate 02: wait(τ time units) // create cycles (T=cycle)03: Pj = getNeighbor() // select a random neighbor 04: send <localEstimate> to Pj // push local estimate to neighbor05: neighborEstimate = receive(Pj) // pull neighbor estimate (wait)06: localEstimate = // converge toward average

(localEstimate + neighborEstimate) / 2

07: Upon receiving <estimate> from neighbor Pi : // neighbor initiated08: neighborEstimate = <estimate> // store neighbors estimate09: send <localEstimate> to Pi // send local estimate to neighbor10: localEstimate = // converge toward average

(localEstimate + neighborEstimate) / 2

Plain Diffusion Uniform Algorithm*

Plain Diffusion Teodoro Cipresso, [email protected], CS249, Fall 2006

*This algorithim is a less abstract rewrite of figure 1 on page 37 [1]

Page 11: Applying Biology to Distributed Computing

Plain Diffusion Visualization

Plain Diffusion Teodoro Cipresso, [email protected], CS249, Fall 2006

A

PP0

S

A

PP1

S

A

PP0

S

A

PP1

S

A

PP0

S

A

PP1

S

A

PP0

S

A

PP1

S

A

PP0

S

A

PP1

S

A

PP0

S

A

PP1

S

<connect0>

<connected1>

<estimate0>

<estimate1>

<disconnect0>

<disconnected1>

A Active Thread P Passive Thread C Synchronization Thread

Σ=1 Σ=0

Σ=1 Σ=0

Σ=1 Σ=0

Σ=.5Σ=.5

Σ=.5

Σ=.5

Σ=.5

Σ=.5

Page 12: Applying Biology to Distributed Computing

Diffusion JR Implementation Challenges

Plain Diffusion

• Processors execute asynchronously

• Asynchronous message passing for synchronization

• Synchronous message passing for estimate exchange

• Active and Passive Threads both update local estimate

• Inherent Problem of Deadlock

• Used randomization to alleviate deadlock (yield)

• Processors do random bounded wait if neighbor connected

Teodoro Cipresso, [email protected], CS249, Fall 2006

Theme for the performance analysis on next 3 slides…

Connectedness of network influences convergence rate

Page 13: Applying Biology to Distributed Computing

Convergence Performance I

Plain Diffusion

• Spanning tree overlay network with 7 processors

Teodoro Cipresso, [email protected], CS249, Fall 2006

3

0

1

654

2

Mean of 3 trials at each cycle marker

11.16 8.4 7.29 7 7 7

66.03

14.065.27

0.5 0.3 00

10

20

30

40

50

60

70

5 10 15 20 25 30

Cycle Allowance

Mean Count Estimate Variance

Spanning tree has worst case performance, why?

Page 14: Applying Biology to Distributed Computing

Convergence Performance II

Plain Diffusion

• Ring overlay network with 7 processors

Teodoro Cipresso, [email protected], CS249, Fall 2006

0

1

4

56

3

2

Mean of 3 trials at each cycle marker

7.41 8.27 7.13 7.04 7 7

16.8818.08

0.89 0.27 0.01 00

5

10

15

20

1 3 5 7 9 11

Cycle Allowance

Mean Count Estimate Variance

Ring converges faster than spanning tree, why?

Page 15: Applying Biology to Distributed Computing

Convergence Performance III

Plain Diffusion

• Mesh (fully connected) network with 7 processors

Teodoro Cipresso, [email protected], CS249, Fall 2006

Mean of 3 trials at each cycle marker

2

3

6 5

4

1

0

7.64 7.14 7.03 7 7 7

15.83

0.97 0.2 0.5 0 00

5

10

15

20

1 2 3 4 5 6

Cycle Allowance

Mean Count Estimate Variance

Mesh has best case convergence, why?

Page 16: Applying Biology to Distributed Computing

JR Implementation of Network Size Count

Plain Diffusion Teodoro Cipresso, [email protected], CS249, Fall 2006

CONN = Connect to processorRCON = Result of connect attemptDISC = Disconnect from processorRDIS = Result of disconnect attemptCALL = Send synchronous messageSEND = Send asynchronous messageRECV = Got message from neighborPUSH = Send estimate to neigborPULL = Get estimate from neighborUPDT = Update local estimate

JR Demo

Page 17: Applying Biology to Distributed Computing

Chemotaxis Teodoro Cipresso, [email protected], CS249, Fall 2006

Chemotaxis In Biology

• Taxis is an innate behavioral response

• Respond to stimulus coming from some direction (signal)

• Positive and Negative Taxis

+

-

stimulus

Page 18: Applying Biology to Distributed Computing

Chemotaxis Teodoro Cipresso, [email protected], CS249, Fall 2006

Chemotaxis In Biology

• Chemotaxis is taxis in response to chemical stimuli

• Positive taxis, or move towards food (glucose)

• Negative taxis, or move away from, a poison (formalin)

• Increasing concentration of chemical (gradient is signal)

+

-

Page 19: Applying Biology to Distributed Computing

Mapping Chemotaxis to Distributed Computing

• Chemotaxis composite pattern (uses Plain Diffusion)

• Plain Diffusion can be used in primitive load balancing

• Nodes send fractions of excess load to neighbors

Teodoro Cipresso, [email protected], CS249, Fall 2006Chemotaxis

P0 max load is 12

0 00

P0=48

0 00

48

12 1212

12

• Problem arises in creation of heavily loaded network areas

• Movement of load not based on knowledge of hotspots

• Chemotaxis inspired load balancing can help

Page 20: Applying Biology to Distributed Computing

Applying Chemotaxis Pattern : Load Balancing

• Propogate a signal (stimulus) from all nodes of the network

• Load can move towards/away from signal (+/- chemotactic)

• Gradients are created which indicate load on network areas

• Signal is lightweight and moves quicker than load (fast diffusion)

Teodoro Cipresso, [email protected], CS249, Fall 2006Chemotaxis

Heavily Loaded Area Lightly Loaded Area

Page 21: Applying Biology to Distributed Computing

Applying Chemotaxis Pattern : Load Balancing

• Load entering network will be forwarded to solve imbalance

• Moves away from hotspots, toward lightly loaded areas

Teodoro Cipresso, [email protected], CS249, Fall 2006Chemotaxis

Page 22: Applying Biology to Distributed Computing

Future Work

Teodoro Cipresso, [email protected], CS249, Fall 2006Summary

Solve open problems in Distributed Computing through identification of additional patterns in Biology.

Implement a simulation of Chemotaxis-Inspired Load Balancing (Signal-aided Diffusion) by extending Plain Diffusion JR implementation.

Page 23: Applying Biology to Distributed Computing

Teodoro Cipresso, [email protected], CS249, Fall 2006End

Thank You!

Page 24: Applying Biology to Distributed Computing

References I

[1] O. Babaoglu et al., Design Patterns From Biology for Distributed Computing, ACM Transactions on Autonomous and Adaptive Systems (TAAS), Volume 1, Issue 1 (September 2006) at http://doi.acm.org/10.1145/1152934.1152937

[2] G. Canright et al., Chemotaxis-Inspired Load Balancing, TELENOR Research and Development, Fornebu, Norway at http://www.cs.unibo.it/bison/publications/chemo.ECCS05.pdf

[3] Van Renesse, R. The importance of aggregation. Cornell University, Ithaca, NY. At http://www.cs.cornell.edu/home/rvr/papers/ImportanceAggregation.pdf

[4] S.I. Rubinow., Introduction to Mathematical Biology, Graduate School of Medical Sciences, Cornell University, 1975 by Dover Publications, Inc.

[5] A. W. Keen et al., The JR Programming Language concurrent programming in an Extended Java, 2004 by Kluwer Academic Publishers.

[6] S. S. Epp, Discrete Mathematics with Applications, second edition, 1995 by PWS Publishing Company.

[7] Diffusion, Wikipedia the free encyclopedia (retrieved October 2006), at http://en.wikipedia.org/w/index.php?title=Diffusion&printable=yes

[8] Taxis, Wikipedia the free encyclopedia (retrieved October 2006), at http://en.wikipedia.org/w/index.php?title=Taxis&printable=yes

Backup Teodoro Cipresso, [email protected], CS249, Fall 2006

Page 25: Applying Biology to Distributed Computing

References II

[9] Chemotaxis, Wikipedia the free encyclopedia (retrieved October 2006), at http://en.wikipedia.org/w/index.php?title=Chemotaxis&printable=yes

[10] Humoral immunity, Wikipedia the free encyclopedia (retrieved October 2006), at http://en.wikipedia.org/w/index.php?title=Humoral_immunity&printable=yes

[11] Overlay network, Wikipedia the free encyclopedia (retrieved October 2006), at http://en.wikipedia.org/w/index.php?title=Overlay_network&printable=yes

[12] Virus, Encyclopedia Britannica, Encyclopedia Britannica Online (retrieved October 2006), at http://www.britannica.com/eb/article-32750

[13] Glossary of graph theory, Wikipedia the free encyclopedia (retrieved October 2006), at http://en.wikipedia.org/w/index.php?title=Glossary_of_graph_theory&printable=yes

Teodoro Cipresso, [email protected], CS249, Fall 2006Backup