evaluating the running time of a communication round over the internet omar bakr idit keidar mit...

20
Evaluating the Running Time of a Communication Round over the Internet Omar Bakr Idit Keidar MIT MIT/Technion PODC 2002

Upload: gwendolyn-matilda-briggs

Post on 28-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Evaluating the Running Time of a Communication Round over the Internet

Omar Bakr Idit Keidar

MIT MIT/Technion

PODC 2002

© Omar Bakr and Idit Keidar; PODC July 2002

Communication Round

• Exchange of information from all hosts to all hosts

• Part of many distributed algorithms, systems– consensus, atomic commit, replication, ...

© Omar Bakr and Idit Keidar; PODC July 2002

Common Metric for Evaluating Algorithms

• Number of rounds (or steps) they require

© Omar Bakr and Idit Keidar; PODC July 2002

Questions

• What is the best way to implement a communication round over the Internet– decentralized vs. centralized

• How long is a communication round over the Internet?

© Omar Bakr and Idit Keidar; PODC July 2002

Prediction is Hard

• Internet is unpredictable, diverse, …• Different answers for different topologies,

different times• Different performance metrics

– local running time one host is engaged in algorithm

– overall running time from when first host starts to when last host finishes

© Omar Bakr and Idit Keidar; PODC July 2002

“Communication Round” Primitive

• Initiated by some host• Propagates data from every host to every other host

connected to it

© Omar Bakr and Idit Keidar; PODC July 2002

Example Implementations

All-to-allLeader

Secondary Leader

© Omar Bakr and Idit Keidar; PODC July 2002

Experiment I

• 10 hosts: Taiwan, Korea, US academia, ISPs• TCP/IP (connections always up)• Algorithms:

– All-to-all– Leader (initiator)– Secondary leader (not initiator)

periodically initiated at each host

- 650 times over 3.5 days

© Omar Bakr and Idit Keidar; PODC July 2002

Computing Overall Running Time

• Elapsed time from initiation (at initiator) until all hosts terminate

• Requires estimating clock differences– Clocks not synchronized, drift– We compute difference over short intervals– Compute 3 different ways – Achieve accuracy within 20 ms. on 90% of runs

© Omar Bakr and Idit Keidar; PODC July 2002

Teaser: Comparing Performance Based on Number of Steps

All-to-all: 2

Leader: 3

Secondary Leader: 4

© Omar Bakr and Idit Keidar; PODC July 2002

Predicting Overall Runnig Times From MIT

• Ping-measured latencies (IP):– Longest link latency 240 milliseconds– Longest link to MIT 150 milliseconds

150+240 = 390 150+150+150 = 450

© Omar Bakr and Idit Keidar; PODC July 2002

Measured Running Times Runs Initiated at MIT

All-to-All Leader

Overall Local Overall LocalPrediction 390 300 450 300Average (runs under 2 sec)

811 295 541 335

% runs over

2 seconds55% 3% 13% 6%

Running times in milliseconds

© Omar Bakr and Idit Keidar; PODC July 2002

What’s Going On?

• Loss rates on two links are very high– 42% and 37%– Taiwan to two ISPs in the US

• Loss rates on other links up to 8%• Upon loss, TCP’s timeout is big

– More than round-trip-time

• All-to-all sends messages on lossy links– Often delayed by loss

© Omar Bakr and Idit Keidar; PODC July 2002

Distribution of Running Times Up to 1.3 sec. at MIT

MIT

0

100

200

300

400

milliseconds

num

ber

of r

uns

Leader

Sleader

All2All

© Omar Bakr and Idit Keidar; PODC July 2002

Running Times Runs Initiated at Taiwan

% runs over

2 seconds

Average (runs under 2 sec)

Sec. Leader

overall local

Leader

overall local

All-to-all

overall local

7%13%43%64%24%54%

6076798441120645866

Running times in milliseconds

© Omar Bakr and Idit Keidar; PODC July 2002

Distribution of Running Timesin Taiwan

Taiwan

050

100150200250300350400

milliseconds

num

ber

of r

uns

Leader

Sleader

All2All

© Omar Bakr and Idit Keidar; PODC July 2002

What’s Going On?

All-to-all Leader Secondary Leader

Taiwan

MIT

Hosts with bad links to Taiwan

Other Hosts

Good link

Lossy link

© Omar Bakr and Idit Keidar; PODC July 2002

Experiment II: Removing Taiwan

• Overall running times much better– For every initiator and algorithm, less than 10%

over 2 seconds (as opposed to 55% previously)

• All-to-all overall still worse than others!– either Leader or Secondary Leader best,

depending on initiator– loss rates of 2% - 8% are not negligible– all-to-all sends O(n2) messages; suffers

• But, all-to-all has best local running times

© Omar Bakr and Idit Keidar; PODC July 2002

Probability of Delay due to Loss

• If all links would have same latency– assume 1% loss on all links; 10 hosts (n=10)– Leader sends 3(n-1) = 27 messages

• probability of at least one loss: 1 -.9927 24%

– All-2-all sends n(n-1) = 90 messages• probability of at least one loss: 1 -.9990 60%

• In reality, links don’t have same latency– only loss on long links matters

© Omar Bakr and Idit Keidar; PODC July 2002

Conclusions• Message loss causes high variation in TCP link latencies

– latency distribution has high variance, heavy tail

• Latency distribution determines expected time for receiving O(n) concurrent messages

• Secondary leader helps – No triangle inequality, especially for loss

• Different for overall vs. local running times

• Number of rounds/steps not sufficient metric– One-to-all and all-to-all have different costs