predicting internet network distance with coordinates-based
TRANSCRIPT
T. S. Eugene Ng [email protected] Carnegie Mellon University 1
Predicting Internet Network Distance with Coordinates-Based Approaches
T.S. Eugene Ng and Hui ZhangDepartment of Computer Science
Carnegie Mellon University
T. S. Eugene Ng [email protected] Carnegie Mellon University 2
Problem Statement
• Network distance– Round-trip propagation and transmission delay– Relatively stable, may be predictable
• Given two Internet hosts, can we accurately estimate the network distance between them without sending any RTT probes between them?
T. S. Eugene Ng [email protected] Carnegie Mellon University 3
Why Predict Network Distance?
• Want to measure network performance to improve performance of applications– Napster, content addressable overlays, overlay multicast
• Huge number of paths to measure• TCP bandwidth and RTT probes are time-consuming
• Predicted network distance enables fast and scalable first-order performance optimization– Eliminate poor choices– Refine when needed
T. S. Eugene Ng [email protected] Carnegie Mellon University 4
IDMaps [Francis et al. ‘99]
• Servers maintain simplified topological map
A
B
T. S. Eugene Ng [email protected] Carnegie Mellon University 4
IDMaps [Francis et al. ‘99]
• Servers maintain simplified topological map
A
B
Tracer
Tracer
Tracer
T. S. Eugene Ng [email protected] Carnegie Mellon University 4
IDMaps [Francis et al. ‘99]
• Servers maintain simplified topological map
A
B
Tracer
Tracer
Tracer
T. S. Eugene Ng [email protected] Carnegie Mellon University 4
IDMaps [Francis et al. ‘99]
• Servers maintain simplified topological map
A
B
Tracer
Tracer
Tracer
T. S. Eugene Ng [email protected] Carnegie Mellon University 4
IDMaps [Francis et al. ‘99]
• Servers maintain simplified topological map
A
B
Tracer
Tracer
Tracer
Server
T. S. Eugene Ng [email protected] Carnegie Mellon University 4
IDMaps [Francis et al. ‘99]
• Servers maintain simplified topological map
A
B
Tracer
Tracer
Tracer
Server
A/B
T. S. Eugene Ng [email protected] Carnegie Mellon University 4
IDMaps [Francis et al. ‘99]
• Servers maintain simplified topological map
A
B
Tracer
Tracer
Tracer
Server
T. S. Eugene Ng [email protected] Carnegie Mellon University 4
IDMaps [Francis et al. ‘99]
• Servers maintain simplified topological map
A
B
Tracer
Tracer
Tracer
Server50ms
T. S. Eugene Ng [email protected] Carnegie Mellon University 5
Disadvantage of Client-Server Architecture
• Unavoidable additional delay in communicating with servers
• Shared servers can become performance bottleneck
T. S. Eugene Ng [email protected] Carnegie Mellon University 6
Can this Problem be Solved with a Peer-to-Peer Architecture?
• Flip the problem: Can end hosts maintain “coordinates” that describe their network locations?
• End hosts exchange coordinates to compute distance– High performance, high scalability
(10,50,22) (-9,-10,3)
Distance function
68ms
T. S. Eugene Ng [email protected] Carnegie Mellon University 7
Approach 1: Global Network Positioning (GNP) Coordinates
• Model the Internet as a geometric space (e.g. 3-D Euclidean)
• Characterize the position of any end host with geometric coordinates
• Use geometric distances to predict network distances
y(x2,y2,z2)
x
z
(x1,y1,z1)
(x3,y3,z3)(x4,y4,z4)
T. S. Eugene Ng [email protected] Carnegie Mellon University 8
GNP Landmark Operationsy
xInternet
L1
L2
L3
• Small number of distributed hosts called Landmarks measure inter-Landmark distances
T. S. Eugene Ng [email protected] Carnegie Mellon University 8
GNP Landmark Operationsy
xInternet
L1
L2
L3
• Small number of distributed hosts called Landmarks measure inter-Landmark distances
Internet
L1
L
L
T. S. Eugene Ng [email protected] Carnegie Mellon University 8
GNP Landmark Operationsy
xInternet
L1
L2
L3
• Small number of distributed hosts called Landmarks measure inter-Landmark distances
Internet
L1
L
L
• Compute Landmark coordinates by minimizing the overall discrepancy between measured distancesand computed distances– Cast as a generic multi-dimensional global minimization
problem
T. S. Eugene Ng [email protected] Carnegie Mellon University 8
GNP Landmark Operationsy
xInternet
L1
L2
L3
• Small number of distributed hosts called Landmarks measure inter-Landmark distances
Internet
L1
L
L
• Compute Landmark coordinates by minimizing the overall discrepancy between measured distancesand computed distances– Cast as a generic multi-dimensional global minimization
problem
(x2,y2)
(x1,y1)
(x3,y3)
L1
L2
L3
T. S. Eugene Ng [email protected] Carnegie Mellon University 9
GNP Ordinary Host Operations
x
Internet
(x2,y2)
(x1,y1)
(x3,y3)
L1
L2
L3
yL2
L1
L3
T. S. Eugene Ng [email protected] Carnegie Mellon University 9
GNP Ordinary Host Operations
x
Internet
(x2,y2)
(x1,y1)
(x3,y3)
L1
L2
L3
yL2
L1
L3
• Each ordinary host measures its distances to the Landmarks, Landmarks just reflect pings
T. S. Eugene Ng [email protected] Carnegie Mellon University 9
GNP Ordinary Host Operations
x
Internet
(x2,y2)
(x1,y1)
(x3,y3)
L1
L2
L3
yL2
L1
L3
• Each ordinary host measures its distances to the Landmarks, Landmarks just reflect pings
Internet
L1
L2
L3
T. S. Eugene Ng [email protected] Carnegie Mellon University 9
GNP Ordinary Host Operations
x
Internet
(x2,y2)
(x1,y1)
(x3,y3)
L1
L2
L3
yL2
L1
L3
• Each ordinary host measures its distances to the Landmarks, Landmarks just reflect pings
Internet
L1
L2
L3
• Ordinary host computes its own coordinates relative to the Landmarks by minimizing the overall discrepancy between measured distances and computed distances– Cast as a generic multi-dimensional global minimization
problem
T. S. Eugene Ng [email protected] Carnegie Mellon University 9
GNP Ordinary Host Operations
x
Internet
(x2,y2)
(x1,y1)
(x3,y3)
L1
L2
L3
yL2
L1
L3
• Each ordinary host measures its distances to the Landmarks, Landmarks just reflect pings
Internet
L1
L2
L3
• Ordinary host computes its own coordinates relative to the Landmarks by minimizing the overall discrepancy between measured distances and computed distances– Cast as a generic multi-dimensional global minimization
problem
(x4,y4)
L2
L1
L3
T. S. Eugene Ng [email protected] Carnegie Mellon University 10
Important Questions
• What geometric model to use? • How to measure error in minimizations?• How to select Landmarks?• How many Landmarks?• What are the sources of prediction error?• How to reduce overhead?• Can we use geographical coordinates?
• Please see our paper• This talk: focus on performance comparisons
T. S. Eugene Ng [email protected] Carnegie Mellon University 11
Approach 2: Triangulated Heuristic Coordinates
• Proposed by Hotz in 1994 for A* heuristic shortest network path search
• Provides upper and lower bounds for network distance– Assumes shortest path routing enforced
• This paper is the first study to apply and evaluate triangulated heuristic as a network distance prediction mechanism
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1( , )
( , )
a1 a2b1 b2
Base1
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1( , )
( , )
a1 a2b1 b2
Base1
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1( , )
( , )
a1 a2b1 b2
Base11(
1
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1( , )
( , )
a1 a2b1 b2
Base11(
1
A
1
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1( , )
( , )
a1 a2b1 b2
Base11(
1
2, 2
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1( , )
( , )
a1 a2b1 b2
Base11(
1
2, 2
2 ,1
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1( , )
( , )
a1 a2b1 b2
Base11(
1
2, 2
2 ,1
2, )2
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1( , )
( , )
a1 a2b1 b2
Base11(
1
2, 2
2 ,1
2, )2
Upper bound (U) = min (ai + bi)
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1( , )
( , )
a1 a2b1 b2
Base11(
1
2, 2
2 ,1
2, )2
Upper bound (U) = min (ai + bi) = 3
T. S. Eugene Ng [email protected] Carnegie Mellon University 12
Approach 2: Triangulated Heuristic Coordinates
BA
Hop count is used in this example
Base2
Base1( , )
( , )
a1 a2b1 b2
Base11(
1
2, 2
2 ,1
2, )2
Upper bound (U) = min (ai + bi) = 3 Correct!
T. S. Eugene Ng [email protected] Carnegie Mellon University 13
Evaluation Methodology
• 19 Probes– 12 in North America, 5 in East Asia, 2 in Europe
• 869 IP addresses called Targets we do not control– Span 44 countries
• Probes measure– Inter-Probe distances– Probe-to-Target distances
• See paper for more results
T. S. Eugene Ng [email protected] Carnegie Mellon University 14
Evaluation Methodology (Cont’d)
• Choose a subset of well-distributed Probes to be Tracers/Base nodes/Landmarks
• Use the rest for evaluation
T
T
T
T
T
P3
P1
P4
P2
T
T. S. Eugene Ng [email protected] Carnegie Mellon University 14
Evaluation Methodology (Cont’d)
• Choose a subset of well-distributed Probes to be Tracers/Base nodes/Landmarks
• Use the rest for evaluation
T
T
T
T
T
P3
P1
P4
P2
TP3
P
P2
T. S. Eugene Ng [email protected] Carnegie Mellon University 14
Evaluation Methodology (Cont’d)
• Choose a subset of well-distributed Probes to be Tracers/Base nodes/Landmarks
• Use the rest for evaluation
T
T
T
T
T
P3
P1
P4
P2
TP3
P
P2(x1,y1)
T. S. Eugene Ng [email protected] Carnegie Mellon University 14
Evaluation Methodology (Cont’d)
• Choose a subset of well-distributed Probes to be Tracers/Base nodes/Landmarks
• Use the rest for evaluation
T
T
T
T
T
P3
P1
P4
P2
TP3
P
P2(x1,y1)
3
P1
P
T
T. S. Eugene Ng [email protected] Carnegie Mellon University 14
Evaluation Methodology (Cont’d)
• Choose a subset of well-distributed Probes to be Tracers/Base nodes/Landmarks
• Use the rest for evaluation
T
T
T
T
T
P3
P1
P4
P2
TP3
P
P2(x1,y1)
3
P1
P
T (x2, y2)
T. S. Eugene Ng [email protected] Carnegie Mellon University 14
Evaluation Methodology (Cont’d)
• Choose a subset of well-distributed Probes to be Tracers/Base nodes/Landmarks
• Use the rest for evaluation
T
T
T
T
T
P3
P1
P4
P2
TP3
P
P2(x1,y1)
3
P1
P
T (x2, y2)
2
T
T. S. Eugene Ng [email protected] Carnegie Mellon University 15
Performance Metrics
• Directional relative error– Symmetrically measure over and under predictions
• Relative error = abs(Directional relative error)
),min( predictedmeasured
measuredpredicted−
T. S. Eugene Ng [email protected] Carnegie Mellon University 16
Relative Error Comparison
90% = 0.97
T. S. Eugene Ng [email protected] Carnegie Mellon University 17
Relative Error Comparison
90% = 0.9790% = 0.59
T. S. Eugene Ng [email protected] Carnegie Mellon University 18
Relative Error Comparison
90% = 0.9790% = 0.59
90% = 0.50
T. S. Eugene Ng [email protected] Carnegie Mellon University 19
Directional Relative Error Analysis
DRE
Frequency
95%
75%
Mean50%
25%
5%
*
T. S. Eugene Ng [email protected] Carnegie Mellon University 20
Directional Relative Error Comparison
T. S. Eugene Ng [email protected] Carnegie Mellon University 21
Directional Relative Error Comparison
T. S. Eugene Ng [email protected] Carnegie Mellon University 22
Directional Relative Error Comparison
T. S. Eugene Ng [email protected] Carnegie Mellon University 23
Why the Difference in Over-Predictions?
IDMaps
Straight-line distance used in this example
T. S. Eugene Ng [email protected] Carnegie Mellon University 23
Why the Difference in Over-Predictions?
IDMaps
Straight-line distance used in this example
T. S. Eugene Ng [email protected] Carnegie Mellon University 23
Why the Difference in Over-Predictions?
IDMaps
Straight-line distance used in this example
Triangulated Heuristic
T. S. Eugene Ng [email protected] Carnegie Mellon University 23
Why the Difference in Over-Predictions?
IDMaps
Straight-line distance used in this example
Triangulated Heuristic
T. S. Eugene Ng [email protected] Carnegie Mellon University 23
Why the Difference in Over-Predictions?
IDMaps
Straight-line distance used in this example
Triangulated Heuristic
GNP (1-dimensional model)
T. S. Eugene Ng [email protected] Carnegie Mellon University 23
Why the Difference in Over-Predictions?
IDMaps
Straight-line distance used in this example
Triangulated Heuristic
GNP (1-dimensional model)
T. S. Eugene Ng [email protected] Carnegie Mellon University 24
Sensitivity to Infrastructure Node Placement
• Which nodes are used as Tracers/Bases/Landmarks matter
• High sensitivity means the approach is less robust• Test sensitivity by picking 20 random combinations of
6 infrastructure nodes and observe performance variance
T. S. Eugene Ng [email protected] Carnegie Mellon University 25
6 Random Infrastructure Nodes(20 Experiments)
00.20.40.60.8
11.21.41.61.8
2
GNP TriangulatedHeuristic/U
IDMaps
90%
Rel
ativ
e E
rror
Max Mean Min Standard Deviation
T. S. Eugene Ng [email protected] Carnegie Mellon University 26
Conclusions
• Coordinates-based approaches represent a new class of solutions
• These solutions fit well with the peer-to-peer architecture– Potentially better performance and scalability than the client-
server architecture
• Careful Internet evaluation shows that coordinates-based approaches are more accurate than IDMaps
• GNP is the most accurate and robust solution
T. S. Eugene Ng [email protected] Carnegie Mellon University 27
Internet as an Euclidean Space? Remarkable!
Internet map as of 1998 byBill Cheswick, Bell LabsHal Burch, CMU