on the interaction between dynamic routing in the native and overlay layers infocom 2006 srinivasan...
TRANSCRIPT
On the Interaction between Dynamic Routing in the
Native and Overlay Layers
INFOCOM 2006
Srinivasan SeetharamanMostafa Ammar
College of ComputingGeorgia Institute of Technology
INFOCOM 20062
Infrastructure overlay networks offer better services by deploying intelligent routing schemes.
Uncoordinated dynamic routing in the two layers lead to many problems.
We focus on the effect of native link failures, as they trigger each layer to reroute independently
Dual Rerouting
Inter-Layer Interaction Problem
INFOCOM 20063
Temporal Dynamics
Consider a native link failure in CEOnly one overlay link is affected.The native path AE is rerouted over F (ACE → ACFDE)
Native Failure
Overlay recovery: 8
Overlay rerouting: 4
Original: 2Native Rerouting: 2
Time
∞
Native Recover
y
Native Repair
Cost
A
B
C
D
F
E
H
I
G
A
E
I
G3
23
2
+
OVERLAY
NATIVE
4
INFOCOM 20064
1. Overlap of functionality between layers causing large number of route flaps (oscillations)
2. Unawareness of other layer’s decisions leading to resource overloading, multiple simultaneous failures a low success rate in rerouting sub-optimal paths after rerouting
3. Lack of flexibility and control
Downside to Dual Rerouting
INFOCOM 20065
Problem Statement I
Assume the two ends of each link (native & overlay) use a keepAlive protocol for link verification.
3 keepAlive messages lost Failure
Understand the effects of different parameters on the rerouting performance.
KeepAlive-time: Time between two keepAlive messages Hold-time: Time window to declare link as down Overlay link cost scheme (Ex: Native hops, Overlay
hops)
INFOCOM 20066
Performance Metrics
1. Hit-time: Time taken for traffic to be recovered. = Detection time + Convergence time +Device time
(depends on timers) (protocol specific)(Negligible)
2. Success rate of recoverySuccess rate of a layer = Number of paths recovered
Number of failed overlay paths
3. Number of route flapsAverage route flaps = Number of route flaps
Number of failed overlay paths
4. Peak & Stabilized inflation (before repair)Path cost inflation = Path cost after rerouting
Path cost before failure
INFOCOM 20067
Temporal Dynamics
Native Failure
Overlay recovery: 8
Overlay rerouting: 4
Original: 2Native Rerouting: 2
Time
Hit time∞
Native Recover
y
Native Repair
Cost
Overlay path AEOverlay detects first100% success rate3 route flapsPeak inflation = 8/2Stabilized inflation = 4/2
INFOCOM 20068
Performance Evaluation – ns2
Using GT-ITM, we randomly generate:25 topologies = (5 overlay network) x (5 native network)
Two scenarios1. Inspect intra-domain failures in single-domain native network2. Inspect inter-domain failures in multi-domain native network
In each scenario, tabulate failure recovery statistics of all overlay paths by breaking one native link at a time
INFOCOM 20069
Effect of Routing Parameters
Observations: By varying the overlay keepAlive-time, hold-time and cost scheme, we observe:
hold-time hit time (only until overlay hold-time < native hold-time)
hold-time # route flaps hold-time sub-optimality keepAlive-time hit-time
hold-time
INFOCOM 200610
Conclusion I
Dual rerouting can be made optimal by adopting the following recommendations:
Overlay hold-time very close to the native hold-time.
Overlay keepAlive-time that is half that of the hold-time as it leads to an earlier detection.
INFOCOM 200611
Problem Statement II
Main observation from previous simulations: “Native-rerouting yields the optimal path, albeit a bit
later”
Make the overlay layer aware of this observation and give higher precedence to native rerouting attempts
Improve overlay routing performance by adjusting the overlay layer functioning
INFOCOM 200612
Three Levels of Layer Awareness
1. No awareness Dual rerouting
2. Awareness of native layer’s existence: Probabilistically Suppressed Overlay Rerouting
(PSOR):Suppress overlay rerouting attempt with probability ‘p’
Deferred Overlay Rerouting (DOR):Delay overlay recovery by time ‘d’
INFOCOM 200613
3. Awareness of native layer’s parameters: Follow-on Suppressed Overlay Rerouting (FSOR)
If follow-on time < threshold ‘f’, then suppress overlay rerouting
Three Levels of Layer Awareness (contd.)
TimeOverlay layerdetects failure
Native layer detects failure
Failure
Follow-on time
INFOCOM 200614
Effect of Adjusting Overlay
All three schemes are simple and offer significant control over the tradeoffs between hit-time and the other metrics.
PSOR: Least number of route flaps Least peak inflation
DSOR and FSOR behave similarly (FSOR has slightly better hit-time):
Better success rate Lower stabilized inflation
INFOCOM 200615
Conclusion II
By appropriately tuning keepAlive-time hold-time suppression probability delay follow-on threshold
…we can improve results for: Hit-time # Route flaps Path cost inflation Stabilization time Success rate
INFOCOM 200616
Problem Statement III
Main observation from previous simulations: “It is not possible to improve all metrics
simultaneously. Hence, performance is still bounded!”
As overlay applications proliferate, the native layer should gradually evolve to suit them
Improve overlay routing performance by adjusting the native layer functioning
INFOCOM 200617
Tuning the Native keepAlive-time
We adopt a non-invasive procedure to advance the native layer rerouting
Tuning of the native layer keepAlive-time
Constraints: Tuning should not generate any extra overhead Effective detection time should be same
INFOCOM 200618
Consider the following scenarios for tuning. Scenario B is vanilla Dual rerouting
Scenario A is the layer-aware overlay rerouting schemeScenario C is the tuning we recommend here
Tuning the Native keepAlive-time (contd.)
INFOCOM 200619
Conclusions III
Native layer tuning we proposed achieves the best performance in all our metrics
INFOCOM 200620
Summary
We propose means to mitigate the problems associated in the inter-layer interaction
We explore two directions:1. Adjusting the overlay layer functioning2. Adjusting the native layer functioning