adaptive failover mechanism motivation end-to-end connectivity can suffer during net failures...
TRANSCRIPT
![Page 1: Adaptive Failover Mechanism Motivation End-to-end connectivity can suffer during net failures Internet path outage detection and recovery is slow (shown](https://reader037.vdocuments.us/reader037/viewer/2022110403/56649e5f5503460f94b58ad0/html5/thumbnails/1.jpg)
Adaptive Failover MechanismAdaptive Failover Mechanism
Motivation• End-to-end connectivity can suffer during net failures
•Internet path outage detection and recovery is slow (shown to take as long as 3 to 30+ mins)
•Path outages are common in mobile networks
• Network fault tolerance should cope with dynamic network conditions and varying applications needs
• Current SCTP failovers are not adaptive to application requirements and network conditions
Motivation• End-to-end connectivity can suffer during net failures
•Internet path outage detection and recovery is slow (shown to take as long as 3 to 30+ mins)
•Path outages are common in mobile networks
• Network fault tolerance should cope with dynamic network conditions and varying applications needs
• Current SCTP failovers are not adaptive to application requirements and network conditions
Current Research • SCTP’s current failover mechanism uses the
Path.Max.Retrans parameter for failover
• We propose a two-level (-) threshold failover mechanism
- maintain primary, but failover temporarily
- change the primary, making failover permanent
• Two-level threshold mechanism provides added control over failover actions
• Using ns-2, we have modeled SCTP data transfer latency to a dual homed destination with failovers incorporating the two-level threshold mechanism
• We are investigating the relationships between the thresholds and the network parameters to develop an adaptive failover mechanism
Current Research • SCTP’s current failover mechanism uses the
Path.Max.Retrans parameter for failover
• We propose a two-level (-) threshold failover mechanism
- maintain primary, but failover temporarily
- change the primary, making failover permanent
• Two-level threshold mechanism provides added control over failover actions
• Using ns-2, we have modeled SCTP data transfer latency to a dual homed destination with failovers incorporating the two-level threshold mechanism
• We are investigating the relationships between the thresholds and the network parameters to develop an adaptive failover mechanism
Future Research• Develop an adaptive failover mechanism for SCTP
• Incorporate application requirements in the mechanism
Future Research• Develop an adaptive failover mechanism for SCTP
• Incorporate application requirements in the mechanism
Related Research• Resilient Overlay Networks (RON)
•Allows a small group of Internet applications to detect and recover from path outages within several seconds
• Rocks: Reliable Sockets
•Protects apps from path failures common to mobile computing such as link failures and IP address changes
• Migrate
•End-to-end framework for Internet mobility that supports rebinding of endpoints for established TCP connections
•Fine-grained server failover mechanism of long-running connections
• Migratory TCP (M-TCP)
•Mechanism to migrate live TCP sessions to a redundant server upon server overload, network congestion, etc.
Related Research• Resilient Overlay Networks (RON)
•Allows a small group of Internet applications to detect and recover from path outages within several seconds
• Rocks: Reliable Sockets
•Protects apps from path failures common to mobile computing such as link failures and IP address changes
• Migrate
•End-to-end framework for Internet mobility that supports rebinding of endpoints for established TCP connections
•Fine-grained server failover mechanism of long-running connections
• Migratory TCP (M-TCP)
•Mechanism to migrate live TCP sessions to a redundant server upon server overload, network congestion, etc.
End-to-End Load BalancingEnd-to-End Load Balancing
Motivation• Exploit all network resources visible at transport layer
• Perform fine-grain load balancing in transport layer
• Avoid replication of work and coarse-grain load balancing in applications
Motivation• Exploit all network resources visible at transport layer
• Perform fine-grain load balancing in transport layer
• Avoid replication of work and coarse-grain load balancing in applications
Issues• Scheduling of traffic on multiple paths
• Reordering introduced by the sender
• Loss detection & recovery
• Congestion control: shared or separate?
Issues• Scheduling of traffic on multiple paths
• Reordering introduced by the sender
• Loss detection & recovery
• Congestion control: shared or separate?
Future Research• Investigate new loss detection and recovery
techniques, which are robust to changeover
• Investigate dynamic shared bottleneck detection techniques for congestion control
• Investigate algorithms for scheduling traffic on multiple paths
Future Research• Investigate new loss detection and recovery
techniques, which are robust to changeover
• Investigate dynamic shared bottleneck detection techniques for congestion control
• Investigate algorithms for scheduling traffic on multiple paths
Current Work: Reordering Issues• Reordering due to changeover causes spurious fast
retransmissions and congestion window overgrowth
• Occurrence of reordering increases due to sender introduced route changes
• Illustration of cwnd overgrowth problem
• Is this problem a “corner case”? …NO!
• Cause of the problem: inadequacies and solutions
•Inadequacy: Retransmission ambiguity
Solution: Rhein Algorithm (variation of Eifel Algorithm)Distinguish between acks for transmissions
and retransmissions
•Inadequacy: Congestion control is unaware of changeover
Solution: Changeover Aware Congestion Control (CACC)Algorithms - prevents spurious fast
retransmits
Current Work: Reordering Issues• Reordering due to changeover causes spurious fast
retransmissions and congestion window overgrowth
• Occurrence of reordering increases due to sender introduced route changes
• Illustration of cwnd overgrowth problem
• Is this problem a “corner case”? …NO!
• Cause of the problem: inadequacies and solutions
•Inadequacy: Retransmission ambiguity
Solution: Rhein Algorithm (variation of Eifel Algorithm)Distinguish between acks for transmissions
and retransmissions
•Inadequacy: Congestion control is unaware of changeover
Solution: Changeover Aware Congestion Control (CACC)Algorithms - prevents spurious fast
retransmits
(C2=2) 81.3
2121.121.2
(C2=2) 41.2
61.261.3
22
23
(C2=3) 82
(C2=4) 83
81
616263
64 828384
69
89
70
90
B1A1 A2 B2Sender
A Receiver
BReceiver
B
(C2=2) 2
41
* TSNs 1-50 have been buffered at the sender’s link layer corresponding to A1 and are being sent.
101.3101.4
102.1
104.1
65
85
66
86
103.1
1-50*
105.1
106.1
4243
(C2=2) 41.1
(C2=2) 81.2
(C2=2) 1
(C2=5) 84
(C2=6) 85
(C2=7) 86
(C2=10) 89
(C2=11) 90
0
OverviewOverview
History• Originally intended for telephony signaling
• Overcomes several TCP & UDP limitations
• Designed as a general purpose transport protocol
• Became an IETF Proposed Standard in October 2000 (RFC2960) under the SIGTRAN working group
• Handed to Transport Area working group for continued work
History• Originally intended for telephony signaling
• Overcomes several TCP & UDP limitations
• Designed as a general purpose transport protocol
• Became an IETF Proposed Standard in October 2000 (RFC2960) under the SIGTRAN working group
• Handed to Transport Area working group for continued work
Features• Reliable data transfer
• Ordered and unordered data delivery
• Multistreaming – no head-of-line blocking
• Multihoming
Features• Reliable data transfer
• Ordered and unordered data delivery
• Multistreaming – no head-of-line blocking
• Multihoming
Changeover• Sender decides primary destination address for traffic
• Sender can change primary destination address during an active association
• Utilities (pending further research):
•End-to-end mobility (with ADD/DELETE IP extension)
•End-to-end load balancing
Changeover• Sender decides primary destination address for traffic
• Sender can change primary destination address during an active association
• Utilities (pending further research):
•End-to-end mobility (with ADD/DELETE IP extension)
•End-to-end load balancing
Using SCTP Multihoming for Fault Tolerance & Load BalancingArmando L. Caro Jr. and Janardhan R. Iyengar
Paul D. Amer, Gerard J. Heinz, Randall R. Stewart
http://pel.cis.udel.edu
Using SCTP Multihoming for Fault Tolerance & Load BalancingArmando L. Caro Jr. and Janardhan R. Iyengar
Paul D. Amer, Gerard J. Heinz, Randall R. Stewart
http://pel.cis.udel.eduProtocol · Engineering · LaboratoryUniversity of Delaware
new path avail bw (kbps)
changeover time (ms)
Using:•Old path avail bw – 500 kbps•Old path end-to-end delay for negligible sized packets – 50 ms•New path end-to-end delay for negligible sized packets – 50 ms
min packets outstanding on old path during changeover
Multihoming
• 4 possible TCP connections:•(A1,B1) or (A1,B2) or (A2,B1) or (A2,B2)
• 1 SCTP association:•({A1,A2},{B1,B2})
•Primary destinations for A & B (e.g., A1 & B1)
•Heartbeats determine reachability of idle destinations•Failover to an alternate destination if primary fails
Multihoming
• 4 possible TCP connections:•(A1,B1) or (A1,B2) or (A2,B1) or (A2,B2)
• 1 SCTP association:•({A1,A2},{B1,B2})
•Primary destinations for A & B (e.g., A1 & B1)
•Heartbeats determine reachability of idle destinations•Failover to an alternate destination if primary fails
NetworkHost A Host BB1
B2
A1
A2
Failover• What happens if B1 fails?
• TCP connection: (A1,B1)•Connection dies
• SCTP association: ({A1,A2},{B1,B2})
•Failover to B2 (ie, traffic temporarily migrates to B2)
•Upon B1’s restoration, traffic migrates back to B1
Failover• What happens if B1 fails?
• TCP connection: (A1,B1)•Connection dies
• SCTP association: ({A1,A2},{B1,B2})
•Failover to B2 (ie, traffic temporarily migrates to B2)
•Upon B1’s restoration, traffic migrates back to B1
NetworkHost A Host BB1
B2
A1
A2