a measurement study on the impact of routing events on end-to-end internet path performance
DESCRIPTION
A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance. Feng Wang 1 , Zhuoqing Morley Mao 2 Jia Wang 3 , Lixin Gao 1 , Randy Bush 4. 1 University of Massachusetts, Amherst 2 University of Michigan 3 AT&T Labs-Research 4 Internet Initiative Japan. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/1.jpg)
A Measurement Study on the Impact of Routing Events on
End-to-End Internet Path Performance
Feng Wang1 , Zhuoqing Morley Mao2
Jia Wang3, Lixin Gao1, Randy Bush4
1University of Massachusetts, Amherst2University of Michigan3AT&T Labs-Research4Internet Initiative Japan
Presentation modified with permissionPresenter: Young-Rae Kim
Date: Feb. 24, 2009
![Page 2: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/2.jpg)
Table of Contents Background Motivation Open Question Our Work Methodology How Routing Failure Occur Summary Conclusion R-BGP Appendix
![Page 3: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/3.jpg)
Background : Border Gate Protocol(BGP) The Border Gateway Protocol(BGP) is the core
routing protocol of the Internet. It maintains a table of IP networks or ‘prefixes’ which designate network reachability among autonomous systems(AS).
Most Internet users do not use BGP directly. However, most ISP must use BGP to establish routing between one another.
![Page 4: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/4.jpg)
Background : Border Gateway Protocol (BGP) Beacons BGP Beacons are for research purposes to
improve our understanding of BGP dynamics. A BGP Beacon is an unused prefix which has a
well-defined schedule for announcement and withdrawal.
Given the known schedule of announcements and withdrawals, we can study the dynamics of BGP using publicly available BGP update data.
![Page 5: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/5.jpg)
Background : MRAI timer MRAI (Minimum Route Advertisement Interval)
timer is specified in BGP. This timer acts to rate-limit updates, on a per-destination basis.
BGP(BGP-4) suggests values of 30s and 5s for this interval for external BGP(eBGP) and internal BGP(iBGP) respectively.
The MRAI serves to suppress messages which BGP would otherwise send out to describe transitory states, and so allow BGP to converge with significantly fewer messages sent.
![Page 6: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/6.jpg)
Background : Internet Control Message Protocol (ICMP) Chiefly used by networked computers’ OS to
send error messages (i.e. indicating that a requested service is not available or that host or router could not be reached.)
It differs in purpose from TCP/UDP in that it is typically not used to send and receive data between end systems.
ICMP can be used directly by user using ping and trace routes.
![Page 7: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/7.jpg)
Motivation Real-time services have made high availability of end-to-
end Internet paths of paramount importance. – low packet loss rate, low delay, high network availability, and
fast reaction time
Internet path failures are widespread [Labovitz:98, Markopoulou:04,Feamster:03].
– can last as long as 10 minutes
Degraded end-to-end path performance is correlated with routing dynamics.
![Page 8: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/8.jpg)
Open Questions
How routing changes result in degraded end-to-end path performance?
What kinds of routing dynamics cause the degraded end-to-end performance?
How factors such as topological properties, or routing policies affect performance degradation?
![Page 9: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/9.jpg)
Our Work
Study end-to-end performance under realistic topologies.
Investigate several metrics to characterize the end-to-end loss, delay, and out-of-order packets.
Characterize the kinds of routing changes that impact end-to-end path performance.
Analyze the impact of topology, routing policies, MRAI timer and iBGP configurations on end-to-end path performance.
![Page 10: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/10.jpg)
Methodology A multi-homed prefix
– BGP Beacon prefix: 192.83.230.0/24
Controlled Routing Changes – Failover events: Beacon changes from the state of having both
providers to the state of having only a single provider.
– Recovery events: Beacon changes from the state of having a single provider for connectivity to the state of having both providers.
Provider 1
Beacon
Provider 2 Provider 1 Provider 2 Provider 1 Provider 2
Beacon Beacon
Failover event Recovery event
![Page 11: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/11.jpg)
Active Probing From 37 PlanetLab hosts to the Beacon host (a host
within the Beacon prefix)– Back-to-back traceroutes– Back-to-back pings– UDP probing (50msec interval)
Data plane performance metricsInternet
Provider 2
Beacon host
Provider 1
host Bhost A
host C
metricsActive probing
traceroute ping UDP probing
Pack loss
Delay
Out-of-order
![Page 12: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/12.jpg)
Packet Loss Loss burst: consecutive UDP probing packets lost
during a routing change event.
Failover Recovery
![Page 13: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/13.jpg)
Correlating Packet Loss with Routing Failures ICMP replies
– temporary loss of reachability (!N or !H) – forwarding loops (exceeded TTL)
Routing failures– temporary loss of reachability and transient routing loops
Correlate loss bursts with ICMP messages – time window [-1 sec, 1 sec]
Underestimate the number of loss bursts due to routing failures – missing ICMP packets.
![Page 14: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/14.jpg)
An Exampleplanet02.csc.ncsu.edu experiences packet loss on July 30, 2005
![Page 15: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/15.jpg)
Loss Bursts due to Routing Failures Failover events: 76% packets lost Recovery events: 26% packets lost
Failover Recovery
![Page 16: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/16.jpg)
How Routing Failures Occur (Failover)?
R1
Beacon
R4 R5
R6
R2 R3Provider 1 Provider 2Peer link
0
02 0
001 0
0
0
Prefer-customer routing policy: routes received from a provider’s customers are always preferred over those received from its peers.
AS 0
Customer link
![Page 17: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/17.jpg)
How Routing Failures Occur (Failover)? (contd.)
R1
Beacon
R4 R5
R6
R2 R3
Provider 1 Provider 2
Peer link
002 0
001 0 0
0
R7 R9Provider 32 01 0
1 01 02 0
No-valley routing policy: peers do not transit traffic from one peer to another.
AS 0
Peer link
R8
![Page 18: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/18.jpg)
How Routing Failures Occur? (Recovery)
R1R2 R4
R3
0Beacon
path (0) Path (0)
Withdraw (2 0)
5. R1 regains its connection to the Beacon
1. Path 0 R3 recovery.
2. R3 sends the path to R2
3. R2 sends a withdrawal
to R14. R3 sends the recovery path to R1
iBGP constraint: a route received from an iBGP router cannot be transited to another iBGP router
Provider 1
Provider 2
AS 0
![Page 19: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/19.jpg)
Summary During failover and recovery events
– Routing changes impact packet loss significantly.– Multiple loss bursts are observed in 60% of events.– Routing changes can lead to long packet round-trip delays and
reordering.
Loss bursts explained by routing failures last longer than those unidentified ones.
Loss bursts caused by forwarding loops last longer than those caused by loop-free routing failures.
![Page 20: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/20.jpg)
Conclusions During failover and recovery events
– routing failures contribute to end-to-end packet loss significantly.
Routing policies, iBGP configuration and MRAI timer values play a major role in causing packet loss during routing events.
Degraded end-to-end performance can be experienced by a diverse set of hosts when there is a routing change.
Accommodate routing redundancy may eliminate majority of identified path failures.
![Page 21: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/21.jpg)
Resilient Border Gate Protocol (R-BGP)
![Page 22: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/22.jpg)
The End
Thanks!
![Page 23: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/23.jpg)
Location of Lost Bursts (Failover events)
Location of the first lost bursts caused by routing failures.
From ISP 2’s BGP updates:– Routing failures do occur and are not visible from ICMP messages due to
short duration.
From another AS’s BGP updates, and Oregon RouteView– Routing failures are cascaded to other ASes.
Class ISP 1 ISP 2 Other tier1 Non tier-1
Failover 1 92% 0 5% 3%
Failover 2 0 9% 73% 18%
![Page 24: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/24.jpg)
Location of Lost Bursts (Recovery events)
Location of the first lost bursts caused by routing failures.
BGP updates from ISP 2– 12 withdrawals over 724 recovery events
Class ISP 1 ISP 2 Other tier1 Non tier-1
Failover 1 90% N/A 0% 10%
Failover 2 N/A 0% 59% 41%
![Page 25: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/25.jpg)
Representativeness
Connectivity of Destination Prefixes– SS: Single-homed prefixes via a single upstream link– SM: Single-homed prefixes via multiple upstream links– MS: Multi-homed prefixes via a single upstream link– MM: Multi-homed prefixes via multiple upstream links
Routing tables from one tier-1 ISP on January 15, 2006
class SS SM MS MMpercentage 48% 6% 29% 17%
![Page 26: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/26.jpg)
Representativeness (contd.)
Multi-homed destination prefixes
ISP 2 ISP 3
ISP 1
destination
Customer link Customer link
Peer link
![Page 27: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/27.jpg)
Representativeness (contd.)
Multi-homed destination prefixes with multi-upstream links
ISP 2
ISP 1
ISP 1 ISP 2
![Page 28: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/28.jpg)
Loss Burst Length
loss burst length can be as long as 480 packets for failover events, and 180 packets for recovery events
Loss burst length
Failover events Recovery events
![Page 29: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/29.jpg)
Multiple Loss Bursts
Multiple loss bursts after the injection of a withdrawal message or an announcement.
Failover Recovery
![Page 30: A Measurement Study on the Impact of Routing Events on End-to-End Internet Path Performance](https://reader036.vdocuments.us/reader036/viewer/2022062816/5681569f550346895dc44221/html5/thumbnails/30.jpg)
Methodology Evaluation Our measurement is not significantly biased by ICMP
blocking– The number of ICMP messages in the absence of routing
change (0.6%).
– ICMP messages from 68 ASes, and 53% of them belong to 10 tier-1 ASes.
– 52% of ISP1’s routers, and 95% of ISP2’s routers generate ICMP messages.