Download - 11 bgp-ethernet
Week 11Interdomain routing with BGP
Ethernet
Agenda
• Interdomain routing
• Peering links
• BGP basics
• BGP convergence
Interdomain routingl Goals
l Allow to transmit IP packets along the best path towards their destination through several transit domains while taking into account their routing policies of each domain without knowing their detailed topology
l From an interdomain viewpoint, best pathoften means cheapest path
u Each domain is free to specify inside its routing policy the domains for which it agrees to provide a transit service and the method it uses to select the best path to reach each destination
Routing policies
l A domain specifies its routing policy by defining on each BGP router two sets of filters for each peer
l Import filter
u Specifies which routes can be accepted by the router among all the received routes from a given peer
l Export filter
u Specifies which routes can be advertised by the router to a given peer
Agenda
• Routing in IP networks
• Interdomain routing
• Peering links
• BGP basics
• BGP convergence
Border Gateway Protocoll Path vector protocol
u BGP router advertises its best route to each destination
AS2AS1
AS4
2001:db8:1/48
AS5
lprefix:2001:db8:1/48lASPath: AS1
lprefix: 2001:db8:1/48lASPath: AS4:AS1
lprefix: 2001:db8:1/48 ASPath: ::AS2:AS4:AS1
lprefix: 2001:db8:1/48 ASPath: AS1
l ... with incremental updates
BGP : Principles operation
l Principles
l BGP relies on the incremental exchange of path vectors
BGP session established over
TCP connection between peers
Each peer sends all its active routes
As long as the BGP session remains up
Incrementally update BGP routing tables
AS3
AS4
R1
R2
BGP session
BGP Msgs
BGP basics(2)
l 2 types of BGP messages
l UPDATE (path vector)
u advertises a route towards one prefix
u Destination address/prefix
u Interdomain path (AS-Path)
u Nexthop
l WITHDRAW
u a previously announced route is not reachable anymore
u Unreachable destination address/prefix
BGP router
BGP Loc-RIB
Peer[1]
Peer[N]
Import filterAttribute
manipulationPeer[1]
Peer[N]
Export filterAttribute
manipulation
BGP Routing Information BaseContains all the acceptable routes
learned from all Peers + internal routesl BGP decision process selects
the best route towards each destination
BGP Msgs from Peer[1]
BGP Msgs from Peer[N] BGP Msgs
to Peer[N]
BGP Msgs to Peer[1]
Import filter(Peer[i])Determines which BGM Msgs
are acceptable from Peer[i] Export filter(Peer[i])Determines which
routes can be sent to Peer[i]
One bestroute to eachdestination
All acceptable
routes
BGP Decision Process
BGP Adj-RIB-In
BGP Adj-RIB-Out
Example
R2
AS20AS30
R1 R3
AS10
2001:db8:12/48
BGP
R4
AS40
BGPBGP
UPDATElprefix: 2001:db8:12/48,
lNextHop:R1lASPath: AS10
UPDATElprefix: 2001:db8:12/48,
lNextHop:R1lASPath: AS10
UPDATElprefix: 2001:db8:12/48,
lNextHop:R4lASPath: AS40:AS10
UPDATElprefix: 2001:db8:12/48,
lNextHop:R2lASPath: AS20:AS10
l What happens if link AS10-AS20 goes down ?
How to prefer some routes over others ?
R1
RA RB
Backup: 2MbpsPrimary: 34Mbps
AS1
AS2
How to prefer some
routes over others• Limitations
RA
R1 R2
R3RB
Cheap
Expensive
AS1
AS2AS3
AS4
R5 AS5
BGP routerBGP RIB
Peer[1]
Peer[N]
Import filterAttribute
manipulation
Peer[1]
Peer[N]
Export filterAttribute
manipulationBGP Msgs from Peer[1]
BGP Msgs from Peer[N]
BGP Msgs to Peer[N]
BGP Msgs to Peer[1]One best
route to eachdestination
All acceptable
routes
BGP Decision Process
Import filterl Selection of acceptable routesl Addition of local-pref attribute inside received BGP Msg
lNormal quality route : local-pref=100lBetter than normal route :local-pref=200lWorse than normal route :local-pref=50
Simplified BGP Decision Processl Select routes with highest local-pref
l If there are several routes,choose routes with theshortest ASPath
l If there are still several routestie-breaking rule
BGP session
• Session establishment
def initialize_BGP_session( RemoteAS, RemoteIP):
# Initialize and start BGP session
# Send BGP OPEN Message to RemoteIP on port 179
# Follow BGP state machine
# advertise local routes and routes learned from
peers*/
for d in BGPLocRIB :
B=build_BGP_Update(d)
S=Apply_Export_Filter(RemoteAS,B)
if (S != None) :
send_Update(S,RemoteAS,RemoteIP)
# entire RIB has been sent
# new Updates will be sent to reflect local or
distant
# changes in routers
Simple export filter
def apply_export_filter(RemoteAS, BGPMsg) :
# check if RemoteAS already received route
if RemoteAS is BGPMsg.ASPath :
BGPMsg=None
# Many additional export policies can be
configured :
# Accept or refuse the BGPMsg
# Modify selected attributes inside BGPMsg
return BGPMsg
Simple import filter
def apply_import_filter(RemoteAS, BGPMsg):
if MysAS in BGPMsg.ASPath :
BGPMsg=None
# Many additional import policies can be
configured :
# Accept or refuse the BGPMsg
# Modify selected attributes inside BGPMsg
return BGPMsg
Processing UPDATE
def Recvd_BGPMsg(Msg, RemoteAS) :
B=apply_import_filer(Msg,RemoteAS)
if (B== None): # Msg not acceptable
return
if IsUPDATE(Msg):
Old_Route=BestRoute(Msg.prefix)
Insert_in_RIB(Msg)
Run_Decision_Process(RIB)
if (BestRoute(Msg.prefix) != Old_Route) :
# best route changed
B=build_BGP_Message(Msg.prefix);
S=apply_export_filter(RemoteAS,B);
if (S!=None) : # announce best route
send_UPDATE(S,RemoteAS,RemoteIP);
else if (Old_Route != None) :
send_WITHDRAW(Msg.prefix,RemoteAS, RemoteIP)
Processing
WITHDRAWelse : # Msg is WITHDRAW
Old_Route=BestRoute(Msg.prefix)
Remove_from_RIB(Msg)
Run_Decision_Process(RIB)
if (Best_Route(Msg.prefix) !=Old_Route):
# best route changed
B=build_BGP_Message(Msg.prefix)
S=apply_export_filter(RemoteAS,B)
if (S != None) : # still one best route towards Msg.prefix
send_UPDATE(S,RemoteAS, RemoteIP);
else if(Old_Route != None) : # No best route anymore
send_WITHDRAW(Msg.prefix,RemoteAS,RemoteIP);
How to prefer routes ?routes over others (3)
?
R1
RA RB
Backup: 2MbpsPrimary: 34Mbps
AS1
AS2
RPSL-like policy for AS1aut-num: AS1import: from AS2 RA at R1 set localpref=100;
from AS2 RB at R1 set localpref=200;accept ANY
export: to AS2 RA at R1 announce AS1to AS2 RB at R1 announce AS1
RPSL-like policy for AS2aut-num: AS2import: from AS1 R1 at RA set localpref=100;
from AS1 R1 at RB set localpref=200;accept AS1
export: to AS1 R1 at RA announce ANYto AS2 R1 at RB announce ANY
How to prefer routes ? routes over others (4) ?
RA
R1 R2
R3RB
Cheap
Expensive
AS1
AS2AS3
AS4
R5 AS5
RPSL policy for AS1aut-num: AS1import: from AS2 RA at R1 set localpref=100;
from AS4 R2 at R1 set localpref=200;accept ANY
export: to AS2 RA at R1 announce AS1to AS4 R2 at R1 announce AS1
u AS1 will prefer to send over cheap link
u But the flow of the packets destined to AS1 will depend on the routing policy of the other domains
Agenda
• Interdomain routing
• Peering links
• BGP basics
• BGP convergence
• Ethernet
Limitations of local-pref
l In theory
u Each domain is free to define its order of preference for the routes learned from external peers
u How to reach 1.0.0.0/8 from AS3 and
AS1
AS3 AS4
Preferred paths for AS41. AS3:AS1
2. AS1
Preferred paths for AS31. AS4:AS1
2. AS1
2001:db8:1/48
l AS1 sends its UPDATE messages ...
AS1
AS3 AS4
2001:db8:1/48
Preferred paths for AS31. AS4:AS1
2. AS1
UPDATElP: 2001:db8:1/48
lASPath: AS1
Routing table for AS32001:db8:1/48 ASPath: AS1 (best)
Preferred paths for AS41. AS3:AS1
2. AS1
UPDATElP: 2001:db8:1/48
lASPath: AS1
Routing table for AS42001:db8:1/48 ASPath: AS1 (best)
Limitations of local-pref
l First possibility
l AS3 sends its UPDATE first...
AS1
AS3 AS4
Preferred paths for AS31. AS4:AS1 2. AS1
2001:db8:1/48
Routing table for AS32001:db8:1/48 ASPath: AS1 (best) UPDATE
lP: 2001:db8:1/48lASPath: AS3:AS1
Preferred paths for AS41. AS3:AS12. AS1
Routing table for AS42001:db8:1/48 ASPath: AS1
2001:db8:1/48 ASPath:AS3:AS1 (best)
u Stable route assignment
Limitations of local-pref
l AS4 sends its UPDATE first...
AS1
AS3 AS4
l2001:db8:1/48
Preferred paths for AS41. AS3:AS12. AS1
Routing table for AS42001:db8:1/48 ASPath: AS1 (best)
Preferred paths for AS31. AS4:AS1 2. AS1
UPDATElPrefix: 2001:db8:1/48
lASPath: AS4:AS1
Routing table for AS32001:db8:1/48 ASPath: AS12001:db8:1/48 ASPath: AS4:AS1 (best)
u Another (but different) stable route assignment
Limitations of local-pref
l AS3 and AS4 send their UPDATE together...
AS1
AS3 AS4
Preferred paths for AS31. AS4:AS1 2. AS1
2001:db8:1/48
UPDATElP: 2001:db8:1/48lASPath: AS3:AS1
Preferred paths for AS41. AS3:AS12. AS1
UPDATElP: 2001:db8:1/48 ASPath: AS4:AS1
u AS3 prefers indirect path -> withdrawu AS4 prefers indirect path -> withdraw
Limitations of local-pref
l AS3 and AS4 send their UPDATE together...
AS1
AS3 AS4
Preferred paths for AS31. AS4:AS1 2. AS1
2001:db8:1/48
Preferred paths for AS41. AS3:AS12. AS1
WITHDRAWlP: 2001:db8:1/48
u AS3 : indirect route is not available anymore u AS3 will reannounce its direct route...
WITHDRAWlP: 2001:db8:1/48
u AS4 : indirect route is not available anymore u AS4 will reannounce its direct route...
Limitations of local-pref
More limitationslocal pref
l Unfortunately, interdomain routing may not converge at all in some cases...
u How to reach a destination inside AS0 in this case ?
AS1
AS3 AS4
Preferred paths for AS31. AS4:AS02. other paths
AS0Preferred paths for AS41. AS1:AS02. other paths
Preferred paths for AS11. AS3:AS02. other paths
local-pref and economical relationshipsl In practice, local-pref is often combined
with filters to enforce economical relationships
AS1
Prov1 Prov2
Peer1
Peer2
Peer3
Peer4
Cust1 Cust2
$ Customer-provider
$
Shared-cost
$
$ $
Local-pref values used by AS1> 1000 for the routes received from a Customer
500 – 999 for the routes learned from a Peer < 500 for the routes learned from a Provider
local-prefl Which route will be used by AS1 to reach AS5 ?
l and how will AS5 reach AS1 ?
AS1
AS4
AS2
AS3
AS5$ Customer-provider
Shared-cost
$
$
$
$
$
AS8
$
AS6
AS7
$
$
Internet paths are often asymmetrical
Internet 1990s• NSFNet
• American backbone
• AUP : no commercial
traffic
• Some regional
networks
• US regions
• national networks in
Europe
• Universities/research
labs
• connected to regional
Internet early 2000s
•• Dozen transit ISPs
shared-cost
• Uunet, Level3, OTIP,
...
• Tier-2 ISPs
• Regional/ National
ISPs
• FT, BT, DT, ...
• Tier-3 ISPs
• Smaller ISPs,
EntreprisesNetworks,
Content providers
• Customers of T2 or
Today’s Internet• Hyper Giants
• google, microsoft,
yahoo, amazon, ...
• google peers 70%
ISPs
• Tier-1 ISPs
• Tier-2 ISPs
• Tier-3 ISPs
• Many peerings at IXPs Craig Labovitz), Scott Iekel-Johnson, Danny McPherson, Jon Oberheide, Farnam Jahanian,
Internet Inter-Domain Traffic, SIGCOMM 2010
AS7007 incident
RIPE RIS
https://stat.ripe.net/widget/looking-
glass#w.resource=2001:6a8::/32
https://stat.ripe.net/AS2611#tabId=at-a-glance
https://stat.ripe.net/2001:6A8::/32#tabId=routing
Youtube and
Pakistan
http://www.ripe.net/internet-
coordination/news/industry-
developments/youtube-hijacking-a-ripe-
ncc-ris-case-study
Exercise
Agenda
• Interdomain routing
• Ethernet
Ethernet/802.3l Most widely used LAN
l First developed by Digital, Intel and Xerox
l Standardised by IEEE and ISO
•Medium Access Control
•CSMA/CD with exponential backoff
• Bandwidth: 10 Mbps
• Two ways delay
• 51.2 microsec on Ethernet/802.3
• => minimum frame size : 512 bits
• Cabling
• 10Base5 : (thick) coaxial cable
maximum 500 m,100 stations
Ethernet Addresses l Each Ethernet adapter has a unique
address
l Ethernet Address format
l 48 bits addresses
u Source Address
u Destination address
u If high order bit is 0, host unicast address
• If high order bit is 1, host multicast address
24 bits OUI(adapter manufacturer)
24 bits (identifier of adapter)
00
Ethernet Frames
l DIX Format
• proposed by Digital, Intel and Xerox
Preamble[8 bytes]
Destination address
Type[2 bytes]
CRC [32 bits]
Source address
Data[46-1500 bytes
Used to mark the beginning of the frameAllows the receiver to synchronise its
clock to the sender’s clock
Indication of the type of packet containedinside the frame
Upper layer protocol must ensure thatthe payload of the Ethernet frame is
at least 46 bytes and at most 1500 bytes
Ethernet Service
l An Ethernet network provides a connectionless unreliable service
l Transmission modes
l unicast, multicast, broadcast
l Even if in theory the Ethernet service is unreliable, a good Ethernet network should
l deliver frames to their destination with a very hig probability of delivery
l not reorder the transmitted frames
• reordering is obviously impossible on a bus
Ethernet with
structured cabling• How to perform CSMA/CD in a star-
shaped network ?
Collision domain : set of stations that could be in collision
Ethernet hub
l A hub is a relay operating in the physical layer
Host A Host BHub
Physical
Datalink Datalink
Physical
Hub
Larger Ethernetl With hubs ?
l Interconnect hubs together
Issue : maximum 51.2 microsecdelay between any pair of stations
Collision domain : entire network
Hub
Hub
Hub
Ethernet Switch
• Ethernet switch
• understands MAC addresses and filters
frames based on their addresses
Address Port
A West
B South
C East
Eth : A
Eth : B
Eth : C
Src:A Dst:B
Ethernet switch
l A switch is a relay that operates in the datalink layer
Host A Host BSwitch
Physical Phys. Phys.
Datalink
Network Network
Datalink
Physical
Frame processing
Arrival of frame F on port P
src=F.Source_Address;
dst=F.Destination_Address;
UpdateTable(src, P); // src heard on port P
if (dst==broadcast) || (dst is multicast)
{
for(Port p!=P) // forward all ports
ForwardFrame(F,p);
}
else
{
if(dst isin AddressPortTable) { ForwardFrame(F,AddressPortTable(dst)); } else { for(Port p!=P) // forward all ports
}
Network redundancy• How to design networks that survive link
and node failures ?
• Add redundant switchesAddress Port
A West
B South
C EastEth : A
Eth : B
Eth : C
Src:A Dst:C
Address Port
A North
B South
C East
Address Port
A West
B South
C North
Network redundancy
(2)Address Port
Eth : A
Eth : B
Eth : C
Src:A Dst:C
Address Port
Address Port
Spanning tree
Switch 1
Switch 7
Switch 9
Switch 22
Switch 44
Switch 2
• Each switch has a unique identifier
• The switch with the lowest id is the root
• Disable all links that do not belong to the
spanning tree
Building the spanning tree l 802.1d protocol
l 802.1d uses Bridge PDUs (BPDUs) containing
u Root ID : identifier of the current root switch
u Cost : Cost of the shortest path between the switch transmitting the BPDU and the root switch
u Transmitting ID : identifier of the switch that transmits the BPDU
l The BPDUs are sent by switches over their attached LANs as multicast frames but they are never forwarded
• switches that implement 802.1d listen to a
Ordering of BPDUsl BPDUs can be strictly ordered
l BPDU11[R=R1,C=C1, T=T1] is better than BPDU2 [R=R2,C=C2, T=T2] if
u R1<R2
u R1=R2 and C1<C2
u R1=R2 and C1=C2 and T1<T2
l ExampleBPDU1 BPDU2
R1 C1 T1 R2 C2 T229 15 35 31 12 3235 80 39 35 80 4035 15 80 35 18 38
802.1d port statesl 802.1d port state based on received BPDUs
l Root port
u port on which best 802.1d BPDU received
l Designated port
u a port is designated if the switch’s BPDU is better than the best BPDU received on this port
u Switch’s BPDU is
u current root, cost to root, switch identifier
l Blocked port (only receives 802.1d BPDUs)
Port states and
activityReceive
BPDUs
Transmit
BPDUs
Blocked yes no
Root yes no
Designated yes yes
Learn
Addresses
Forward Data
Frames
Inactive no no
Active yes yes