VS Overview
• VS aims to be a practical and scalable data center network architecture which is desired to meet the following objectives:– Maximize Bandwidth Utilization:
• Use L3 routing to overcome the limitations of STP.– Layer-2 Connectivity Service:
• Just as if the servers of a given service domain were on a LAN or a subnet.
– Service Domain Isolation:• Due to performance isolation and security considerations, servers o
f different service domains should be isolated from each other, just as if they were isolated via VLANs.
– Broadcast Flooding Suppression• Limit the broadcast flooding (e.g., ARP broadcast traffic, unknown u
nicast traffic) scope as small as possible.
VS Overview (cont)
• VS provides an IP-only L2VPN service for server interconnection in data center networks by mainly combining L3VPN and ARP proxy [RFC 925] (was invented by Jon Postel) technologies.
• On PE control plane– Host routes (i.e., /32) for local CE hosts are generated automatic
ally according to learnt ARP entries. – Host routes for remote CE hosts are learnt by using the existing
L3VPN technology to distribute the above local CE host routes across PEs.
– Acting as an ARP proxy, the PE returns its own MAC as a response to an ARP request for a remote CE host which is sent from a local CE host.
• On PE data plane– Use L3VPN forwarding mechanism WITHOUT ANY CHANGE.
VPN Blue:1.1.1.0/24
Host D:1.1.1.4
Host B:1.1.1.3
Unicast Communication Example
MPLS/IP Backbone
PE-1
VPN Blue:1.1.1.0/24
PE-2
Prefix Next-hop Protocol 1.1.1.1/32 PE-1 BGP1.1.1.2/32 PE-1 BGP1.1.1.3/32 Local ARP1.1.1.4/32 Local ARP
Prefix Next-hop Protocol 1.1.1.1/32 Local ARP1.1.1.2/32 Local ARP1.1.1.3/32 PE-2 BGP1.1.1.4/32 PE-2 BGP
Host C:1.1.1.2
Host A:1.1.1.1
ToRSwitch
ToRSwitch
VRF Blue: VRF Blue:
IP MACIP(C) MAC(C)IP(B) MAC(PE-1)IP(D) MAC(PE-1)
IP MACIP(C) MAC(C)IP(B) MAC(PE-1)IP(D) MAC(PE-1)
ARP:
ARP Proxy
ARP Proxy
IP MACIP(D) MAC(D)IP(A) MAC(PE-2)IP(C) MAC(PE-2)
IP MACIP(D) MAC(D)IP(A) MAC(PE-2)IP(C) MAC(PE-2)
ARP:
IP(A)->IP(B)IP(A)->IP(B)
VLAN IDVLAN ID
MAC(A)->MAC(PE-1)MAC(A)->MAC(PE-1)
IP(A)->IP(B)IP(A)->IP(B)
VPN LabelVPN Label
Tunnel to PE-2Tunnel to PE-2
IP(A)->IP(B)IP(A)->IP(B)
VLAN IDVLAN ID
MAC(PE-2)->MAC(B)MAC(PE-2)->MAC(B)
Local CE Host Discovery
• Local CE hosts are discovered through ARP learning. – PE sends unicast ARP requests to those learnt local CE hosts p
eriodically to keep their corresponding ARP entries from expiring.• To ensure the PE has learnt all local CE hosts, especiall
y in the event of rebooting, ARP scan should be performed at least once after rebooting:– Option 1 (available today):
• PE sends to its local site an ARP request for each IP address within the configured IP subnet in turn.
– Option 2 (extensions to existing ARP needed):• PE sends to its local site an ARP request for a directed broadcast a
ddress (i.e., 255.255.255.255) or an ALL-Systems multicast group address (i.e., 224.0.0.1).
• Any CE host receiving such ARP request should respond with an ARP reply containing its IP and MAC addresses.
ARP Reduction
• Besides ARP learning, PE should perform the ARP proxy [RFC 925] function:– For an ARP request for a local CE host, discards it.– For an ARP request for a remote CE host, return its own MAC a
s an ARP reply.– For an ARP request for an unknown CE host (i.e., no matching V
RF entry found), discards it.• ARP broadcast traffic from CE hosts is limited to local V
PN sites– ARP broadcast traffic would not be flooded across PEs.– ARP update for a CE host (e.g., triggered by VM mobility) would
not trigger any BGP update as long as that CE host is still attached to its original PE and VRF instance (e.g., VM mobility within the VPN site).
CE Multi-homing
• CE multi-homing is an important feature for redundancy and load-balancing, especially in data center networks.– Multiple equal-cost host routes with different BGP next-hops (i.e.,
remote PEs) for a given multi-homed CE host can be used to achieve maximum capacity for server interconnection.
• CE hosts can be multi-homed to PEs via Intermediary bridges (e.g., ToR switches) in the following way.– VRRP is enabled on PEs of a given redundancy group, – and only VRRP master is delegated to act as ARP proxy and res
pond with its VIRTUAL MAC.
CE Mobility (e.g., VM Mobility)
• CE mobility within a VPN site.– PE just needs to update the corresponding ARP
entry.– No BGP update is triggered.
• CE mobility across VPN sites.– Upon learning a host route for a given local CE host
via BGP, PE should immediately send an ARP request to that host to determine whether that host is still connected to it.
• If not, PE should delete the corresponding ARP entry and host route for that CE host, and withdrawn the corresponding BGP route advertised before.
• Otherwise, it is judged as CE multi-homing.
Multicast/Broadcast
• MVPN technology can be used directly without any change to distribute customer multicast traffic among PEs.– Inclusive multicast distribution tree– Selective multicast distribution tree
• Customer broadcast traffic can be processed as a special customer multicast group.
ComparisonIPLS VS
CE reachability Information Distribution
MAC reachability advertisement via LDP
IP reachability advertisement via BGP
ARP reduction mechanism
ARP cache/snooping (return a real MAC of the requested CE).
ARP proxy (return the MAC of the ARP proxy)
Eliminating ARP/unknown unicast flooding across PEs
No Yes
CE multi-homing Not support Support natively
MAC table capacity pressure on Intermediary bridges
Need to learn MACs of both local and remote CEs. Not aging out learned MAC entries worsen such pressure.
Only need to learn local CE hosts’ MAC addresses.
Next-steps
• Any comments?
IPLS vs. VS (CE Reachability Advertisement)
• In IPLS, MAC reachability is advertised via LDP– LDP sessions face scalability challenge in a full-meshed large da
ta center network.– Adding new PEs would require configurations on all remote PEs.
• In VS, IP reachability is advertised via BGP– BGP session can scale well with the help of route reflector mech
anism.– Adding new PEs just induce configuration on RRs.
• The forwarding table size on PE is the same for both IPLS and VS.– Both host routes and MAC routes are not aggregatable.
IPLS vs. VS (ARP Reduction)
• In IPLS, ARP storm issue is not solved completely.– ARP packets even including the unicast ARP reply pa
ckets are forwarded from attachment circuits to "multicast" PWs and the received APR packets from the "multicast" PWs will be flooded to all CE hosts.
– How to keep the consistency of ARP caches on different PE routers is a hard issue.
• In VS, by using ARP Proxy on PE routers, ARP traffic is limited within a site scope.
IPLS vs. VS (CE Multi-homing)
• IPLS prohibits connection of a common LAN or VLAN to more than one PE router.– That’s to say, IPLS can not support
redundancy and load-balancing of PE-CE connections.
• VS can support CE multi-homing natively.
IPLS vs. VS (Intermediary Bridge’s MAC Table Size)
• In IPLS, the intermediary bridges between PEs and CEs would have to learn all CE hosts (both local and remote)– An IP frame received over a unicast PW is prepended with the P
E router’s own local MAC address before transmitting it on the appropriate attachment circuits. However, the destination MAC address of the packet to a remote CE host which is sent from a local CE host is the MAC of the remote CE host, rather than the local PE router’s MAC. Thus, flooding unknown unicast frames on the above Ethernet bridges would happen sooner or latter.
– To avoid flooding unknown unicast frames, these bridges are configured to not age out the learned MAC entries.
• In VS, the intermediary bridges only need to learn the MAC addresses of local CE hosts and local PE routers.