vroom: virtual routers on the move aditya akella based on slides from yi wang
TRANSCRIPT
![Page 1: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/1.jpg)
VROOM: Virtual ROuters On the Move
Aditya Akella
Based on slides from Yi Wang
![Page 2: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/2.jpg)
Virtual ROuters On the Move (VROOM)
• Key idea– Routers should be free to roam around
• Useful for many different applications– Simplify network maintenance– Simplify service deployment and evolution– Reduce power consumption– …
• Feasible in practice– No performance impact on data traffic– No visible impact on routing protocols
2
![Page 3: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/3.jpg)
VROOM: The Basic Idea • Virtual routers (VRs) form logical topology
3
1
2 3
4
5
physical router
virtual router
logical link
![Page 4: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/4.jpg)
VROOM: The Basic Idea • VR migration does not affect the logical topology
4
1
2 3
4
5
physical router
virtual router
logical link
![Page 5: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/5.jpg)
Outline
• Why is VROOM a good idea?• What are the challenges?
– Or it is just technically trivial?
• How does VROOM work?– The migration process
• Is VROOM practical?– Prototype system– Performance evaluation
• Where to migrate?– The scheduling problem
• Still have questions? Feel free to ask!
5
![Page 6: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/6.jpg)
The Coupling of Logical and Physical• Today, the physical and logical configurations of a router is
tightly coupled• Physical changes break protocol adjacencies, disrupt traffic• Logical configuration as a tool to reduce the disruption
– E.g., the “cost-out/cost-in” of IGP link weights– Cannot eliminate the disruption– Account for over 73% of network maintenance events
6
![Page 7: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/7.jpg)
VROOM Separates the Logical and Physical
• Make a logical router instance migratable among physical nodes
• All logical configurations/states remain the same before/after the migration– IP addresses remain the same– Routing protocol configurations remain the same– Routing-protocol adjacencies stay up
• No protocol (BGP/IGP) reconvergence– Network topology stays intact
• No disruption to data traffic
7
![Page 8: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/8.jpg)
Case 1: Planned Maintenance• Today’s best practice: “cost-out/cost-in”
– Router reconfiguration & protocol reconvergence
• VROOM– NO reconfiguration of VRs, NO reconvergence
8
PR-A
PR-B
VR-1
![Page 9: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/9.jpg)
Case 1: Planned Maintenance• Today’s best practice: “cost-out/cost-in”
– Router reconfiguration & protocol reconvergence
• VROOM– NO reconfiguration of VRs, NO reconvergence
9
PR-A
PR-B
VR-1
![Page 10: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/10.jpg)
Case 1: Planned Maintenance• Today’s best practice: “cost-out/cost-in”
– Router reconfiguration & protocol reconvergence
• VROOM– NO reconfiguration of VRs, NO reconvergence
10
PR-A
PR-B
VR-1
![Page 11: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/11.jpg)
Case 2: Service Deployment & Evolution
• Deploy a new service in a controlled “test network” first
11
Production network
Test network
Test network
Test network
CECECE
![Page 12: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/12.jpg)
Case 2: Service Deployment & Evolution
• Roll out the service to the production network after it matures
• VROOM guarantees seamless service to existing customers during the roll-out and later evolution
12
Production network
Test network
Test network
Test network
![Page 13: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/13.jpg)
Case 3: Power Savings
• Big power consumption of routers– Millions of Routers in the U.S.– Electricity bill: $ hundreds of millions/year
13
(Source: National Technical Information Service, Department of Commerce, 2000. Figures for 2005 & 2010 are projections.)
1.1
2.4
3.9
0
1
2
3
4
2000 2005 2010
TwH/year
![Page 14: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/14.jpg)
Case 3: Power Savings
• Observation: the diurnal traffic pattern• Idea: contract and expand the physical
network according to the traffic demand
14
![Page 15: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/15.jpg)
Case 3: Power Savings
15
Dynamically contract & expand the physical network in a day - 3PM
![Page 16: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/16.jpg)
Case 3: Power Savings
16
Dynamically contract & expand the physical network in a day - 9PM
![Page 17: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/17.jpg)
Case 3: Power Savings
17
Dynamically contract & expand the physical network in a day - 4AM
![Page 18: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/18.jpg)
• Migrate an entire virtual router instance– All control plane & data plane processes / states
• Minimize disruption– Data plane: up to millions packets per second– Control plane: less stringent (w/ routing message retrans.)
• Migrate links
Virtual Router Migration: the Challenges
18
![Page 19: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/19.jpg)
Outline
• Why is VROOM a good idea?• What are the challenges?• How does VROOM work?
– The migration enablers– The migration process
• What to be migrated?• How? (in order to minimize disruption)
• Is VROOM practical?• Where to migrate?
![Page 20: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/20.jpg)
VROOM Architecture
• Three enablers that make VR migration possible– Router virtualization– Control and data plane separation– Dynamic interface binding
20
![Page 21: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/21.jpg)
A Naive Migration Process
21
1. Freeze the virtual router2. Copy states3. Restart4. Migrate links
Practically unacceptable Packet forwarding should not stop during migration
![Page 22: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/22.jpg)
22
VROOM’s Migration Process Key idea: separate the migration of control and data plane
No data-plane interruption Low control-plane interruption
1. Control-plane migration2. Data-plane cloning3. Link migration
![Page 23: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/23.jpg)
23
Control-Plane Migration Two things to be copied
Router image Binaries, configuration files, etc.
Memory 1st stage: pre-copy 2nd stage: stall-and-copy (when the control plane is “frozen”)
t1 t2 t3 t4time
1 2
1: router-image copy
2: memory copy
pre-copy stall-and-copy
![Page 24: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/24.jpg)
24
Data-Plane Cloning Clone the data plane by repopulation
Copying the data plane states is wasteful, and could be hard Instead, repopulate the new data plane using the migrated control
plane The old data plane continues working during migration
t1 t2 t3 t4time
1 2
1: router-image copy
2: memory copy
t5
3
3: data-plane cloning
![Page 25: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/25.jpg)
25
Remote Control Plane The migrated control plane plays two roles
Act as a “remote control plane” for the old data plane Populate the new data plane
t1 t2 t3 t4time
1 2
1: router-image copy
2: memory copy
t5
3
3: data-plane cloning
old nodenew node
control plane
remote control plane
![Page 26: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/26.jpg)
26
Keep the Control Plane “Online” Data-plane cloning takes time
Around 110 us per FIB entry update (for high-end router) * Installing 250k routes could take over 20 seconds
The control plane needs connectivity during this period Redirect the routing messages through tunnels
*: P. Francios, et. al., Achieving sub-second IGP convergence in large IP networks, ACM SIGCOMM CCR, no. 3, 2005.
![Page 27: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/27.jpg)
27
Double Data Planes At the end of data-plane cloning, two data planes are ready to
forward traffic (i.e., “double data planes”)
t1 t2 t3 t4time
1 2
1: router-image copy
2: memory copy
t5
3
3: data-plane cloning
t0
0
0: tunnel setupdoubledata plane
data plane
old node
4
4: asynchronous link migration
new node
old nodenew node
control plane
remote control planet6
![Page 28: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/28.jpg)
28
Asynchronous Link Migration With the double data planes, each link can be migrated
independently Eliminate the need for a synchronization system
![Page 29: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/29.jpg)
Outline
• Why is VROOM a good idea?• What are the challenges?• How does VROOM work?• Is VROOM practical?
– Prototype system– Performance evaluation
• Where to migrate?
![Page 30: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/30.jpg)
30
Prototype Implementation PC + OpenVZ OpenVZ: OS-level virtualization
Lighter-weight Supports live migration
Two prototypes Software-based data plane (SD): Linux kernel Hardware-based data plane (HD): NetFPGA
NetFPGA: 4-port gigabit Ethernet PCI with an FPGA
Why two prototypes? To validate the data-plane hypervisor design (e.g.,
migration between SD and HD)
![Page 31: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/31.jpg)
31
The Out-of-box OpenVZ Approach Packets are forwarded inside each VE When a VE is being migrated, packets are
dropped
![Page 32: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/32.jpg)
32
Control and Data Plane Separation
Move the FIBs out of the VEs shadowd in each VE, “pushing down” route
updates virtd in VE0, as the “data-plane hypervisor”
![Page 33: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/33.jpg)
33
Dynamic Interface Binding bindd provides two types of bindings:
Map substrate interfaces to the right FIB Map substrate interfaces to the right virtual
interfaces
![Page 34: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/34.jpg)
34
Putting It Altogether: Realizing Migration
1. The migration program notifies shadowd about the completion of the control plane migration
![Page 35: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/35.jpg)
35
Putting It Altogether: Realizing Migration
2. shadowd requests zebra to resend all the routes, and pushes them down to virtd
![Page 36: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/36.jpg)
36
Putting It Altogether: Realizing Migration
3. virtd installs routes the new FIB, while continuing to update the old FIB
![Page 37: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/37.jpg)
37
Putting It Altogether: Realizing Migration
4. virtd notifies the migration program to start link migration after finishing populating the new FIB
5. After link migration is completed, the migration program notifies virtd to stop updating the old FIB
![Page 38: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/38.jpg)
38
Evaluation
Answer three questions Performance of individual migration steps? Impact on data traffic? Impact on routing protocol?
Experiments on Emulab
![Page 39: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/39.jpg)
39
Performance of Migration Steps
Memory copy time With different
numbers of routes (dump file sizes)
0
1
2
3
4
5
6
0 10k 100k 200k 300k 400k 500k
Number of routes
Time (seconds)
Suspend + dump Copy dump file Undump + resume Bridging setup
![Page 40: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/40.jpg)
40
Performance of Migration Steps
FIB population time Grows linearly w.r.t. the number of route entries Installing a FIB entry into NetFPGA: 7.4 microseconds Installing a FIB entry into Linux kernel: 1.94 milliseconds
• FIB update time: time for virtd to install entries to FIB• Total time: FIB update time + time for shadowd to send routes to virtd
![Page 41: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/41.jpg)
41
Data Plane Impact
The diamond testbed
64-byte UDP packets, round-trip traffic
![Page 42: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/42.jpg)
42
Data Plane Impact
HD router with separate migration bandwidth No delay increase or packet loss
SD router with separate migration bandwidth Up to 3.7% delay increase at 5k packets/s Less than 0.4% delay increase at 25k packets/s
SD, 5k packets/s
![Page 43: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/43.jpg)
43
The Importance of Separate Migration Bandwidth
The dumbbell testbed
250k routes in the RIB
![Page 44: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/44.jpg)
44
Separate Migration Bandwidth is Important
Throughput of the migration traffic
![Page 45: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/45.jpg)
45
Separate Migration Bandwidth is Important
Delay increase of the data traffic
![Page 46: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/46.jpg)
46
Separate Migration Bandwidth is Important
Loss rate of the data traffic
![Page 47: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/47.jpg)
47
Control Plane Impact
The Abilene testbed
Assume a backbone running MPLS VR5 configured as
Core router (running OSPF only) Edge router (running OSPF + BGP)
![Page 48: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/48.jpg)
48
Core Router Migration
No events during migration Average control plane downtime: 0.972 seconds (0.924
- 1.008 seconds in 10 runs) Support 1-second OSPF hello-interval (with 4-second
dead-interval) Miss at most one hello message
![Page 49: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/49.jpg)
49
Core Router Migration
Events happen during migration Introducing events (LSA) by flapping link VR2-VR3 Miss at most one LSA Get retransmission 5 seconds later (the default LSA retransmission-
interval) Can use smaller LSA retransmission-interval (e.g., 1 second)
![Page 50: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/50.jpg)
50
Edge Router Migration
255k BGP routes + OSPF Dump file size grows from 3.2MB to 76.0MB Average control plane downtime: 3.560 seconds (3.484 -
3.594 seconds in 10 runs) Support 2-second OSPF hello-interval (with 8-second dead-
interval) BGP sessions stay up
In practice, ISPs often use the default values 10-second hello-interval 40-second dead interval
![Page 51: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/51.jpg)
Outline
• Why is VROOM a good idea?• What are the challenges?• How does VROOM work?• Is VROOM practical?• Where to migrate?
![Page 52: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/52.jpg)
Deciding Where To Migrate• Physical constraints
– Latency• E.g, NYC to Washington D.C.: 2 msec
– Link capacity• Enough remaining capacity for extra traffic
– Platform compatibility• Routers from different vendors
– Router capability• E.g., number of access control lists (ACLs) supported
• Good news: these constraints limit the search space
52
![Page 53: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/53.jpg)
53
Two Optimization Problems
• For planned maintenance/service deployment– Minimize path stretch– With constraints on link capacity, platform
compatibility, router capability, etc.
• For power savings– Maximize power savings
• With different regional electricity prices
– With constraints on path stretch, link capacity, etc.
![Page 54: VROOM: Virtual ROuters On the Move Aditya Akella Based on slides from Yi Wang](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649d215503460f949f61c5/html5/thumbnails/54.jpg)
Conclusions
• VROOM offers a useful network-management primitive– separates the tight coupling between physical and logical– Simplify network management, enable new applications
• Live router migration with minimal disruption– Data-plane hypervisor enables
• Data-plane cloning• Remote control plane• Double data plane and asynchronous link migration
– No data-plane disruption– No visible control-plane disruption
54