fido: fast inter-virtual-machine communication for ...2 •high performance •scalable and...
TRANSCRIPT
![Page 1: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/1.jpg)
Fido: Fast Inter-Virtual-Machine Communication for Enterprise
Appliances
Anton Burtsev†, Kiran Srinivasan, Prashanth Radhakrishnan, Lakshmi N. Bairavasundaram,
Kaladhar Voruganti, Garth R. Goodson
†University of Utah,
School of Computing
NetApp, Inc
![Page 2: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/2.jpg)
Enterprise appliances
2
• High performance• Scalable and highly-available access
Network attached storage, routers, etc.
![Page 3: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/3.jpg)
Example Appliance
3
• Monolithic kernel• Kernel components
Problems:
• Fault isolation• Performance isolation• Resource provisioning
![Page 4: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/4.jpg)
Split architecture
4
![Page 5: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/5.jpg)
Benefits of virtualization• High availability
• Fault-isolation• Micro-reboots• Partial functionality in case of failure
• Performance isolation
• Resource allocation• Consolidation and load balancing, VM migration
• Non-disruptive updates• Hardware upgrades via VM migration• Software updates as micro-reboots
• Computation to data migration
5
![Page 6: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/6.jpg)
Main Problem: PerformanceIs it possible to match performance of a monolithic
environment?
6
• Large amount of data movement between components• Mostly cross-core
• Connection oriented (established once)
• Throughput optimized (asynchronous)
• Coarse grained (no one-word messages)
• Multi-stage data processing
• Main cost contributors• Transitions to hypervisor
• Memory map/copy operations
• Not VM context switches (multi-cores)
• Not IPC marshaling
![Page 7: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/7.jpg)
Main Insight: Relaxed Trust Model
• Appliance is built by a single organization• Components:
• Pre-tested and qualified
• Collaborative and non-malicious
• Share memory read-only across VMs!
• Fast inter-VM communication• Exchange only pointers to data
• No hypervisor calls (only cross-core notification)• No memory map/copy operations
• Zero-copy across entire appliance
7
![Page 8: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/8.jpg)
Contributions
• Fast inter-VM communication mechanism
• Abstraction of a single address space for traditional systems
• Case study• Realistic microkernelized network attached storage
8
![Page 9: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/9.jpg)
Design
9
![Page 10: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/10.jpg)
Design Goals
• Performance• High-throughput
• Practicality• Minimal guest system and hypervisor dependencies
• No intrusive guest kernel changes
• Generality• Support for different communication mechanisms in the
guest system
10
![Page 11: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/11.jpg)
Transitive Zero Copy
11
• Goal• Zero-copy across entire appliance• No changes to guest kernel
• Observation• Multi-stage data processing
![Page 12: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/12.jpg)
Pseudo Global Virtual Address Space
12
264
0
Insight: • CPUs support 64-bit address space• Individual VMs have no need in it
![Page 13: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/13.jpg)
Pseudo Global Virtual Address Space
13
264
0
![Page 14: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/14.jpg)
Pseudo Global Virtual Address Space
14
264
0
![Page 15: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/15.jpg)
Transitive Zero Copy
15
![Page 16: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/16.jpg)
Fido: High-level View
16
![Page 17: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/17.jpg)
Fido: High-level View
17
• “c” – connection management• “m” – memory mapping• “s” – cross-VM signaling
![Page 18: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/18.jpg)
IPC Organization
18
• Shared memory ring• Pointers to data
![Page 19: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/19.jpg)
IPC Organization
19
• Shared memory ring• Pointers to data
• For complex data structures• Scatter-gather array
![Page 20: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/20.jpg)
IPC Organization
20
• Shared memory ring• Pointers to data
• For complex data structures• Scatter-gather array
• Translate pointers
![Page 21: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/21.jpg)
IPC Organization
21
• Shared memory ring• Pointers to data
• For complex data structures• Scatter-gather array
• Translate pointers
•Signaling:• Cross-core interrupts (event channels)• Batching and in-ring polling
![Page 22: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/22.jpg)
Fast device-level communication
• MMNet• Link-level
• Standard network device interface
• Supports full transitive zero-copy
• MMBlk• Block-level
• Standard block device interface
• Zero-copy on write
• Incurs one copy on read
22
![Page 23: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/23.jpg)
Evaluation
23
![Page 24: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/24.jpg)
MMNet Evaluation
24
• AMD Opteron with 2 2.1GHz 4-core CPUs (8 cores total)
• 16GB RAM• NVidia 1Gbps NICs• 64-bit Xen (3.2), 64-bit Linux (2.6.18.8)• Netperf benchmark (2.4.4)
Loop NetFront MMNetXenLoop
![Page 25: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/25.jpg)
MMNet: TCP Throughput
0
2000
4000
6000
8000
10000
12000
0.5 1 2 4 8 16 32 64 128 256
Thro
ugh
pu
t (M
bp
s)
Message Size (KB)
Monolithic
Netfront
XenLoop
MMNet
25
![Page 26: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/26.jpg)
MMBlk Evaluation
26
• Same hardware• AMD Opteron with 2 2.1GHz 4-core CPUs (8 cores total)• 16GB Ram• NVidia 1Gbps NICs
• VMs are configured with 4GB and 1GB RAM• 3 GB in-memory file system (TMPFS)• IOZone benchmark
MMNetXenBlkMonolithic
![Page 27: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/27.jpg)
MMBlk Sequential Writes
27
0
100
200
300
400
500
600
4 8 16 32 64 128 256 512 1K 2K 4K
Thro
ugh
pu
t (M
B/s
)
Record Size (KB)
Monolithic
XenBlk
MMBlk
![Page 28: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/28.jpg)
Case Study
28
![Page 29: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/29.jpg)
Network-attached Storage
29
![Page 30: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/30.jpg)
Network-attached Storage
30
• RAM• VMs have 1GB each, except FS VM (4GB)• Monolithic system has 7GB RAM
• Disks : • RAID5 over 3 64MB/s disks
• Benchmark• IOZone reads/writes 8GB file over NFS (async)
![Page 31: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/31.jpg)
Sequential Writes
31
0
10
20
30
40
50
60
70
80
90
4 8 16 32 64 128 256 512 1K 2K 4K
Thro
ugh
pu
t (M
B/s
)
Record Size (KB)
Monolithic
Native-Xen
MM-Xen
![Page 32: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/32.jpg)
Sequential Reads
32
0
10
20
30
40
50
60
70
80
4 8 16 32 64 128 256 512 1K 2K 4K
Thro
ugh
pu
t (M
B/s
)
Record Size (KB)
Monolithic
Native-Xen
MM-Xen
![Page 33: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/33.jpg)
TPC-C (On-Line Transactional Processing)
0
50
100
150
200
250
300
350
Tran
sact
ion
s/m
inu
te (
tpm
C)
Monolithic
MMXen
Native-Xen
33
![Page 34: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/34.jpg)
Conclusions• We match monolithic performance
• “Microkernelization” of traditional systems is possible!
• Fast inter-VM communication• The search for VM communication mechanisms is not over
• Important aspects of design• Trust model
• VM as a library (for example, FSVA)
• End-to-end zero copy• Pseudo Global Virtual Address Space
• There are still problems to solve• Full end-to-end zero copy• Cross-VM memory management• Full utilization of pipelined parallelism
34
![Page 36: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/36.jpg)
Backup Slides
36
![Page 37: Fido: Fast Inter-Virtual-Machine Communication for ...2 •High performance •Scalable and highly-available access Network attached storage, routers, etc. Example Appliance 3 •Monolithic](https://reader036.vdocuments.us/reader036/viewer/2022080719/5f7977db0712227a006c13cd/html5/thumbnails/37.jpg)
Related Work
• Traditional microkernels [L4, Eros, CoyotOS]• Synchronous (effectively thread migration)• Optimized for single-CPU, fast context switch, small
messages (often in registers), efficient marshaling (IDL)• Buffer management [Fbufs, IOLite, Beltway Buffers]
• Shared buffer is a unit of protection• Fast-forward – fast cache-to-cache data transfer
• VMs [Xen split drivers, XWay, XenSocket, XenLoop]• Page flipping, later buffer sharing• IVC, VMCI
• Language-based protection [Singularity]• Shared heap, zero-copy (only pointer transfer)
• Hardware acceleration [Solarflare]• Multi-core OSes [Barrelfish, Corey, FOS]
37