trillium: the code is the irhyu/talk/virt19-07-18-slides.pdf · svga2: tgsi as ir and visa native...
TRANSCRIPT
Trillium: The Code is the IR
Amogh Akshintala, Hangchen Yu,
Arthur M. Peters, Christopher J. Rossbach
#2A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
GPGPU Virtualization
End-to-end comparison of prior approaches
Lessons learnt:
● Virtual ISA unnecessary
● Para-virtual API remoting only feasible option
Brief Overview
#3A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
General purpose computing on GPUs
↑↑↑
GPUCPU
Performance Gain
Scientific Computing
Machine Learning
#4A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
The need to virtualize GPUsP
erf
orm
ance
Cost
P3.2xlarge1x V100, $2,200/mo
P2.xlarge
1x K80, $650/mo
g3s.xlarge1x M60, $540/mo
Credits to BitFusion Inc.
#5A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
The need to virtualize GPUsP
erf
orm
ance
Cost
P3.2xlarge1x V100, $2,200/mo
P2.xlarge
1x K80, $650/mo
g3s.xlarge1x M60, $540/mo
Virtual V1000.75x V100, $1,500
Virtual V1000.5x V100, $1,000
Credits to BitFusion Inc.
#6A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Existing techniques are impractical
Sharing Interposition Isolation Compatibility Slowdown
Pass Through ❌ 1x
Full-virtualization
Everybody sacrifices somethingUser-mode API Remoting
Para-virtualization
#8A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Hypervisor
Virtual GPU
Full-virtualization
Native stack
Vendor Library
Vendor Driver
GPU
VM
Application
Vendor Library
Vendor Driver
Sharing Interposition Isolation Compatibility Slowdown
✔️ ✔️ ✔️ ❌ 100x
#11A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
User-mode API Remoting
Native stack
Hypervisor
Custom API Server
Vendor Library
Vendor Driver
GPU
VM
Application
Custom User-mode Library
Sharing Interposition Isolation Compatibility Slowdown
✔️ ❌ ✔️/❌ ❌ 1.5x
#13A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Para-virtualization
Native stack
Hypervisor
Vendor Library
Vendor Driver
GPU
VM
Application
Custom User-mode Library
Interface Translator
Custom Guest Driver
Custom Virtual GPU
Sharing Interposition Isolation Compatibility Slowdown
✔️ ✔️ ✔️ ❌ 10x
#18A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
End-to-end performance comparison
Y. Suzuki, S. Kato, H. Yamada, K. Kono, “GPUvm: Why Not Virtualizing GPUs at the Hypervisor?”, ATC’14
#19A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
More details in the paper...
#20A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Everybody sacrifices something
Sharing Interposition Isolation Compatibility Slowdown
Full-virtualization ✔️ ✔️ ✔️ ❌ 100x
User-mode API Remoting ✔️ ❌ ✔️/❌ ❌ 1.5x
Para-virtualization ✔️ ✔️ ✔️ ❌ 10x
#22A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Para-virtual API-remoting...
Sharing Interposition Isolation Compatibility Slowdown
Full-virtualization ✔️ ✔️ ✔️ ❌ 100x
User-mode API Remoting ✔️ ❌ ✔️/❌ ❌ 1.5x
Para-virtualization ✔️ ✔️ ✔️ ❌ 10x
Para-virtual API remoting ✔️ ✔️ ✔️ 1.5x
#26A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
SVGA2: TGSI as IR and vISA
Native stack
Hypervisor
Vendor Library
Vendor Driver
GPU
VMApplication
Translator
OpenGL/DX11 Libraries
vmwgfx.ko
SVGA Device
TGSI, Tungsten Graphics Shader Infrastructure, is an intermediate language for describing shaders.
GLSL Code
TGSI
Native ISA
#31A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Xen-SVGA: computing support
Native stack
Hypervisor
Vendor Library
GPU
VMApplication
Mesa3D OpenCL Library
Nouveau
OpenCL Code
TGSI
Native ISA
TGSI LLVM Compiler
Nouveau
TGSI
Xen-SVGA Device
#32A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Xen-SVGA: computing support
Native stack
Hypervisor
Vendor Library
GPU
VMApplication
Mesa3D OpenCL Library
Nouveau
OpenCL Code
TGSI
Native ISA
TGSI LLVM Compiler
Nouveau
TGSI
Xen-SVGA Device
#33A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
vISA - Boon or bane?
PTX, a low-level parallel thread execution virtual machine and instruction set architecture (ISA).
#37A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Trillium: Eliminates vISA
Native stack
Hypervisor
Vendor Library
Nouveau
GPU
VMApplication
Mesa3D OpenCL Library
Nouveau
OpenCL Code
Native ISA
Trillium Device
#39A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
So are we done?
#40A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Checkpoint
✓ Virtual ISA is unnecessary
✓ Para-virtual API remoting system performs better
Sharing Interposition Isolation Compatibility Slowdown
User-mode API Remoting ✔️ ❌ ✔️/❌ ❌ 1.5x
Para-virtualization ✔️ ✔️ ✔️ ❌ 10x
Trillium ✔️ ✔️ ✔️ ❌ 2.4x
#41A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Para-virtual API-remoting...
Sharing Interposition Isolation Compatibility Slowdown
User-mode API Remoting ✔️ ❌ ✔️/❌ ❌ 1.5x
Para-virtualization ✔️ ✔️ ✔️ ❌ 1-10x
Trillium ✔️ ✔️ ✔️ ❌ 2.4x
✕ Interposing too low in the stack
✕ API-specific
#42A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Para-virtual API-remoting...
Sharing Interposition Isolation Compatibility Slowdown
User-mode API Remoting ✔️ ❌ ✔️/❌ ❌ 1.5x
Para-virtualization ✔️ ✔️ ✔️ ❌ 1-10x
Trillium ✔️ ✔️ ✔️ ❌ 2.4x
AvA ✔️ ✔️ ✔️ ✔️ <1.5x
✕ Interposing too low in the stack
✕ API-specific
#45A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Para-virtual
API Stack
Native stack
Automatic virtualization of acceleratorsC
L.h
+ A
nn
ota
tio
ns
Hypervisor
Generated API Server
Vendor Library
Vendor Driver
Accelerators
VMApplication
AvA Guest Driver
AvA Virtual Device
H. Yu, A. M. Peters, A. Akshintala, C. J. Rossbach, Automatic Virtualization of Accelerators, HotOS’19
Generated User-mode Library
#46A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Preliminary development effort
Type APIs LoC Time Difficulty
GPUvm Full-virt 1 20 000 Years ★★★★
SVGA2 Para-virt 2 MANY! Years ★★★★
AvA
Automatic
Para-virtual
API Remoting
9 OpenCL: 835 Days ★
#47A. Akshintala, H. Yu, A. M. Peters and C. J. Rossbach, Trillium, Virt’19
Conclusion
Lessons
● Virtual ISA is unnecessary○ Decouple device virtualization from GPU ISA virtualization
● Para-virtual API remoting can lead to better performance and properties
New para-virtual API remoting system: AvA
● “No IR” enables interposition at user-mode APIs
● Compatibility is compensated