fpga accelerator virtualization in an openpower … · fpga accelerator virtualization in an...
Post on 30-Sep-2018
228 Views
Preview:
TRANSCRIPT
FPGA Accelerator Virtualization in an OpenPOWER cloud
Fei Chen, Yonghua Lin
IBM China Research Lab
Trend of Acceleration Technology
2
• Used FPGA to accelerate Bing search on 1632 servers • A 6*8 2D-torus design for high throughput network topology
• Storage >2000PB, processing 10~100PB/day, log 100TB~1PB/day • Using FPGA for storage controller • Used GPU for Deep Learning
Acceleration in Cloud is Taking Off
• Scalable acceleration fabric • Open framework for accelerator integration and sharing • Accelerator resource abstraction, re-configuration and scheduling in cloud • Modeling & advisory tool for dynamic acceleration system composition
Innovations Required Acceleration programming becomes hot topics
TB scale problem PB scale problem
Acceleration architecture for single node
Architecture for thousands of nodes
Dedicate acceleration resource
Shared acceleration resources
Proprietary accelerator framework
Open framework to enable accelerator sharing & integration
Close innovation model Open innovation model through eco-system
Appliance Acceleration in Cloud
OpenCL, Sumatra (Oracle), LiMe (IBM), …
Resources on FPGA are huge
3
• Resources on FPGA – Programmable resources
•Logic cells (LCs) •DSP slices: Fixed/floating-point •On-chip memory blocks •Clock resources
– Miscellaneous periphrals (Xilinx Virtex as an example) •DDR3 controllers •PCIe Gen3 interfaces •10G Ethernet controllers •...
– Hard processor core •PowerPC: Xilinx Virtex-5 FXT •ARM: Xilinx Zynq-7000 •Atom: Intel + Altera E600C Xilinx Virtex UltraScale 440 FPGA, the largest-
scale FPGA in the world delivered in2014, consists of
more than 4 million logical cells (LCs).
Using this chip, we can build up to 250 AES
crypto accelerators, or 520 ARM7 processor cores.
FPGA Capacity Trends
FPGA on Cloud – Double Win
4
Cloud benefits from FPGA • Performance • Power consumption
FPGA benefits from cloud • Lower the cost
• Tenants need not purchase and maintain FPGAs. • Tenants pay for accelerators only when using them.
• More applications • High FPGA utilization
• Ecosystem • Grow with the cloud ecosystem
Motivation of Accelerator/FPGA as Service in Cloud
5
Can FPGA (pool) be managed in data
center?
ID, location, re-
configuration,
performance, etc.
App.
App.
App. VM
VM
VM Container
Container
Container
How to reduce system cost through sharing FPGA
resources among applications, VMs, containers?
Dynamic, Flexible, Priority controllable
How to orchestrate FPGA/accelerator resources
with VM, network and storage resources easily,
according to the need of application?
Enable the manageability Reduce system cost
Reduce deployment complexity
VM
Container host
Storage
Network
Bring high value of cloud infrastructure
Could we generate new value for IaaS ?
FPGA Ecosystem in Cloud
HEAT orchestrator
POWER8/PowerKVM
FPGA
cards
Compute Network Storage FPGA
accelerator
Accelerator Market Place
Cloud Service Provider
Cloud
Tenants •Pay for the usage of accelerator, rather than license and hardware
•Get the accelerator service in self-service way
•Use the single HEAT orchestrator to finish the workload deployment with accelerator, together with compute, network, and storage.
•Cloud service provider will buy the “Cloudified” accelerators on market place
•Create the Service Category for FPGA accelerator, and sale on the cloud as service
•Companies or
individual
developers could
upload and sale
their accelerator
through market
place (e.g. on
OpenPOWER)
• Accelerator market place will do the cloudify for accelerator, through integrating the service layer with accelerator and compilation
•All the integration, compilation, test, verification and certification will be done in automatic way.
Accelerator developers
Accelerator Cloudify Tool
(in plan)
OpenStack extension for accelerator service
Service logics for accelerator
service in FPGA
Accelerator as Service on SuperVessel
7
• Accelerator MarketPlace for developers to upload and compile the accelerators for SuperVessel POWER cloud
• Allow user to request different size of cluster
Fig.1 Accelerator MarketPlace for SuperVessel Cloud
Fig.2 Cloud users could apply accelerator when creating VM
FPGA
Guest Process
APIs
Utilities
Bitfile
Library
HW
Modules
Guest
OS
Guest
Control
Module
Guest
Driver
Virtual
FPGA Utilities Driver
Hypervisor Host Control Module Host Driver
Service Logic FPGA Hardware
APIs Images Library
Openstack
Agent
DRAM
Control Node
Scheduler
Compute
Node
Compute Node
Openstack-based CloudTenant
Tenant
Virtual
Machine
Kernel Control Module / Driver
CAPI FPGAHardware
APIs Images Library
Utilities Driver
Openstack
Agent
APIs
Applications
Virtual
Machine
Enabling FPGA virtualization in OpenStack cloud
8
Enhanced OpenStack
Components for FPGA framework
……
KVM-based Compute Node Docker-based Compute Node
FPGA accelerator as Services online on SuperVessel Cloud
9
SuperVessel Cloud Infrastructure
SuperVessel Cloud
Service
SuperVessel Big Data and HPC
Service
Super Class
Service
OpenPOWER Enablement
Service
Super Project Team
Service
Super Marketplace
1.VM and container service
2.Storage service 3.Network service
4.Accelerator as service
5. Image service
1.Big Data: MapReduce (Symphony), SPARK
2.Performance tuning service
1.X-to-P migration 2.AutoPort Tool 3.OpenPOWER
new system test service
1.On-line video courses
2.Teacher course management
3.User contribution management
1.Project management service
2.DevOps automation
Storage IBM POWER servers OpenPOWER server FPGA/GPU
Docker
(Online) (Online) (Preparing) (Preparing)
Try it here: www.ptopenlab.com
Thanks!
10
FPGA Implementation
11
AB
C D
Service Logic
Ser
vice
Subla
yer
Pla
tform
Subla
yer
A
BC
DUser Sublayer : Shared FPGA resource
Service Sublayer : Job Queue, Switch, …
Platform Sublayer : DRAM, PCIe, ICAP, …
FPGA chip
DRAM
Switch
Job Scheduler
Job Queue
Security Controller
DMA Engine
Reconfig
Controller
A B C D
Eth
Registers
High Bandwidth I/O
PCIe / CAPI
……
Context Controller
Hardware
OS
Apps
Computer
The FPGA subsystem is designed as a computer system.
System Implementation
12
Control Node
Compute Node
Compiler
• Control Node: Nova, Glance, Horizon, Neutron, Swift
• Compute Node: Nova Compute
• Compiler: FPGA incremental compiler environment
1. Accelerator source code package
2. image_file
Dashboard
Compiler Glance
Scheduler
Compute
3. VM request (with accelerator)
4. acc_file
5. Launch VM
Evaluation
13
(1) Accelerator Sharing Evaluation
900
1000
1100
1200
1300
1 2 3 4 5 6 7 8
Tota
l Ban
dw
idth
(MB
/s)
Number of Processes
Host One VM
(1)
• Host: All processes run in host environment
• One VM: All processes run in one VM
• VMs : Each process runs in one VM
• AESs: Each VM uses one independent AES accelerator
0
400
800
1200
1600
1 5 9 13 17 21 25 29 33 37 41 45 49 53 57 61Time
Ban
dw
idth
(B
M/s
)
Process 0 Process 1 Total
Reduce VM
bandwidth
Increase P0
bandwidth
Reduce P0
bandwidth
(2) Management – Bandwidth Control
0%
20%
40%
60%
80%
1 11 21 31 41 51 61Time (second)
CV
0
1
2
3
4
1 11 21 31 41 51 61Time (second)
Ave
rag
e L
ate
ncy (
ms)
1
10
100
1000
10000
1 11 21 31 41 51 61Time (second)
Ba
nd
wid
th (
MB
/s)
Process 0 Process 1
Process 1 begin Priority control
1194 MB/s
25 MB/s
0.21ms 0.22ms
2.3ms
(3) Management – Priority Control
(3)
Process 0 : 256KB payload, 100 times per second
Process 1 : 4MB payload, best effort use.
Same priority during second 1 ~ 38.
Raise process 0 priority at second 38.
top related