enpower: an opencl approach for energy proportional … · tnpl t recon request for turning on the...
TRANSCRIPT
-
ENPOWER: An OpenCL Approach for Energy Proportional Computing With Heterogeneous and Reconfigurable Processors
Mohammad Hosseinabady, Jose Nunez-Yanez
Department of Electronic Engineering, University of Bristol, UKThis research is sponsored by EPSRC and done in collaboration with Queen's University Belfast
{m.hosseinabady, j.l.nunez-yanez, }@bristol.ac.uk
· Energy proportional computing applied to hybrid FPGA-ARM chip· This research proposes the concept of energy proportional computing
includes scalability of a power triplet formed by · Voltage, · Frequency and · Logic (e.g. capacitance)
· Applications in this design paradigm are written in an extended OpenCL
1- Abstract 2- Motivations
3-Goals
ApplicationSpecification
Host program(C/C++)
Kernels(OpenCL-C or HDL)
C/C++ files
bitstreamsbitstreamsbitstreams
C/C++ Compiler
HDL files
Vivado_HLS& Vivado
Binary files
PS PL
Select a bitstream
Kernels extraction
1
2
3 4
OpenCL-C
OpenCL compiler
Binary filesBinary
filesBinary
files
Multi-core Processors
Ru
n
Ru
n
Map
5
6 7
8
9
12
10
11
4-Contributions
5-Framework
6-Power & Energy Modelling
(prerequisite research)
SOC Consumer Portable Power Consumption Trends
Sou
rce:
ITR
S 2
01
1-S
yste
m D
rive
rs
Xilinx ZYNQ
Host program
OpenCL Runtime Accelerator Accelerator Accelerator
ARM-OpenCL
PS PL
ARM-OpenCL
· Propose a new design paradigm in order to expose the concept of energy proportional computing to the application software
· Considering the logic and power scalability in a variation-aware closed-loop configuration
· Using a software centric approach based on OpenCL for describing applications · Using a processor-centric platform such as Xilinx All Programmable SoCs to implement
applications
7-Power Gating
(prerequisite research)
Turning off PL (tfpl) PL turned off (pltf) Turning on PL (tnpl) PL reconfiguration (reconf)
Request for turning off the PL
ttfpl tpltf ttnpl treconf
Request for turning on the PL
Restore state (rs)Store State (ss)
tss trs
0
0.2
0.4
0.6
0.8
1
1.2
0 1000 2000 3000 4000 5000 6000
Vo
ltag
e (
V)
Time (µs)
0
0.05
0.1
0.15
0.2
0.25
0.3
00.20.40.60.811.2
VCCINT (volt)
Power on VCCINT RailPower on VCCAUX Rail
Po
we
r on
(W)
Power versus VCCINT voltage on ZC702
PL shut-down slop
Accelerator configuration
freq
uen
cy
volta
ge
A simple motivation example is considered in this subsection.The example consists of a MicroBlaze soft processor coreconfigured in the PL and the ARM processor available in thePS. The MicroBlaze runs a 32 x 32 floating point matrix multiplication.
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Running Software
Configuration
Before Configuration
VCCINT VCCAUX
Po
we
r (W
)
VCCINT
VCCAUX
Other power linesClock Lo
gic
Sign
als
MMCMs
Leakage
Xilinx XPower Analyzer Real power measurement on ZC702
MicroBlaze power consumption
System-level issues· Power modelling · Energy modelling· Power gating· Power scaling· Clock gating· Frequency scaling· Logic scaling
RuntimeSystem
12
PL Power for ME = a*V2+b*f*V2 +c*V3
High-Level Models should be:
This section investigates the viability of physical power gating FPGA devices that incorporate a hardened processor in a different power domain. The run-time power gating approach is applied to Xilinx ZYNQ devices that incorporate a hardened Cortex A9 multi-processor.
UCD9248UCD9248
PS
PL
UCD9248
I2C
PS-PL level shifter
PS power rails
PL power rails
ZYNQ all Programmable SOC
VIV
AD
O &
V
IVA
DO
-HLS
System-Level Power Control
Role of OpenCL runtime Choosing the best configuration and voltage/
frequency at run time. · scheduling · mapping algorithm.
· Covers PL, PS and memories· Simple· Compositional· Descriptive
Specially the PL High-Level Models should:· Describes different thread hardware and cores· Describes different accelerator modes
· Idle · Data transfer· Active· Computational
The results show that the minimum time that the FPGA fabric must remain in power-off state for the technique to be energy efficient is in the order of milliseconds and up to 96% power reduction occurs when the fabric voltage is lowered below critical level.
Dynamic power caused by measurement modules with
own frequency
ME dynamic power
Static power
Motion Estimation
(ME)
MB
PS
PL
ZYNQ (FPGA+ARM)
(FPGA+ARM)
iwocl2014.vsdPage-1