meterpu: a generic measurement abstraction api · meterpu: a generic measurement abstraction api...

14
MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15 workshop at ISPA’15, Helsinki, Finland. IEEE. Lu Li , Christoph Kessler [email protected], [email protected] 1 / 13

Upload: vankhue

Post on 01-May-2019

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

MeterPU:A Generic Measurement Abstraction API

– Enabling Energy-tuned Skeleton Backend Selection

To appear in Proc. REPARA’15 workshop at ISPA’15, Helsinki, Finland. IEEE.

Lu Li, Christoph [email protected], [email protected]

1 / 13

Page 2: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

Motivation

Energy MeasurementNecessary for energy modeling and optimization.Difficult, especially for GPUs

Power visualization (Li et al., 2015 [4]) for a CUDA program.Illustrating "capacitor effect". Data obtained by Nvidia NVML.

6

6.5

7

7.5

8

8.5

9

·104

Pow

er(m

W)

40

42

44

46

48

50Tem

perature

30

32

34

36

Fan

speed

65 70 75 80 85

Time(s)

Requires numerical postprocessing(Burtscher et al., 2014 [1]) to calculate real energy cost

Goal: as simple as measuring time in good old days.2 / 13

Page 3: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

MeterPU

A software multimeterA generic measurement abstraction,hiding platform-specific measurement details.Four simple functions to unify measurement interface forvarious metrics on different hardware components.

Time, Energy on CPU, GPUEasy to extend: FLOPS, cache misses etc.

On top of native measurement libraries (plug-ins).CPU time: clock_gettime()GPU time: cudaEventRecord();CPU and DRAM energy: Intel PCM library.GPU energy: Nvidia NVML library....

3 / 13

Page 4: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

MeterPU API

template<cl ass T ype> / / analogous to swi tch/ / on a r eal mul t imeter

cl ass Meter{

publ i c :void st ar t ( ) ; / / st ar t a measurementvoid stop ( ) ; / / stop a measurementvoid cal c ( ) ; / / cal cul at e a met r i c value

/ / of a code regi ontypename Meter_ T raits<Type>:: R esultT ype const &

get_ value( ) const ;/ / get the cal cul at ed met r i c value

pr i vat e :/ / P lat form speci f i c measurement det ai l s hidden . . .

}

4 / 13

Page 5: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

An MeterPU Application: Measure CPU Time

#incl ude <MeterPU .h>

i nt main ( ){

cpu_ func ( ) ; / / Do sth here

}

#incl ude <MeterPU .h>

i nt main ( ){

using namespace MeterPU ;Meter<CPU_ Time> meter ;

cpu_ func ( ) ; / / Do sth here

}

#incl ude <MeterPU .h>

i nt main ( ){

using namespace MeterPU ;Meter<CPU_ Time> meter ;

meter . st ar t ( ) ; / / Measurement Star t !

cpu_ func ( ) ; / / Do sth here

meter . stop ( ) ; / / Measurement Stop !

}

#incl ude <MeterPU .h>

i nt main ( ){

using namespace MeterPU ;Meter<CPU_ Time> meter ;

meter . st ar t ( ) ; / / Measurement Star t !

cpu_ func ( ) ; / / Do sth here

meter . stop ( ) ; / / Measurement Stop !

meter . cal c ( ) ;

}

#incl ude <MeterPU .h>

i nt main ( ){

using namespace MeterPU ;Meter<CPU_ Time> meter ;

meter . st ar t ( ) ; / / Measurement Star t !

cpu_ func ( ) ; / / Do sth here

meter . stop ( ) ; / / Measurement Stop !

meter . cal c ( ) ;BUILD_ CPU_ TIME_ MODEL( meter . get_ value( ) ) ;

}

5 / 13

Page 6: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

An MeterPU Application: Measure GPU Energy

#incl ude <MeterPU .h>

i nt main ( ){

using namespace MeterPU ;/ / Only one l i ne d i f f er s ! ! ! !Meter<NVML_ Energy<> > meter ;

meter . st ar t ( ) ; / / Measurement Star t !

cuda_ func<<< . . . , . . . >>>(...) ;cudaDeviceSynchronize( ) ;

meter . stop ( ) ; / / Measurement Stop !

meter . cal c ( ) ;BUILD_GPU_ ENERGY_MODEL( meter . get_ value( ) ) ;

}

6 / 13

Page 7: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

Measure Combinations of CPUs and GPUs

#incl ude <MeterPU .h>

i nt main ( ){

using namespace MeterPU ;/ / Only one l i ne d i f f er s ! ! ! !Meter< System_ Energy<0> > meter ;

meter . st ar t ( ) ; / / Measurement Star t !

async_ cpu_ func ( ) ;cuda_ func<<< . . . , . . . >>>(...) ;wait_ for_ cpu_ func_ to_ finish ( ) ;cudaDeviceSynchronize( ) ;

meter . stop ( ) ; / / Measurement Stop !

meter . cal c ( ) ;BUILD_ SYSTEM_ ENERGY_MODEL( meter . get_ value( ) ) ;

}

7 / 13

Page 8: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

Unification → Reuse Legacy Autotuning Framework

Figure 1 : Unification allows empirical autotuning framework to switch tomultiple meter types on different hardware components.

8 / 13

Page 9: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

SkePU Reduce Skeleton (A Single Skeleton Call)

2e+02 1e+03 5e+03 5e+04 5e+05

Problem size for reduce

1

10

100

1000

10000T

ime(

us):

ave

rage

of 1

00 r

uns ●

CPUOMPCUDASelection

(a) Time tuning forReduce.

●●

●●

● ● ● ●

●●

2e+03 1e+04 5e+04 2e+05

Problem size for reduce

100

200

500

1000

Ene

rgy(

mill

iJ):

ave

rage

of 1

000

runs

CPUOMPCUDASelection

(b) Energy tuning forReduce.

Further results see paperEmpirical autotuning framework for time optimizationreused for energy optimization.

9 / 13

Page 10: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

LU decomposition (Multiple Skeleton Calls)

20 30 40 50

Problem size for LU decomposition

200

500

1000

2000

5000

10000

Tim

e(us

): a

vera

ge o

f 300

0 ru

ns ●

CPUOMPCUDASelection

(a) Time tuning for LUdecomposition

● ●●

● ●●

5 10 20 50 100

Problem size for LU decomposition

100

200

500

1000

2000

5000

10000

Ene

rgy(

mill

iJ):

ave

rage

of 1

000

runs

CPUOMPCUDASelection

(b) Energy tuning for LUdecomposition

Easy switching for optimization goals.

10 / 13

Page 11: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

Related Work

Other measurement abstraction software:EML (Cabrera et al., 2015 [2])REPARA (Akos et al., 2015 [3])

MeterPU:Requires much fewer lines of code (LOC) to use.Very low overhead, negligible.

Only one function callCan even be inlined for some meter types.

11 / 13

Page 12: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

Pluggable to Other Work

Some MeterPU Use CasesEnergy-tuned SkePUEnergy comparison between SkePU’s Myriad Backend andNvidia GPUGlobal Composition FrameworkTunePU...

12 / 13

Page 13: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

Summary of Contributions

Hide complexity of platform-specific energy measurement,especially for GPUs.MeterPU enables to reuse legacy empirical autotuningframeworks, such as the one in SkePU.With MeterPU, SkePU offers the first energy-tunedskeletons, as far as we know.Switching optimization goal can be easy,facilitates building time and energy models formulti-objective optimization.

, Open source, download at:http://www.ida.liu.se/labs/pelab/meterpu

13 / 13

Page 14: MeterPU: A Generic Measurement Abstraction API · MeterPU: A Generic Measurement Abstraction API – Enabling Energy-tuned Skeleton Backend Selection To appear in Proc. REPARA’15

Bibliography

Martin Burtscher, Ivan Zecena, and Ziliang Zong.Measuring GPU power with the K20 built-in sensor.In Proc. Workshop on General Purpose Processing Using GPUs (GPGPU-7). ACM, March 2014.

Alberto Cabrera, Francisco Almeida, Javier Arteaga, and Vicente Blanco.Energy measurement library (eml) usage and overhead analysis.In 23rd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP),pages 554–558, March 2015.

Akos Kiss, Marco Danuletto, Zoltan Herczeg, Akos Kiss, Peter Molnar, Robert Sipka, Massimo Torquati, andLaszlo Vidacs.D6.4: REPARA performance and energy monitoring library.Technical report, c©The REPARA Consortium, March 2015.

Lu Li and Christoph Kessler.Validating Energy Compositionality of GPU Computations.In Proc. HIPEAC Workshop on Energy Efficiency with Heterogeneous Computing (EEHCO-2015), January2015.

13 / 13