high-level power simulation for dvs-aware processors traineeship under supervision of: prof. h....

32
High-level Power Simulation for DVS- aware Processors Traineeship under supervision of: Prof. H. Corporaal M.Sc. S.V. Gheorghita by Hans Giesen

Post on 21-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

High-level Power Simulation for DVS-aware

Processors

Traineeship under supervision of:Prof. H. CorporaalM.Sc. S.V. Gheorghita

by Hans Giesen

2 / 28High-level Power Simulation for DVS-aware Processors

18-04-23

Overview

• Introduction

•Design

• Implementation

•Experiments

•Future Work

3 / 28High-level Power Simulation for DVS-aware Processors

18-04-23

Overview

• Introduction

•Design

• Implementation

•Experiments

•Future Work

18-04-23 Introduction 4 / 28

DVS principle

•Energy depends quadratically on supply voltage

2NVE

V

V

VVf T

•Clock frequency depends almost linearly on supply voltage

18-04-23 Introduction 5 / 28

Real-time embedded systems

Deadline

Time

IdlePow

erP

ower

Time

•Real-time systems have timing constraints

18-04-23 Introduction 6 / 28

Simulation toolset

Power simulator DVS simulator

Cycles

Combine

Schedule

Remove

Deadline

Calculate

Max

Statistics

Sample

18-04-23 Introduction 7 / 28

Simulation levels

System-level analytical models

Abstract performance simulation

Instruction set simulation

Cycle-accurate simulation

HDL / RTL simulation

Synthesis

Low

LowHigh

High

Acc

urac

y / D

etai

l

Sim

ulat

ion

spee

d

8 / 28High-level Power Simulation for DVS-aware Processors

18-04-23

Overview

• Introduction

•Design

• Implementation

•Experiments

•Future Work

18-04-23 Design 9 / 28

XTREM power simulator

XTREM

ARM binaryArchitectureinformation

Input

Performance +energy trace

Output

18-04-23 Design 10 / 28

Intel XScale architecture• Based on ARM architecture

• Used in e.g. Intel PXA255 and PXA270 processors

18-04-23 Design 11 / 28

Simple tracefile exampleCycle PC IPC DEC BTB

0 02000100 0 6.0e-16

200000 02002164 0.34 0.034 0.024

400000 0200210C 0.36 0.040 0.028

600000 020021B4 0.37 0.040 0.028

800000 020002A0 0.38 0.040 0.028

1000000 020021C0 0.38 0.040 0.028

1200000 0200217C 0.38 0.040 0.028

18-04-23 Design 12 / 28

Problems for DVS simulation

•Trace is only valid for one combination of frequency and voltage

•Sample periods have fixed length

18-04-23 Design 13 / 28

period Sample

xxxxxxxxxxxx

Adapting XTREM

N

P1

N

P4

N

P2

Code

N

P3

period Sample

xxxxxxxxxxxxxx

1

111 ,

N

VfP 4

444 ,

N

VfP 2

222 ,

N

VfP

Code

Mark Mark Mark

3

333 ,

N

VfP

Before adaptation:

After adaptation:

18-04-23 Design 14 / 28

DVS simulator

DVS simulator

Architectureinformation

DVS algorithmTrace from

XTREM

Performance +energy trace

18-04-23 Design 15 / 28

DVS simulator

t

E1

1

t

E

a

a

t

E

2

2

t

E

Code

3

3

t

E

DVS DVS

a

a

t

E

1

1

V

f

2

2

V

f

3

3

V

f

1

111 ,

N

VfP 2

222 ,

N

VfP 3

333 ,

N

VfP

16 / 28High-level Power Simulation for DVS-aware Processors

18-04-23

Overview

• Introduction

•Design

• Implementation

•Experiments

•Future Work

18-04-23 Implementation 17 / 28

Deriving power formulas

Before:double senseamp_power(int cols)

{

return((double) cols * Vdd / 8 * .5e-3);

}

After:double senseamp_power(int cols)

{

return((double) cols / 8 * .5e-3);

}

= cV

= c

18-04-23 Implementation 18 / 28

Deriving power formulasBefore:power->btb_datapower =

ram_decoder_power(logtwo(rowsb), 2) + ram_wordline_power(rowsb, colsb, 1, 1, CACHE) + BTB_DATA_BITLINE_AF * ram_bitline_power(rowsb, colsb, 1, 1, CACHE) +senseamp_power(colsb);

After:power->btb_datapower_fV2 =

ram_decoder_power(logtwo(rowsb), 2) + ram_wordline_power(rowsb, colsb, 1, 1, CACHE) + BTB_DATA_BITLINE_AF * ram_bitline_power(rowsb, colsb, 1, 1, CACHE);

power->btb_datapower_V = senseamp_power(colsb);

cfV2

cfV2

cfV2

cV

18-04-23 Implementation 19 / 28

Power formulas

652

4

322

1

1

21

1

21

1

22

1

22

1

22

1

22

1

22

1

22

1

22

1

21

cVcfVc

cVcfVcc

fVc

c

fVc

c

VcfVc

VcfVc

VcfVc

VcfVc

VcfVc

VcfVc

VcfVc

fVc

Clock

unit controlMemory

busmemory Internal

unitr accumulato-Multiplier

unitShift

unit logic-Arithmetic

cache Data

cache nInstructio

fileRegister

buffer Pend

buffer Write

buffer Fill

buffertarget Branch

decoder nInstructio

formula ofFormat unit Functional

18-04-23 Implementation 20 / 28

Example with DVS marks

#include <stdio.h>#include "DVS.h"

int main(){ DVS("Deadline=%u RWEC=%u", 0.026, 3425256); puts("This is a piece of code"); DVS("RWEC=982428"); puts("This is another piece of code"); DVS("%s=%u", "Deadline", 0);}

18-04-23 Implementation 21 / 28

DVS marks

C source of simulated program

void DVS(const char *iFormat, …)

int syscall(int number, …)

XTREM system call interface (syscall.c)

XTREM tracefile output (xtrem.c)

swi instruction

DVS_parameters variable

Call

Call

22 / 28High-level Power Simulation for DVS-aware Processors

18-04-23

Overview

• Introduction

•Design

• Implementation

•Experiments

•Future Work

18-04-23 Experiments 23 / 28

Experiments

•Comparison of total values of original XTREM and adapted XTREM

•Simulation of MP3 decoder– 20 DVS marks– 3 DVS algorithms– 4 Test files

18-04-23 Experiments 24 / 28

DVS algorithms•Constant algorithm

– Similar to using no DVS

•Worst Case Execution Path (WCEP) algorithm– At each DVS mark the lowest f and V

calculated for which the deadline is still reached in all cases

•Oracle algorithm– f and V are calculated using the execution

path that must be known in advance

18-04-23 Experiments 25 / 28

Frequency graph

0,00E+00

5,00E+07

1,00E+08

1,50E+08

2,00E+08

2,50E+08

3,00E+08

3,50E+08

4,00E+08

4,50E+08

5,00E+08

0 0,01 0,02 0,03 0,04 0,05 0,06 0,07

Time (s)

Fre

qu

ency

(H

z)

Constant

WCEP

Oracle

Deadline Deadline

18-04-23 Experiments 26 / 28

Power graph

0

0,5

1

1,5

2

2,5

3

3,5

4

4,5

5

0 0,01 0,02 0,03 0,04 0,05 0,06 0,07

Time (s)

Po

wer

(W

)

Constant

WCEP

Oracle

Deadline Deadline

18-04-23 Experiments 27 / 28

Total energy consumption

Algorithm

Energy (J)

Energy saved (%)

Constant

345.5 0.0

WCEP 334.3 3.2

Oracle 310.9 10.0

18-04-23 Experiments 28 / 28

Power distribution

DEC14,6%

BTB5,5%

MM28,6%

CLK15,4%

I$8,6%

MEM10,3%

D$6,3% FB

0,0%PB

0,0%

WB1,9%

SHF0,0%

MAC0,0%

REG2,7%

ALU5,9%

29 / 28High-level Power Simulation for DVS-aware Processors

18-04-23

Overview

• Introduction

•Design

• Implementation

•Experiments

•Future Work

18-04-23 Future Work 30 / 28

Future Work

•Other DVS algorithms

•Add more variables to power formulas

• Improve accuracy of power simulator