physical-aware task migration algorithm for dynamic ...€¦ · • process variation, •...

37
Physical-Aware Task Migration Algorithm for Dynamic Thermal Management of SMT Multi-Core Processors 1 Bagher Salami †, Mohammadreza Baharani § , Hamid Noori †, Farhad Mehdipour † School of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran § School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran ‡ E-JUST Center, Graduate School of Information Science and Electrical Eng., Kyushu University, Fukuoka, Japan

Upload: others

Post on 19-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Physical-Aware Task Migration Algorithm for

Dynamic Thermal Management of

SMT Multi-Core Processors

1

Bagher Salami †, Mohammadreza Baharani § , Hamid Noori †,

Farhad Mehdipour ‡

† School of Engineering, Ferdowsi University of Mashhad, Mashhad, Iran

§ School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran

‡ E-JUST Center, Graduate School of Information Science and Electrical Eng., Kyushu University,

Fukuoka, Japan

Page 2: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Outline

Introduction

SMT-multi core processors

Problem description

Contributions

Proposed algorithm

Experimental results

Summary

2

Page 3: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Outline

Introduction

SMT-multi core processors

Problem description

Contributions

Proposed algorithm

Experimental results

Summary

3

Page 4: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

4

More computing

power

Shrinking feature

size

Increase power density

Temperature Increment

Page 5: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

5

Temperature increment

Performance

Implementation Cost

Lifetime Reliability

Expensive Cooling

Equipment

Page 6: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

6

Thermal management techniques

Dynamic

Hardware

Stop-go

DVFS

Software

Task scheduling

Task migration

Static

Hardware Thermal-aware floor planning

Software Thermal-aware

compilation

Page 7: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Outline

Introduction

SMT-multi core processors

Problem description

Contributions

Proposed algorithm

Experimental results

Summary

7

Page 8: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Simultaneous Multi Threading (SMT) (1/2)

8

• General purpose registers

• Control registers

• Machine state registers Dedicated resources

• Caches

• Execution units

• Buses Shared resources

A single physical processor appears as two logical

processors

The physical execution resources are shared and the

architecture state is duplicated

Page 9: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Simultaneous Multi Threading (SMT) (2/2)

9

A two-context SMT pipeline*

* Marr, Deborah T, “Hyper-threading technology architecture and microarchitecture: A hypertext history,“ Intel

Technology Journal, 2002

Page 10: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Outline

Introduction

SMT-multi core processors

Problem description

Contributions

Proposed algorithm

Experimental results

Summary

10

Page 11: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Problem Description

11

Target

System Specification

• An SMT multi-core processor with N cores

• There are N+1 to 2×N tasks for execution

• Each core can execute up to two threads

simultaneously (two-context SMT cores)

• Each of N cores has a physical thermal sensor

The problem:

How to schedule tasks among cores dynamically so that:

• The average and peak temperature under minimum

performance loss is minimized

• Temperature does not violate Tmax.

Page 12: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Outline

Introduction

SMT-multi core processors

Problem description

Contributions

Proposed algorithm

Experimental results

Summary

12

Page 13: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Contributions

13

• Considering features of SMT-multicores in task scheduling,

that leads to optimal thermal management

• Task migration based on physical features of the cores

• Thermal management of off-the-shelf SMT multi-core

products

Page 14: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Physical Features of Cores (1/3)

14

Different cores of a processor do not have similar thermal

behavior due to:

• Process variation,

• Packaging, and

• The temperature effect of neighbor components,

The temperature difference between cores of a processor

running the same application can be as much as 10∼15◦C*

* Inchoon Yeo, Chih Chun Liu, and Eun Jung Kim, “Predictive dynamic thermal management for multicore

systems,” DATE, 2008.

Page 15: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Physical Features of Cores (2/2)

15

Benchmark

Executed

on core 0

Executed

on core 1

Executed

on core 2

Executed

on core 3

gcc 59◦C 58◦C 61◦C 64◦C

hmmer 69◦C 62◦C 63◦C 66◦C

bzip2 69◦C 67◦C 69◦C 71◦C

Peak temperature differential between cores on Intel Core i7-3770

Benchmark

Executed

on core 0

Executed

on core 1

Executed

on core 2

Executed

on core 3

gcc 56◦C 56◦C 57◦C 55◦C

hmmer 60◦C 60◦C 62◦C 58◦C

bzip2 59◦C 59◦C 60◦C 58◦C

Peak temperature differential between cores on Intel Core i7-2600

Page 16: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

16

Peak temperature differential between cores on AMD's 8-core Bulldozer

Be

nc

hm

ark

Ex

ec

ute

d

on

co

re 0

Ex

ec

ute

d

on

co

re 1

Ex

ec

ute

d

on

co

re 2

Ex

ec

ute

d

on

co

re 3

Execu

ted

on

co

re 4

Ex

ec

ute

d

on

co

re 5

Ex

ec

ute

d

on

co

re 6

Execu

ted

on

co

re 7

gcc 41.5 42 45 44.5 42 42 43 43.5

hmmer 42 43.5 44.5 44.5 42 41.5 43.5 42.5

bzip2 41 41.5 43.5 43.5 41 41 41.5 42.5

Physical Features of Cores (3/3)

Page 17: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Task Scheduling of SMT Multi-Core

Processors

17

Main challenge in task scheduling of SMT multi-core processors:

• Co-scheduling of complementary threads on individual SMT cores*

• Better use of shared pipeline resources

• Improving performance

• Higher heat generation due to higher utilization of pipeline resources

* Donald, James, and Margaret Martonosi. “Techniques for multicore thermal manage-ment: Classification

and new exploration,” ACM SIGARCH Computer Architecture News 34, pp. 78-88, 2006.

In this work:

• Five different strategies are studied,

• Purpose: finding the most suitable pairs of tasks that should be co-scheduled

Page 18: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Complimentary task scheduling Higher performance

Execution on cold cores Lower temperature

Different Strategies of Task Scheduling

of SMT Multi-Core Processors (1/4)

18

First Strategy: Sorting cores based on the temperature of physical sensors

• Sort tasks based on performance counter values,

• Hottest and coolest tasks are co-scheduled to the coldest core,

• The second hottest and coolest tasks are co-scheduled to the second coolest core,

• The above process continues.

Page 19: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Complimentary task scheduling Higher performance

Execution on cold cores Lower temperature

Different Strategies of Task Scheduling

of SMT Multi-Core Processors (2/4)

19

Second Strategy: Sorting cores based on the physical features of cores

• Sort tasks based on performance counter values,

• Hottest and coolest tasks are co-scheduled to the coldest core,

• The second hottest and coolest tasks are co-scheduled to the second coolest core,

• The above process continues.

Page 20: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

20

• Third Strategy:

• After sorting cores according to their physical features, the first two hottest tasks are co-scheduled to the coldest core.

• Fourth Strategy:

• Similar to the third strategy, except

• the first two coldest tasks are assigned to the coldest core.

• Fifth Strategy:

• Reschedules tasks between cores only in critical situation (the core that has tr<tres)

Different Strategies of Task Scheduling

of SMT Multi-Core Processors (3/4)

According to experiments: The second strategy leads to the best result

the predicted time when the core reaches temperature threshold

the response time for the algorithm to decrease the core temperature

Page 21: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

21

More details on the second strategy

Different Strategies of Task Scheduling

of SMT Multi-Core Processors (4/4)

Page 22: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Outline

Introduction

SMT-multi core processors

Problem description

Contributions

Proposed algorithm

Experimental results

Summary

22

Page 23: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Analyze the migration frequency(Migration#)

Start

Migration#= 0

Increment(Tthr) Decrement(Tthr)

Predict tr (required time to reach Tthr) for all cores

Is there at least one core: tr <= tres

Reschedule and migrate tasks

Decrement(fcur)

fcur <=fmin

And(Migration#) = 0

Increment(fcur)

YES YES

YES YES

Threshold Management

Temperature Management

Performance Management

Predict tr (required time to reach Tthr) for all cores

Is there at least one core: tr <= tresYES

Migration# >=

Migrationlimit

&&Tthr<Tmax

Tmax=70(◦C), Tthr=Tmax

tres=2 (sec)Migrationlimit=5

Analyze the migration frequency(Migration#)

Migration#= 0

Increment(Tthr) Decrement(Tthr)

YES YES

Threshold Management

Migration# >=

Migrationlimit

&&Tthr<Tmax

Predict tr (required time to reach Tthr) for all cores

Is there at least one core: tr <= tres

Reschedule and migrate tasks

Decrement(fcur)

YES

Temperature Management

Predict tr (required time to reach Tthr) for all cores

Is there at least one core: tr <= tresYES

fcur <=fmin

And(Migration#) = 0

Increment(fcur)

YES

Performance Management

The Proposed Algorithm

23

Page 24: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Temperature Prediction*

24

0 50 100 150 200 250 300 350 40060

65

70

75

80

85

90

95

100

105

110

Tem

pera

ture

( C

)

Time(sec)

(a)

Processor Temp

Tss

𝑑𝑇

𝑑𝑡= 𝑐 × 𝑇𝑠𝑠 − 𝑇 ,

Steady state temperature of hmmer on intel Core i7-2600

* Inchoon Yeo and Eun Jung Kim, “Temperature-aware scheduler based on thermal behavior grouping in

multicore systems,” DATE, 2009.

a core-specific constant

steady state temperature

current temperature

initial processor temperature

Page 25: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Improved Temperature Prediction

25 𝑑𝑇

𝑑𝑡= 𝑐 × w × 𝑇𝑠𝑠 − 𝑇 , 𝑇 𝑡 = 𝑇𝑆𝑆 − ( 𝑇𝑆𝑆−𝑇𝑖𝑛𝑖𝑡) × 𝑒−𝑐×𝑤×𝑡.

(a) (b)

(a) 𝑇𝑠𝑠of hmmer on core 3 and other cores are idle.

(b) 𝑇𝑠𝑠of hmmer on core 3 and core 2 and core 1 executing gcc.

0 50 100 150 200 250 3000

10

20

30

40

50

60

70

80

90

100

Tem

pera

ture

( C

)

Time(sec)

Processor Temp

Tss

0 50 100 150 200 250 3000

10

20

30

40

50

60

70

80

90

100

Tem

pera

ture

( C

)

Time(sec)

Processor Temp

Tss

core activity

Page 26: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Outline

Introduction

SMT-multi core processors

Problem description

Contributions

Proposed algorithm

Experimental results

Summary

26

Page 27: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Experimental Setup

27

Benchmarks hmmer libquantum sjeng

perlbenc

h gobmk gcc mcf bzip2

Avg. Temp

(◦C) 68.2 67 65.7 65 63.9 63.9 63 62.9

System Specifications Value

Number of cores 4

Processor Intel Core i7-3770

Main memory size GB 8

OS Linux 3.2.0

Page 28: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

28

In order to decrease performance overhead using

performance counter:

Seeking performance counter with the greatest

correlation with temperature of programs

Performance Counter Analysis (1/3)

We use

performance counters to:

1) Distinguish the cold thread from the hot

thread on a two-context SMT core, and

2) To sort applications according to their

temperature.

Page 29: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

29

Events Correlation

stalled-cycles-backend -0.37

cache-references -0.35

stalled-cycles-frontend -0.35

cache-misses -0.33

Cycles -0.29

task-clock -0.24

context-switches -0.03

branches -0.03

page-faults -0.01

branch-misses 0.02

CPU-migrations 0.04

Instructions 0.29

IPC 0.30

Correlation between events and applications temperature

Performance Counter Analysis (2/3)

greatest correlation with temperature

Page 30: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

30

0 10 20 30 40 50 60 70 8035

40

45

50

55

60

Tem

pera

ture

( C

)

Time(sec)

(d)

PATM-Low Correlation

PATM-High Correlation

Linux

PATM (Physical-Aware Task Migration - our proposed method)’s

average temperature using high and low correlation counter for

application ordering vs. Linux standard scheduler.

Performance Counter Analysis (3/3)

Page 31: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

31

Strategy 1 Strategy 2 Strategy 3 Strategy 4 Strategy 5-2

0

2

4

6

8

10

12

14

Impro

vem

ent(

%)

AverageTemperature

PeakTemperature

Performance

Aggregated Objective Function

Performance, and average and peak temperature improvement of

different strategies compared to Linux standard scheduler

Comparison of Task Scheduling Strategies

Best improvement

Page 32: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

32

Experimental Results

Compared methods:

(1) PDTM (Predictive Dynamic Thermal Management):

One of the first attempts that predicts core temperature and

migrates applications from the possible overheated core to the

future coolest core in order to maintain the processor

temperature below a threshold temperature.

(2) TAS (Thermal Aware Scheduler):

Similar PDTM but categorizes applications according to their

thermal behavior.

Page 33: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

33

Temperature management

Comparison of different algorithms with various number of programs in

terms of (a) average temperature (b) peak temperature, and (c) run-time.

Experimental Results

• Running 5 to 8 program simultaneously with four different algorithms: Linux,

TAS, PDTM, and our proposed method PATM.

• Comparing average and peak temperature and run-time.

Page 34: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

34

Temperature management

DTM Algorithm Average

Temp.

Peak

Temp.

Run

Time(Second)

Linux 50.9(◦C) 64(◦C) 912.5(Sec)

PDTM 47.9(◦C) 59(◦C) 924.8(Sec)

TAS 47.8(◦C) 60(◦C) 916.8(Sec)

PATM 47.3(◦C) 56(◦C) 928.3(Sec)

Improvement PATM vs. PDTM 1.3% 5.8% -0.4%

Improvement PATM vs. TAS 1.1% 8.1% -1.3%

Improvement PATM vs. Linux 7.7% 13.9% -1.7%

Total comparison results of PATM against Linux, TAS, and PDTM.

Experimental Results

Page 35: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Outline

Introduction

SMT-multi core processors

Problem description

Contributions

Proposed algorithm

Experimental results

Summary

35

Page 36: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

Summary

36

• A dynamic thermal management algorithm for SMT multicore

processors.

• The algorithm manages processor temperature taking into

account the workload phase (being hot or cold) and

physical features of cores.

• Experimental results indicate that:

• Our algorithm outperforms Linux standard scheduler, TAS,

and PDTM in terms of thermal management,

• Performance overhead is negligible.

Page 37: Physical-Aware Task Migration Algorithm for Dynamic ...€¦ · • Process variation, • Packaging, and • The temperature effect of neighbor components, The temperature difference

37

Thank You!