author : cedric augonnet, samuel thibault, and raymond namyst inria bordeaux, labri, university of...

Automatic Calibration of Performance Models on Heterogeneous Multi-core Architectures Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a Chip (HPPC 2009)

Upload: rose-golden

Post on 17-Jan-2018

218 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

DESCRIPTION

 Multi-core architectures featuring specialized accelerator ◦ Those are getting an increasing amount of attention. ◦ This success will probably influence the design of future High Performance Computing hardware.  Homogeneous multi-core system → Heterogeneous multi-core system  Static prediction → Dynamic prediction

TRANSCRIPT

Page 1: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Automatic Calibration of Performance Models on Heterogeneous Multi-core ArchitecturesAuthor : Cedric Augonnet, Samuel Thibault, and Raymond NamystINRIA Bordeaux, LaBRI, University of BordeauxWorkshop on Highly Parallel Processing on a Chip (HPPC 2009)

Page 2: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

1) Introduction2) What is StarPU ?3) How to define and to build performance

models ?4) Build history-based performance models

dynamically5) Experimental validation 6) Conclusion

outline

Page 3: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Multi-core architectures featuring specialized accelerator◦ Those are getting an increasing amount of

attention.◦ This success will probably influence the design of

future High Performance Computing hardware.

Homogeneous multi-core system → Heterogeneous multi-core system

Static prediction → Dynamic prediction

Introduction

Page 4: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Auto-tuning performance prediction approach ◦ based on performance history tables dynamically

built during the application run.

Introduction (cont.)

Page 5: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

A runtime system for task scheduling on heterogeneous multi-core architecture.

The design of StarPU is organized around three main components:◦ An unified execution model.◦ A data management library. ◦ A scheduling framework.

What is StarPU ?

Page 6: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Page 7: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Scheduling strategies based on performance models

Page 8: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Define performance model :

◦ We need to decide which parameters the model should depend on.

◦ Find relationship between these parameters.

How to define and to build performance models ?

Page 9: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Build performance model :◦ It is common to use specific pre-calibration program

to build those model.

◦ It is however possible to design a model based on the amount of computations per task, and to calibrate the parameters by the means of a regression.

◦ StarPU can therefore automatically calibrate parametric models, either at runtime using linear regression models or offline in the case of non-linear models.

How to define and to build performance models ? (cont.)

Page 10: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Regression analysis will be create a model of dependent variable and independent variable.

In the model, we can be prediction value of dependent variable by independent variable.

General cases are linear regression and non-linear regression

Regression analysis

Page 11: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Measuring tasks' duration.

Identifying task kinds.

Feeding and looking up from the model.

Build history-based performance models dynamically

Page 12: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Page 13: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Identifying task kinds.

Page 14: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Each computational kernel is associated with a hash table per architecture.

Steps :1. A task is submitted to StarPU2. It computes its hash.3. Consults the hash table corresponding to the

proper kernel-architecture pair to retrieve the average execution time previously measured for this kind of task.

4. Update hash table, and save the new hash table to a file. (These performance models are persistent between different runs.)

Feeding and looking up from the model.

Page 15: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Environment :◦ They have implemented these automatic model

calibration mechanisms in StarPU .

◦ Multi-core CPU, GPU, Cell processor(SPU)

Experimental validation

Page 16: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Sharpness of the performance prediction

Page 17: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Page 18: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Performance feedback tools

Page 19: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

Page 20: Author : Cedric Augonnet, Samuel Thibault, and Raymond Namyst INRIA Bordeaux, LaBRI, University of Bordeaux Workshop on Highly Parallel Processing on a

We have proposed a generic approach to seamlessly build history-based performance models.

It has been implemented within the StarPU runtime system with the support of its integrated data management library, and we have shown how StarPU's performance feedback tools help the programmer to analyze whether the resulting performance prediction are relevant or not.

Conclusion