tmpa-2017: using functional directives to analyze code complexity and communication

32
:: :: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :: Daniel Rubio Bonilla TMPA 2017 Using Funconal Direcves to Analyze code Complexity and Communicaon Daniel Rubio Bonilla HLRS – University of Stugart

Upload: iosif-itkin

Post on 11-Apr-2017

144 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: Daniel Rubio BonillaTMPA 2017

Using Functional Directives to Analyze code Complexity and Communication

Daniel Rubio BonillaHLRS – University of Stuttgart

Page 2: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 2

CPU Evolution

Daniel Rubio BonillaTMPA 2017

Page 3: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 3

Hazel Hen

CPU

E5-2680 v312 Cores30MiB Cache2.5 GhZ

Node 2 CPUs – 24C128 GB

Comp. Nodes 7712

Total Cores 185,088

Performance 7420 TFlops

Storage ~10 PB

Weight 61.5 T

Power 3200 KW

Daniel Rubio BonillaTMPA 2017

Page 4: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 4

Amdahl Law

# processing units

speedup100% parallelizable

98% parallelizable

90% parallelizable

50% parallelizable

Daniel Rubio BonillaTMPA 2017

Page 5: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 5

Real Amdahl Law

# processing units

speedup100% parallelizable

98% parallelizable

90% parallelizable

50% parallelizable

Daniel Rubio BonillaTMPA 2017

Page 6: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 6

The Problem

Daniel Rubio Bonilla

In High Performance Computing…

• Performance is increased by• Integrating more cores (millions!?)• Using heterogeneous accelerators (GPU, FPGA, ...)

• Issues• Programmability• Portability

TMPA 2017

Page 7: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 7

New Approaches

Different Programming Model

• Focused on mathematical problems• Engineering • Science

• To enable:• Parallelization and concurrency• Portability across different hardware and accelerators

Daniel Rubio BonillaTMPA 2017

Page 8: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 8

Our Approach

To obtain the structural information of the application by annotating the imperative code with a functional-like

directives (mathematical / algorithmic structure)

• The main difficulty in this approach are:• “deriving” the structure of the application• matching the structure to the source code

Daniel Rubio BonillaTMPA 2017

Page 9: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 9

Higher Order Functions

• Higher Order functions are mathematical functions • Takes one or more function as an argument• Can return a function as a result

• Clear repetitive execution structure

• These structures can be transformed to equivalent ones• But with different non-functional properties

Daniel Rubio BonillaTMPA 2017

Page 10: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 10

map :: (a -> b) -> [a] -> [b]

map (*2) [1,2,3,4] = [2,4,6,8]

Higher Order Functions

• Apply to all:

Daniel Rubio BonillaTMPA 2017

Page 11: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 11

foldl :: (a -> b -> a) -> a -> [b] -> a

foldl (+) 0 [1,2,3,4] = 10

map :: (a -> b) -> [a] -> [b]

map (*2) [1,2,3,4] = [2,4,6,8]

Higher Order Functions

• Apply to all:

• Reduction:

Daniel Rubio BonillaTMPA 2017

Page 12: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 12

Other Higher Order Functions

Daniel Rubio BonillaTMPA 2017

Page 13: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 13

Higher Order Functions Transformations

total = foldl (+) 0 vs

One possible transformation

Only if the operation is associative and we knowits neutral element

Daniel Rubio BonillaTMPA 2017

Page 14: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 14

Transformations

• Changes in the mathematical formulation• Or the algorithm execution

• Produce equivalent code• Change computing load• Change memory distribution• Modify communication

• Allow adaptation to different architectures

• While maintaining correctness!

Daniel Rubio BonillaTMPA 2017

Page 15: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 15

Hierarchical Structures

• Functional annotations allow the construction of multiple structural levels:• Emerging complexity of the structural information

• We distinguish between:• Output of one Higher Order Function is input of another

• This can be achieved by analyzing the data dependencies between the functions

• The operator of one (Higher Order) Function is composed of other functions

Daniel Rubio BonillaTMPA 2017

Page 16: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 16

Flat Structure

Graph of a Complex Structure of two same level Higher Order Functions (HOFs)• The output of one HOF is the input for another HOF

foldl (+) 0 (map (*2) [0..n-1])

foldl :: (a -> b -> a) -> a -> [b] -> amap :: (a -> b) -> [a] -> [b]

Daniel Rubio BonillaTMPA 2017

Page 17: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 17

Hierarchical Structure

Graph of a Complex Hierarchical Structure of two different level Higher Order Functions (HOF)• The operator of one HOF is another HOF

map (foldl (+) 0) [[..]..[..]]

foldl :: (a -> b -> a) -> a -> [b] -> amap :: (a -> b) -> [a] -> [b]

Daniel Rubio BonillaTMPA 2017

Page 18: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 18

Other requirements

• Strong binding between directives and code

• Description of memory organization

Daniel Rubio BonillaTMPA 2017

Page 19: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 19

Example - Heat

 

t

Daniel Rubio Bonilla

 

1-D heat dissipation function

Discretization

TMPA 2017

Page 20: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 20

Complexity

Daniel Rubio BonillaTMPA 2017

O(N_ELEM)

O(N_ITER)

O(N_ELEM)

O(1)

O(1)

O(1)

Page 21: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 21

Complexity

Daniel Rubio BonillaTMPA 2017

O(N_ELEM)

O(N_ITER)

O(N_ELEM)

O(1)

O(1)

O(1)

O(1) + O(1) + O(N_ITER) * (O(N_ELEM)*O(1) + O(N_ELEM))

O(N_ITER * N_ELEM)

Page 22: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 22

Transformations – Partitioning 1

let heatDiffusion = itn HEATTIMESTEP hm_array N_ITERPAR1 v = stencil1D TKernel 1 v where TKernel x y z = y + K * (x - 2*y + z)HEATTIMESTEP vs = map PAR1 vs

Daniel Rubio BonillaTMPA 2017

Page 23: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 23

Transformations – Partitioning 2

let heatDiffusion = itn HEATTIMESTEP hm_array N_ITERPAR2 v = stencil1D TKernel 1 v where TKernel x y z = y + K * (x - 2*y + z)PAR1 vs = map PAR2 vsHEATTIMESTEP vss = map PAR1 vss

Daniel Rubio BonillaTMPA 2017

Page 24: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 24

Platform Specific Transformations

• OpenMP:• Relatively straightforward

• MPI:• Communication• Halos

Daniel Rubio BonillaTMPA 2017

Page 25: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 25

Transformed Code

if (rank < size - 1) MPI_Send(&hm[LOCAL_N_ELEM],1, MPI_FLOAT, rank + 1, 0, MPI_COMM_WORLD);if (rank > 0) MPI_Recv(&hm[0], 1, MPI_FLOAT, rank-1, 0, MPI_COMM_WORLD, &status);if (rank > 0) MPI_Send(&hm[1], 1, MPI_FLOAT, rank-1, 1, MPI_COMM_WORLD );if (rank < size - 1) MPI_Recv(&hm[LOCAL_N_ELEM+1],1,MPI_FLOAT, rank+1, 1, MPI_COMM_WORLD, \ &status);

#pragma polca stencil1D 1 G hm hm_tmp#pragma omp parallel forfor(i=1; i<LOCAL_N_ELEM+1; i++){#pragma polca G#pragma polca input (hm[i-1] hm[i] hm[i+1]) output(hm_tmp[i]) hm_tmp[i] = hm[i] + K * (hm[i-1] + hm[i+1] - 2 * hm[i]);}

Daniel Rubio BonillaTMPA 2017

Page 26: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 26

Example - NBody

t

Daniel Rubio Bonilla

N-Body Problem

TMPA 2017

Three steps1) Calculate Forces2) Update Velocities3) Calculate Position

Page 27: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 27

Structure

Daniel Rubio BonillaTMPA 2017

O(1)

O(1)

O(1)

O(nIters)

O(nBodies)

O(nBodies)

O(nBodies)

O(nBodies)

Page 28: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 28

Structure

Daniel Rubio BonillaTMPA 2017

O(nIters)

O(nBodies2)

O(nBodies)

O(nBodies)

Page 29: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 29

Structure

Daniel Rubio BonillaTMPA 2017

O(nIters) * (O(nBodies2) + 2*O(nBodies))

O(nIters * nBodies2)

O(nBodies2)

O(nBodies)

O(nBodies)

Page 30: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 30

Communication

Daniel Rubio BonillaTMPA 2017

Parallel

Parallel

Sequential

Parallel (with caution)

Page 31: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 31

Conclusion

• Functional semantics can enable code:• Transformation• Adaptation

• But also...• Algorithmic complexity analysis• Communication patterns

• This information helps to predict application’s behavior

Daniel Rubio BonillaTMPA 2017

Page 32: TMPA-2017: Using Functional Directives to Analyze Code Complexity and Communication

:: ::

::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: ::::: :::::

:: 32

Questions

Thank you!

Contact:[email protected]

Projects:POLCA www.polca-project.euSmart-Dash www.dash-project.orgCλaSH www.clash-lang.org

Daniel Rubio BonillaTMPA 2017