score-p and scalasca and scalasca... · portable open-source tools for scalable performance...

Post on 16-Oct-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Score-P and ScalascaPortable open-source tools for scalableperformance analysis

February 1, 2014 Alexandre Otto Strube

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Outline

Going Exascale

Scalasca

We’re not alone

Things got messy

Unification

Who uses/develops Score-P

What is ours

Extreme scalability

The future

February 1, 2014 Alexandre Otto Strube Slide 2

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Going Exascale

February 1, 2014 Alexandre Otto Strube Slide 3

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

TL;DR

Single core perfomance peaking

# of cores increasingHybrid environmentsThat affects YOU - TODAY - RIGHT NOWHPC is just the spearheadWe only find the problems before the othersSupercomputers of today → notebooks of tomorrow

February 1, 2014 Alexandre Otto Strube Slide 5

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

TL;DR

Single core perfomance peaking# of cores increasing

Hybrid environmentsThat affects YOU - TODAY - RIGHT NOWHPC is just the spearheadWe only find the problems before the othersSupercomputers of today → notebooks of tomorrow

February 1, 2014 Alexandre Otto Strube Slide 5

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

TL;DR

Single core perfomance peaking# of cores increasingHybrid environments

That affects YOU - TODAY - RIGHT NOWHPC is just the spearheadWe only find the problems before the othersSupercomputers of today → notebooks of tomorrow

February 1, 2014 Alexandre Otto Strube Slide 5

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

TL;DR

Single core perfomance peaking# of cores increasingHybrid environmentsThat affects YOU - TODAY - RIGHT NOW

HPC is just the spearheadWe only find the problems before the othersSupercomputers of today → notebooks of tomorrow

February 1, 2014 Alexandre Otto Strube Slide 5

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

TL;DR

Single core perfomance peaking# of cores increasingHybrid environmentsThat affects YOU - TODAY - RIGHT NOWHPC is just the spearhead

We only find the problems before the othersSupercomputers of today → notebooks of tomorrow

February 1, 2014 Alexandre Otto Strube Slide 5

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

TL;DR

Single core perfomance peaking# of cores increasingHybrid environmentsThat affects YOU - TODAY - RIGHT NOWHPC is just the spearheadWe only find the problems before the others

Supercomputers of today → notebooks of tomorrow

February 1, 2014 Alexandre Otto Strube Slide 5

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

TL;DR

Single core perfomance peaking# of cores increasingHybrid environmentsThat affects YOU - TODAY - RIGHT NOWHPC is just the spearheadWe only find the problems before the othersSupercomputers of today → notebooks of tomorrow

February 1, 2014 Alexandre Otto Strube Slide 5

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

It doesn’t get easier

Increasing machine complexity (gpu, accelerators, etc)

Every doubling of scale reveals a new bottleneckPerturbation and data volumeDrawing insight from measurements

February 1, 2014 Alexandre Otto Strube Slide 6

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

It doesn’t get easier

Increasing machine complexity (gpu, accelerators, etc)Every doubling of scale reveals a new bottleneck

Perturbation and data volumeDrawing insight from measurements

February 1, 2014 Alexandre Otto Strube Slide 6

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

It doesn’t get easier

Increasing machine complexity (gpu, accelerators, etc)Every doubling of scale reveals a new bottleneckPerturbation and data volume

Drawing insight from measurements

February 1, 2014 Alexandre Otto Strube Slide 6

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

It doesn’t get easier

Increasing machine complexity (gpu, accelerators, etc)Every doubling of scale reveals a new bottleneckPerturbation and data volumeDrawing insight from measurements

February 1, 2014 Alexandre Otto Strube Slide 6

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Example: Sweep3d Wait States on BG/P (2010)

February 1, 2014 Alexandre Otto Strube Slide 7

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

This is an old song

Several performance tools exist, for many years

Most cease to work in huge processor/core countsKOJAK performance tool was created 16 years ago.

February 1, 2014 Alexandre Otto Strube Slide 8

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

This is an old song

Several performance tools exist, for many yearsMost cease to work in huge processor/core counts

KOJAK performance tool was created 16 years ago.

February 1, 2014 Alexandre Otto Strube Slide 8

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

This is an old song

Several performance tools exist, for many yearsMost cease to work in huge processor/core countsKOJAK performance tool was created 16 years ago.

February 1, 2014 Alexandre Otto Strube Slide 8

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca

Started in 2006 (following KOJAK from ’98)

Goals:

Scalable performance analysis toolsetSpecifically targeting large-scale parallel applications such asthose running on IBM Blue Gene or Cray XT with 10,000s or100,000s of processes

February 1, 2014 Alexandre Otto Strube Slide 9

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca

Started in 2006 (following KOJAK from ’98)Goals:

Scalable performance analysis toolsetSpecifically targeting large-scale parallel applications such asthose running on IBM Blue Gene or Cray XT with 10,000s or100,000s of processes

February 1, 2014 Alexandre Otto Strube Slide 9

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca

Started in 2006 (following KOJAK from ’98)Goals:

Scalable performance analysis toolset

Specifically targeting large-scale parallel applications such asthose running on IBM Blue Gene or Cray XT with 10,000s or100,000s of processes

February 1, 2014 Alexandre Otto Strube Slide 9

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca

Started in 2006 (following KOJAK from ’98)Goals:

Scalable performance analysis toolsetSpecifically targeting large-scale parallel applications such asthose running on IBM Blue Gene or Cray XT with 10,000s or100,000s of processes

February 1, 2014 Alexandre Otto Strube Slide 9

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)

PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages

Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP

Unique:

scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)Portable

IBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages

Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP

Unique:

scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10

Supports common parallel programming paradigms &languages

Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP

Unique:

scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages

Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP

Unique:

scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages

Fortran, C, C++

MPI 2.2, basic OpenMP & hybrid MPI+OpenMPUnique:

scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages

Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP

Unique:

scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages

Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP

Unique:

scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages

Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP

Unique:scalable trace analysis

Automatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages

Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP

Unique:scalable trace analysisAutomatic wait-state search

Parallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages

Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP

Unique:scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalability

INSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca: Features

Open source (New BSD license)PortableIBM Blue Gene, Cray XT, SGI Altix, IBM SP, blade clusters,Solaris, Linux clusters, NEC SX, K Computer, Fujitsu FX10Supports common parallel programming paradigms &languages

Fortran, C, C++MPI 2.2, basic OpenMP & hybrid MPI+OpenMP

Unique:scalable trace analysisAutomatic wait-state searchParallel replay exploits memory & processors to deliverscalabilityINSIGHTFUL

February 1, 2014 Alexandre Otto Strube Slide 10

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

This looks understandable...

February 1, 2014 Alexandre Otto Strube Slide 11

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

... but this is a real code.

February 1, 2014 Alexandre Otto Strube Slide 12

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

and this.

February 1, 2014 Alexandre Otto Strube Slide 13

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

... it can get really confusing.

February 1, 2014 Alexandre Otto Strube Slide 14

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca

February 1, 2014 Alexandre Otto Strube Slide 15

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca

February 1, 2014 Alexandre Otto Strube Slide 16

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca

February 1, 2014 Alexandre Otto Strube Slide 17

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca

February 1, 2014 Alexandre Otto Strube Slide 18

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca

February 1, 2014 Alexandre Otto Strube Slide 19

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Scalasca

February 1, 2014 Alexandre Otto Strube Slide 20

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

We’re not alone

Several tools exist

Different goals, similar needsSeparate measurement systems and output formatsComplementary features and overlapping functionalityRedundant effort for development and maintenanceLimited or expensive interoperabilityComplications for user experience, support, training

February 1, 2014 Alexandre Otto Strube Slide 21

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

We’re not alone

Several tools existDifferent goals, similar needs

Separate measurement systems and output formatsComplementary features and overlapping functionalityRedundant effort for development and maintenanceLimited or expensive interoperabilityComplications for user experience, support, training

February 1, 2014 Alexandre Otto Strube Slide 21

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

We’re not alone

Several tools existDifferent goals, similar needsSeparate measurement systems and output formats

Complementary features and overlapping functionalityRedundant effort for development and maintenanceLimited or expensive interoperabilityComplications for user experience, support, training

February 1, 2014 Alexandre Otto Strube Slide 21

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

We’re not alone

Several tools existDifferent goals, similar needsSeparate measurement systems and output formatsComplementary features and overlapping functionality

Redundant effort for development and maintenanceLimited or expensive interoperabilityComplications for user experience, support, training

February 1, 2014 Alexandre Otto Strube Slide 21

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

We’re not alone

Several tools existDifferent goals, similar needsSeparate measurement systems and output formatsComplementary features and overlapping functionalityRedundant effort for development and maintenance

Limited or expensive interoperabilityComplications for user experience, support, training

February 1, 2014 Alexandre Otto Strube Slide 21

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

We’re not alone

Several tools existDifferent goals, similar needsSeparate measurement systems and output formatsComplementary features and overlapping functionalityRedundant effort for development and maintenanceLimited or expensive interoperability

Complications for user experience, support, training

February 1, 2014 Alexandre Otto Strube Slide 21

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

We’re not alone

Several tools existDifferent goals, similar needsSeparate measurement systems and output formatsComplementary features and overlapping functionalityRedundant effort for development and maintenanceLimited or expensive interoperabilityComplications for user experience, support, training

February 1, 2014 Alexandre Otto Strube Slide 21

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Things got messy

February 1, 2014 Alexandre Otto Strube Slide 22

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Unification

February 1, 2014 Alexandre Otto Strube Slide 23

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Score-P project idea

Communnity project with common infrastructure

What we share:

Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4

Single development effort, testing, supportSingle installation, interoperability, etc

So, Score-P is the base instrumentation/measurement forseveral projects

February 1, 2014 Alexandre Otto Strube Slide 24

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Score-P project idea

Communnity project with common infrastructureWhat we share:

Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4

Single development effort, testing, supportSingle installation, interoperability, etc

So, Score-P is the base instrumentation/measurement forseveral projects

February 1, 2014 Alexandre Otto Strube Slide 24

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Score-P project idea

Communnity project with common infrastructureWhat we share:

Single instrumentation and measurement system

Common data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4

Single development effort, testing, supportSingle installation, interoperability, etc

So, Score-P is the base instrumentation/measurement forseveral projects

February 1, 2014 Alexandre Otto Strube Slide 24

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Score-P project idea

Communnity project with common infrastructureWhat we share:

Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortraces

Performance report: Cube4

Single development effort, testing, supportSingle installation, interoperability, etc

So, Score-P is the base instrumentation/measurement forseveral projects

February 1, 2014 Alexandre Otto Strube Slide 24

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Score-P project idea

Communnity project with common infrastructureWhat we share:

Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4

Single development effort, testing, supportSingle installation, interoperability, etc

So, Score-P is the base instrumentation/measurement forseveral projects

February 1, 2014 Alexandre Otto Strube Slide 24

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Score-P project idea

Communnity project with common infrastructureWhat we share:

Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4

Single development effort, testing, support

Single installation, interoperability, etc

So, Score-P is the base instrumentation/measurement forseveral projects

February 1, 2014 Alexandre Otto Strube Slide 24

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Score-P project idea

Communnity project with common infrastructureWhat we share:

Single instrumentation and measurement systemCommon data formats: Open Trace Format 2 (OTF2) fortracesPerformance report: Cube4

Single development effort, testing, supportSingle installation, interoperability, etc

So, Score-P is the base instrumentation/measurement forseveral projects

February 1, 2014 Alexandre Otto Strube Slide 24

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Who uses/develops Score-P?

Scalasca (Fz-Juelich, RTWH Aachen)

Vampir (TU Dresden)Periscope (Tu Munich)Tau (U. Oregon)

February 1, 2014 Alexandre Otto Strube Slide 25

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Who uses/develops Score-P?

Scalasca (Fz-Juelich, RTWH Aachen)Vampir (TU Dresden)

Periscope (Tu Munich)Tau (U. Oregon)

February 1, 2014 Alexandre Otto Strube Slide 25

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Who uses/develops Score-P?

Scalasca (Fz-Juelich, RTWH Aachen)Vampir (TU Dresden)Periscope (Tu Munich)

Tau (U. Oregon)

February 1, 2014 Alexandre Otto Strube Slide 25

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Who uses/develops Score-P?

Scalasca (Fz-Juelich, RTWH Aachen)Vampir (TU Dresden)Periscope (Tu Munich)Tau (U. Oregon)

February 1, 2014 Alexandre Otto Strube Slide 25

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

And why we did it?

February 1, 2014 Alexandre Otto Strube Slide 26

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Cleaning the house

February 1, 2014 Alexandre Otto Strube Slide 27

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functions

Generation of flat MPI profiles

Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred

Call-path profiles

Needs recompilationSome overhead - might need filtering

Trace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred

Call-path profiles

Needs recompilationSome overhead - might need filtering

Trace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinking

Minimum overheadTimes each function was calledTime spent in each functionAmount of data transferred

Call-path profiles

Needs recompilationSome overhead - might need filtering

Trace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overhead

Times each function was calledTime spent in each functionAmount of data transferred

Call-path profiles

Needs recompilationSome overhead - might need filtering

Trace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overheadTimes each function was called

Time spent in each functionAmount of data transferred

Call-path profiles

Needs recompilationSome overhead - might need filtering

Trace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overheadTimes each function was calledTime spent in each function

Amount of data transferredCall-path profiles

Needs recompilationSome overhead - might need filtering

Trace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred

Call-path profiles

Needs recompilationSome overhead - might need filtering

Trace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred

Call-path profiles

Needs recompilationSome overhead - might need filtering

Trace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred

Call-path profilesNeeds recompilation

Some overhead - might need filteringTrace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred

Call-path profilesNeeds recompilationSome overhead - might need filtering

Trace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred

Call-path profilesNeeds recompilationSome overhead - might need filtering

Trace analysis

Identifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred

Call-path profilesNeeds recompilationSome overhead - might need filtering

Trace analysisIdentifies inneficiency patterns in communication andsynchronization

Traces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

What do we measure?

Measurement of MPI, OpenMP, User-level functionsGeneration of flat MPI profiles

Only relinkingMinimum overheadTimes each function was calledTime spent in each functionAmount of data transferred

Call-path profilesNeeds recompilationSome overhead - might need filtering

Trace analysisIdentifies inneficiency patterns in communication andsynchronizationTraces can quickly get huge - better filter that

February 1, 2014 Alexandre Otto Strube Slide 28

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Extreme scalability

All parallel:

Data collection/reduction

Analysis:

Pattern searchDelay analysisCritical-path analysis

Visualization

February 1, 2014 Alexandre Otto Strube Slide 29

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Extreme scalability

All parallel:

Data collection/reductionAnalysis:

Pattern searchDelay analysisCritical-path analysis

Visualization

February 1, 2014 Alexandre Otto Strube Slide 29

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Extreme scalability

All parallel:

Data collection/reductionAnalysis:

Pattern search

Delay analysisCritical-path analysis

Visualization

February 1, 2014 Alexandre Otto Strube Slide 29

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Extreme scalability

All parallel:

Data collection/reductionAnalysis:

Pattern searchDelay analysis

Critical-path analysis

Visualization

February 1, 2014 Alexandre Otto Strube Slide 29

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Extreme scalability

All parallel:

Data collection/reductionAnalysis:

Pattern searchDelay analysisCritical-path analysis

Visualization

February 1, 2014 Alexandre Otto Strube Slide 29

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Extreme scalability

All parallel:

Data collection/reductionAnalysis:

Pattern searchDelay analysisCritical-path analysis

Visualization

February 1, 2014 Alexandre Otto Strube Slide 29

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Some MPI patterns

time

pro

cess

ENTER EXIT SEND RECV COLLEXIT

(a) Late Sendertime

pro

cess

(b) Late Receiver

time

pro

cess

(d) Wait at N x Ntime

pro

cess

(c) Late Sender / Wrong Order

February 1, 2014 Alexandre Otto Strube Slide 30

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Late sender

February 1, 2014 Alexandre Otto Strube Slide 31

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Late sender and application topology

February 1, 2014 Alexandre Otto Strube Slide 32

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Direct wait time analysis

A

B

C Recv

Send

Send

foo

foo

foo

bar

bar Recv

bar

time

Recv

Recv

Direct wait

February 1, 2014 Alexandre Otto Strube Slide 33

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Indirect wait time analysis

Recv

Send

Send

foo

foo

foo

bar

bar Recv

A

B

C

bar

time

Recv

Direct waitIndirect wait

Recv

February 1, 2014 Alexandre Otto Strube Slide 34

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Direct wait time

February 1, 2014 Alexandre Otto Strube Slide 35

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Indirect wait time analysis

February 1, 2014 Alexandre Otto Strube Slide 36

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

Root cause analysis

A

B

C Recv

Send

Send

foo

foo

foo

bar

bar Recv

barDELAY

time

Recv

Recv

Direct waitIndirect wait

Recv

cause

February 1, 2014 Alexandre Otto Strube Slide 37

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

6D Hardware topology

February 1, 2014 Alexandre Otto Strube Slide 38

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

The Future

February 1, 2014 Alexandre Otto Strube Slide 39

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

The Future

Energy awareness

Bring performance analysis to YOU!There’s a bunch of experts craving for users and parallelapplication developers!support@score-p.orghttp://www.scalasca.org

February 1, 2014 Alexandre Otto Strube Slide 40

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

The Future

Energy awarenessBring performance analysis to YOU!

There’s a bunch of experts craving for users and parallelapplication developers!support@score-p.orghttp://www.scalasca.org

February 1, 2014 Alexandre Otto Strube Slide 40

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

The Future

Energy awarenessBring performance analysis to YOU!There’s a bunch of experts craving for users and parallelapplication developers!

support@score-p.orghttp://www.scalasca.org

February 1, 2014 Alexandre Otto Strube Slide 40

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

The Future

Energy awarenessBring performance analysis to YOU!There’s a bunch of experts craving for users and parallelapplication developers!support@score-p.org

http://www.scalasca.org

February 1, 2014 Alexandre Otto Strube Slide 40

Mem

bero

fthe

Hel

mho

ltz-A

ssoc

iatio

n

The Future

Energy awarenessBring performance analysis to YOU!There’s a bunch of experts craving for users and parallelapplication developers!support@score-p.orghttp://www.scalasca.org

February 1, 2014 Alexandre Otto Strube Slide 40

top related