high performance computing: concepts, methods & means introduction to libraries hartmut kaiser...

56
High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State University April 17 th , 2007

Upload: anabel-norris

Post on 04-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

High Performance Computing: Concepts, Methods & Means

Introduction to Libraries

Hartmut Kaiser PhDCenter for Computation & Technology

Louisiana State University

April 17th, 2007

Page 2: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Outline

• Why libraries• What is a library• How to use a library• Standard library support• Summary - Materials for Test

2

Page 3: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Outline

• Why libraries• What is a library• How to use a library• Standard library support• Summary - Materials for Test

3

Page 4: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Why libraries?The expected output of the following C program is to print the elements in the array. But when actually run, it doesn't do so.

#include <stdio.h>

#define TOTAL_ELEMENTS (sizeof(array) / sizeof(array[0]))

int array[] = { 23, 34, 12, 17, 204, 99, 16 };int main(){ int d; for (d = -1; d <= (TOTAL_ELEMENTS-2); ++d) printf ("%d\n", array[d+1]); return 0;}

Find out, what‘s going wrong!

4

References: [3]

Page 5: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Why libraries?• How can we dare to assume to be able to write

correct code?• Reuse, reuse, reuse!

– Allows to concentrate on the science– Leverage knowledge and skills of others– Offload part of your work to library maintainers

• But: used libraries should be – High quality– Flexible and generic– Combinable– Preferrably have access to source code

• Use the right tool for the right job– Having a new and shiny hammer doesn‘t mean

everything is a nail

5

Page 6: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Outline

• Why libraries• What is a library• How to use a library• Standard library support• Summary - Materials for Test

6

Page 7: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

What is a library?

• Short history of software libraries• Different perspectives• Classification of libraries

7

Page 8: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Bouchons Loom (1725)

Basile Bouchon, Jean Falcon, Jacques Vaucanson, Joseph Marie Jacquard (1801):

An automated loom that transformed the 18th century textile industry and became the inspiration for future calculating and tabulating machines.

The binary principle embodied in the punched-card operation of the loom was inspiration for the data processing machines to come.

Picture of Jacquard loom (1830)

8

Short history of software libraries

References: [1]

Page 9: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Short history of software libraries

9

Charles Babbage’s Analytical Machine (1830)

Every set of cards made for any formula will at any future time recalculate that formula with whatever constants may be required. Thus the Analytical Engine will possess a library of its own. Every set of cards once made will at any future time reproduce the calculations for which it was first arranged.

— Passages from the Life of a Philosopher, Charles Babbage (1864)

References: [1]

Page 10: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Short history of software libraries

Harvard Mark I: Grace Murray Hopper and Howard Aiken (1944)

Some sequences that were used again and again were permanently wired into the Mark I’s circuits... Since the Mark I was not a stored-program computer, Hopper had no choice for other sequences than to code the same pattern in successive pieces of tape. It did not take long for her to realize that if a way could be found to reuse the pieces of tape already coded for another problem, a lot of effort would be saved. The Mark I did not allow that to be easily done, but the idea had taken root and later modifications did permit multiple tape loops to be mounted.

10

References: [1]

Page 11: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Alan Turing and the ACE (1946)

Subroutine, call stack, jump and return: When we wish to start on a subsidiary operation [subroutine] we need only make a note of where we left off the major operation [return address] and then apply the first instruction of the subsidiary. When the subsidiary is over we look up the note and continue with the major operation. Each subsidiary operation can end with instructions for this recovery of the note. How is the burying and disinterring [push and pop] of the note to be done? There are of course many ways. One is to keep a list of these notes in one or more standard size delay lines (1024), with the most recent last [a stack]... the burying being done through a standard instruction table BURY, and the disinterring by the table UNBURY.

— Proposals for the development in the Mathematics Division of an Automatic Computing Engine (ACE), Alan M. Turing (1946)

11

Short history of software libraries

References: [1]

Page 12: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Calling subroutines• Hardware support for

stack operations (LIFO: last in first out)

• Special stack pointer or general register

– Call:• Put parameters on top of

stack• Put return address on top

of stack• Jump to subroutine

– Return• Retrieve address from

top of stack• Jump to this address

• Modern compilers use stack for local data as well

12

Gro

ws

dow

nwar

ds

Page 13: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

FORTRAN• Parameters put on stack

from left to right• Called code is responsible

for cleaning parameters from stack (unwinding)

• Local data on stack is handled by called code

• Caller and callee must agree on number of arguments

• Names are not changed(sometimes all capital)

C• Parameters put on stack

from right to left• Calling code is responsible

for cleaning parameters from stack (unwinding)

• Local data on stack is handled by called code

• Subroutines may have variable parameter count (printf)

• Prepended ‘_‘ for names(sometimes appended)

13

Calling conventions

Page 14: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Short history of software libraries

EDSAC - Electronic Delay Storage Automatic Calculator (1951)

“The library of tapes on which subroutines are punched is contained in the steel cabinet shown on the left. The operator is punching a program tape on a keyboard perforator. She can copy mechanically tapes taken from the library on to the tape she is preparing by placing them in the tape reader shown in the center of the photograph.”

14

References: [1]

Page 15: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Short history of software libraries

• Key ideas of the EDSAC library (David Wheeler, Maurice Wilkes)

– Library of subroutines– Reuse of reliable components to shorten development time

and reduce defects– Linking, relocatable objects– Multiple versions of a subroutine, each with a clearly

indicated tradeoff of time, space, accuracy– Unit testing– Open subroutines (inline/intrinsic functions), nonstandard

semantics (‘interpretive subroutines’)– Pure vs. impure functions– Debugging interpreter– Passing functions as parameters (e.g., to integration

subroutines)

15

References: [1]

Page 16: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

What is a library?• No clear answer, could be many things:

– A library is a reuse repository• “A library is a bunch of code I don’t have to write.”

– A library is a knowledge base• A library is a knowledge base about a problem

domain.– A library is a language extension

• Different languages have differently sharp borders between language and libraries

• “In effect, designing a class library is like designing part of a programming language, and should be approached with commensurate respect.”

– A library is a notation

16

References: [1]

Page 17: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

What is a library?

– A library is an expert-in-a-box• Allows to concentrate on the science without having

to worry about implementation details

– A library is an abstraction• APIs hide the details; we can use libraries knowing

what they do but don’t need to know how they do it.

– A library is a de-facto standard• Widely used, open source, well tested• Full standards: C99 Standard library, C++ Standard

template library

– A library is a defect management strategy• “ The only error free code I ever write is the code I

do not have to write“

17

References: [1]

Page 18: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

What is a library?

– A library is a tool for software compression• Especially in the context of shared libraries

– A library is a stable platform• Implementation can change without breaking code

relying on it

– A library is a vehicle for technology adoption• A certain technology (way of doing things) may be

encapsulated behind a API, simplifying it‘s adoption• New technologies may be encapsulated behind old

APIs

– A library is a communication medium• Allows to communicate on higher levels using

conepts

18

References: [1]

Page 19: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Libraries (by locality)

19

Page 20: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Libraries (by domain)

20

Page 21: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Application domains• General parallelization, load balancing

– MPI, Chombo• Mesh manipulation and management

– METIS• Graph manipulation

– Boost.Graph library• Vector/Signal/Image processing

– VSIPL, PSSL• Linear algebra

– BLAS, ATLAS, LAPACK, LINPACK, Slatec, pim• Ordinary and partial Differential Equations

– PETSc

21

Page 22: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Outline

• Why libraries• What is a library• How to use a library• Standard library support• Summary - Materials for Test

22

Page 23: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

How to use a library?

• Compile single source file• Compile multiple source files• Create a library• Compile multiple source files written

using different languages• Combining different languages

23

Page 24: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Compile single source file

24

Page 25: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Compile single source file

25

Page 26: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Compile multiple source files

26

Page 27: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Create a library

• Static library (.a)– Created using ar

(archiver)– Just collection of object

modules and table of entry points

– Used by linker to add referred code to created executable

• Dynamic library (.so)– Created by ld (linker)– Executable binary code

with resolved externals– Used by linker to add

reference to created executable

27

Page 28: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Create a library

• Static library (.a)– No additional runtime

dependencies– Beneficial in simple

scenario‘s

– If used in more than one module code will be duplicated

• Dynamic library (.so)– Code loaded only once– Beneficial in complex

binary applications

– Additional runtime dependency

– Difficult version control

28

Page 29: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Compile multiple source files

29

main.c

int main(){ say_hello(“Hello“);}

say_hello.c

int say_hello(char const* msg)

{ puts(msg);}

• Interface of subroutines must be known• Different languages have diffeent means

of interface specification for modules, subroutines and functions

Page 30: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Compile multiple source files

main.c

#include “say_hello.h“int main(){ say_hello(“Hello“);}

say_hello.c

#include <stdio.h>#include “say_hello.h“int say_hello(char const*

msg){ puts(msg);}

30

say_hello.h

int say_hello(char const* msg);

Page 31: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Demo1

31

Page 32: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Multi language programming

• Need to account for– Different calling conventions

• C, FORTRAN calling conventions• Parameter passing (by value/by reference)• Parameter types (strings)

– Naming conventions• FORTRAN: all uppercase, C: case is significant

– Data types• Memory layout (row major, column major, strings)

• Most of this is done by providing a correct interface description to the FORTRAN and/or C compilers

• All of this is highly compiler specific, but GNU compiler suite (gcc, f77 etc.) are well suited

32

Page 33: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Why C++ • Multiparadigm language

– Object orientation, functional programming, template meta-programming

• Better maintainability of programs• More frequent code re-use• More efficient software development in groups• Higher adaptability of software to new demands

– Huge amount of libraries, from simple data structures and algorithms to modules in highly specialized domains

– But you don’t pay for what you don’t use

• C++ is available and supported by vendors on almost all Supercomputers like Cray, NEC SX, Hitachi SR8000 …

• With a few minor exceptions C++ is a better C: this allows a smooth migration from C to C++.

33

Page 34: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

C/C++

• Since C and C++ are ‚siblings‘ interfacing is easy: extern ”C” {…}– Adjusts naming and calling conventions

• C data types are generally compatible with C++

• C++ data types (classes) are not generally compatible with C except POD (plain old data) types

34

Page 35: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Calling C from C++

35

main.cppextern “C“ {#include “say_hello.h“}int main(){ say_hello(“Hello“);}

say_hello.c#include <stdio.h>#include “say_hello.h“int say_hello(char const*

msg){ puts(msg);}

say_hello.h

int say_hello(char const* msg);

Page 36: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Demo 2

36

Page 37: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Calling C++ from C

37

main.c

#include “say_hello.h“int main(){ say_hello(“Hello“);}

say_hello.cpp#include <stdio.h>#include “say_hello.h“int say_hello(char const*

msg){ puts(msg);}

say_hello.hpp#ifdef __cplusplusextern “C“ #endifint say_hello(char const*

msg);

Page 38: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Demo 3

38

Page 39: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

FORTRAN/C

• Data types:

39

FORTRAN C/C++

integer*2 short int

integer long int or int

integer iabc(2,3) int iabc[3][2];

logical long int or int

logical*1 bool (C++, One byte)

Real float

real*8 double

complex struct { float r, i; }

double complex struct { double dr, di; }

character*6 abc char abc[6];

parameter #define PARAMETER value

Page 40: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Multi language programming

40

main.fINTERFACE TO SUBROUTINE

SAY_HELLO [C.ALIAS: '_say_hello']

(msg) CHARACTER(*) msgEND PROGRAM MAINCALL SAY_HELLO(“Hello“)END

say_hello.c

int say_hello( char const* msg, int len){ printf(“%*d”, len, msg);}

• Interface of subroutines must be declared in a language specific way• Tooling support available

• SWIG: http://www.swig.org• FLIB2C: http://www.mycplus.com/utilitiesdetail.asp?iPro=7

• More information: http://arnholm.org/software/cppf77/cppf77.htm

Page 41: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Demo 4

41

Page 42: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Outline

• Why libraries• What is a library• How to use a library• Standard library support• Summary - Materials for Test

42

Page 43: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Standard library support

• Standard (runtime) libraries• C++ Standard template library

43

Page 44: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Standard (runtime) libraries

• Libraries needed by almost any application– Provide often used functions (data types, algorithms, I/O,

support routines)– No need to explicitly specify library

• Compiler and linker usually ‚know‘ what runtime libraries to use

• Different languages have different level and amount of standard library support (system level and support)– F77: very small standards library

• Filesystem, math, auxiliary– C99: large standards library aimed at portability over wide

amount of platforms• Operating system, filesystem, math, string handling, basic data

types (complex, integer types)– C++: everything in C99 plus Standards Template Library

• Adds data structures, algorithms

44

Page 45: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

C++ Standard Template Library

• Set of generic components:– Different data structures

• (vector, list, set, map, deque etc.)

– More than 100 Algorithms • (foreach, transform, copy, sort, uniq etc.)

– Iterators• (forward, bidirectional, random_access etc.)

– Adaptors• (stack, queue, inserter etc.)

– Function objects• (memfn etc.)

45

Page 46: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

C++ Standard Template Library

• Algorithms and data structures are generic and orthogonal– Each algorithm usable with each algorithm– Algorithms usable with your data structures– Your algorithms can use standard data structures

• Iterators connect the two – General pointer concept, i.e. used by algorithms to

refer to the data items– Allow to abstract the algorithms from the data

structures

46

Page 47: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Simple example

• Copy a vector of integer’s into a list std::vector<int> v; // v = 1, 2, 3, 4; std::list<int> l;

std::copy(v.begin(), v.end(), std::inserter(l, l.end()));

• Or v.v.: std::copy(l.begin(), l.end(), std::inserter(v, v.end()));

• How is it implemented

template <typename InIter, typename OutIter> void copy(InIter f, InIter l, OutIter o) { for (/**/; f != l; ++f, ++o) *o = *f; }

47

Page 48: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Genericity leads to Concepts

• No strict interface anymore (as known from Fortran or C

• Rather components required to expose concepts, i.e. satisfy set of requirements

• In the copy example:– First two parameters must be at least input

iterators• implement operator++(), operator*()

– Last parameter must be at least an output iterator

• Implement operator++(), operator*()

48

Page 49: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Iterator categories

Input Output

forward

bi-directional

Random access

Read one item at a time, in forward direction only

Write one item at a time, in forward direction only

Read and write one item at a time, in forward direction only Write one item at a

time, either in forward or backward direction

Write one item at a time, either in forward or backward direction can jump any distance

49

Page 50: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Iterator behavior• Common operations

++i: Advance one element and return reference to ii++: Advance one element and return the previous value of i

• Input iterator operations*i: Return a read-only reference to the element at i‘s current positioni == j: Return true if i and j positioned at the same element

(i != j: at different elements)

• Output iterator operations*i: Return a write-only reference to the element at i‘s current positioni = j: set i‘s position to the same as j‘s

• Bidirectional iterator operations--i: Retreat one element and returns i‘s new valuei--: Retreat one element and returns i‘s previous value

• Random access iterator operationsi + n: Return an iterator positioned n elements ahead i‘s current positioni – n: Return an iterator positioned n elements behind i‘s current positioni[n]: return a reference to the n‘th element from i‘s current position

• A plain C pointer is a random access iterator

50

Page 51: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Containers and iteratorsContainer Iterator Container Iterator

vector random access map bidirectional

deque random access multimap bidirectional

list bidirectional stack none

set bidirectional queue none

multiset bidirectional priority_queue none

51

• Every container– Has typedefs for this (no need to remember above):

• iterator, const_iterator, reverse_iterator, const_reverse_iterator

– Exposes functions returning iterators:• begin(), end() (non-const and const variants)

Page 52: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

What‘s that all about?

• Orthogonality: std::vector<int> v; // v = 3, 1, 4, 2; std::sort(v.begin(), v.end()); std::list<int> l; // l = 3, 1, 4, 2; std::sort(v.begin(), v.end());

• Any algorithm is usable with any container– Still optimal code, because STL contains

optimal implementation for each container and iterator type

– Optimal code with your data structures as well

52

Page 53: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Conclusions

• Talked about importance of libraries• How to use and create libraries• Standards libraries in modern

languages

53

Page 54: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Outline

• Why libraries• What is a library• How to use a library• Standard library support• Summary - Materials for Test

54

Page 55: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State

Summary – Material for the Test

• Why Libraries: (Slides 4, 5)• Calling Subroutines: (Slides 12)• What is a Library: (Slides 16-18)• Library (by locality) : (Slides 19)• Library (by domain): (Slides 20)• Application domains: (Slides 21)• Creating a library: (Slides 27, 28)• Multi language programming: (Slides 32, 33)• Standard runtime libraries: (Slides 44)• Iterator Categories & Behavior: (Slides 51)• Containers & Iterators: (Slides 51)• What is all that about: (Slide 52)

Page 56: High Performance Computing: Concepts, Methods & Means Introduction to Libraries Hartmut Kaiser PhD Center for Computation & Technology Louisiana State