multidimensional data in c++€¦ · uses a contiguous memory block of size 10*sizeof(float); ......

22

Upload: others

Post on 12-May-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Multidimensional Data in C++(and other languages)

Janus Weil

Frankfurt Institute for Advanced Studies (FIAS)

C++ Seminar, 12.11.2014

Janus Weil Multidimensional Data in C++

Page 2: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Outline

array basics

1-dim arrays: di�erent options

multi-dim arrays: di�erent options

libraries:

boost::multi_array

eigenBlitz++

arrays in Fortran

Janus Weil Multidimensional Data in C++

Page 3: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Motivation

multi-dim data is very common (especially in scienti�c codes)

examples:

matricesgrid structuresimage processing...

but:

no standardized concept of dimensionality in C++no standard multi-dim container(although: several std representations of one-dim data exist)

Janus Weil Multidimensional Data in C++

Page 4: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Intro: arrays and pointer arithmetic [1D]

let's start with the basics ...

static one-dim array: float A[10];

uses a contiguous memory block of size 10*sizeof(float);

the array is basically a pointer to the �rst element

can be accessed with [] operator:float element3 = A[3]; A[3] = A[4];

actually the same can be done with any pointer:float *p = malloc(16); p[3]=*(p+4);

p[3] is the same as p+3

Janus Weil Multidimensional Data in C++

Page 5: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

different ways to represent 1-dim data

compile-time size:

intrinsic C++ array: float A[10];

since C++11: std::array<float, 10> A;

(std. container features: iterators, size() method, etc)

run-time-determined (but constant) size:

C99: variable-length arrays (VLA, not legal in C++):

1 float my_function(int n) { float A[n]; ... }

C++14 proposal: std::dynarray (canceled?)

dynamic size (can grow/shrink):

std::vector (contiguous, dyn-allocated storage)std::valarray (only for numeric types, but with extra goodies)std::list (non-contiguous, double-linked)std::forward_list (non-cont., single-linked)...

Janus Weil Multidimensional Data in C++

Page 6: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

ways to represent multi-dim data

�xed size: �array of arrays�

float A[3][4];

std::array<std::array<float,4>,3> A;

dynamic size: �vector of vectors�

std::vector<std::vector<float>> A;

each sub-vector is contiguous, but not the whole

both are a bit 'ugly' and only 'aggregated'(better: one coherent container for the whole array)

library solutions (non-std):

boost::multi_array

eigenBlitz++

std: proposal for C++14 (array_view and bounds_iterator) wascanceled!

Janus Weil Multidimensional Data in C++

Page 7: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

multi-dim arrays for pedestrians

memory addressing is one-dimensional

solution: 'linearize' multiple dimensions into one

float A[3][3] has the following memory layout:

(C++ is 'row-major', most signi�cant index is put �rst)

to represent an array float A[N1][N2][N3], you can simply:allocate a su�ciently large memory block(N1*N2*N3*sizeof(float))access the element A[i][j][k] via a linear index

l = (i · N2+ j) · N3+ k =3∑

d=1

index[d ] · stride[d ]

for e�ciency: do loops over i,j,k in right order!

Janus Weil Multidimensional Data in C++

Page 8: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Characteristics of multi-dim data

�rank�: number of dimensions

�size�: total number of elements

�contiguity�: is the data represented in a coherent memoryblock or 'fragmented' in memory?for each dimension:

�bounds�: lowest & highest index�extent�: number of elements in that dimension�stride�: memory distance between two neighboring elements inone dimension

�shape�: set of all extents

Janus Weil Multidimensional Data in C++

Page 9: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Example: Matrix Multiplication (plain C++)

1 #include<vector>2 int main() {3 const int N=1000;4 std::vector<float> vA(N*N), vB(N*N), vC(N*N);5

6 for (int row=0; row<N; row++) {7 for (int col=0; col<N; col++) {8 float sum=0;9 for (int i=0; i<N; i++)10 sum += vA[row*N + i] * vB[i*N + col];11 vC[row*N + col] = sum;12 }13 }14 }

Janus Weil Multidimensional Data in C++

Page 10: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

eigen

http://eigen.tuxfamily.org

template library for linear algebra

limited to 1 or 2 dimensions (vectors and matrices)

has many specialized types for vectors & matrices with �xed ordynamic sizes: Matrix3d, MatrixXd, Vector4f, RowVector2i, ...

operators for addition/multiplication of vectors/matrices

support for dense and sparse matrices

geometry/transformations

Janus Weil Multidimensional Data in C++

Page 11: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

eigen example

1 #include <iostream>2 #include <Eigen/Dense>3 using Eigen::MatrixXd;4 int main() {5 MatrixXd m(2,2);6 MatrixXd n = MatrixXd::Random(2,2);7 m(0,0) = 3;8 m(1,0) = 2.5;9 m(0,1) = -1;10 m(1,1) = m(1,0) + m(0,1);11

12 m = (m + n) * 2;13 }

Janus Weil Multidimensional Data in C++

Page 12: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Blitz++

http://blitz.sourceforge.net

high-performance vector mathematics library

utilizes advanced C++ template metaprogramming techniqueslike expression templates

�speed comparable to Fortran Implementations�

supports vectors, matrices and tensors (up to 11 dimensions)

boundary checks, arbitrary ordering, slicing, ...

Janus Weil Multidimensional Data in C++

Page 13: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

boost::multi_array

http://www.boost.org/doc/libs/1_57_0/libs/multi_array/doc/user.html

class template for multidimensional arrays

1 typedef boost::multi_array<double,3> array_3d;2 typedef boost::multi_array_types::extent_range range;3 array_3d A(boost::extents[3][4][2]); // 3x4x24 array_3d B(boost::extents[3][range(-1,3)][2]); // 3x4x2

element access via familiar syntax of native C++ arrays

1 A[1][1][1] = B[2][-1][1];

or via index array:

1 boost::array<index,3> idx = {{2,1,0}};2 A(idx) = 3.14;

Janus Weil Multidimensional Data in C++

Page 14: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Example: Matrix Multiplication (Boost)

1 #include "boost/multi_array.hpp"2 int main () {3 const int N = 1000;4 typedef boost::multi_array<float, 2> array_2d;5 boost::array<int,2> extents = {{N, N}};6 array_2d vA(extents), vB(extents), vC(extents);7

8 for (int row=0; row<N; row++) {9 for (int col=0; col<N; col++) {10 float sum=0;11 for (int i=0; i<N; i++)12 sum += vA[row][i] * vB[i][col];13 vC[row][col] = sum;14 }15 }16 }

Janus Weil Multidimensional Data in C++

Page 15: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

boost::multi_array (II)

additional features ...

reshaping (rank & size must stay the same):

1 array_3d A(boost::extents[3][4][2]);2 boost::array<int,3> dims = {{2, 3, 4}};3 A.reshape(dims);

resizing (rank must be equal, elements are preserved)

1 A.resize(extents[3][4][6]);

changing the bases (lower bounds):

1 A.reindex(1); // change all bases to 12 boost::array<int,3> bases = {{0, 1, -1}};3 A.reindex(bases); // use different values

select storage ordering

Janus Weil Multidimensional Data in C++

Page 16: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

boost::multi_array (III)

create sub-views via index ranges(rank: less or equal)

sectioning:

1 array_2d A(boost::extents[5][5]);2 array_2d::array_view<2>::type sect =3 A[ boost::indices[range(1,4)][range(2,4)] ];

slicing:

1 array_2d A;2 array_2d::array_view<1>::type slice =3 A[ boost::indices[3][range(0,5)] ];

Janus Weil Multidimensional Data in C++

Page 17: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Arrays in Fortran

Fortran 90+: (multi-dim) arrays are a core feature of thelanguage (also with dynamic size)

1 real, dimension(3,4) :: A, B, C2 real, dimension(-5:5) :: D3 real, allocatable, dimension(:,:,:) :: E, F

di�erences to C++:

'column-major' memory layoutindices start with 1 by default!

goodies:

element-wise operations: C=A+B

array sections: B(1:3,:)=A(2:4,:)

elemental functions: B=sin(A)

contiguous attribute

Janus Weil Multidimensional Data in C++

Page 18: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Example: Matrix Multiplication (Fortran)

1 program main2 integer, parameter :: N = 10003 real, dimension(N,N) :: vA, vB, vC4 integer :: row, col5

6 do row = 1, N7 do col = 1, N8 vC(row,col) = sum (vA(row,:) * vB(:,col))9 end do10 end do11 ! or even simpler:12 vC = matmul (vA, vB)13 end

dynamic arrays: completely analogous, except for

1 real, dimension(:,:), allocatable :: vA, vB, vC2 allocate(vA(N,N), vB(N,N), vC(N,N))

Janus Weil Multidimensional Data in C++

Page 19: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Fortran Arrays: Implementation

in order to support dynamic multi-dim arrays at runtime, it isnot su�cient to store just a base address, but one needsadditional info to characterize the array

stored in a structure called 'array descriptor'(sometimes 'dope vector')

set up internally by the compiler to represent arrays at run-time

contains �elds like:

base addresselement sizerankfor each dim: lower bound, extent, strideallocation statusdynamic type information (OOP)

Janus Weil Multidimensional Data in C++

Page 20: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Outlook: Coarrays

�concurrent arrays�, standardized in Fortran 2008

concept for parallelization using array-like syntax

PGAS model (partitioned global address space)

several versions ('images') of a program running in parallel,either on the same machine or distributed over a cluster

1 real, dimension(10), codimension[*] :: x, y2 integer :: num_img, me3 num_img = num_images()4 me = this_image()5 ! Some code here6 x(2) = x(3)[7] ! get value from image 77 x(3)[me] = 2*x(3)[me+1] ! get value from neighbor image8 sync all9 x(6)[4] = x(1) ! put value on image 410 x(:)[2] = y(:) ! put array on image 211 sync images (*)

Janus Weil Multidimensional Data in C++

Page 21: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

Summary/Conclusions

C++ has no 'standard' treatment of multi-dimensional data,but several library implementations are available (for di�erentpurposes)

if your application relies heavily on multi-dim data, a modernFortran dialect might not be the worst alternative ;)

Janus Weil Multidimensional Data in C++

Page 22: Multidimensional Data in C++€¦ · uses a contiguous memory block of size 10*sizeof(float); ... Example: Matrix Multiplication (plain C++) 1 #include 2 int main()

The End

Thanks for your attention!

Janus Weil Multidimensional Data in C++