multidimensional data in c++€¦ · uses a contiguous memory block of size 10*sizeof(float); ......
TRANSCRIPT
Multidimensional Data in C++(and other languages)
Janus Weil
Frankfurt Institute for Advanced Studies (FIAS)
C++ Seminar, 12.11.2014
Janus Weil Multidimensional Data in C++
Outline
array basics
1-dim arrays: di�erent options
multi-dim arrays: di�erent options
libraries:
boost::multi_array
eigenBlitz++
arrays in Fortran
Janus Weil Multidimensional Data in C++
Motivation
multi-dim data is very common (especially in scienti�c codes)
examples:
matricesgrid structuresimage processing...
but:
no standardized concept of dimensionality in C++no standard multi-dim container(although: several std representations of one-dim data exist)
Janus Weil Multidimensional Data in C++
Intro: arrays and pointer arithmetic [1D]
let's start with the basics ...
static one-dim array: float A[10];
uses a contiguous memory block of size 10*sizeof(float);
the array is basically a pointer to the �rst element
can be accessed with [] operator:float element3 = A[3]; A[3] = A[4];
actually the same can be done with any pointer:float *p = malloc(16); p[3]=*(p+4);
p[3] is the same as p+3
Janus Weil Multidimensional Data in C++
different ways to represent 1-dim data
compile-time size:
intrinsic C++ array: float A[10];
since C++11: std::array<float, 10> A;
(std. container features: iterators, size() method, etc)
run-time-determined (but constant) size:
C99: variable-length arrays (VLA, not legal in C++):
1 float my_function(int n) { float A[n]; ... }
C++14 proposal: std::dynarray (canceled?)
dynamic size (can grow/shrink):
std::vector (contiguous, dyn-allocated storage)std::valarray (only for numeric types, but with extra goodies)std::list (non-contiguous, double-linked)std::forward_list (non-cont., single-linked)...
Janus Weil Multidimensional Data in C++
ways to represent multi-dim data
�xed size: �array of arrays�
float A[3][4];
std::array<std::array<float,4>,3> A;
dynamic size: �vector of vectors�
std::vector<std::vector<float>> A;
each sub-vector is contiguous, but not the whole
both are a bit 'ugly' and only 'aggregated'(better: one coherent container for the whole array)
library solutions (non-std):
boost::multi_array
eigenBlitz++
std: proposal for C++14 (array_view and bounds_iterator) wascanceled!
Janus Weil Multidimensional Data in C++
multi-dim arrays for pedestrians
memory addressing is one-dimensional
solution: 'linearize' multiple dimensions into one
float A[3][3] has the following memory layout:
(C++ is 'row-major', most signi�cant index is put �rst)
to represent an array float A[N1][N2][N3], you can simply:allocate a su�ciently large memory block(N1*N2*N3*sizeof(float))access the element A[i][j][k] via a linear index
l = (i · N2+ j) · N3+ k =3∑
d=1
index[d ] · stride[d ]
for e�ciency: do loops over i,j,k in right order!
Janus Weil Multidimensional Data in C++
Characteristics of multi-dim data
�rank�: number of dimensions
�size�: total number of elements
�contiguity�: is the data represented in a coherent memoryblock or 'fragmented' in memory?for each dimension:
�bounds�: lowest & highest index�extent�: number of elements in that dimension�stride�: memory distance between two neighboring elements inone dimension
�shape�: set of all extents
Janus Weil Multidimensional Data in C++
Example: Matrix Multiplication (plain C++)
1 #include<vector>2 int main() {3 const int N=1000;4 std::vector<float> vA(N*N), vB(N*N), vC(N*N);5
6 for (int row=0; row<N; row++) {7 for (int col=0; col<N; col++) {8 float sum=0;9 for (int i=0; i<N; i++)10 sum += vA[row*N + i] * vB[i*N + col];11 vC[row*N + col] = sum;12 }13 }14 }
Janus Weil Multidimensional Data in C++
eigen
http://eigen.tuxfamily.org
template library for linear algebra
limited to 1 or 2 dimensions (vectors and matrices)
has many specialized types for vectors & matrices with �xed ordynamic sizes: Matrix3d, MatrixXd, Vector4f, RowVector2i, ...
operators for addition/multiplication of vectors/matrices
support for dense and sparse matrices
geometry/transformations
Janus Weil Multidimensional Data in C++
eigen example
1 #include <iostream>2 #include <Eigen/Dense>3 using Eigen::MatrixXd;4 int main() {5 MatrixXd m(2,2);6 MatrixXd n = MatrixXd::Random(2,2);7 m(0,0) = 3;8 m(1,0) = 2.5;9 m(0,1) = -1;10 m(1,1) = m(1,0) + m(0,1);11
12 m = (m + n) * 2;13 }
Janus Weil Multidimensional Data in C++
Blitz++
http://blitz.sourceforge.net
high-performance vector mathematics library
utilizes advanced C++ template metaprogramming techniqueslike expression templates
�speed comparable to Fortran Implementations�
supports vectors, matrices and tensors (up to 11 dimensions)
boundary checks, arbitrary ordering, slicing, ...
Janus Weil Multidimensional Data in C++
boost::multi_array
http://www.boost.org/doc/libs/1_57_0/libs/multi_array/doc/user.html
class template for multidimensional arrays
1 typedef boost::multi_array<double,3> array_3d;2 typedef boost::multi_array_types::extent_range range;3 array_3d A(boost::extents[3][4][2]); // 3x4x24 array_3d B(boost::extents[3][range(-1,3)][2]); // 3x4x2
element access via familiar syntax of native C++ arrays
1 A[1][1][1] = B[2][-1][1];
or via index array:
1 boost::array<index,3> idx = {{2,1,0}};2 A(idx) = 3.14;
Janus Weil Multidimensional Data in C++
Example: Matrix Multiplication (Boost)
1 #include "boost/multi_array.hpp"2 int main () {3 const int N = 1000;4 typedef boost::multi_array<float, 2> array_2d;5 boost::array<int,2> extents = {{N, N}};6 array_2d vA(extents), vB(extents), vC(extents);7
8 for (int row=0; row<N; row++) {9 for (int col=0; col<N; col++) {10 float sum=0;11 for (int i=0; i<N; i++)12 sum += vA[row][i] * vB[i][col];13 vC[row][col] = sum;14 }15 }16 }
Janus Weil Multidimensional Data in C++
boost::multi_array (II)
additional features ...
reshaping (rank & size must stay the same):
1 array_3d A(boost::extents[3][4][2]);2 boost::array<int,3> dims = {{2, 3, 4}};3 A.reshape(dims);
resizing (rank must be equal, elements are preserved)
1 A.resize(extents[3][4][6]);
changing the bases (lower bounds):
1 A.reindex(1); // change all bases to 12 boost::array<int,3> bases = {{0, 1, -1}};3 A.reindex(bases); // use different values
select storage ordering
Janus Weil Multidimensional Data in C++
boost::multi_array (III)
create sub-views via index ranges(rank: less or equal)
sectioning:
1 array_2d A(boost::extents[5][5]);2 array_2d::array_view<2>::type sect =3 A[ boost::indices[range(1,4)][range(2,4)] ];
slicing:
1 array_2d A;2 array_2d::array_view<1>::type slice =3 A[ boost::indices[3][range(0,5)] ];
Janus Weil Multidimensional Data in C++
Arrays in Fortran
Fortran 90+: (multi-dim) arrays are a core feature of thelanguage (also with dynamic size)
1 real, dimension(3,4) :: A, B, C2 real, dimension(-5:5) :: D3 real, allocatable, dimension(:,:,:) :: E, F
di�erences to C++:
'column-major' memory layoutindices start with 1 by default!
goodies:
element-wise operations: C=A+B
array sections: B(1:3,:)=A(2:4,:)
elemental functions: B=sin(A)
contiguous attribute
Janus Weil Multidimensional Data in C++
Example: Matrix Multiplication (Fortran)
1 program main2 integer, parameter :: N = 10003 real, dimension(N,N) :: vA, vB, vC4 integer :: row, col5
6 do row = 1, N7 do col = 1, N8 vC(row,col) = sum (vA(row,:) * vB(:,col))9 end do10 end do11 ! or even simpler:12 vC = matmul (vA, vB)13 end
dynamic arrays: completely analogous, except for
1 real, dimension(:,:), allocatable :: vA, vB, vC2 allocate(vA(N,N), vB(N,N), vC(N,N))
Janus Weil Multidimensional Data in C++
Fortran Arrays: Implementation
in order to support dynamic multi-dim arrays at runtime, it isnot su�cient to store just a base address, but one needsadditional info to characterize the array
stored in a structure called 'array descriptor'(sometimes 'dope vector')
set up internally by the compiler to represent arrays at run-time
contains �elds like:
base addresselement sizerankfor each dim: lower bound, extent, strideallocation statusdynamic type information (OOP)
Janus Weil Multidimensional Data in C++
Outlook: Coarrays
�concurrent arrays�, standardized in Fortran 2008
concept for parallelization using array-like syntax
PGAS model (partitioned global address space)
several versions ('images') of a program running in parallel,either on the same machine or distributed over a cluster
1 real, dimension(10), codimension[*] :: x, y2 integer :: num_img, me3 num_img = num_images()4 me = this_image()5 ! Some code here6 x(2) = x(3)[7] ! get value from image 77 x(3)[me] = 2*x(3)[me+1] ! get value from neighbor image8 sync all9 x(6)[4] = x(1) ! put value on image 410 x(:)[2] = y(:) ! put array on image 211 sync images (*)
Janus Weil Multidimensional Data in C++
Summary/Conclusions
C++ has no 'standard' treatment of multi-dimensional data,but several library implementations are available (for di�erentpurposes)
if your application relies heavily on multi-dim data, a modernFortran dialect might not be the worst alternative ;)
Janus Weil Multidimensional Data in C++
The End
Thanks for your attention!
Janus Weil Multidimensional Data in C++