sparse code optimization
DESCRIPTION
Sparse code optimization. Automatic transformation of linked list pointer structures. Sven Groot. 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - PowerPoint PPT PresentationTRANSCRIPT
SPARSE CODE OPTIMIZATION
Automatic transformation of linked list pointer structures
Sven Groot
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
Sven Groot 3
RESTRUCTURING COMPILER Special type of optimizing compiler Restructures source code (e.g. to enable
vectorization or parallelization) Techniques:
Loop interchangeStrip miningLoop collapsingLoop fusion and fissionData structure transformation
Sparse compiler is a type of restructuring compiler
Sven Groot 4
RESTRUCTURING COMPILER (CONT’D) Loop interchange example
void MatrixMultiply1(float **result, float **left, float **right, int size, int rightWidth){ int row, col, x; for( row = 0; row < size; ++row ) { for( col = 0; col < rightWidth; ++col ) { for( x = 0; x < size; ++x ) { result[row][col] += left[row][x] * right[x][col]; } } }}
Sven Groot 5
RESTRUCTURING COMPILER (CONT’D) Loop interchange result
void MatrixMultiply2(float **result, float **left, float **right, int size, int rightWidth){ int row, col, x; for( row = 0; row < size; ++row ) { for( x = 0; x < size; ++x ) { for( col = 0; col < rightWidth; ++col ) { result[row][col] += left[row][x] * right[x][col]; } } }}
Sven Groot 6
RESTRUCTURING COMPILERS (CONT’D) Pointers pose problems
void MatrixMultiply3(float **result, float **left, float **right, int size, int rightWidth){ int row, col; float **tempRight; float *tempLeft;
for( row = 0; row < size; ++row ) { for( col = 0; col < rightWidth; ++col ) { tempRight = right; tempLeft = left[row]; while( tempRight < right + size ) { result[row][col] += *tempLeft * (*tempRight)[col]; ++tempRight; ++tempLeft; } } }}
Sven Groot 7
LINKED LIST MATRIX
Matrix ColHeadIndex=1
Col ColHeadIndex=2
Next ColHeadIndex=3
Next
RowHeadIndex=1
Row
RowHeadIndex=3
Next
CellCell
Cell
CellColNext
Cell
CellCell
Cell
CellColNext
RowNext
110 000 101
Sven Groot 8
LINKED LIST MATRIX (CONT’D)struct Cell { float Value; int ColIndex; int RowIndex; struct Cell *RowNext; // Cell in the next row struct Cell *ColNext; // Cell in the next column};struct RowHead{ int RowIndex; struct Cell *Cell; struct RowHead *Next;};struct ColHead{ int ColIndex; struct Cell *Cell; struct ColHead *Next;};struct Matrix{ int Dimensions; struct ColHead *Col; struct RowHead *Row;};
Sven Groot 9
LINKED LIST MATRIX (CONT’D) Matrix multiplication using linked lists
void MatrixMultiply(struct Matrix left, float **right, float **result, int rightWidth){ struct RowHead *leftRow = left.Row; struct Cell *leftCell; int dimensions = left.Dimensions; int col, row, x; for( col = 0; col < rightWidth; ++col ) { leftRow = left.Row; for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { result[row][col] += leftCell->Value * right[x][col]; } } } } }}
Sven Groot 10
LINKED LIST MATRIX (CONT’D) Alternative matrix multiplication
int MatrixMultiplyAlternative(struct Matrix left, float **right, float **result, int rightWidth){ struct RowHead *leftRow = left.Row; struct Cell *leftCell; int dimensions = left.Dimensions; int col; for( col = 0; col < rightWidth; ++col ) { leftRow = left.Row; while( leftRow != NULL ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { result[leftRow->RowIndex][col] += leftCell->Value * right[leftCell->ColIndex][col]; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } } return 0;}
Sven Groot 11
TRANSFORMATION The goal: remove all references to the
linked list from the loop The means: move linked list references
into initialization loop Initialization copies linked list contents
into array Transformed loop uses array Two methods, sublimation and
annihilation Must be done automatically
Sven Groot 12
SUBLIMATION Transforming the innermost loop
for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { result[row][col] += leftCell->Value * right[x][col]; } }
Sven Groot 13
SUBLIMATION (CONT’D) Initialization
Transformed main loop
leftCellArray = malloc(sizeof(float) * dimensions); for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArray[x] = leftCell->Value; } else leftCellArray[x] = 0; }
for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArray[x] * right[x][col]; }
Sven Groot 14
SUBLIMATION (CONT’D) Transforming the inner loop (alternative)
Initialization
Transformed main loop
while( leftCell != NULL ) { result[leftRow->RowIndex][col] += leftCell->Value * right[leftCell->ColIndex][col]; leftCell = leftCell->ColNext; }
leftCellArray = malloc(sizeof(float) * dimensions); memset(leftCellArray, 0, sizeof(float) * dimensions); while( leftCell != NULL ) { leftCellArray[leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; }
for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArray[leftCellCounter] * right[leftCellCounter][col]; }
Sven Groot 15
LOOP EXTRACTION Putting it in context for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell;
leftCellArray = malloc(sizeof(float) * dimensions); for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArray[x] = leftCell->Value; } else leftCellArray[x] = 0; }
for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArray[x] * right[x][col]; }
free(leftCellArray); } }
init
ializ
ati
on
Main
lo
op
Sven Groot 16
LOOP EXTRACTION (CONT’D) Initialization for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next;
leftCellArrayArray[row] = malloc(sizeof(float*) * dimensions); memset(leftCellArrayArray[row], 0, sizeof(float) * dimensions);
if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArrayArray[row][x] = leftCell->Value; } else leftCellArrayArray[row][x] = 0; } } }
Sven Groot 17
LOOP EXTRACTION (CONT’D) Transformed main loop for( row = 0; row < dimensions; ++row ) { for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArrayArray[row][x] * right[x][col]; } }
Sven Groot 18
LOOP EXTRACTION (CONT’D) Putting it in context (alternative) while( leftRow != NULL ) { leftCell = leftRow->Cell;
leftCellArray = malloc(sizeof(float) * dimensions); memset(leftCellArray, 0, sizeof(float) * dimensions); while( leftCell != NULL ) { leftCellArray[leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; }
for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArray[leftCellCounter] * right[leftCellCounter][col]; }
free(leftCellArray);
leftRow = leftRow->Next; }
Sven Groot 19
LOOP EXTRACTION (CONT’D) Initialization (alternative)
Transformed main loop
leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { leftCellArrayArray[leftRowCounter] = malloc(dimensions * sizeof(float)); memset(leftCellArrayArray[leftRowCounter], 0, dimensions * sizeof(float));
if( leftRow != NULL && leftRowCounter == leftRow->RowIndex ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { leftCellArrayArray[leftRowCounter][leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } }
for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRowCounter][col] += leftCellArrayArray[leftRowCounter][leftCellCounter] * right[leftCellCounter][col]; } }
Sven Groot 20
LOOP EXTRACTION (CONT’D) Once more, in context
Sven Groot 21
for( col = 0; col < rightWidth; ++col ) { leftRow = left.Row;
leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; leftCellArrayArray[row] = malloc(sizeof(float*) * dimensions); memset(leftCellArrayArray[row], 0, sizeof(float) * dimensions); if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArrayArray[row][x] = leftCell->Value; } else leftCellArrayArray[row][x] = 0; } } }
for( row = 0; row < dimensions; ++row ) { for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArrayArray[row][x] * right[x][col]; } }
for( row = 0; row < dimensions; ++row ) free(leftCellArrayArray[row]); free(leftCellArrayArray); }
init
ializ
ati
on
main
loop
Sven Groot 22
for( col = 0; col < dimensions; ++col ) { leftRow = left.Row;
leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { leftCellArrayArray[leftRowCounter] = malloc(dimensions * sizeof(float)); memset(leftCellArrayArray[leftRowCounter], 0, dimensions * sizeof(float)); if( leftRow != NULL && leftRowCounter == leftRow->RowIndex ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { leftCellArrayArray[leftRowCounter][leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } }
for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArrayArray[leftRowCounter][leftCellCounter] * right[leftCellCounter][col]; } }
for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) free(leftCellArrayArray[leftRowCounter]); free(leftCellArrayArray); }
init
ializ
ati
on
main
loop
Sven Groot 23
TRANSFORMATION RESULT
Sven Groot 24
void MatrixMultiplySublimation(struct Matrix left, float** right, float **result, int rightWidth) { struct RowHead *leftRow = left.Row; struct Cell *leftCell; float **leftCellArrayArray; int dimensions = left.Dimensions; int col, row, x; leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( row = 0; row < dimensions; ++row ) { if( leftRow != NULL && leftRow->RowIndex < row ) leftRow = leftRow->Next; leftCellArrayArray[row] = malloc(sizeof(float*) * dimensions); memset(leftCellArrayArray[row], 0, sizeof(float) * dimensions); if( leftRow != NULL && leftRow->RowIndex == row ) { leftCell = leftRow->Cell; for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { leftCellArrayArray[row][x] = leftCell->Value; } else leftCellArrayArray[row][x] = 0; } } } for( col = 0; col < rightWidth; ++col ) { for( row = 0; row < dimensions; ++row ) { for( x = 0; x < dimensions; ++x ) { result[row][col] += leftCellArrayArray[row][x] * right[x][col]; } } } for( row = 0; row < dimensions; ++row ) free(leftCellArrayArray[row]); free(leftCellArrayArray);}
Generateddeclaratio
n
init
ializ
ati
on
main
loop
Sven Groot 25
void MatrixMultiplyAlternativeSublimation(struct Matrix left, float **right, float **result, int rightWidth){ struct RowHead *leftRow = left.Row; struct Cell *leftCell; int dimensions = left.Dimensions; int col; float **leftCellArrayArray; int leftCellCounter, leftRowCounter;
leftCellArrayArray = malloc(sizeof(float*) * dimensions); for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { leftCellArrayArray[leftRowCounter] = malloc(dimensions * sizeof(float)); memset(leftCellArrayArray[leftRowCounter], 0, dimensions * sizeof(float)); if( leftRow != NULL && leftRowCounter == leftRow->RowIndex ) { leftCell = leftRow->Cell; while( leftCell != NULL ) { leftCellArrayArray[leftRowCounter][leftCell->ColIndex] = leftCell->Value; leftCell = leftCell->ColNext; } leftRow = leftRow->Next; } }
for( col = 0; col < rightWidth; ++col ) { for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) { for( leftCellCounter = 0; leftCellCounter < dimensions; ++leftCellCounter ) { result[leftRowCounter][col] += leftCellArrayArray[leftRowCounter][leftCellCounter] * right[leftCellCounter][col]; } } }
for( leftRowCounter = 0; leftRowCounter < dimensions; ++leftRowCounter ) free(leftCellArrayArray[leftRowCounter]); free(leftCellArrayArray);}
Generateddeclaration
s
init
ializ
ati
on
main
loop
Sven Groot 26
ANNIHILATION Alternative method of transformation No fill-in: omitted values stay omitted Sublimation:
Sparse loop: more iterationsSemi-dense loop: same number of iterations
AnnihilationSparse loop: same number of iterationsSemi-dense loop: less iterations
Can require other transformations
Sven Groot 27
ANNIHILATION (CONT’D) Recall the innermost loop
for( x = 0; x < dimensions; ++x ) { if( leftCell != NULL && leftCell->ColIndex < x ) leftCell = leftCell->ColNext; if( leftCell != NULL && leftCell->ColIndex == x && leftCell->RowIndex == row ) { result[row][col] += leftCell->Value * right[x][col]; } }
Sven Groot 28
ANNIHILATION (CONT’D) Initialization
leftCellArraySize = 100; leftCellArray = malloc(sizeof(float) * leftCellArraySize); newDimensions = 0; leftCellCopy = leftCell; for( x = 0; x < dimensions; ++x ) { if( newDimensions >= leftCellArraySize ) { leftCellArraySize *= 2; leftCellArray = realloc(leftCellArray, sizeof(float) * leftCellArraySize); } if( leftCellCopy != NULL && leftCellCopy->ColIndex < x ) leftCellCopy = leftCellCopy->ColNext; if( leftCellCopy != NULL && leftCellCopy->ColIndex == x && leftCellCopy->RowIndex == row ) { leftCellArray[newDimensions] = leftCellCopy->Value; ++newDimensions; } }
Sven Groot 29
ANNIHILATION (CONT’D) Initialization (cont’d)
rightArraySize = 100; rightArray = malloc(sizeof(float*) * rightArraySize); newDimensions = 0; leftCellCopy = leftCell; for( x = 0; x < dimensions; ++x ) { if( newDimensions >= rightArraySize ) { rightArraySize *= 2; rightArray = realloc(rightArray, sizeof(float) * rightArraySize); } if( leftCellCopy != NULL && leftCellCopy->ColIndex < x ) leftCellCopy = leftCellCopy->ColNext; if( leftCellCopy != NULL && leftCellCopy->ColIndex == x && leftCellCopy->RowIndex == row ) { rightArray[newDimensions] = right[x]; ++newDimensions; } }
Sven Groot 30
ANNIHILATION (CONT’D) Transformed main loop
for( x = 0; x < newDimensions; ++x ) { result[row][col] += leftCellArray[x] * rightArray[x][col]; }
Sven Groot 31
ANNIHILATION (CONT’D) Inner loop (alternative)
while( leftCell != NULL ) { result[leftRow->RowIndex][col] += leftCell->Value * right[leftCell->ColIndex][col]; leftCell = leftCell->ColNext; }
Sven Groot 32
ANNIHILATION (CONT’D) Initialization (alternative) leftCellArraySize = 100; leftCellArray = malloc(sizeof(float) * leftCellArraySize); newDimensions = 0; leftCellCopy = leftCell; while( leftCellCopy != NULL ) { if( newDimensions >= leftCellArraySize ) { leftCellArraySize *= 2; leftCellArray = realloc(leftCellArray, sizeof(float) * leftCellArraySize); } leftCellArray[newDimensions] = leftCellCopy->Value; ++newDimensions; leftCellCopy = leftCellCopy->ColNext; }
rightArraySize = 100; rightArray = malloc(sizeof(float*) * rightArraySize); newDimensions = 0; leftCellCopy = leftCell; while( leftCellCopy != NULL ) { if( newDimensions >= rightArraySize ) { rightArraySize *= 2; rightArray = realloc(rightArray, sizeof(float) * rightArraySize); } rightArray[newDimensions] = right[leftCell->ColIndex]; ++newDimensions; leftCellCopy = leftCellCopy->ColNext; }
left
Cel
lri
ght
Sven Groot 33
ANNIHILATION (CONT’D) Transformed main loop (alternative) for( leftCellCounter = 0; leftCellCounter < newDimensions; ++leftCellCounter ) { result[leftRow->RowIndex][col] += leftCellArray[leftCellCounter] * rightArray[leftCellCounter][col]; }
Sven Groot 34
POST-INITIALIZATION Pre-initialization: before the main loop Post-initialization: after the main loop Needed when an expression that needs
to be transformed is written to Needs to use index expression Fill-in value not needed
Sven Groot 35
POST-INITIALIZATION (CONT’D) Example
Result
while( node != NULL ) { node->Value = node->Value * 2; node = node->Next; }
nodeArray = malloc(size * sizeof(int)); nodeCopy = node; memset(nodeArray, 0, size * sizeof(int)); while( nodeCopy != NULL ) { nodeArray[nodeCopy] = nodeCopy->Value; nodeCopy = nodeCopy->Next; }
for( nodeCounter = 0; nodeCounter < size; ++ nodeCounter ) { nodeArray[nodeCounter] = nodeArray[nodeCounter] * 2; }
nodeCopy = node; while( nodeCopy != NULL ) { nodeCopy->Value = nodeArray[nodeCopy->Index]; nodeCopy = nodeCopy->Next; }
Pre
-init
Main
loop
Post
-init
Sven Groot 36
AUTOMATED TRANSFORMATION Seven steps
1. Find candidate structures2. Analyze usage of these structures in the code3. Determine transformation safety4. Identify data members5. Generate dense data structures6. Transform7. Loop extraction
Code must be normalized
Sven Groot 37
CONDITIONS The linked list expression must not have side effects Loop termination control must be trivial The linked list iteration statement may be the only statement
in the loop body that modifies the linked list expression The “next” pointer member may not be a data member Any expression, other than the linked list iteration statement,
that might be moved to an initialization loop may not have side effects, and use only constants, loop-invariant values, linked list members and loop control variables
If the linked list expression is guarded, it must be possible to move that entire guard, including both the true and false parts, to the initialization loop.
When performing annihilation on a semi-dense loop, there must be a single guard that covers all statements in the loop body except for the linked list iteration statement and its guard, and statements related to loop control (such as those that increment the counter).
Sven Groot 38
TRANSFORMATION DIRECTIVES Fill in gaps in the compiler’s knowledge Embedded in source code as comments Examples
SAFE_CODE, UNSAFE_CODESAFE_LOOP, UNSAFE_LOOPDENSE_INDEXDENSE_DIMENSIONSFILL_INEtc.
Sven Groot 39
TRANSFORMATION DIRECTIVES (CONT’D) Example
/***SAFE_CODE***/ /***DENSE_INDEX(node, node->Index)***/ /***DENSE_DIMENSION(node, size)***/ while( node != NULL ) { node->Value = node->Value * 2; node = node->Next; } /***UNSAFE_CODE***/
Sven Groot 40
EXPERIMENTATION Tested on three matrices Sublimation code: ran through MT1 Annihilation code: loop interchange Used tools: Intel C Compiler, Intel
FORTRAN Compiler Test system: Dual Intel Xeon 3.06GHz,
1GB RAM
Sven Groot 41
EXPERIMENTATION (CONT’D)
sherman3 e40r5000 af23560
Size 5005x5005 17281x17281 23560x23560
Non-zero elements 20033 553965 484256
Density 0.080% 0.186% 0.087%
Results (seconds)
Original algorithm 22.813 s 266.404 s 443.982 s
Alternative algorithm 2.138 s 46.108 s 29.940 s
Annihilation 0.195 s 1.742 s 2.744 s
MT1 (Fortran) 0.095 s 5.228 s 2.216 s
MT1 (C) 0.124 s 5.153 s 3.314 s
Sven Groot 42
EXPERIMENTATION (CONT’D)
sherman3 e40r5000 af235600
50
100
150
200
250
300
Alternative al-gorithmAnnihilationMT1 (fortran)MT1 (C)
SPARSE CODE OPTIMIZATION
Automatic transformation of linked list pointer structures
Sven Groot