“matrix multiply ― in parallel”

6

“Matrix Multiply ― in parallel” Joe Hummel, PhD U. Of Illinois, Chicago Loyola University Chicago [email protected]

Upload: benjamin-bates

Post on 02-Jan-2016

31 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

DESCRIPTION

“Matrix Multiply ― in parallel”. Joe Hummel, PhD U. Of Illinois, Chicago Loyola University Chicago [email protected]. Background…. Class :“ Introduction to CS for Engineers ” Lang :C/C++ Focus :programming basics, vectors, matrices Timing :present this after introducing 2D arrays…. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: “Matrix Multiply ― in parallel”

“Matrix Multiply ― in parallel”

Joe Hummel, PhDU. Of Illinois, Chicago

Loyola University Chicago

[email protected]

Page 2: “Matrix Multiply ― in parallel”

Class: “Introduction to CS for Engineers”

Lang: C/C++

Focus: programming basics, vectors, matrices

Timing: present this after introducing 2D arrays…

Background…

Page 3: “Matrix Multiply ― in parallel”

Yes, it’s boring, but…◦ everyone understands the problem

◦ good example of triply-nested loops

◦ non-trivial computation

Matrix multiply

for (int i = 0; i < N; i++)for (int j = 0; j < N; j++)for (int k = 0; k < N; k++)

C[i][j] += (A[i][k] * B[k][j]);

1500x1500 matrix:

2.25M elements » 32 seconds…

Page 4: “Matrix Multiply ― in parallel”

Matrix multiply is greatcandidate for multicore

◦ embarrassingly-parallel

◦ easy to parallelize viaoutermost loop

Multicore

#pragma omp parallel forfor (int i = 0; i < N; i++)for (int j = 0; j < N; j++)for (int k = 0; k < N; k++)

C[i][j] += (A[i][k] * B[k][j]);

Cores

1500x1500 matrix:

Quad-core CPU » 8 seconds…

Page 5: “Matrix Multiply ― in parallel”

Parallelism alone is not enough…

Designing for HPC

HPC == Parallelism + Memory Hierarchy ─ Contention

Expose parallelism

Maximize data locality:• network• disk• RAM• cache• core

Minimize interaction:• false sharing• locking• synchronization

Page 6: “Matrix Multiply ― in parallel”

What’s the other halfof the chip?

Implications?◦ No one implements MM this way

◦ Rewrite to use loop interchange,and access B row-wise…

Data locality

Cache!

X

#pragma omp parallel for

for (int i = 0; i < N; i++)for (int k = 0; k < N; k++)

for (int j = 0; j < N; j++)

C[i][j] += (A[i][k] * B[k][j]);

1500x1500 matrix:

Quad-core + cache » 2 seconds…

CSE5304—Project Proposal Parallel Matrix Multiplication

Matrix-Multiply Assist (MMA) Best Practices Guide

Recent Advances in Matrix Partitioning for Parallel ...€¦ · Transactions on Parallel and Distributed Systems 1 Recent Advances in Matrix Partitioning for Parallel Computing on

Matrix Factorizations for Parallel Integer Transforms

Avoiding Communication in Sparse Matrix-Vector Multiply ( SpMV )

Conjugate Gradient Method - Stanford Engineering Everywhere · 2008-05-24 · matrix-vector multiply z → Az • for A dense, matrix-vector multiply z → Az costs n2, so total cost

Parallel Methods for Matrix-Vector Multiplication

Flatten 2D matrix · Square Matrix Multiply Simple matrix multiply with square matrices: C=A*B with size WIDTH*WIDTH Procedure: row y of A times column x of B = C element (y,x) Note

TYPES OF MATRICES€¦ · Web viewAn identity matrix is special because when you multiply a matrix with it or when you multiply it with a matrix, the matrix does not change. For

The PHiPAC v1.0 Matrix-Multiply Distribution

Lab 2: Parallel Algorithms of Matrix Multiplication“Parallel algorithms of matrix multiplication”. Besides, the preliminary lab “Parallel programming with MPI” and Lab 1 “Parallel

CS 267 Applications of Parallel Processors Lecture 13: Parallel Matrix Multiply

Benchmarking Sparse Matrix-Vector Multiply In 5 Minutes

18.337 / 6.338: Parallel Computing Project FinalReport Parallelization of Matrix Multiply: A Look At How Differing Algorithmic Approaches and CPU Hardware

MATRIX MULTIPLY WITH DRYAD B649 Course Project Introduction

Lecture 7 CSE 260 – Parallel Computation (Fall 2015) Scott ...cseweb.ucsd.edu/classes/fa15/cse260-a/static/Lec07-slides.pdf · memory locality in matrix multiply ... • Two ways

Parallel CREW matrix multiplication · Parallel CREW matrix multiplication Contents I Reminder: Array total on EREW-PRAM I Reminder: How to multiply matrices I CREW matrix vector

Lecture 6: Parallel Matrix Algorithms (part 3)zxu2/acms60212-40212-S12/Lec-07-3.pdf · Lecture 6: Parallel Matrix Algorithms (part 3) 1 . A Simple Parallel Matrix-Matrix Multiplication

Parallel Methods for Matrix Multiplication

Lecture 2: Tiling matrix-matrix multiply, code tuningbindel/class/cs5220-s10/...2.Matrix-vector multiply: n2 data, 2n2 ﬂops 3.Matrix-matrix multiply: 2n2 data, 2n2 ﬂops These are

Parallel Numerical Algorithms · 2019. 11. 21. · parallel, using algorithms analogous to parallel QR factorization for dense matrix But subsequent QR iteration for reduced matrix

Matrix Eigensystem Tutorial For Parallel Computation

PSOD Lecture 2. MathCAD – vectors and matrix Matrix operations Matrix operations –Multiply by constant –Matrix transpose [ctrl]+[1] –Inverse [^][-][1]

Communication-Avoiding Parallel Sparse-Dense Matrix-Matrix

Parallel Sparse Matrix-Vector and Matrix- Transpose-Vector ...aydin/talks/csb-spaa.pdf · 1 Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication using Compressed

Parallel Sparse Matrix Algorithms for numerical computing matrix-vector multiplication

Optimizing Matrix Multiply - Florida State University

Effective Java - Concurrency [email protected]. The scope of the topic Concurrency Distributed Parallel Multiply-Thread Multiply-Core Multiply-Box (Process/JVM)

CS 267 Sparse Matrices: Sparse Matrix-Vector Multiply for Iterative Solvers

Objectives Add two matrices Subtract two matrices Multiply a matrix by a constant Multiply two matrices

Data-Flow Algorithms for Parallel Matrix Computationsoleary/reprints/j19.pdf · Parallel Matrix Computations ... data-flow algorithms for matrix computations might be ... August 1985

Server-side Sparse Matrix Multiply in the Accumulo …...Graphulo-TableMult-1 Server-side Sparse Matrix Multiply in the Accumulo Database Dylan Hutchison12* Vijay Gadepally1* Jeremy

7. Parallel Methods for Matrix-Vector Multiplication. Parallel Methods for Matrix-Vector Multiplication 7. Parallel Methods for Matrix-Vector Multiplication 1 7.1. Introduction

Matrix Multiply: Writing and Refining FSMs Nirav Dave

Midpoint-Based Parallel Sparse Matrix-Matrix Multiplication Algorithm