high-performance gunrock graph analytics for the gpu · 2020-01-31 · nvidia ai laboratory. uc...

23
Gunrock High-Performance Graph Analytics for the GPU Muhammad Osama — University of California, Davis

Upload: others

Post on 24-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Gunrock High-Performance Graph Analytics for the GPU

Muhammad Osama — University of California, Davis

Page 2: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Why use GPUs for graph processing?

2FOSDEM 2020

Page 3: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

GPUs and GraphsGraphs

● Found everywhere○ Road & social networks, web, etc.

● Require fast processing○ Memory bandwidth, computing

power and GOOD software

● Becoming very large○ Billions of edges

● Irregular data access pattern and control flow

○ Limits performance and scalability

GPUs

● Found everywhere○ Data center, desktops, mobiles, etc.

● Very powerful○ High memory bandwidth (900 GBps)

and computing power (15.7 TFlops)

● Limited memory size○ 32 GB per NVIDIA V100

● Difficult to program○ Harder to optimize

3FOSDEM 2020

Page 4: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

What is Gunrock?

4FOSDEM 2020

Page 5: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

A CUDA-based graph processing

library, aims for

PerformanceState-of-the-art graph processing library

5FOSDEM 2020

Page 6: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

A CUDA-based graph processing

library, aims for

GeneralityCovers a broad range of graph algorithms

6FOSDEM 2020

Page 7: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

A CUDA-based graph processing

library, aims for

Programmability Makes it easy to implement and extend graph algorithms from 1-GPU to multi-GPUs

7FOSDEM 2020

Page 8: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

A CUDA-based graph processing

library, aims for

ScalabilityFits in (very) limited GPU memory space performance scales when using many GPUs

8FOSDEM 2020

Page 9: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Where can you find Gunrock?

9FOSDEM 2020

Page 10: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

gunrock.github.io10FOSDEM 2020

Page 11: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Project’s Workflow

Release (master) branch

Development (dev) branch

Git Forking Workflow

Contribute GitHub Issues &

Pull Requests

Apache 2.0 License

Code Coverage codecov.io

Central Integration jenkins.io

Documentation slate & doxygen

11FOSDEM 2020

Page 12: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Project’s Workflow (cont.)

Gunrock's Roadmap12FOSDEM 2020

Page 13: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Some Stats and Stuff! (as of 01/30/2020)

● 32 Contributors (over 2500 commits)● ~600 stars 148 forks● NVIDIA CUDA-X: GPU Accelerated Library● RAPIDS

13FOSDEM 2020

Page 14: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

How does Gunrock work?

14FOSDEM 2020

Page 15: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Programming Model

● Data-centric abstraction● Bulk-synchronous programming

15FOSDEM 2020

Page 16: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

FrontierA frontier; group of vertices or edges

An example graph {G}

Frontier of vertices of graph {G}16FOSDEM 2020

Page 17: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Parallel OperatorsManipulation of frontiers

is an operation

● Advance● Filter● For● Intersection● Neighbor-Reduce● … and more.

Generates new frontier by visiting the neighbors.

Illustration of Advance Operator

17FOSDEM 2020

Page 18: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Bulk-Synchronous Programming

Serial loop until convergence

Series of parallel operations separated by global barriers

Parallel advance operator

18FOSDEM 2020

Page 19: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

{Gunrock Algorithms}A* Search

Betweenness Centrality

Breadth-First Search

Connected Components

Graph Coloring

Geolocation

RMAT Graph Generator

Graph Trend Filtering

Graph Projections

Random Walk

Hyperlink-Induced Topic Search

K-Nearest Neighbors

Louvain Modularity

Label Propagation

MaxFlow

Minimum Spanning Tree

PageRank

Local Graph Clustering

GraphSAGE

Stochastic Approach for Link-Structure Analysis

Subgraph Matching

Shared Nearest Neighbors

Scan Statistics

Single Source Shortest Path

Triangle Counting

Top K

Vertex Nomination

Who To Follow19FOSDEM 2020

Page 20: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Example application in Gunrock.

20FOSDEM 2020

Page 21: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Single-Source Shortest Path

Implement the advance and filter C++ lambdas for

SSSP

{complete code}

auto advance_op =

[distances, weights] __host__ __device__ (...) -> bool {

auto distance = distances[vertex_id] + weights[edge_id];

auto old_distance = atomicMin(distances + neighbor_id, distance);

if (distance < old_distance) return true;

return false;

};

auto filter_op =

[labels, iteration] __host__ __device__ (...) -> bool {

if (!util::isValid(neighbor_id)) return false;

return true;

};

21FOSDEM 2020

Page 22: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Single-Source Shortest Path

Launch the lambdas within

the operator call

{complete code}

while (!frontier.isEmpty()) {

oprtr::Advance<oprtr::OprtrType_V2V>(

graph.csr(), frontier, oprtr_parameters,

advance_op, filter_op);

}

22FOSDEM 2020

Page 23: High-Performance Gunrock Graph Analytics for the GPU · 2020-01-31 · NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics. Department of Defense Advanced Research Projects

Acknowledgements & Thanks!

NVIDIA AI Laboratory. UC Davis Center for GPU Graph Analytics.

Department of Defense Advanced Research Projects Agency (DARPA). SYMPHONY: Orchestrating Sparse and Dense Data for Efficient Computation. Award HR0011-18-3-0007.

Department of Defense Advanced Research Projects Agency (DARPA). A Commodity Performance Baseline for HIVE Graph Applications. Award FA8650-18-2-7835.

Adobe Data Science Research Award. Scalability and Mutability for Large Streaming Graph Problems on the GPU.

National Science Foundation (Award OAC-1740333) SI\textln{2-SSE: Gunrock: High-Performance GPU Graph Analytics.

National Science Foundation (Award CCF-1637442) Theory and implementation of dynamic data structures for the GPU. Program: AitF---Algorithms in the Field.

National Science Foundation (Award CCF-1629657) PARAGRAPH: Parallel, Scalable Graph Analytics. XPS---Exploiting Parallelism & Scalability.

Department of Defense Advanced Research Projects Agency (DARPA) SBIR SB152-004. Many-Core Acceleration of Common Graph Programming Frameworks. Phase II: award W911NF-16-C-0020.

Department of Defense Advanced Research Projects Agency (DARPA) STTR ST13B-004 (“Data-Parallel Analytics on Graphics Processing Units (GPUs)”). A High-Level Operator Abstraction for GPU Graph Analytics. Awards D14PC00023 and D15PC00010.

Department of Defense (XDATA program). An XDATA Architecture for Federated Graph Models and Multi-Tier Asymmetric Computing. Oct. 2012--Sept. 2017. Prime contractor: Sotera Defense Solutions, Inc., US Army award W911QX-12-C-0059.

23FOSDEM 2020