visual computing for cloud mobile...visual computing for cloud mobile . 2 three trends converging...

18
HPC Advisory Council Singapore October 7, 2014 Marc Hamilton, Vice President, Solution Architecture and Engineering VISUAL COMPUTING FOR CLOUD MOBILE

Upload: others

Post on 05-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

HPC Advisory Council Singapore

October 7, 2014

Marc Hamilton, Vice President,

Solution Architecture and Engineering

VISUAL COMPUTING

FOR CLOUD MOBILE

Page 2: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

2

THREE TRENDS CONVERGING

Torrent of Data

2010 2015

Exabyte

s of

unst

ructu

red d

ata

Deep Neural Networks GPU Computing

SOURCE: : IDC

Page 3: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

3

Branch of Artificial Intelligence

Computers that learn from data

person

car

helmet

motorcycle

bird

frog

person

dog

chair

person

hammer

flower pot

power drill

MACHINE LEARNING

Page 4: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

4

DEEP LEARNING IN A LARGER CONTEXT

Data Science

(“Big Data”)

Data

Analysis

Data

Management

Some GPU value

SVM

K-Means

Clustering

Deep Learning

Deep Neural Nets

Convolutional Neural Nets

Strong GPU value

Recommender Systems

Collaborative Filtering

Regression

Bayesian Networks

Decision Trees

Random Forests

Semantic Analysis

More research to prove

GPU value

Machine

Learning

Distributed

Storage

e.g. HDFS

Queries & Indexing

e.g. Map-D, GISFederal, SQream

Data Mining

e.g. Statistics

Page 5: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

5

GPUS FOR DEEP LEARNING

1.2M training images • 1000 object

categories Hosted by

Image Recognition

CHALLENGE Winning %

Error

GPU usage for ILSVRC

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0%

5%

10%

15%

20%

25%

30%

2010 2011 2012 2013 2014

Winning % Error

% Teams

using GPUs

Page 6: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

6

NUS WINS IMAGENET 2014 CHALLENGE

Page 7: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

7

MACHINE LEARNING USE CASES

Face Detection

Autonomous Driving Image / Video Tagging

Speech Recognition

Product Recommendations

Object Recognition

Situational Awareness

…machine learning is pervasive

Page 8: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

8

A B C

D E F

G H I

a b c

d e f

g h i

EFFICIENT CONVOLUTIONS ON GPUS

Convolution as GEMM (matrix-matrix product) => Great on GPUs

x

y

image

kernel α

- A B - D E - G H

A B C D E F G H I

B C - E F - H I -

i

h

g

f

… e …

d

c

b

a

x,y

α

i h g

f e d

c b a

Page 9: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

9

INTRODUCING NVIDIA CUDNN

Lets DNN researchers focus on DNNs

We provide expertly tuned computational components

Accelerate, don’t replace, existing popular DNN frameworks

Forward and backward convolution routines tuned for NVIDIA GPUs

Optimized for all future NVIDIA GPU generations

Arbitrary dimension ordering, striding, and subregions for 4d tensors means easy integration into any neural net implementation

Download: http://www.nvidia.com/cudnn

Contact: [email protected]

Page 10: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

10

USING CAFFE WITH CUDNN

Accelerate Caffe layer types by 1.2 – 3x

Example: AlexNet Layer 2 forward:

1.9x faster convolution, 2.7x faster pooling

Integrated into Caffe dev branch today! (targeting official release with Caffe 1.0)

Comparison against SOL: ~50% headroom

(still trying to figure this out)

CPU could probably get within ~3x

Caffe (CPU*)

1x

Caffe (GPU) 11x

Caffe (cuDNN)

14x

Baseline Caffe compared to Caffe

accelerated by cuDNN on K40

Overall AlexNet training time

*CPU is 24 core E5-2697v2 @ 2.4GHz

Intel MKL 11.1.3

Page 11: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

11

Deep Learning with COTS HPC Systems

A. Coates, B. Huval, T. Wang, D. Wu, A. Ng, B. Catanzaro

Stanford / NVIDIA • ICML 2013

STANFORD AI LAB

3 GPU-Accelerated Servers

12 GPUs • 18,432 cores

4 kWatts

$33,000

Now You Can Build Google’s

$1M Artificial Brain on the

Cheap

-Wired

1,000 CPU Servers 2,000 CPUs • 16,000

cores

600 kWatts

$5,000,00

0

GOOGLE BRAIN

Page 12: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

12

Mobile - More Than Just Phones

Page 13: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

13

MOBILE

ARCHITECTURE

Maxwell

Kepler

Tesla

Fermi

Tegra 3

Tegra 4

Tegra

K1

GPU

ARCHITECTURE

UNIFIED ARCHITECTURE TEGRA K1 – MOBILE SUPER

CHIP

BREAKTHROUGH EXPERIENCES

TEGRA TK1

Page 14: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

14

192 CUDA cores

326 GFLOPS

VisionWorks SDK

JETSON TK1 DEV KIT 1ST MOBILE SUPERCOMPUTER FOR EMBEDDED SYSTEMS

Page 15: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

15

DIGITAL COCKPIT

EVOLUTION OF COMPUTING IN THE CAR

Tegra 4 Tegra 3 Tegra K1

Virtual Cockpit Autonomous Driving Infotainment

Page 16: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

16

COMPUTER VISION ON CUDA

Feature Detection / Tracking ~30 GFLOPS @ 30 Hz

Object Recognition / Tracking ~180 GFLOPS @ 30 Hz

3D Scene Interpretation ~280 GFLOPS @ 30 Hz

Page 17: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

17

Without GPU With GPU

NIGHT AND DAY DIFFERENCE HTTP://NVIDIA.COM/TRYGRID

Page 18: VISUAL COMPUTING FOR CLOUD MOBILE...VISUAL COMPUTING FOR CLOUD MOBILE . 2 THREE TRENDS CONVERGING Torrent of Data 2010 2015 ta Deep Neural Networks GPU Computing SOURCE: : IDC . 3

18

Thank You