supercharge performance using gpus in the cloud

45
Supercharge performance using GPUs in the Cloud John Barrus GPU Product Manager

Upload: etcenter

Post on 11-Apr-2017

126 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Supercharge performance using GPUs in the cloud

Supercharge performance using GPUs in the CloudJohn BarrusGPU Product Manager

Page 2: Supercharge performance using GPUs in the cloud

#NABShow

Agenda● Why GPUs?

● GPUs for Google Compute Engine

● No more HWOps!

● Provision a GPU instance

● Looking ahead: Remote Workstations for animation and production

Page 3: Supercharge performance using GPUs in the cloud

#NABShow

Linear Algebra

Example calculation: bn = a11* x1 + a12 * x2 + … + a1n * xn

Multiply each ai,j * xj in parallel—n2 parallel threads.

To calculate bj you must gather n results—n parallel threads.

=

b1b2

bn

x1x2

xn

a1na2n

ann

a11a21

an1

a12a22

an2

Page 4: Supercharge performance using GPUs in the cloud

#NABShow

CPU vs. GPU

Intel® Xeon® Processor E7-8890

v4 CPU

NVIDIA K80 GPU(per GPU)

AMD S9300x2 GPU(per GPU) NVIDIA P100 GPU

Cores 24(48 threads)

2496stream processors

4096stream processors

3584stream processors

Memory Bandwidth 85 GBps 240 GBps 512 GBps 732 GBps

Frequency (boost) 2.2 (3.4) GHz 562 MHz (875 MHz) 850 MHz 1.13 (1.30) GHz

Other FP16 support for machine learning

FP16 support for machine learning

Page 5: Supercharge performance using GPUs in the cloud

#NABShow

Example CPU vs. GPU

Intel® Xeon® Processor E7-8890

v4 CPU

NVIDIA K80 GPU(per GPU)

AMD S9300x2 GPU(per GPU) NVIDIA P100 GPU

Cores 24(48 threads)

2496stream processors

4096stream processors

3584stream processors

Memory Bandwidth 85 GBps 240 GBps 512 GBps 732 GBps

Frequency (boost) 2.2 (3.4) GHz 562 MHz (875 MHz) 850 MHz 1.13 (1.30) GHz

Other FP16 support for machine learning

FP16 support for machine learning

Page 6: Supercharge performance using GPUs in the cloud

#NABShow

Example CPU vs. GPU

Intel® Xeon® Processor E7-8890

v4 CPU

NVIDIA K80 GPU(per GPU)

AMD S9300x2 GPU(per GPU) NVIDIA P100 GPU

Cores 24(48 threads)

2496stream processors

4096stream processors

3584stream processors

Memory Bandwidth 85 GBps 240 GBps 512 GBps 732 GBps

Frequency (boost) 2.2 (3.4) GHz 562 MHz (875 MHz) 850 MHz 1.13 (1.30) GHz

Other FP16 support for machine learning

FP16 support for machine learning

Page 7: Supercharge performance using GPUs in the cloud
Page 8: Supercharge performance using GPUs in the cloud
Page 9: Supercharge performance using GPUs in the cloud
Page 10: Supercharge performance using GPUs in the cloud
Page 11: Supercharge performance using GPUs in the cloud

#NABShow

AMBER Simulation of CRISPR

AMBER 16 Pre-release, CRSPR based on PDB ID 5f9r, 336,898 atomsCPU: Dual Socket Intel E5-2680v3 12 cores, 128 GB DDR4 per node, FDR IB

Page 12: Supercharge performance using GPUs in the cloud

#NABShow

GPU Computing has reached a tipping point...

Page 13: Supercharge performance using GPUs in the cloud

#NABShow

Computing with GPUs● Machine Learning Training and Inference- TensorFlow● Frame Rendering and image composition - V-Ray by ChaosGroup● Physical Simulation and Analysis (CFD, FEM, Structural Mechanics)● Real-time Visual Analytics and SQL Database - MapD● FFT-based 3D Protein Docking - MEGADOCK● Faster than real-time 4K video transcoding - Colorfront Transkoder● Open Source Video Transcoding - FFmpeg, libav● Open Source Sequence Mapping/Alignment - BarraCUDA● Subsurface Analysis for the Oil & Gas industry - Reverse Time Migration● Risk Management and Derivatives Pricing - Computational Finance

Workloads that require compute-intensive

processing of massive amounts of data can benefit from the parallel architecture of the GPU

Page 14: Supercharge performance using GPUs in the cloud

#NABShow

Use hundreds of K80 GPUs to ray-trace massive models in real-time in Google Cloud.

V-Ray is Academy Award-winning software optimized for photorealistic rendering of imagery and animation. V-Ray’s ray tracing technology is used in multiple industries – from Architecture to Visual Effects. V-Ray RT GPU is built to scale in the Google Cloud, offering an exponential increase in speed to benefit individual artists and designers, as well as the largest studios and firms.

V-Ray by Chaos Group

Use hundreds of K80 GPUs to ray-trace massive models in real-time in Google Cloud.

V-Ray is Academy Award-winning software optimized for photorealistic rendering of imagery and animation. V-Ray’s ray tracing technology is used in multiple industries – from Architecture to Visual Effects. V-Ray RT GPU is built to scale in the Google Cloud, offering an exponential increase in speed to benefit individual artists and designers, as well as the largest studios and firms.

Page 15: Supercharge performance using GPUs in the cloud

#NABShow

"Scalability—inherent in modern V-Ray GPU raytrace rendering on NVIDIA K80s and in conjunction with cloud rendering on GCP—enables real time interaction with complex photorealistic scenes. It's on GCP where I've seen the dawn of this ideal creative workflow which will certainly have tremendous benefits to the filmmaking community in years to come."

— Kevin Margo, Director, blur studio

Page 16: Supercharge performance using GPUs in the cloud

#NABShow

Real-time visual analytics - MapD

Using the parallel processing power of GPUs, MapD has crafted a SQL database and visual analytics layer capable of querying and rendering billions of rows with millisecond latency

Page 17: Supercharge performance using GPUs in the cloud

#NABShow

Software optimized for the fastest hardware

MapD Core MapD ImmerseAn in-memory, relational, column store database

powered by GPUs

A visual analytics engine that leverages the speed +

rendering capabilities of MapD Core

+

100x Faster Queries Speed of Thought Visualization

Page 18: Supercharge performance using GPUs in the cloud

#NABShow

GPUs on GCPOn Feb 21st, Google Cloud Platform introduced

K80 GPUs in the US, Europe and Asia.

NVIDIA Tesla K80s AMD FirePro S9300 x2 NVIDIA Tesla P100

Page 19: Supercharge performance using GPUs in the cloud

#NABShow

● GCP offers teraflops of performance per instance by attaching GPUs to virtual machines

● Machine learning, engineering simulations, and molecular modeling will take hours instead of days on AMD FirePro and NVIDIA Tesla GPUs

● Regardless of the size and scale of your workload, GCP will provide you with the perfect GPU for your job

● Scientists, artists, and engineers who run compute-intensive jobs require access to massively parallel computation

Accelerated cloud computing

Page 20: Supercharge performance using GPUs in the cloud

Up to 8 GPUs per Virtual Machine

On any VM shape with at least 1 vCPU, you can attach1, 2, 4 or 8 GPUs along with up to 3 TB of Local SSD.

GPUs are now available in 4 regions, including us-west1

Page 21: Supercharge performance using GPUs in the cloud

#NABShow

Features

Bare metal Performance

Attach GPUs to Any Machine Type

Flexible GPU Counts Per Instance

• GPUs are offered in passthrough mode to provide bare metal performance

• Attach up to 8 GPU dies to your instance to get the power that you need for your applications

• You can mix-match different GCP compute resources, such as vCPUs, memory, local SSD, GPUs and persistent disk, to suit the need of your workloads

Page 22: Supercharge performance using GPUs in the cloud

#NABShow

Why GPUs in the Cloud?

GPUs in the Cloud Optimize Time and CostSpeed Up Complex Compute Jobs

● Offers the breadth of GPU capability for speeding up compute-intensive jobs in the Cloud as well as for the best interactive graphics experience with remote workstations

● No capital investment● Custom machine types:

Configure an instance with exactly the number of CPUs, GPUs, memory and local SSD that you need for your workload

Thanks to per minute pricing, you can choose the GPU that best suits your needs and pay only for what you use

Page 23: Supercharge performance using GPUs in the cloud

#NABShow

K80 Pricing (Beta)

Location SKU On demand priceGPU / hour (USD)

US GpuNvidiaTeslaK80 $0.700

Europe GpuNvidiaTeslaK80_Eu $0.770

Asia GpuNvidiaTeslaK80_Apac $0.770

billed in per minute increments with 10 minute minimum2 GPUs per board, up to 4 boards / 8 GPUs per VM

Page 24: Supercharge performance using GPUs in the cloud

#NABShow

Cloud GPUs - no need to worry about...

...system research

...upfront hardware purchase and shipping

...physical space and racks

...assembly and test

...hardware failures and debugging

...power and cooling

Page 25: Supercharge performance using GPUs in the cloud

#NABShow

Provision a GPU instance using the console

https://console.cloud.google.com/

Page 26: Supercharge performance using GPUs in the cloud

#NABShow

Choose “Customize”

Page 27: Supercharge performance using GPUs in the cloud

#NABShow

Click on “GPUs”

Page 28: Supercharge performance using GPUs in the cloud

#NABShow

Choose the number of GPUs desired

Page 29: Supercharge performance using GPUs in the cloud

#NABShow

Press “Create”

Page 30: Supercharge performance using GPUs in the cloud

#NABShow

gcloud beta compute instances create gpu-instance-1 \ --machine-type n1-standard-16 \ --zone asia-east1-a \ --accelerator type=nvidia-tesla-k80,count=2 \ --image-family ubuntu-1604-lts \ --image-project ubuntu-os-cloud \ --maintenance-policy TERMINATE \ --restart-on-failure \ --metadata startup-script='#!/bin/bash echo "Checking for CUDA and installing." # Check for CUDA and try to install. if ! dpkg-query -W cuda; then curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb apt-get update apt-get install cuda -y fi'

Provisioning a GPU instance

Page 31: Supercharge performance using GPUs in the cloud

#NABShow

High performance GPUs do not support Live Migration

GPUs offered in high-performance “pass-through” mode—VM owns the entire GPU

It’s not possible to migrate the state and contents of the GPU chip and memory.

VMs attached to GPUs must be set to “terminateOnHostMaintenance”

One hour notice is provided for the system to checkpoint and save state to be restored.

Page 32: Supercharge performance using GPUs in the cloud

#NABShow

VM metadata provides notice

Returns either “NONE” or a timestamp at which time your instance will be forcefully terminated.

See: https://cloud.google.com/compute/docs/gpus/add-gpus#host-maintenance

curl \ http://metadata.google.internal/computeMetadata/\v1/instance/maintenance-event \-H "Metadata-Flavor: Google"

Page 33: Supercharge performance using GPUs in the cloud

#NABShow

TensorFlow Supervisor

https://www.tensorflow.org/programmers_guide/supervisor

● Handles shutdowns and crashes cleanly.

● Can be resumed after a shutdown or a crash.

● Can be monitored through TensorBoard.

Page 34: Supercharge performance using GPUs in the cloud

#NABShow

Rendering on a GPU farm in the CloudAdrian GrahamCloud Solutions Architect

Page 35: Supercharge performance using GPUs in the cloud

#NABShow

■ Remote workstation with sufficient CPU, GPU and memory.■ Project-based cloud storage.■ Interactive and render licenses served on cloud, or from on-premises.■ Color-accurate display capability.■ As many render workers as possible.

Render pipeline requirements

Page 37: Supercharge performance using GPUs in the cloud

#NABShow

Demo video

Page 38: Supercharge performance using GPUs in the cloud

#NABShow

Instance Group

Render InstanceCompute Engine

Multiple Instances

Architecture: Using display and compute GPUs

On-premise infrastructure

Asset Management Database

APIs: gcloud, gsutil, ssh, rsync, etc

File Server Zero Client

AssetsCloud Storage

UsersCloud IAM

Users & Admins

Users & Admins

Cloud Directory Sync

Remote DesktopCompute Engine

License ServerCompute Engine

Teradici PCoIP

Page 39: Supercharge performance using GPUs in the cloud

#NABShow

Creating a workstation

For this job, we needed to run project-specific software (Autodesk 3DS Max) that only runs on Windows.

# Create a workstation.gcloud compute instances create "remote-work" \--zone "us-central1-a" \--machine-type "n1-standard-32" \--accelerator [type=,count=1] \--can-ip-forward --maintenance-policy "TERMINATE" \--tags "https-server" \--image "windows-server-2008-r2-dc-v20170214" \--image-project "windows-cloud" \--boot-disk-size 250 \--no-boot-disk-auto-delete \--boot-disk-type "pd-ssd" \--boot-disk-device-name "remote-work-boot"

2

3

1

1 Choose from zones in us-east1, us-west1, europe-west1, and asia-east1.

2 Choose type and number of attached GPUs.

3 Attach a GPU to an instance with any public image.

Page 40: Supercharge performance using GPUs in the cloud

#NABShow

Creating a render worker

We'll be interacting with Windows, but our render workers will be running CentOS 7. Here, we build a base image to deploy.

# Create a render worker.gcloud compute instances create "vray-render-base" \--zone "us-central1-a" \--machine-type "n1-standard-32" \--accelerator type="nvidia-tesla-k80",count=4 \--maintenance-policy "TERMINATE" \--image "centos-7-v20170227" \--boot-disk-size 100 \--no-boot-disk-auto-delete \--boot-disk-type "pd-ssd" \--boot-disk-device-name "vray-render-base-boot"

1

2

1 We will keep the render workers in the same zone for maximum throughput.

2 Once the instance is set up to our liking, we will delete the instance, leaving the disk.

Page 41: Supercharge performance using GPUs in the cloud

#NABShow

Deploying an instance group

Once we have our base Linux image, we create an instance template which we can deploy as part of a managed instance group.

# Create the image.gcloud compute images create "vrayrt-cent7-boot" \--source-disk "vray-render-base-boot" \--source-disk-zone "us-central1-a"

# Create the template.gcloud compute instance-templates create \"vray-render-template" \--image "vrayrt-cent7-boot" \--machine-type "n1-standard-32" \--accelerator type="nvidia-tesla-k80",count=4 \--maintenance-policy "TERMINATE" \--boot-disk-size 100 \--boot-disk-type "pd-ssd" \--restart-on-failure \--metadata startup-script='#! /bin/bashrunuser -l adriangraham -c "/usr/ChaosGroup/V-Ray/Standalone_for_linux_x64/bin/linux_x64/gcc-4.4/vray -server -portNumber 20207"'

1

1 On boot, we need each worker to launch the V-Ray Server command.

Page 42: Supercharge performance using GPUs in the cloud

#NABShow

# Launch a managed instance group.gcloud compute instance-groups managed create \"vray-render-grp" \--base-instance-name "vray-render" \--size 32 \--template "vray-render-template" \--zone "us-central1-a"

Release the hounds!

Managed instance groups can be deployed quickly, based on an instance template.

This group will launch 32 instances, respecting resources such as quota, IAM role at the project and organization levels.

Page 43: Supercharge performance using GPUs in the cloud

#NABShow

# Listen to output from the serial port.gcloud compute instances \tail-serial-port-output \vray-render-tk43 # ← name of managed instance

# Reduce size of instance group.gcloud compute instance-groups managed \resize --size=16 "vray-render-grp"

# Kill all instances.gcloud compute instance-groups managed \delete "vray-render-grp"

Useful commands

Once running, it's helpful to be able access the state of your instances, manage the group's size, or even deploy an updated instance template.

Page 44: Supercharge performance using GPUs in the cloud

#NABShow

Summary

● K80 GPUs available today on Google Cloud● Scale up easily and quickly● S9300x2 and P100’s coming soon

Go go https://cloud.google.com/gpu to provision GPUs on Google’s Cloud today!

Page 45: Supercharge performance using GPUs in the cloud

Thank you