seamless prediction at the edge using tensorflow on fpgas

15
©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject to change without notice. All information is provided on an “AS IS” basis without warranties of any kind. Statements regarding products, including regarding their features, availability, functionality, or compatibility, are provided for informational purposes only and do not modify the warranty, if any, applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their respective owners. Seamless Prediction at the Edge Using TensorFlow on FPGAs ©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject to change without notice. All information is provided on an “AS IS” basis without warranties of any kind. Statements regarding products, including regarding their features, availability, functionality, or compatibility, are provided for informational purposes only and do not modify the warranty, if any, applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their respective owners. Brad Spiers, Principal Solutions Architect Linley Spring Processor Conference: April 12, 2018

Upload: others

Post on 21-Jan-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Seamless Prediction at the Edge Using TensorFlow on FPGAs

©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject

to change without notice. All information is provided on an “AS IS” basis without warranties of any kind.

Statements regarding products, including regarding their features, availability, functionality, or

compatibility, are provided for informational purposes only and do not modify the warranty, if any,

applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron

trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their

respective owners.

Seamless Prediction at the Edge Using TensorFlow on FPGAs

©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject

to change without notice. All information is provided on an “AS IS” basis without warranties of any kind.

Statements regarding products, including regarding their features, availability, functionality, or

compatibility, are provided for informational purposes only and do not modify the warranty, if any,

applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron

trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their

respective owners.

Brad Spiers, Principal Solutions Architect

Linley Spring Processor Conference: April 12, 2018

Page 2: Seamless Prediction at the Edge Using TensorFlow on FPGAs

Prediction.. At the Edge▪ Limited Weight, Space and Power

▪ Very Limited External Bandwidth

▪ Cannot Move Data Must Compute Locally

▪ FPGAs Have Speed, Efficiency & Memory Capability

▪ Now Program FPGAs – with No Code Change!

Micron Confidential2

Page 3: Seamless Prediction at the Edge Using TensorFlow on FPGAs

What are Field Programmable Gate Arrays (FPGAs)?

3

▪ Unlike a CPU, no Pre-Defined Instructions

▪ Can be Dynamically Reprogrammed

▪ Massive Inherent Parallelism

ALU

ALU

ALU

ALU

Control

Cache

CPU

GPU

FPGA

Page 4: Seamless Prediction at the Edge Using TensorFlow on FPGAs

Current Customer Challenges

4

▪ Person and Face Recognition

▪ Body Pose Recognition

▪ Fingerprint Recognition

▪ Voice and Speaker Identification

▪ Object Categorization

▪ Time-Series Pattern Recognition (LSTM-based RNN’s)

Page 5: Seamless Prediction at the Edge Using TensorFlow on FPGAs

FWDNXT Performance on FPGAs

5

From Just 24 Watts to Handle Power Constraints on “The Edge”

Page 6: Seamless Prediction at the Edge Using TensorFlow on FPGAs

FWDNXT’s Approach

6

▪ Speed up Traces, not Layers

▪ Key Idea: Hide non-essential Work Behind Long Traces

▪ Traces Stretch

Across

Network Layers

▪With Long Traces, Bandwidth Becomes Key

Page 7: Seamless Prediction at the Edge Using TensorFlow on FPGAs

FWDNXT Has a Hierarchical Architecture

7

▪Hierarchical Memory Design Achieves Efficiency

▪Hidden, Long Memory Fetches Fill Buffers

▪ Full Buffers Feed Compute Units

Page 8: Seamless Prediction at the Edge Using TensorFlow on FPGAs

Micron Hybrid Memory Cube

April 6, 20188

Low-Power Bandwidth to Feed Long Traces

8.5x more bandwidth than DDR4

70% lessenergyper bit

How?▪ Stacked DRAM

▪ Multiple “banks” per layer

▪ “Light up” smaller bank less energy

Page 9: Seamless Prediction at the Edge Using TensorFlow on FPGAs

Problem: How to Program FPGAs?

9

▪ Programming has Been a Barrier in the Past− Verilog, HDL --> Months to Deploy

▪ FWDNXT’s Snowflake Compiler & Micron FPGA Modules: ML for IoT

Your Network

Your

Framework

Network

DescriptionSnowflake

Compiler

Micron FPGA

Module

Machine Learning

At the Edge

Page 10: Seamless Prediction at the Edge Using TensorFlow on FPGAs

What Model Types Can FWDNXT Handle?

10

▪ Any Model− CNN

− RNN

− LSTM

− …

▪ Any Framework− PYTORCH

− Caffe

− TensorFlow

− …

Page 11: Seamless Prediction at the Edge Using TensorFlow on FPGAs

FWDNXT Representations

11

▪Now, 16 bit Fixed Point Used for Inputs

▪ Fixed Point: 5 bit integer, 11-bit fraction

▪Moving to 16 bit Floating Point

▪Now, 32-bit Fixed Point Used for Multiplication Output and Add’s

Fixed Point Representation

Page 12: Seamless Prediction at the Edge Using TensorFlow on FPGAs

Steps to Deploy Models on FPGAs

12

1. Define Model in PYTORCH, Caffeor Tensorflow

2. Train Model with Data on GPUs

3. Input Framework-Trained Model into SnowFlake Compiler

4. Deploy Snowflake Output Directly onto Micron FPGA Module

NO CODE CHANGE

Page 13: Seamless Prediction at the Edge Using TensorFlow on FPGAs

Hybrid Memory

Cube

Up to 512GB

DDR Footprints

Advanced

FPGAs▪ Xilinx UltraScale +

▪ Intel Stratix 10

What New Problems Can We Solve?

Micron Confidential13

▪ Some Domains Have Problems that Require Larger Memory Footprints− Medical Imaging

− Oil Exploration

− Videos

− Government

▪ Need both High-Bandwidth and High-Capacity Memory

▪ Micron FPGA Cards Plus FWDNXT Snowflake Compiler Provide Missing Links

Page 14: Seamless Prediction at the Edge Using TensorFlow on FPGAs

Summary

Micron Confidential14

▪ The Edge Poses Challenges in Power and Bandwidth

▪ FPGAs Can Help, but Programming Was a Challenge—Until Now

▪ Memory Bandwidth now Key to Machine Learning Performance

▪ Plus, Solve Larger Problems on Boards with up to 512GB of Memory

www.micron.com/tensorflow

Page 15: Seamless Prediction at the Edge Using TensorFlow on FPGAs

Micron Confidential15