![Page 1: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/1.jpg)
©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject
to change without notice. All information is provided on an “AS IS” basis without warranties of any kind.
Statements regarding products, including regarding their features, availability, functionality, or
compatibility, are provided for informational purposes only and do not modify the warranty, if any,
applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron
trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their
respective owners.
Seamless Prediction at the Edge Using TensorFlow on FPGAs
©2018 Micron Technology, Inc. All rights reserved. Information, products, and/or specifications are subject
to change without notice. All information is provided on an “AS IS” basis without warranties of any kind.
Statements regarding products, including regarding their features, availability, functionality, or
compatibility, are provided for informational purposes only and do not modify the warranty, if any,
applicable to any product. Drawings may not be to scale. Micron, the Micron logo, and all other Micron
trademarks are the property of Micron Technology, Inc. All other trademarks are the property of their
respective owners.
Brad Spiers, Principal Solutions Architect
Linley Spring Processor Conference: April 12, 2018
![Page 2: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/2.jpg)
Prediction.. At the Edge▪ Limited Weight, Space and Power
▪ Very Limited External Bandwidth
▪ Cannot Move Data Must Compute Locally
▪ FPGAs Have Speed, Efficiency & Memory Capability
▪ Now Program FPGAs – with No Code Change!
Micron Confidential2
![Page 3: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/3.jpg)
What are Field Programmable Gate Arrays (FPGAs)?
3
▪ Unlike a CPU, no Pre-Defined Instructions
▪ Can be Dynamically Reprogrammed
▪ Massive Inherent Parallelism
ALU
ALU
ALU
ALU
Control
Cache
CPU
GPU
FPGA
![Page 4: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/4.jpg)
Current Customer Challenges
4
▪ Person and Face Recognition
▪ Body Pose Recognition
▪ Fingerprint Recognition
▪ Voice and Speaker Identification
▪ Object Categorization
▪ Time-Series Pattern Recognition (LSTM-based RNN’s)
![Page 5: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/5.jpg)
FWDNXT Performance on FPGAs
5
From Just 24 Watts to Handle Power Constraints on “The Edge”
![Page 6: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/6.jpg)
FWDNXT’s Approach
6
▪ Speed up Traces, not Layers
▪ Key Idea: Hide non-essential Work Behind Long Traces
▪ Traces Stretch
Across
Network Layers
▪With Long Traces, Bandwidth Becomes Key
![Page 7: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/7.jpg)
FWDNXT Has a Hierarchical Architecture
7
▪Hierarchical Memory Design Achieves Efficiency
▪Hidden, Long Memory Fetches Fill Buffers
▪ Full Buffers Feed Compute Units
![Page 8: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/8.jpg)
Micron Hybrid Memory Cube
April 6, 20188
Low-Power Bandwidth to Feed Long Traces
8.5x more bandwidth than DDR4
70% lessenergyper bit
How?▪ Stacked DRAM
▪ Multiple “banks” per layer
▪ “Light up” smaller bank less energy
![Page 9: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/9.jpg)
Problem: How to Program FPGAs?
9
▪ Programming has Been a Barrier in the Past− Verilog, HDL --> Months to Deploy
▪ FWDNXT’s Snowflake Compiler & Micron FPGA Modules: ML for IoT
Your Network
Your
Framework
Network
DescriptionSnowflake
Compiler
Micron FPGA
Module
Machine Learning
At the Edge
![Page 10: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/10.jpg)
What Model Types Can FWDNXT Handle?
10
▪ Any Model− CNN
− RNN
− LSTM
− …
▪ Any Framework− PYTORCH
− Caffe
− TensorFlow
− …
![Page 11: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/11.jpg)
FWDNXT Representations
11
▪Now, 16 bit Fixed Point Used for Inputs
▪ Fixed Point: 5 bit integer, 11-bit fraction
▪Moving to 16 bit Floating Point
▪Now, 32-bit Fixed Point Used for Multiplication Output and Add’s
Fixed Point Representation
![Page 12: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/12.jpg)
Steps to Deploy Models on FPGAs
12
1. Define Model in PYTORCH, Caffeor Tensorflow
2. Train Model with Data on GPUs
3. Input Framework-Trained Model into SnowFlake Compiler
4. Deploy Snowflake Output Directly onto Micron FPGA Module
NO CODE CHANGE
![Page 13: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/13.jpg)
Hybrid Memory
Cube
Up to 512GB
DDR Footprints
Advanced
FPGAs▪ Xilinx UltraScale +
▪ Intel Stratix 10
What New Problems Can We Solve?
Micron Confidential13
▪ Some Domains Have Problems that Require Larger Memory Footprints− Medical Imaging
− Oil Exploration
− Videos
− Government
▪ Need both High-Bandwidth and High-Capacity Memory
▪ Micron FPGA Cards Plus FWDNXT Snowflake Compiler Provide Missing Links
![Page 14: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/14.jpg)
Summary
Micron Confidential14
▪ The Edge Poses Challenges in Power and Bandwidth
▪ FPGAs Can Help, but Programming Was a Challenge—Until Now
▪ Memory Bandwidth now Key to Machine Learning Performance
▪ Plus, Solve Larger Problems on Boards with up to 512GB of Memory
www.micron.com/tensorflow
![Page 15: Seamless Prediction at the Edge Using TensorFlow on FPGAs](https://reader031.vdocuments.us/reader031/viewer/2022020621/61eab6ae32f85d3d184dbf92/html5/thumbnails/15.jpg)
Micron Confidential15