analog compute-in- memory for inference...© 2019 mythic. all rights reserved. confidential. for...

30
© 2019 MYTHIC. ALL RIGHTS RESERVED. © 2019 MYTHIC. ALL RIGHTS RESERVED. Analog Compute-in- Memory for Inference Mike Henry, CEO & Founder

Upload: others

Post on 05-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Analog Compute-in-Memory for InferenceMike Henry, CEO & Founder

Page 2: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Large Growth Opportunity for AI Inference

2

Semiconductor TAM for AI ($B) – Barclay’s Research

0

5

10

15

20

25

30

2016 2017 2018 2019 2020 2021 2022 2023

Training Data center inference Edge inference

AUTOMOTIVEINDUSTRIAL

AEROSPACE VIDEO ANALYTICS

DATACENTER

CONSUMER

Page 3: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Tough Requirements for Inference

▪ Accuracy requirements dictatemany millions of weights per model

▪ Trillions of operations per second

▪ Real-time, low-latency, with high levels of concurrency

▪ Affordable, scalable, low-power package

3

DATASET SIZE

AC

CU

RA

CY

Traditional ML algorithms/Statistical Learning

Small NN

Medium NN

Large NN

Large models needed for edge cases

Page 4: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED.

On-Chip Compute-and-Memory

4 © 2019 MYTHIC. ALL RIGHTS RESERVED.

Page 5: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Principles of On-Chip Compute-and-Memory

▪ Minimize journey from memory to processing units

▪ Many parallel compute units to maximize throughput and minimize latency

▪ Maximize memory bandwidth for entire set of DNNs

5

CPU

DRAM

L1 L2 L3

Traditional Architecture

Page 6: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Principles of On-Chip Compute-and-Memory

6

Compute-and-Memory Architecture

Mem

CPU

Mem

CPU

Mem

CPU

Mem

CPU

Mem

CPU

Mem

CPU

Mem

CPU

Mem

CPU

. . .

. . .

. . .. . .. . .. . .

▪ Minimize journey from memory to processing units

▪ Many parallel compute units to maximize throughput and minimize latency

▪ Maximize memory bandwidth for entire set of DNNs

Page 7: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Principles of On-Chip Compute-and-Memory

7

Weights are Stationary for Inference

Model A

Model B

Model C Model D

▪ Minimize journey from memory to processing units

▪ Many parallel compute units to maximize throughput and minimize latency

▪ Maximize memory bandwidth for entire set of DNNs

Page 8: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Innovate Faster and Easily Build Complex Applications

8D

ET

EC

TIO

NS

Final Layers

conv1

pool

1

conv3_x

conv4_x conv5_x

conv2_x

▪ Easily trade-off performance, power, latency, accuracy

▪ Enable concurrent and dynamic applications with deterministic execution

▪ Scalability from Very Large to Very Small

Page 9: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Innovate Faster and Easily Build Complex Applications

9

▪ Easily trade-off performance, power, latency, accuracy

▪ Enable concurrent and dynamic applications with deterministic execution

▪ Scalability from Very Large to Very Small

NeuralNetwork A

NeuralNetwork B

FAA 4563

Page 10: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Scalability from Very Large to Very Small

10

GPU

On-Chip Stationary Weights = OK Off-Chip Weights = Major BottlenecksWeight bandwidth = 10-20X data bandwidth

HD Video30 fps

186MB/s

HD Video30 fps

100M WeightsDistributed

20M WeightsDistributed

186MB/s 186MB/s

HD Video30 fps

20MWeights

HD Video30 fps

100MWeights

186MB/s6 GB/s 30 GB/s

CPU

Page 11: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Best-in-Class Performance and Power for Inference

11

Traditional

CPU / GPU

Compute-and-

Memory

Total model capacity ✓

System Simplicity ✓

Low Latency ✓

Energy Efficiency ✓

High Performance ✓

Deterministic Execution ✓

Page 12: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Implementation Challenges in Digital

▪ True compute-in-memory not possible

▪ Large die-area from compute and memory

▪ High power-consumption

▪ High development costs due to state-of-the-art process nodes

12

Page 13: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED.

Enter Analog

13 © 2019 MYTHIC. ALL RIGHTS RESERVED.

Page 14: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Unique Analog Compute-in-Memory

14

▪ Mythic computes directly inside the flash memory array– Weights stored in flash– Inputs provided as voltages– Outputs collected as currents

▪ Result: High-performance, yet amazingly efficient processing

X0

X1

X2

V0

V1

V2

YA YB YC

ADC ADC ADC

DAC

DAC

DAC

Page 15: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Power, Performance, and Latency

15

Incredibly high parallelization and “effective” memory

bandwidth

1000 x 1000vector-matrix product in

a few microseconds

Energy efficiency of4 TOPS/W for convolutional

and dense networks

Accounts for all system aspects including processor control

cores and PCI express

Page 16: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

▪ NOR Flash: Up to 50x denser weight storage

▪ No need to add more compute – the storage is the compute

Analog Compute Gives Us Model Capacity

16

SRAM = 48 transistors per 8-bit weight Analog = 1 flash transistor per 8-bit weight

Page 17: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED.

Rich Roadmap for Analog Compute Improvements

17

▪ More on this later

© 2019 MYTHIC. ALL RIGHTS RESERVED.

Page 18: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

DNN Tile - Ultra-dense Storage + Matrix Multiplication

18

Tiles Connected in a Grid

Expandable Grid of Tiles

Inp

ut

Da

ta

Ac

tiv

ati

on

s

Weight Storage

+Analog Matrix

MultiplierDig

ita

l to

An

alo

g

An

alo

g T

o D

igit

al

Single Tile

Network Connections

SRAM Processor

SIMD Router

Scene Segmentation

ObjectTracking

Camera Enhancement

Page 19: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

AI Inference Processor Landscape

19

On-Chip Compute-and-MemoryTraditional Von-Neuman-like Architectures

AnalogCompute-In-Memory

DigitalCompute-Near-MemorySoC CPU/GPU TPU

(IP)

Mythic is a trademark of Mythic. All other product or company names may be trademarks of their respective owners.

Page 20: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED.

The Mythic IPU

20 © 2019 MYTHIC. ALL RIGHTS RESERVED.

Page 21: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Mythic Initial Product Lineup

21

Mythic PCIe CardsMythic IPU

PCIe x16 FHFLM.2 x4 2280

Mythic IPU x1 Mythic IPU x4 Mythic IPU x16

PCIe x4 HHHL

PCIeSwitchPCIe

Switch

Up to 120M Weights

Page 22: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Low-Power DNN Inference Solution

22

Inference Capability per Watt for Common DNN ProcessorsResNet-50, INT8, and Batch=1

Frames per Second/Watt

Mythic IPU (est)

Habana Goya HL – 100

nVidia Tesla T4

nVidia Jetson Xavier

Qualcomm 835

nVidia Jetson TX2

Page 23: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Mythic SDK - Develop with Latest Networks / Frameworks

23

▪ Post-training quantization▪ Retraining libraries

▪ Graph decomposition and mapping

▪ Code generation

▪ Profiling and logging

▪ Power, speed and memory estimates

Performance Estimator

Runtime API Visualizer▪ Host runtime API and OS drivers

▪ Annotated

Graph Compiler

Optimization Suite

Page 24: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED.

The Next 10 years

24 © 2019 MYTHIC. ALL RIGHTS RESERVED.

Page 25: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

What the Hardware Industry is Used to

25

$ / Transistor (Intel) (Normalized)

130nm 90nm 65nm 45nm 32nm 22nm 14nm 10nm

1

0.1

0.01

2001 2014 ??

$10,000

$1,000

$100

$10

$1

$0

NAND Flash: Cost per GB

1999 2002 2005 2008 2011

Page 26: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Reality: Cost Improvements are Flat-Lining

26

https://wccftech.com/apple-5nm-3nm-cost-transistors/

$-

$1.00

$2.00

$3.00

$4.00

$5.00

$6.00

Cost per 1B Transistors

This should be a huge concern

16nm 10nm 7nm 5nm 3nm

Chip area (mm2) 125.00 87.66 83.27 85.00 85.00

No. of transistors (BU) 3.3 4.3 6.9 10.5 14.1

Gross die per wafer 478 686 721 707 707

Net die per wafer 359.74 512.44 545.65 530.25 509.04

Wafer price ($) 5,912.00 8,389.00 9,965.00 12,500 15,500

Die cost ($) 16.43 16.37 18.26 23.57 30.45

Transistor cost per 1B transistors ($)

4.98 3.81 2.65 2.25 2.16

Page 27: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

NVM Scalability With Analog Compute-In-Memory

27

340

290

240

190

140

90

40

Density Improvements

2009 2012 2013 2015 2016 2018 2020 2021 2022

Core Transistors NAND Flash

340

290

240

190

140

90

40

Density Improvements

2009 2012 2013 2015 2016 2018 2020 2021 2022

Core Transistors NAND Flash

Page 28: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.

Mythic Analog Compute is the Clear Path Forward

28

Mythic Analog Compute-in-Memory Digital

Performance per Dollar

+ 10 years

100x

On-chip Model Capacity

+ 10 years

100x

Performance per Watt

+ 10 years

100x

Page 29: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY.© 2019 MYTHIC. ALL RIGHTS RESERVED.29

Mythic Analog Compute is the Clear Path Forward

Page 30: Analog Compute-in- Memory for Inference...© 2019 MYTHIC. ALL RIGHTS RESERVED. CONFIDENTIAL. FOR INTERNAL USE ONLY. Large Growth Opportunity for AI Inference 2 Semiconductor TAM for

© 2019 MYTHIC. ALL RIGHTS RESERVED.© 2019 MYTHIC. ALL RIGHTS RESERVED.