algorithms+and+specializers+for+provably+op:mal+ ... · pipe&filter map-reduce computational...

41
A lgorithms and S pecializers for P rovably Op:mal I mplementa:ons with R esiliency and E fficiency Krste Asanović, Elad Alon, Jonathan Bachrach, Jim Demmel, Armando Fox, Kurt Keutzer, Borivoje Nikolić, David Pa@erson, Koushik Sen, Vladimir Stojanović, John Wawrzynek [email protected] http:// aspire.eecs.berkeley.edu ASPIRE End of Project Party UC Berkeley December 5, 2017

Upload: others

Post on 28-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

Algorithms+and+Specializers+for+Provably+Op:mal+Implementa:ons+with+Resiliency+and+Efficiency+

Krste&Asanović,&Elad&Alon,&&Jonathan&Bachrach,&Jim&Demmel,&Armando&Fox,&Kurt&Keutzer,&Borivoje&Nikolić,&David&Pa@erson,&

Koushik&Sen,&Vladimir&Stojanović,&John&Wawrzynek&[email protected]&

http://aspire.eecs.berkeley.edu&ASPIRE&End&of&Project&Party&

UC&Berkeley&December&5,&2017&

Page 2: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley It#all#began#back#at#end#of#2011…#

! We&were&searching&for&what&would&come&aRer&Par&Lab&(2008V2013)&

Page 3: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

3/56&

Today’s#so7ware#paradigm:#“Apps”#

Page 4: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

4/56&

What’s#the#problem:#FuncEonality#Encased#in#App#Silos#

Client

Data

Cloud

Client +

Cloud App 1

Client +

Cloud App 2

Client +

Cloud App N

Each application has a different: user interface, account to register, password, software updates, payment model, different subset of supported devices…

This model cannot scale to

hundreds of devices per

person!

Each task involves the manual invocation of a manually scheduled series of applications

Page 5: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

5/56&

What#we#want#to#do#in#future:#Interact#with#Intelligent#Personal#Assistant#

!  Have&a&dialogue&with&assistant&who&you&can&give&high&level&goals&vs.&simple&steps,&who&remembers&past&tasks&and&can&find&informa`on&for&you&

Like Ms. Potts in Iron Man

Or like Jeeves in P.G. Wodehouse’s Aunts Aren’t Gentlemen

Page 6: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

6/56&

PostIParLab#Vision#

! Goal: “Computing continuum”: users’ perception is always connected, always available

!  From mobile client " sensors + mobile client + cloud

!  From GUI " Virtual Assistant via Natural User Interfaces

!  From a giant menu of applications " set of integrated services

!  From local-client scheduling " intelligent Client+Cloud partitioning

!  Enable productive development of an integrated set of services that meet end-user requirements of speed, privacy, and reliability through intelligent use of client and cloud computing.

Page 7: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley Sponsor’s#Feedback:#

“Work&on&EnergyVEfficient&Compu`ng”&

Page 8: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley Upheaval#in#Computer#Design#! Most&of&last&50&years,&Moore’s&Law&ruled&!  Last&decade,&technology&scaling&slowed&- Dennard&scaling&over&(supply&voltage&~fixed)&- Moore’s&Law&(cost/transistor)&over?&- No&compe``ve&replacement&for&CMOS&any`me&soon&

! Users’&imagina`on&unlimited,&crea`ng&new&useful&applica`ons&with&endlessly&growing&compute&demands&

!  Energy&efficiency&constrains&everything&- Mobile&- HighVperformance&embedded&compu`ng&- WarehouseVscale&compu`ng&

! Parallel&compu`ng&going&mainstream,&but&only&oneV`me&improvement&in&energy&efficiency&

! What&next?&

Page 9: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

EECS Electrical Engineering and

Computer Sciences BERKELEY PAR LAB

ASIC/SoC

UCB HW/SW Specializer Stack

9

Audio Recognition

Object Recognition Applications/Domains

FPGA Computer

C++ Simulation

Communication-Avoiding Algorithms

DenseC-A GEMM

Core and Interconnect

PatternsHardware-pattern-specific resiliency

FPGA Emulation

Resilient VregfileVector/GPUVector/GPUVector/GPU

Hardware Target Technologies

Softw

are

Hard

war

e

Pipe&Filter Map-ReduceComputational and Structural Patterns

Scene Analysis

Idempotent restartsGraph EngineGraph EngineGraph Engine

Hardware Cache CoherenceLocal Stores + DMA

GraphC-A BFS

Retry/ReplaySuperscalarSuperscalarSuperscalar

Hardware Specializers using Chisel HDL

Validation/Verification/Modeling

SparseC-A SpMV

Specializers with SEJITS Implementations and Autotuning

Deep HW/SW Design-Space

Exploration

… …

DARPA PERFECT Proposer’s Day Feb 9, 2012

Page 10: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley

[Graph+from+“Advancing+Computers+without+Technology+Progress”,+Hill,+Kozyrakis,+et+al.,++DARPA+ISAT+2012+]++

ASPIRE#Goal#Keep+computers’+performance+and+energy+efficiency+improving+past+end+of+CMOS+transistor+scaling+un:l+

new+switch+technology+deployed+

Page 11: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley ASPIRE#Explained#

! Algorithms&with&Provably&minimal&data&movement&!  Specializa`on&of&soRware&and&hardware&based&on&recurring&pa@erns&of&computa`on&and&communica`on&

! Autotuning&&&designVspace&explora`on&of&soRware&&&hardware&Implementa`ons&for&best&Efficiency&

! Resiliency&from&holis`c&wholeVstack&design&

Page 12: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley

Algorithms0and0Specializers0for0Provably0Op=mal0Implementa=ons0with0Resiliency0and0Efficiency0

SoRware&Tools&

Applica`ons&

Algorithms&

Architecture&

Circuits&

Simula`

on&and

&Mod

eling&

12

Page 13: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley ASPIRE#System#Targets#

Systems& Racks&(10s&kW)& Embedded&(10&kWV1W)& Mobiles&(1W)&

#

13

Page 14: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley

Performan

ce/Ene

rgy/Error#

SimulaE

on#and

#Mod

eling#

&Chisel&Pa@ernflow&Hardware&Pa@erns&

Chisel&HDL&

SoRware&Radio&

Computer&Vision&

Machine&Learning&

Cancer&Genomics&

Graph&Processing&

Computa`onal&and&&Structural&Pa@erns&Communica`onVAvoiding&Algorithms&

Produc`vity&Languages&(Python,&Scala)&with&Pa@ern&Frameworks&

COTS&CPU/GPU&

Vendor&Compilers&

ESP:&Ensembles&of&Specialized&Processors&

Hurricane&Spa`al&Compu`ng&Fabric& FPGA&

COTS&Tools&

Racks&(10s&kW)& Mobiles&(1W)&

Memories,&Interconnects,&I/O&

ESP&LLVM&Compiler&

ASIC&Archite

cture&

Systems&

Efficiency&

Layer&

Produc`vity&

Layer&

Algorithms&

Applica`ons

&

Pa@ernVSpecific&VMs&

Efficiency&Languages&(C++,&CUDA/OpenCL,&JVM)&

Pa@ern&Specializers&(SEJITS,&Spade)&

EnergyVEfficient&Resilient&Circuit&Design&Circuit

s&

Mul`media&Analysis&

Run`mes,&OS,&Hypervisor,&Cluster&Manager&OS&

Embedded&(10&kWV1W)&

Interac`ve&Cloud&

ASPIRE#Lasagna#

14

Page 15: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley Driving#ApplicaEons#SoRware&Radio&

Computer&Vision&

Machine&Learning&

Cancer&Genomics&

Graph&Processing&

Mul`media&Analysis&

Interac`ve&Cloud&

Applica`ons

&

15

Squeezdet&exceeded&our&DARPA&PhaseVIII&goals&from&2012&proposal:&57&fps&at&1.4&J/frame&&(>2000&improvement&in&efficiency)&

&&

Page 16: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley CommunicaEonIAvoiding#Algorithms#

Computa`onal&and&&Structural&Pa@erns&Applica`ons

&

CommunicaEonIAvoiding#Algorithms#Algorit

hms&

SoRware&Radio&

Computer&Vision&

Machine&Learning&

Cancer&Genomics&

Graph&Processing&

Mul`media&Analysis&

Interac`ve&Cloud&

16

Idea##1:&read&a&piece&of&a&sparse&matrix&(=&graph)&into&fast&memory&and&take&mul`ple&steps&of&higherVlevel&algorithm&

Idea##2:&replicate&data&(including&leRVhand&side&arrays,&as&in&C&in&C=A*B)&and&compute&par`al&results,&reduce&later&&

Seemingly&endless&“Best&Paper”&and&other&awards&for&this&work,&due&to&very&real&speedups&on&wellVstudied&kernels&

&&

Page 17: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley CompuEng#with#Specialized#Hardware?#Well&known,&for&some&applica`ons,&custom&hardware&100V1000x&performance&and&energy&efficiency&over&generalVpurpose&processors,&but&how&to&deploy?&!  Custom#chip#per#applicaEon#(ASIC)?#

-  NRE&costs&exploding,&~$50V100M&for&cutngVedge&ASIC&-  Development&`me&too&long,&months&or&years,&high&risk&-  Number&of&ASIC&starts&dropping&over&`me&

!  Custom#“Chiplets”?#(aka#Lego,#Bricks&Mortar,#MoChi,…)#-  i.e.,&small&custom&chips&connect&to&standard&parts&as&SystemVinVPackage&-  No&industry&standards,&packaging&cost,&yield&-  Data&movement&energy/performance&costs&

!  Standard#programmable#logic#parts,#FPGAs?#-  Lower&NRE&and&risk&-  Historically,&poor&energy&efficiency&and&terrible&programming&produc`vity&

!  Standard#programmable#SystemsIonIChip#(SoC)?#-  Common&for&both&server&and&mobile&plauorms&today&-  Specialized&hardware&as&heterogeneous&accelerators&

&

Page 18: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley EvaluaEng#Hardware#Choices#

CPU& FPGA&

Memories,&Interconnects,&I/O&

ASIC&Archite

cture& GPU&

SoRware&Radio&

Computer&Vision&

Machine&Learning&

Cancer&Genomics&

Graph&Processing&

Applica`ons

& Mul`media&Analysis&

Interac`ve&Cloud&

Racks&(10s&kW)& Mobiles&(1W)&System

s& Embedded&(10&kWV1W)&

What&to&target?&What&are&produc`vity/cost/performance&tradeoffs?&

18

Page 19: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley

Chisel:#ConstrucEng#Hardware#In#a#Scala#Embedded#Language#

!  Embed&hardwareVdescrip`on&language&in&Scala,&using&Scala’s&extension&facili`es:&Hardware&module&is&just&data&structure&in&Scala&

!  Different&output&rou`nes&generate&different&types&of&output&(C,&FPGAVVerilog,&ASICVVerilog)&from&same&hardware&representa`on&

!  Full&power&of&Scala&for&wri`ng&hardware&generators&-  ObjectVOriented:&Factory&objects,&traits,&overloading&etc&-  Func`onal:&HigherVorder&funcs,&anonymous&funcs,&currying&-  Compiles&to&JVM:&Good&performance,&Java&interoperability&

Chisel Program

C++ code FPGA

Verilog ASIC Verilog

Software Simulator

C++ Compiler

Scala/JVM

FPGA Emulation

FPGA Tools

GDS Layout

ASIC Tools

Chisel&3.0/&FIRRTL&1.0&released.&&Extern

al&industry&+&

academia&uptake&accelera`ng.&

&

Page 20: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley RISCIV#ISA#

www.riscv.org ! A&new&completely&free&and&open&ISA&&- Already&GCC,&Linux,&glibc,&LLVM,&upstreamed!&- RV32,&RV64,&and&RV128&variants&for&32b,&64b,&and&128b&address&spaces&defined&

! Base&ISA&only&<50&integer&instruc`ons,&but&supports&compiler,&linker,&OS,&etc.&

!  Extensions&provide&full&generalVpurpose&ISA,&including&IEEEV754/2008&floa`ngVpoint&

! Comparable&ISAVlevel&metrics&to&other&RISCs&! Designed&for&extension,&customiza`on&!  Fourteen&64Vbit&silicon&prototype&implementa`ons&completed&at&Berkeley&so&far&(45nm,&28nm,&16nm)&

20

RISCVV&experiencing&strong&momentum&as&legi`mate&

alterna`ve&to&other&proprietary&ISAs.&

70+&companies&are&members&of&RISCVV&Founda`on&

Nvidia&&&Western&Digital&commi@ed&to&using&RISCVV.&

Linley&group&“Best&Technology&in&2016

”.&

Becoming&the&standard&ISA&for&educa`on.&

&

Page 21: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley Rocket#Chip#Generator#

21

RocketTile

Rocket L1I$

L1D$

RoCCAccel.

CSRFile

L1 Network

L2$ Bank

RocketTile

Rocket L1I$

L1D$

RoCCAccel.CSR

File

JTAGDebug

L2 Network

Cache-CoherentDevice

L2$ BankL2$ Bank

L2$ BankCache-CoherentDevice

TileLink/AXI4Bridge

TileLink/AXI4Bridge

AXI4 Crossbar

DRAMController

DRAMController

High-Speed

IO Device

High-Speed IO

Device

AXI4/AHBBridge

AHB-Lite Bus

Z-scale

Low-Speed

IO DeviceScratch

PadSCRFile

AHB/APBBridge

APB Bus

Peripheral Peripheral Peripheral

1. Change Parameters 2. Develop New Accelerators 3. Develop Own RISC-V Core 4. Develop Own Device

OpenVsource&Rocket&Chip&generator&w

idely&used&in&

both&academia&and&industry.&Rocket&Chip&being&

used&in&several&products.&

&

Page 22: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley Agile#Hardware#Development#

22

Specification

Design

Implementation

Verification

Tape-inSpecification

DesignImplementation

C++

FPGA

ASIC flow

Tape-in

Small tape-out

Big tape-out

Spec

ifica

tion

Des

ign

Impl

emen

tatio

n

Verifi

catio

n

Tape

-out

A. Waterfall Model

B. Agile Model

C. Validation Through Agile Iteration

F1 F2 F3 F4

F1

F2

F3

F4 Valid

atio

n

Tape-out

Validation

F1 F2 F3 F4

F1

F1

F2

F2

F1

F1

F1

F2 F1

F1/F2

F1/F2

F1

F3 F4

F3 F4

F3 F4

F4

F4

F1/F2/F4

F1/F2

F1/F2

Always+have+a+fabricatable+working+prototype,+incrementally+add+features+and+push+to+“tapein”+oSen+

Page 23: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley RISCIV#Chips#Designed#at#Berkeley#

Raven-1

Raven-2

Raven-3 Raven-4

EOS14

EOS16

EOS18

EOS20

EOS22

EOS24

2011 2012 2013 2014 2015 May Apr Aug Feb Jul Sep Mar Nov Mar

Raven, Hurricane: ST 28nm FDSOI SWERVE, BROOM: TSMC 28nm EOS: IBM 45nm SOI, CRAFT: 16nm TSMC

SWERVE

Apr

Hurricane-1

2016 Mar Jul Dec

Hurricane-2

CRAFT

23

BROOM

Page 24: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley

1 Terabit/sec optical fibers FireBox Overview

High Radix Switches

SoC

SoC

SoC

SoC

SoC

SoC

SoC SoC SoC

SoC

SoC

SoC

SoC

SoC

SoC

SoC

Up to 1000 SoCs + High-BW Mem

(100,000 core total)

NVM

NVM

NVM

NVM

NVM

NVM

NVM

NVM NVM NVM

NVM

NVM

NVM

NVM

NVM

NVM

Up to 1000 NonVolatile Memory Modules (100PB total)

InterVBox&Network&

Many&Short&Paths&Thru&HighVRadix&Switches&

“SingleVchip&microprocessor&

that&communicates&

via&light”,&Nature,&December&2015&

&&

Page 25: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley ASPIRE#Summary#

! Architect&and&program&whole&applica=ons&using&computa`onal&and&structural&pa@erns&

! Develop&algorithms&that&minimize&data&movement&! Pa@ernVspecific&compilers&to&specialize&and&autotune&soRware&for&best&hardware&performance&

! New&“generalVpurpose&specialized”&architectures&employing&pa@ernVspecific&accelera`on&

! Parameterized&generators&and&designVspace&explora`on&to&op`mize&hardware&µarchitecture&

! Resilient&circuit&design,&new&tech.&(photonics,&NVM)&!  Fast&flexible&accurate&wholeVsystem&FPGA0simula=on0! Real&silicon0prototypes0to&validate&ideas&

Page 26: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley ASPIRE Timeline

“Next&Project”&Mee`ngs&

Par&Lab&End&of&Project&Party!&

You&are&here&

ASPIRE&End&of&Project&Party!&

DARPA&award&

ASPIRE&launch&

DARPA&PERFECT&BAA&

RISE&launch&Start&AMP+ASPIRE&Successor&Mee`ngs&

26

ADEPT&(quiet)&launch&

Page 27: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley ASPIRE#Sponsors#! DARPA&PERFECT&program&! DARPA&POEM&program&(Si&photonics)&! DARPA&CRAFT&program&(new)&!  STARnet&Center&for&Future&Architectures&(CVFAR)&!  Lawrence&Berkeley&Na`onal&Laboratory&!  Industrial&sponsors&- Intel&

!  Industrial&affiliates&- Google&- HPE&- Huawei&- LGE&- NVIDIA&- Oracle&- Samsung&

Page 28: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley

28

Page 29: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

Agile&Design&of&Efficient&Processing&Technologies&

Krste&Asanovic,&Bora&Nikolic,&Elad&Alon,&Jonathan&Bachrach,&Jonathan&RaganVKelley,&

Koushik&Sen,&Sanjit&Seshia&

Page 30: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

ADEPT:#New#5IYear#Project#adept.eecs.berkeley.edu#

! Moore’s&Law&has&effec`vely&ended&- But&silicon&manufacturing&is&already&wonderful&technology&

! GeneralVpurpose&compu`ng&stuck&in&plateau&- SoRware’s&free&ride&is&over&

!  Specialized&silicon&needed&for&new&applica`ons&- Machine&learning/inference,&IoT&sensing/processing,&automo`ve,…&

! But&too&expensive&to&design&custom&chips&today!&- semiconductor&industry&model&broken&for&lots&of&specialized&chips.&

!  Solu`ons:&&Reduce&NRE!&- Develop&nextVgenera`on&of&chipVdesign&automa`on&- Back&to&the&future:&use&right&node&instead&of&next&node&- New&industry&business&models:&share&risks,&increase&rewards&

- OpenVsource&hardware:&RISCVV&ISA&is&one&element&& 30

Page 31: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

CMOS#at#End#of#Moore’s#Law##

! Wonderful,&almost&magical,&technology&!  Fabricate&billions&of&transistors&! Connect&them&with&billions&of&wires&! Clock&at&a&few&GHz&! Dissipate&<100W&! Near&100%&yield,&cost&a&few&$/die&&

! Manufacturing&is&least&of&our&problems&! Good,&because&nothing&wai`ng&in&wings&to&replace&any`me&soon&(20+&years&to&compe`tor?)&

&

Page 32: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

Custom#Chip#Costs#

! NRE:&NonVRecurring&Engineering&costs&- Design&+&tooling&for&produc`on&

! Manufacturing&cost&- Cost&of&each&chip&made&once&in&produc`on&- Silicon&manufacture&plus&other&costs&including&package&&&test,&which&get&worse&rela`vely&with&smaller&nodes&

32

Page 33: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

Overall#Cost/Project#

33 Total&#transistors&shipped&

Total&cost&

Intercept&set&by&NRE& Expected&life`me&shipments&

Expected&life`me&cost&

Page 34: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

34 [Source: IBS]

~$5/chip&manufacturing&cost&

~$300M&development&cost&&

Page 35: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

Overall#Cost/Project#

35 Total&#transistors&

Total&cost&

Older&node&

NRE+doubling+each+genera:on.+But+slope+not+changing,+or+increasing,+each+genera:on!+

<28nm&

Newer&nodes&

Page 36: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

Old#Semiconductor#Business#Model#

! Design&standard&part&for&the&“killer&socket”&- Was&PC,&now&smartphone&

!  Sell&100s&millions&parts&

!  Ideally,&~0&customers,&~infinite&volume&

! Not&working&so&well&now&with&high&NREs,&end&of&Moore’s&Law,&and&fragmen`ng&markets&

36

Page 37: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

No#good#ideas,#so#merge#to#cut#costs#

37

7&of&top&8&in&last&2&years&

Page 38: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

New#Silicon#Model#emerging#

Instead&of&chip&company’s&standard&product,&chip&customers&want&own&differen`ated&designs:&! Apple,&Samsung&for&phones&! Google,&Amazon,&MicrosoR,&for&client/cloud&!  Tesla?,&Uber?&for&cars?&!  IoT&products,&FitBit?&maybe?&!  EndVsystem&value/profit&jus`fies&chip&NRE&

But&what&about&the&nonVhuge&firms&that&can’t&risk&NRE,&or&don’t&have&chip&experience?&- Can&we&cut&NRE&to&<<$100M,&or&<<$10M&or&<<$1M?&

38

Page 39: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

Acacking#all#the#NRE#components#

Using&data&center&(FireBox)&and&embedded&vision&chips&(Raven/Hurricane/Eagle)&as&driving&applica`ons:&! Reduce&soRware&cost&with&highVlevel&programming&tools&(Halide,serverless)&and&standard&ISA&(RISCIV)&

! Reduce&architecture&design&and&verifica`on&with&reusable&openVsource&chip&generators&(Chisel/FIRRTL)&

!  Share&costs&of&analog/mixedVsignal&IP&development&by&s`cking&to&a&few&nodes,&analog&generators&(BAG),&maybe&openVsource&analog!&

! Automate&backend&physical&design,&save&on&tool&costs&(HAMMER/FICL)&

!  Improve&verifica`on&and&valida`on,&accelerate&with&cloudVhosted&FPGAs&

39

Page 40: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

ADEPT#Sponsors#

! Cornerstone&funding:&Intel#Science#and#Technology#Center#on#Agile#Design&(ADEPT:+Agile+Development+of+Efficient+Processing+Technologies)&- PIs:&Asanovic,&Nikolic,&Bachrach,&RaganVKelley,&Seshia&- Funding&for&3&years&ini`ally,&started&in&December&2016&

! Also&AHDEPT:+Agile+Hardware+Design+in+Extreme+Process+Technologies+- Part&of&DARPA#CRAFT&program,&started&in&May&2016&- Pis:&Alon,&Nikolic,&Bachrach,&Sen&- Partners:&NorthropVGrumman,&Cadence&

!  Industrial&sponsors:&Siemens#and&SK#Hynix#!  Looking&for&more!&!  First&retreat:&Monterey,&January&8V10,&2018&&

40

Page 41: Algorithms+and+Specializers+for+Provably+Op:mal+ ... · Pipe&Filter Map-Reduce Computational and Structural Patterns Scene Analysis Idempotent restarts Graph EngineGraph EngineGraph

UC Berkeley ASPIRE#End#of#Project#

!  It’s&been&a&great&adventure&!  Thank&you&all&for&feedback&over&the&years!&! Many&impacuul&results&- SqueezeDet&V>&DeepScale&startup&- CA&Algorithms&V>&Many&awards&- FireBox&photonics&V>&Nature&ar`cle&on&first&photonic&micro,&Ayar&Labs&startup&

- RISCVV&ISA&V>&widespread&industry&and&academia&adop`on&- Chisel&+&Agile&Hardware&Design&V>&SiFive,&JITx&startups&-  https://aspire.eecs.berkeley.edu/aspire-best-of-papers/

! Many&amazing&graduated/gradua`ng&students&!  Strong&founda`on&for&the&next&5Vyear&ADEPT&project&

41