an analog accelerator for linear algebrayipenghuang.com/wp-content/uploads/2018/09/8b-3.pdf · •...

59
An Analog Accelerator for Linear Algebra Yipeng Huang, Ning Guo, Mingoo Seok, Yannis Tsividis, Simha Sethumadhavan Columbia University

Upload: others

Post on 21-Jun-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

An Analog Acceleratorfor Linear Algebra

Yipeng Huang, Ning Guo, Mingoo Seok,Yannis Tsividis, Simha Sethumadhavan

Columbia University

Page 2: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Why Analog?

Digital hardware

Digital algorithms

Supports

•Binary numbers

• Step-by-step operation

Page 3: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Why Analog?

Faster, more efficient digital hardware

Scaling & architectureDigital hardware

Digital algorithms

Supports

Page 4: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Why Analog?

Faster, more efficient digital hardware

More optimaldigital algorithms

Supports

Scaling & architecture

Research & development

Digital hardware

Digital algorithms

Supports

Page 5: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Why Analog?

Gates’ Simplification #1:Dennard scaling ended, Moore scaling will end

—Monday keynote speaker Doug Carmean

Faster, more efficient digital hardware

More optimaldigital algorithms

Supports

Scaling & architecture

Research & development

Digital hardware

Digital algorithms

Supports

Page 6: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Why Analog?

Analog hardware

Faster, more efficient digital hardware

More optimaldigital algorithms

Supports

Scaling & architecture

Research & development

Digital hardware

Digital algorithms

Supports

Page 7: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Why Analog?

Analog hardware

Analog algorithms

Supports

Faster, more efficient digital hardware

More optimaldigital algorithms

Supports

Scaling & architecture

Research & development

Digital hardware

Digital algorithms

Supports

Page 8: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Why Analog?

Analog hardware

Analog algorithms

Supports

Faster, more efficient digital hardware

More optimaldigital algorithms

Supports

Scaling & architecture

Research & development

Digital hardware

Digital algorithms

Supports

•No binary numbers•Continuous

operation

Page 9: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

A continuous-time, analog computing model• step-by-step algorithm → continuous-time algorithm• continuous-time algorithm → analog accelerator hardware

Analog drawbacks: how to fix them• limited applications: tackle key linear algebra kernel• limited accuracy: calibration & exceptions• limited precision: build precision with digital help• limited scalability: divide & conquere sparse matrix

A prototype analog accelerator & evaluation• microarchitecture• architecture & programming• performance• energy

Page 10: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Analog computing solves ordinary differential equationsScientific computation phrased problems as ODEs

Modern problems are converted to linear algebra, not ODEsCan we accelerate linear algebra using analog?

Continuous-time algorithm

Page 11: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

11

𝑨𝒙 = 𝒃

Page 12: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

𝑨𝒙 = 𝒃

%𝑎00𝑥0 + 𝑎01𝑥1 = 𝑏0𝑎10𝑥0 + 𝑎11𝑥1 = 𝑏1

𝑥0

𝑥1

Page 13: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Direct methods• E.g., Gaussian elimination

Iterative methods

𝑨𝒙 = 𝒃

%𝑎00𝑥0 + 𝑎01𝑥1 = 𝑏0𝑎10𝑥0 + 𝑎11𝑥1 = 𝑏1

𝑥0

𝑥1

Page 14: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

𝑨𝒙 = 𝒃

%𝑎00𝑥0 + 𝑎01𝑥1 = 𝑏0𝑎10𝑥0 + 𝑎11𝑥1 = 𝑏1

Direct methods• E.g., Gaussian elimination

Iterative methods• E.g., steepest gradient descent• E.g., conjugate gradients

𝑥0

𝑥1

Page 15: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Solve %𝑎00𝑥0 + 𝑎01𝑥1 = 𝑏0𝑎10𝑥0 + 𝑎11𝑥1 = 𝑏1

Step-by-stepsteepest gradient descent

𝑥0

𝑥1

Page 16: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Solve %𝑎00𝑥0 + 𝑎01𝑥1 = 𝑏0𝑎10𝑥0 + 𝑎11𝑥1 = 𝑏1

Step-by-stepsteepest gradient descent

recurrence relation

,𝑥-./0 = 𝑥-. − 𝑠 𝑎--𝑥-. + 𝑎-0𝑥0. − 𝑏- 𝑥0./0 = 𝑥0. − 𝑠 𝑎0-𝑥-. + 𝑎00𝑥0. − 𝑏0

𝑥0

𝑥1

Page 17: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Solve %𝑎00𝑥0 + 𝑎01𝑥1 = 𝑏0𝑎10𝑥0 + 𝑎11𝑥1 = 𝑏1

𝑥0

𝑥1

Continuous-timecontinuous steepest descent

𝑥0

𝑥1

Step-by-stepsteepest gradient descent

recurrence relation

,𝑥-./0 = 𝑥-. − 𝑠 𝑎--𝑥-. + 𝑎-0𝑥0. − 𝑏- 𝑥0./0 = 𝑥0. − 𝑠 𝑎0-𝑥-. + 𝑎00𝑥0. − 𝑏0

Page 18: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Solve %𝑎00𝑥0 + 𝑎01𝑥1 = 𝑏0𝑎10𝑥0 + 𝑎11𝑥1 = 𝑏1

𝑥0

𝑥1

Continuous-timecontinuous steepest descentordinary differential equation

%d𝑥- d𝑡⁄ = −𝑎--𝑥- − 𝑎-0𝑥0 + 𝑏- d𝑥0 d𝑡⁄ = −𝑎0-𝑥- − 𝑎00𝑥0 + 𝑏0

𝑥0

𝑥1

Step-by-stepsteepest gradient descent

recurrence relation

,𝑥-./0 = 𝑥-. − 𝑠 𝑎--𝑥-. + 𝑎-0𝑥0. − 𝑏- 𝑥0./0 = 𝑥0. − 𝑠 𝑎0-𝑥-. + 𝑎00𝑥0. − 𝑏0

Page 19: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

CONTINUOUS-TIMEPotentially fast: not limited by step-by-step algorithm

A continuous-time, analog computing model

Page 20: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

A continuous-time, analog computing model• step-by-step algorithm → continuous-time algorithm• continuous-time algorithm → analog accelerator hardware

Analog drawbacks: how to fix them• limited applications: tackle key linear algebra kernel• limited accuracy: calibration & exceptions• limited precision: build precision with digital help• limited scalability: divide & conquere sparse matrix

A prototype analog accelerator & evaluation• microarchitecture• architecture & programming• performance• energy

Page 21: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Datapath: explicit data flow

Values: represented as analog current & voltage

Functional units: analog arithmetic operators

Analog accelerator hardware

Page 22: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

−𝑎10

−𝑎00

∫d𝑥1d𝑡 𝑥1(𝑡) ⍦

ADCDAC

∫d𝑥0d𝑡 𝑥0(𝑡)

ADCDAC

−𝑎11

−𝑎01

𝑏1

𝑏0 ⍦

∫d𝒙𝟏d𝒕 𝒙𝟏(𝒕)

∫d𝒙𝟎d𝒕 𝒙𝟎(𝒕)

Integrators

Solve %𝑎00𝒙𝟎 + 𝑎01𝒙𝟏 = 𝑏0𝑎10𝒙𝟎 + 𝑎11𝒙𝟏 = 𝑏1

ordinary differential equation

%d𝒙𝟎 d𝒕⁄ = −𝑎--𝒙𝟎 − 𝑎-0𝒙𝟏 + 𝑏- d𝒙𝟏 d𝒕⁄ = −𝑎0-𝒙𝟎 − 𝑎00𝒙𝟏 + 𝑏0

Page 23: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

−𝑎10

−𝑎00

∫d𝑥1d𝑡 𝑥1(𝑡) ⍦

ADCDAC

∫d𝑥0d𝑡 𝑥0(𝑡)

ADCDAC

−𝑎11

−𝑎01

𝑏1

𝑏0 ⍦

ordinary differential equation

%d𝑥- d𝑡⁄ = −𝑎--𝑥- − 𝑎-0𝑥0 + 𝑏- d𝑥0 d𝑡⁄ = −𝑎0-𝑥- − 𝑎00𝑥0 + 𝑏0

∫d𝑥1d𝑡 𝑥1(𝑡) ⍦

∫d𝑥0d𝑡 𝑥0(𝑡) ⍦

Current mirrorfanout blocks

Page 24: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

−𝑎10

−𝑎00

∫d𝑥1d𝑡 𝑥1(𝑡) ⍦

ADCDAC

∫d𝑥0d𝑡 𝑥0(𝑡)

ADCDAC

−𝑎11

−𝑎01

𝑏1

𝑏0 ⍦

ordinary differential equation

%d𝑥- d𝑡⁄ = −𝒂𝟎𝟎𝑥- − 𝒂𝟎𝟏𝑥0 + 𝑏- d𝑥0 d𝑡⁄ = −𝒂𝟏𝟎𝑥- − 𝒂𝟏𝟏𝑥0 + 𝑏0

−𝒂𝟏𝟎

−𝒂𝟎𝟎

∫d𝑥1d𝑡 𝑥1(𝑡) ⍦

∫d𝑥0d𝑡 𝑥0(𝑡) ⍦

−𝒂𝟏𝟏

−𝒂𝟎𝟏

Variablegain

amplifiers

Page 25: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

−𝑎10

−𝑎00

∫d𝑥1d𝑡 𝑥1(𝑡) ⍦

ADCDAC

∫d𝑥0d𝑡 𝑥0(𝑡)

ADCDAC

−𝑎11

−𝑎01

𝑏1

𝑏0 ⍦

ordinary differential equation

%d𝑥- d𝑡⁄ = −𝑎--𝑥- − 𝑎-0𝑥0 + 𝒃𝟎 d𝑥0 d𝑡⁄ = −𝑎0-𝑥- − 𝑎00𝑥0 + 𝒃𝟏

−𝑎10

−𝑎00

∫𝑥1(𝑡) ⍦

DAC

∫𝑥0(𝑡) ⍦

DAC

−𝑎11

−𝑎01

𝒃𝟏

𝒃𝟎

Digital to analog converters

Page 26: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

−𝑎10

−𝑎00

∫d𝑥1d𝑡 𝑥1(𝑡) ⍦

ADCDAC

∫d𝑥0d𝑡 𝑥0(𝑡)

ADCDAC

−𝑎11

−𝑎01

𝑏1

𝑏0 ⍦

ordinary differential equation

%d𝒙𝟎 d𝒕⁄ = −𝑎--𝑥- − 𝑎-0𝑥0 + 𝑏- d𝒙𝟏 d𝒕⁄ = −𝑎0-𝑥- − 𝑎00𝑥0 + 𝑏0

−𝑎10

−𝑎00

∫d𝒙𝟏d𝒕 𝑥1(𝑡) ⍦

DAC

∫d𝒙𝟎d𝒕 𝑥0(𝑡) ⍦

DAC

−𝑎11

−𝑎01

𝑏1

𝑏0

Sum currents

by joining wires

Page 27: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

ordinary differential equation

%d𝒙𝟎 d𝒕⁄ = −𝑎--𝒙𝟎 − 𝑎-0𝒙𝟏 + 𝑏- d𝒙𝟏 d𝒕⁄ = −𝑎0-𝒙𝟎 − 𝑎00𝒙𝟏 + 𝑏0

−𝑎10

−𝑎00

∫d𝑥1d𝑡 𝒙𝟏(𝒕) ⍦

DAC

∫d𝒙𝟎d𝒕 𝒙𝟎(𝒕) ⍦

DAC

−𝑎11

−𝑎01

𝑏1

𝑏0

𝑥0

𝑥1

ADC

ADC

Page 28: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

ordinary differential equation

%d𝑥- d𝑡⁄ = −𝑎--𝒙𝟎 − 𝑎-0𝒙𝟏 + 𝑏- d𝑥0 d𝑡⁄ = −𝑎0-𝒙𝟎 − 𝑎00𝒙𝟏 + 𝑏0

−𝑎10

−𝑎00

∫d𝑥1d𝑡 𝒙𝟏(𝒕) ⍦

ADCDAC

∫d𝑥0d𝑡 𝒙𝟎(𝒕) ⍦

ADCDAC

−𝑎11

−𝑎01

𝑏1

𝑏0

𝑥0

𝑥1

Page 29: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

ordinary differential equation

%d𝑥- d𝑡⁄ = −𝑎--𝒙𝟎 − 𝑎-0𝒙𝟏 + 𝑏- d𝑥0 d𝑡⁄ = −𝑎0-𝒙𝟎 − 𝑎00𝒙𝟏 + 𝑏0

−𝑎10

−𝑎00

∫d𝑥1d𝑡 𝒙𝟏(𝒕) ⍦

ADCDAC

∫d𝑥0d𝑡 𝒙𝟎(𝒕) ⍦

ADCDAC

−𝑎11

−𝑎01

𝑏1

𝑏0

Solve %𝑎00𝒙𝟎 + 𝑎01𝒙𝟏 = 𝑏0𝑎10𝒙𝟎 + 𝑎11𝒙𝟏 = 𝑏1

𝑥0

𝑥1

Page 30: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

CONTINUOUS-TIMEPotentially fast: not limited by step-by-step algorithm

ANALOG VALUESPotentially efficient: one wire carries real number

A continuous-time, analog computing model

Page 31: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

A continuous-time, analog computing model• step-by-step algorithm → continuous-time algorithm• continuous-time algorithm → analog accelerator hardware

Analog drawbacks: how to fix them• limited applications: tackle key linear algebra kernel• limited accuracy: calibration & exceptions• limited precision: build precision with digital help• limited scalability: divide & conquere sparse matrix

A prototype analog accelerator & evaluation• microarchitecture• architecture & programming• performance• energy

Page 32: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Accuracy

Digital Analog

Intermediate values unambiguously interpreted as 1 or 0

Process & temperature variation → computation result variation!

Page 33: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Accuracy

Digital Analog

Intermediate values unambiguously interpreted as 1 or 0

Process & temperature variation → computation result variation!

Discrete math error correction possible

Within purely analog execution, no error correction

Page 34: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Components are inaccurate:• Gain• Offset• Nonlinearity (clipping)

Calibrate all components using additional DACs

Exceptions catch values exceeding valid input range

input

output

+Vs

-Vs

Accuracy: calibration & exceptions

Page 35: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Components are inaccurate:• Gain• Offset• Nonlinearity (clipping)

Calibrate all components using additional DACs

Exceptions catch values exceeding valid input range

input

output

+Vs

-Vs

Accuracy: calibration & exceptions

Page 36: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Components are inaccurate:• Gain• Offset• Nonlinearity (clipping)

Calibrate all components using additional DACs

Exceptions catch values exceeding valid input range

input

output

+Vs

-Vs

Accuracy: calibration & exceptions

Page 37: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Components are inaccurate:• Gain• Offset• Nonlinearity (clipping)

Calibrate: all components using additional DACs

Exceptions catch values exceeding valid input range

input

output

+Vs

-Vs

Accuracy: calibration & exceptions

Page 38: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Components are inaccurate:• Gain• Offset• Nonlinearity (clipping)

Calibrate: all components using additional DACs

Exceptions: catch values exceeding valid input range

input

output

+Vs

-Vs

validrange

Accuracy: calibration & exceptions

Page 39: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

A continuous-time, analog computing model• step-by-step algorithm → continuous-time algorithm• continuous-time algorithm → analog accelerator hardware

Analog drawbacks: how to fix them• limited applications: tackle key linear algebra kernel• limited accuracy: calibration & exceptions• limited precision: build precision with digital help• limited scalability: divide & conquere sparse matrix

A prototype analog accelerator & evaluation• microarchitecture• architecture & programming• performance• energy

Page 40: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Precision: build precision w/ digital help

Limitation in sampling resolution→ limited precision result

𝑥0

𝑥1

Page 41: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Precision: build precision w/ digital help

Limitation in sampling resolution→ limited precision result

Find residual, rescale problem,& solve again in analog for precision

𝑥0

𝑥1𝑥0

𝑥1

Page 42: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

A continuous-time, analog computing model• step-by-step algorithm → continuous-time algorithm• continuous-time algorithm → analog accelerator hardware

Analog drawbacks: how to fix them• limited applications: tackle key linear algebra kernel• limited accuracy: calibration & exceptions• limited precision: build precision with digital help• limited scalability: divide & conquere sparse matrix

A prototype analog accelerator & evaluation• microarchitecture• architecture & programming• performance• energy

Page 43: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Without time multiplexing, analog hardware needed for all variables• Impractical to build analog

hardware for whole problems

Solve subproblems of smaller size in analog• Possible because many

problems have sparse matrices

Scalability: divide & conquer sparse matrix

Page 44: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Without time multiplexing, analog hardware needed for all variables• Impractical to build analog

hardware for whole problems

Solve subproblems of smaller size in analog• Possible because many

problems have sparse matrices

Scalability: divide & conquer sparse matrix

Page 45: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

A continuous-time, analog computing model• step-by-step algorithm → continuous-time algorithm• continuous-time algorithm → analog accelerator hardware

Analog drawbacks: how to fix them• limited applications: tackle key linear algebra kernel• limited accuracy: calibration & exceptions• limited precision: build precision with digital help• limited scalability: divide & conquere sparse matrix

A prototype analog accelerator & evaluation• microarchitecture• architecture & programming• energy• performance

Page 46: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Prototype analog accelerator: why & how

Why prototype?• validate analog circuits• physical measurement of components• prototype helps developing

analog applications

Engineering process• 2 full-time PhD students, 4 project students• 3 years• full custom analog• synthesized digital• I worked on validation,

digital interface, applications

Page 47: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Prototype analog accelerator: μArch.

Components• 20 KHz analog bandwidth• 4 integrators• 8 multipliers• Other features: nonlinear lookup

Fabrication• 1.2 V 65nm TSMC• Low power density• 0.06 W/cm2 at full power• 2.0 mm2 active area• 1.2 mW at full power

Page 48: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Prototype analog accelerator: interface

Architecture• 8-bit A/D/A conversion• Configurable analog crossbar• Calibration & exceptions on all analog

Programming• Library for configuration• ODE syntax parser and compiler

Page 49: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

A continuous-time, analog computing model• step-by-step algorithm → continuous-time algorithm• continuous-time algorithm → analog accelerator hardware

Analog drawbacks: how to fix them• limited applications: tackle key linear algebra kernel• limited accuracy: calibration & exceptions• limited precision: build precision with digital help• limited scalability: divide & conquere sparse matrix

A prototype analog accelerator & evaluation• microarchitecture• architecture & programming• energy• performance

Page 50: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

50

Solution energy: analog HW vs. digital SW

Page 51: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

0

100

200

300

0 100 200 300 400 500

Tim

e to

sol

utio

n (μ

s)

Number of variables

Digital solution time

Analog solution time

Solution time: analog HW vs. digital SW

10x less

Page 52: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

0

100

200

300

0 100 200 300 400 500

Tim

e to

sol

utio

n (μ

s)

Number of variables

Digital solution time

Analog solution time

Solution time: analog HW vs. digital SW

Analog:• Explicit

dataflow architecture• Continuous-

time speed• Analog

efficiency• High analog

area cost

Page 53: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

0

100

200

300

0 100 200 300 400 500

Tim

e to

sol

utio

n (μ

s)

Number of variables

Digital solution time

Analog solution time

Solution time: analog HW vs. digital SW

Analog:• Explicit

dataflow architecture• Continuous-

time speed• Analog

efficiency• High analog

area cost

Page 54: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

0

100

200

300

0 100 200 300 400 500

Tim

e to

sol

utio

n (μ

s)

Number of variables

Digital solution time

Analog solution time

Solution time: analog HW vs. digital SW

Analog:• Explicit

dataflow architecture• Continuous-

time speed• Analog

efficiency• High analog

area cost

Page 55: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

0

100

200

300

0 100 200 300 400 500

Tim

e to

sol

utio

n (μ

s)

Number of variables

Digital solution time

Analog solution time

Solution time: analog HW vs. digital SW

Analog:• Explicit

dataflow architecture• Continuous-

time speed• Analog

efficiency• High analog

area cost

Digital:• Optimal

digital algorithms

Page 56: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Continuous-time + analog offers alternative abstractions to digital

Tackled analog drawbacks: generality, accuracy, precision, scalability

Analog prototype: ISA & hardware; speed, area, energy analysis

Should we use analog to accelerate linear algebra?Some gains, but digital algorithms are optimized!Analog promises greater advantage in other problems: nonlinear?

An Analog Accelerator for Linear Algebra

Page 57: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

57

Page 58: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

58

Page 59: An Analog Accelerator for Linear Algebrayipenghuang.com/wp-content/uploads/2018/09/8B-3.pdf · • step-by-step algorithm → continuous-time algorithm • continuous-time algorithm

Analog Digital

Continuous timeOrdinary differential equationAdvantage: fastAdvantage: low power b/c no clock

Discrete timeRecurrence relationAdvantage: allows complex algorithmsAdvantage: allows time multiplexing

Continuous valueCurrent & voltageAdvantage: fast and efficient operationsAdvantage: one wire carries real number

Discrete valueIntegers & floating pointAdvantage: high dynamic rangeAdvantage: high signal-to-noise ratio

59

𝑥0

𝑥1

𝑥0

𝑥1