deluceva: delta-based neural network inference …...deluceva: delta-based neural network inference...

Deluceva:Delta-BasedNeuralNetworkInferenceforFastVideoAnalytics

Jingjing WangandMagdalenaBalazinskaUniversity of Washington

Deluceva:Delta-BasedNeuralNetworkInferenceforFastVideoAnalytics

• Largevolumeofimages/videoswithvaluableinformation• Alargesetofneuralnetworkmodelsforimages• Objectclassification,detection,…

• Nextstep:videoanalytics• Largervolume• Efficiencycritical,liveoutput

VideoAnalyticsUsingNeuralNetworks

… …

DeepNeuralNetworkRichsetofestablishedimagemodels

KeyObservation:TemporalRedundancy

ProcessDeltasInsteadofFullFrames

… …

“IncrementalQueryEvaluation”

ProcessDeltasInsteadofFullFrames

DataStream

“IncrementalQueryEvaluation”

…newboatdiscoveredwithboundingboxB

…pixel[x,y,z]has

changedby(0,0,1)…

Delta-BasedInferenceforVideos

• Problem:• Input: a videostream,areferencemodel• Output: similartothereferencemodel’sresult

• Approach:• Acceleratemodelinferencebyperforminglesscomputation

Delta-BasedInferenceforVideos:Overview

• Modifyneuralnetworktotakedeltasasinputs• Decidewhichdeltasaresignificantenoughtoprocess• Generate a network of mixed-type (dense or delta-based) operators

Delta-BasedInferenceforVideos:Overview

Frame2– Frame1

Frame1

Neural Network with SparseDeltas

Model(original)

Frame2

Model(delta)

ExampleNeuralNetwork Model

Model(original)

OpOp ……

• Exampleoperators:convolution,maxpooling,ReLU,…

ExampleNeuralNetwork Model

Model(delta)

OpOp ……

• Modifyoperatorstooperateondeltas

DeltaOperatorUnit

FilterOperator

DeltaOperatorUnit

• Sparseoperator:takessparsedeltas&outputsdelta• Saves#ofFLOPsbyprocessingdeltascalarsonly

• Filter:sendonlysignificantdeltastotheoperator• Buildshistogram,keepssmalldeltas&outputslargedeltas

Indices,Values

Accum.deltas

Kernel

SparseHow much to filterforthecurrentframe?

DeltaImpactsOutputQuality

Conv …

Frame1 Frame2

Frame2– Frame1

Groundtruths:

DynamicFilteringPercentage

• Filteringpercentage• Higherisfasterbutrisky• Lowerissafer butslower

• Targetfilteringpercentage:largestpercentagethatgeneratesgoodresult• Appliestoalloperators• Two approaches: PI controller / Machine learning

Safer,slower Risky,faster

abs(delta)

MixedNetwork

OpOp ……

Filteringpercentageis:low(e.g.0)high(e.g.99)

medium(e.g.50)

OpOp ……

DeltaOpUnit

MixedNetwork• Logical plan: a DAG of operators• Physical plan: choose between delta opunit/ originaldenseimplementation• Profileeachoperatorwithdifferentfilteringpercentages• Pickthefastervariant

Op2(sparse)Filter

Op2(dense)Filteringpercentage=99

Op1… …Op3

Evaluation:Setup

• Threeobjectdetectionmodels

• Six10-minutevideosfromthreeYouTubelivestreams– Takenatdifferenttimes(e.g.day/night)for eachstream– Typicalobjects:people,cars,buses,boats,…– One frame per second

• TensorFlow,oneCPUthread, AmazonEC2r3.2xlarge

Model Abbrv. #ofFLOPs

SSD-VGG16 ssd-vgg 123B

FRCNN-RESNET101 frcnn-res 550B

FRCNN-INCEPTION-RESNET-V2 frcnn-incep 1395B

Evaluation:End-to-EndComparison

• HighestruntimesavingsbyPIcontroller– Whenerrorlessthanathreshold

Model / Datasetssd-vgg

je jn kd kn vd vnfrcnn-res

je jn kd kn vd vnfrcnn-incep

je jn kd kn vd vn

. of S

videos

Deluceva:Conclusion

• Observerichtemporalredundancyinvideos• Acceleratemodelinferencebyprocessingsignificantdeltasonly

• ModifyNNmodelstoconsistofsparse&denseops• Adjustthefilteringgranularityadaptively• Generateanetwork of mixed-type operatorsbasedoncostmodels

• Improveruntimeupto67%with lowerror• Appliestoconvolutionalneuralnetworkmodels

• Ongoing:GPUimplementation,comparetootherwork

deluceva: delta-based neural network inference …...deluceva: delta-based neural network inference...

Documents

complex inference in neural circuits with probabilistic...

an on-line self-constructing neural fuzzy inference ......

inference of gas-liquid flowrate using neural networks

property inference attacks on neural networks using

variational inference for bayesian neural...

inference of other's internal neural models from active...

energy proportional neural network inference with adaptive...

delphi: a cryptographic inference service for neural...

memory-augmented dynamic neural relational inference

gina: neural relational inference from independent snapshots

neural implementation of bayesian inference in...

neural networks cortical circuits for perceptual inference

artificial neural network fuzzy inference system(1)

application of neural network and fuzzy inference …

delphi cryptographic inference for neural networks

1 characterizing the deep neural networks inference ... ·...

bayesian inference & neural networks

neural network and adaptive neuro- fuzzy inference system...

deep convolutional neural network inference with floating...

adaptive neural fuzzy inference systems (anfis): analysis...