technology transfer between ctuning foundation and...
TRANSCRIPT
TETRACOMsuccessstory:technologytransferbetweencTuningfoundationandARM
CollectiveKnowledge:aframeworkforreproducibleexperimentationandoptimizationknowledgesharing
Weenableefficient,reliableandcheapcomputing– everywhere!
Dr.GrigoriFursindividiti(UK)/cTuning(France)
Computingiscriticaltoinnovation&wellbeingeverywhere:fromtinycomputersin“smart things”tosupercomputers.
Hencetheperpetualneedforfaster,cheaper,smaller,moreenergyefficientandreliablecomputersystems.
Weneedcheaper,faster,moreenergyefficient&morereliablecomputing!
Howtoperformagivencomputationinthemostefficientwaygivenavailableresources,
userrequirementsandconstraints(performance,energy,accuracy,networkutilization,
resourceusage,price,etc)?
Manyactorscontributetocomputersystems
End-usersAcademia
HardwareprovidersSoftwaredevelopersandserviceproviders
Tooldevelopers
Problem:evergrowingcomplexityandpoorunderstandingoftrade-offs
Technologicalchaos!
•everchangingsoftwareandhardware;•raisingnumberofdesignandoptimizationchoices;•non-representativebenchmarksanddatasets;•highlystochasticbehavior;•noknowledgesharingandnocommonexperimentalmethodology
GCC4.1.x
ICC11.1
LLVM3.9
OpenMP MPI OpenCL
perf
ATLAS
function-level
hardwarecounters
passreordering
frequencyGCC5.x
geneticalgorithms
ARMv8
CUDA7.x GCC4.3.x
GCC4.4.x
GCC4.5.x
GCC4.6.x
ICC11.0ICC12.0
LLVM2.6
LLVM3.xMVS2013
XLCOpenACC
PAPI
Scalascapredictivescheduling
MKL
polyhedraltransformations
KNN
bandwidth
memorysize
executiontime
SSE4
SimpleScalar
LTO
cache size
threads
algorithmprecision
Open64
Jikes
TAU
GCC6.x ICC2015
big/little
Onlyincremental advancesleadtooverly-expensive,under-performingandenergy-hungrycomputersystems!
Muststopwastingexpensiveresourcesandenergy!It’stimetorevisitcomputerengineering!
cuDNN
CAFFE
OpenCVcuFFT
CNNOpenBLAS
Oursolution:CollectiveKnowledgeframeworkandrepository
CollectiveKnowledge,adisruptive approachtodesigningandoptimisingcomputersystemsinacollaborativeway:
SimilartoWikipedia,invitesabroadcommunity tosharerepresentativeprograms,datasets,tools,predictivemodels,asreusable components.
Allowsthecommunitytocrowdsource andreproduce experimentsacrossdiverse computersystems.
Appliespredictiveanalytics (machinelearning,datamining)tocontinuouslygrow knowledgeaboutoptimisingcomputersystems.
Codeanddatasharing
Auto-tuning,machinelearning
Systematizationandunificationofcollectedknowledge
(bigdata) Interdisciplinarycommunity
cKnowledge.org/repo
HowdidtheSAEprojectTETRACOMhelpus?
Providedknow-howandfunding(€50K)tothecTuningfoundation(non-profitresearchorganization– anoutcomeoftheEUFP6MILEPOST)tomaturetheCollectiveKnowledge(CK)technology,andreleaseitunderapermissivelicense (cknowledge.org)
AllowedtovalidateourapproachatARM,theworld-leadingsupplierofmicroprocessortechnology(arm.com):
ü CKprovidedvaluableinsightsintoperformanceofARMproductsinafractionof timerequiredbyconventionalanalysis...
ü ...whichdemonstratedthepotentialofCKtospurthedesignofnextgeneration,highperformanceandenergyefficientsystems.
Industrialandacademicimpact
ü dividiti,aUKstartupco-foundedbyDrGrigoriFursin(cTuning,ex-INRIA,ex-Intel)andDrAntonLokhmotov(ex-ARM).
optimizingcomputing;reinventingcomputerengineering;acceleratingknowledgediscovery;
crusadingforreproducibleandcollaborativeR&D(includingw/ACMSIGsandartifactevaluation)
ü CustomersalreadyincludeacloudcompanyandacarmanufacturerontheFortune50list.
ü 2016estimates:revenueof€300K;headcountof4.
ü 2017+year-over-yeargrowth:4xrevenue;2xheadcount.
ü Customersavings:€1-10Min2yrs;€10-100M+in5yrs.
ü 2..3xfastertimetomarketfornewproducts
Acknowledgmentsandsuggestions
Provideextrafunding(6-12months)tohelpcreatestartups aftersuccessfulTTPorEUprojects?
TETRACOM (FP7)HiPEACCARP (FP7)MILEPOST (FP6)
Discussion?
Furtherinfo:dividiti.comcTuning.orgcKnowledge.org
Anycommentsandquestions?Pleasegetintouch!
[email protected]@dividiti.com