canopus: enabling extreme-scale data analytics on big hpc ... · tao lu*, eric suchyta, jong choi,...
TRANSCRIPT
TaoLu*, EricSuchyta,JongChoi,NorbertPodhorszki,andScottKlasky, QingLiu*, DavePugmire andMattWolf, andMarkAinsworth+
* NewJerseyInstituteofTechnologyOakRidgeNationalLaboratory+ BrownUniversity
Canopus:EnablingExtreme-ScaleDataAnalyticsonBigHPCStorageviaProgressiveRefactoring
Overview
• Backgroundandrelatedwork
• Progressivedatarefactoring
• Conclusion
2
HPCsystems•Mission:illuminatephenomenathatareoftenimpossibletostudyinalaboratory[OakRidgeNationalLab,2013]• Climateimpactsofenergyuse• Fusioninareactornotyetbuilt• Galaxyformation
•Methodology:Modelingandsimulation,alongwithdataexploration• Datageneration• Storage• Analysis• Visualization 3
HPCsystems(cont’d)
• Thebigdatamanagementchallenge [Shalf et.al.,2014]
•WorseningI/Obottleneckforexascale systems• Exponentiallyincreasingmulti-scale,multi-physicsdata
4
Datacompressioninexascale (next-generationHPC)systems
5
•Goal:10xto100xdatareductionratio [IanFosteret.al.,2017]
• Reducedatabyatleast90%
•Datafeatures• Temporalandspatial• High-dimensional• Floating-point
Datacompressors
6
• Losslesscompression• Deduplication• GZIP• FPC[Burtscher et.al.,2009]
• Lossy compression• ZFP [Lindstromet.al.,2014]• ISABELA[Lakshminarasimhan et.al.,2011]• SZ[Shenet.al.,2016]
Lossy compression
7
1. Array linearization
Original Data
Compressed data2. Data
transformation3. Curvefitting
4. Optimize theunpredictable
Floating-pointBlock:
0.368178-1.269298-0.904911-0.242216
1. Align to a common exponent
Mantissa and exponent :
0.184089 * 21
-0.684649 * 21
-0.452456 * 21
-0.121108 * 21
2a. Encode exponent
2b. Convertmantissa to
integer
Integers:848960939002912256
-3157387122695243776-2086581739634989568-558511819984848000
3. Orthogonal transform, reorder, and convertto unsigned integer Integers:
3733945919058534944395516959657970744
4457920553677607264580309805578545072
4. Encode coefficients
Compressed low bits
Compressed high bits
Workflowofcurvefittingbasedcompression(e.g.ISABELAandSZ)
Workflowofquantizationandtransformationbasedcompression(e.g. ZFP)
Canfloating-pointcompressorsachieveanear100xcompressionratio?
8
Performanceofcompressors
9
0102030
0100200300400
Compressio
nratio
Dataset
fpc gzip isb zfp sz
Relativeerrorbound 0.000001
0102030
0100200300400
Compressio
nratio
Dataset
fpc gzip isb zfp sz
Relativeerrorbound 0.001
Canfloating-pointcompressorsachieveanear100xcompressionratio?
Yes.IfDatasetcontainsalotofidenticalvalues;
Or,datavalueshighlyskewwithmoderatecompressionerrorbounds.
No.Formostdatasets.
10
Canfloating-pointcompressorsachievea10xcompressionratio?
Yes.Foralotofdatasetswithmoderatecompressionerrorbounds.
Whatifthecompressionratioisrushedto100xbylooseningerrorbounds?
11
VisualizationandblobdetectiononcompressedDpot data.
Limitationsofdatacompressionbyreducingfloating-pointaccuracy
13
•Near100xcompressionratioishardlyachievable
• Lostdataaccuracycannotberestored
Overview
• Backgroundandrelatedwork
• Progressivedatarefactoring
• Conclusion
14
WeproposeCanopus
15
• CompressingHPCdatainanotherdimension(resolution)
• Enablingprogressivedatarefactoring
• Usertransparentimplementation
CanopusI/Oconfiguration
Canopus:basicidea
17
• Refactorthesimulationresults(viadecimation)intoabasedatasetalongwithaseriesofdeltas• Basedatasetissavedinfastdevices,deltasinslowdevices• Basedatasetcanbeusedseparately(atalowerresolution)foranalysis• Selectedsubsetofdeltastoberetrievedtorestoredatatoatargetaccuracy
Simulation
ADIOS Write API
Canopus(I/O, refactoring, compression, placement, retrieval, restoration )
I/O Transport
MPI MPI_LUSTREPOSIX Dataspaces
Data Analytics
ADIOS Query API
MPI_AGGREGATE FLEXPATH
Node-local Storage (NVRAM, SSDs) Burst Buffer
ADIOS Kernel (buffering, metadata, scheduling, etc.)
Remote Parallel File SystemCampaign Storage
Storage Tiers
CanopusinHPCSystems
Canopusworkflow
18
HPC Simulations (high accuracy)
Base ST1
Delta2x ST2
Deltafull ST3Analytics Pipeline n (high accuracy)
Analytics Pipeline 2 (medium accuracy)
Analytics Pipeline 1 (low accuracy)
Storage Hierarchy
base = L4x
base + delta2x
base + delta2x + deltafull
Refactoring (decimation, compression)
Retrieving &Reconstruction
Datarefactoring
19
Mesh (Full) Mesh (4x reduction)
Full L4x deltafull
delta2x
Delta calculationDecimationOriginal
1.Meshdecimation
2.Deltacalculation
3.Floating-pointcompression
Meshdecimation
20
Vl+1i =½(Vl
i +Vlj)
Vli
Vlj
Vlk
Vlh
Vl+1k
Vl+1h
Vl+1i
DeltaCalculation• Formeshdata,it’scommonthateachvertexcorrespondstoavalue(floating-point)• Aftertriangularmeshdecimation:
deltaln = F(Vln) - 1/3*F(Vl+1
i) - 1/3*F(Vl+1
j) - 1/3*F(Vl+1k)
Compression
21
• Thefloating-pointvaluescorrespondingtovertexesarecompressedusingZFPcompressor
•Apotentialoptimizationtoourframeworkissupportingadaptivecompressorsbasedondatasetfeatures
Progressivedataexploration(reversethedatarefactoringprocedures)• I/O(readthebasedatasetanddeltas)•Decompression•Restoration
22
PerformancegainofCanopusfordataanalytics
23
ImpactonDataAnalytics
24
Original 2xreduction 4xreduction
8xreduction 16xreduction 32xreduction
Aquantitativeevaluationofblobdetection
25
Overview
• StoragestacksofHPCsystems
• Progressivedatarefactoring
• Conclusionandfuturework
26
Conclusion• Lossy compressionmaydevastatetheusefulnessofdatatoachievehighcompressionratio(suchas100x)
• Itiscriticaltocompressdatainmultipleorthogonaldimensionssuchasaccuracyandresolution
•Canopuscombinesmeshcompressionandfloating-pointcompression,possiblydeliveringahighcompressionratio withoutdevastatetheusefulnessofdata
27
Futurework• Investigatetheimpactoflossy compressiononanalyticalapplicationsotherthanvisualization• OriginaldataA == B,compresseddataA’ == B’ ?
• F(D) == F(D’)?F isafunction
28
References• OakRidgeNationalLab,SolvingBigProblems:ScienceandTechnologyatOakRidgeNationalLaboratory,2013
• Foster,I.,Ainsworth,M.,Allen,B.,Bessac,J.,Cappello,F.,Choi,J.Y.,…Yoo,S.(2017).ComputingJustWhatYouNeed :OnlineDataAnalysisandReductionatExtremeScales,1–16.
• Shalf,J.,Dosanjh,S.,&Morrison,J.(2014).Toptenexascale researchchallenges,1–25.Retrievedfromhttp://link.springer.com/chapter/10.1007/978-3-642-19328-6_1
• Burtscher,M.,&Ratanaworabhan,P.(2009).FPC:Ahigh-speedcompressorfordouble-precisionfloating-pointdata.IEEETransactionsonComputers,58(1),18–31.
• Lindstrom,P.(2014).Fixed-ratecompressedfloating-pointarrays.IEEETransactionsonVisualizationandComputerGraphics,20(12),2674–2683.
• Lakshminarasimhan,S.,Shah,N.,Ethier,S.,Klasky,S.,Latham,R.,Ross,R.,&Samatova,N.F.(n.d.).CompressingtheIncompressiblewithISABELA :In-situReductionofSpatio-TemporalData,1–14.
29
Thanks & Questions
30