canopus: enabling extreme-scale data analytics on big hpc ... · tao lu*, eric suchyta, jong choi,...

30
Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki, and Scott Klasky, Qing Liu * , Dave Pugmire and Matt Wolf, and Mark Ainsworth + * New Jersey Institute of Technology Oak Ridge National Laboratory + Brown University Canopus: Enabling Extreme-Scale Data Analytics on Big HPC Storage via Progressive Refactoring

Upload: others

Post on 18-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

TaoLu*, EricSuchyta,JongChoi,NorbertPodhorszki,andScottKlasky, QingLiu*, DavePugmire andMattWolf, andMarkAinsworth+

* NewJerseyInstituteofTechnologyOakRidgeNationalLaboratory+ BrownUniversity

Canopus:EnablingExtreme-ScaleDataAnalyticsonBigHPCStorageviaProgressiveRefactoring

Page 2: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Overview

• Backgroundandrelatedwork

• Progressivedatarefactoring

• Conclusion

2

Page 3: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

HPCsystems•Mission:illuminatephenomenathatareoftenimpossibletostudyinalaboratory[OakRidgeNationalLab,2013]• Climateimpactsofenergyuse• Fusioninareactornotyetbuilt• Galaxyformation

•Methodology:Modelingandsimulation,alongwithdataexploration• Datageneration• Storage• Analysis• Visualization 3

Page 4: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

HPCsystems(cont’d)

• Thebigdatamanagementchallenge [Shalf et.al.,2014]

•WorseningI/Obottleneckforexascale systems• Exponentiallyincreasingmulti-scale,multi-physicsdata

4

Page 5: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Datacompressioninexascale (next-generationHPC)systems

5

•Goal:10xto100xdatareductionratio [IanFosteret.al.,2017]

• Reducedatabyatleast90%

•Datafeatures• Temporalandspatial• High-dimensional• Floating-point

Page 6: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Datacompressors

6

• Losslesscompression• Deduplication• GZIP• FPC[Burtscher et.al.,2009]

• Lossy compression• ZFP [Lindstromet.al.,2014]• ISABELA[Lakshminarasimhan et.al.,2011]• SZ[Shenet.al.,2016]

Page 7: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Lossy compression

7

1. Array linearization

Original Data

Compressed data2. Data

transformation3. Curvefitting

4. Optimize theunpredictable

Floating-pointBlock:

0.368178-1.269298-0.904911-0.242216

1. Align to a common exponent

Mantissa and exponent :

0.184089 * 21

-0.684649 * 21

-0.452456 * 21

-0.121108 * 21

2a. Encode exponent

2b. Convertmantissa to

integer

Integers:848960939002912256

-3157387122695243776-2086581739634989568-558511819984848000

3. Orthogonal transform, reorder, and convertto unsigned integer Integers:

3733945919058534944395516959657970744

4457920553677607264580309805578545072

4. Encode coefficients

Compressed low bits

Compressed high bits

Workflowofcurvefittingbasedcompression(e.g.ISABELAandSZ)

Workflowofquantizationandtransformationbasedcompression(e.g. ZFP)

Page 8: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Canfloating-pointcompressorsachieveanear100xcompressionratio?

8

Page 9: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Performanceofcompressors

9

0102030

0100200300400

Compressio

nratio

Dataset

fpc gzip isb zfp sz

Relativeerrorbound 0.000001

0102030

0100200300400

Compressio

nratio

Dataset

fpc gzip isb zfp sz

Relativeerrorbound 0.001

Page 10: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Canfloating-pointcompressorsachieveanear100xcompressionratio?

Yes.IfDatasetcontainsalotofidenticalvalues;

Or,datavalueshighlyskewwithmoderatecompressionerrorbounds.

No.Formostdatasets.

10

Canfloating-pointcompressorsachievea10xcompressionratio?

Yes.Foralotofdatasetswithmoderatecompressionerrorbounds.

Page 11: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Whatifthecompressionratioisrushedto100xbylooseningerrorbounds?

11

Page 12: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

VisualizationandblobdetectiononcompressedDpot data.

Page 13: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Limitationsofdatacompressionbyreducingfloating-pointaccuracy

13

•Near100xcompressionratioishardlyachievable

• Lostdataaccuracycannotberestored

Page 14: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Overview

• Backgroundandrelatedwork

• Progressivedatarefactoring

• Conclusion

14

Page 15: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

WeproposeCanopus

15

• CompressingHPCdatainanotherdimension(resolution)

• Enablingprogressivedatarefactoring

• Usertransparentimplementation

Page 16: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

CanopusI/Oconfiguration

Page 17: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Canopus:basicidea

17

• Refactorthesimulationresults(viadecimation)intoabasedatasetalongwithaseriesofdeltas• Basedatasetissavedinfastdevices,deltasinslowdevices• Basedatasetcanbeusedseparately(atalowerresolution)foranalysis• Selectedsubsetofdeltastoberetrievedtorestoredatatoatargetaccuracy

Simulation

ADIOS Write API

Canopus(I/O, refactoring, compression, placement, retrieval, restoration )

I/O Transport

MPI MPI_LUSTREPOSIX Dataspaces

Data Analytics

ADIOS Query API

MPI_AGGREGATE FLEXPATH

Node-local Storage (NVRAM, SSDs) Burst Buffer

ADIOS Kernel (buffering, metadata, scheduling, etc.)

Remote Parallel File SystemCampaign Storage

Storage Tiers

CanopusinHPCSystems

Page 18: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Canopusworkflow

18

HPC Simulations (high accuracy)

Base ST1

Delta2x ST2

Deltafull ST3Analytics Pipeline n (high accuracy)

Analytics Pipeline 2 (medium accuracy)

Analytics Pipeline 1 (low accuracy)

Storage Hierarchy

base = L4x

base + delta2x

base + delta2x + deltafull

Refactoring (decimation, compression)

Retrieving &Reconstruction

Page 19: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Datarefactoring

19

Mesh (Full) Mesh (4x reduction)

Full L4x deltafull

delta2x

Delta calculationDecimationOriginal

1.Meshdecimation

2.Deltacalculation

3.Floating-pointcompression

Page 20: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Meshdecimation

20

Vl+1i =½(Vl

i +Vlj)

Vli

Vlj

Vlk

Vlh

Vl+1k

Vl+1h

Vl+1i

DeltaCalculation• Formeshdata,it’scommonthateachvertexcorrespondstoavalue(floating-point)• Aftertriangularmeshdecimation:

deltaln = F(Vln) - 1/3*F(Vl+1

i) - 1/3*F(Vl+1

j) - 1/3*F(Vl+1k)

Page 21: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Compression

21

• Thefloating-pointvaluescorrespondingtovertexesarecompressedusingZFPcompressor

•Apotentialoptimizationtoourframeworkissupportingadaptivecompressorsbasedondatasetfeatures

Page 22: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Progressivedataexploration(reversethedatarefactoringprocedures)• I/O(readthebasedatasetanddeltas)•Decompression•Restoration

22

Page 23: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

PerformancegainofCanopusfordataanalytics

23

Page 24: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

ImpactonDataAnalytics

24

Original 2xreduction 4xreduction

8xreduction 16xreduction 32xreduction

Page 25: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Aquantitativeevaluationofblobdetection

25

Page 26: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Overview

• StoragestacksofHPCsystems

• Progressivedatarefactoring

• Conclusionandfuturework

26

Page 27: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Conclusion• Lossy compressionmaydevastatetheusefulnessofdatatoachievehighcompressionratio(suchas100x)

• Itiscriticaltocompressdatainmultipleorthogonaldimensionssuchasaccuracyandresolution

•Canopuscombinesmeshcompressionandfloating-pointcompression,possiblydeliveringahighcompressionratio withoutdevastatetheusefulnessofdata

27

Page 28: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Futurework• Investigatetheimpactoflossy compressiononanalyticalapplicationsotherthanvisualization• OriginaldataA == B,compresseddataA’ == B’ ?

• F(D) == F(D’)?F isafunction

28

Page 29: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

References• OakRidgeNationalLab,SolvingBigProblems:ScienceandTechnologyatOakRidgeNationalLaboratory,2013

• Foster,I.,Ainsworth,M.,Allen,B.,Bessac,J.,Cappello,F.,Choi,J.Y.,…Yoo,S.(2017).ComputingJustWhatYouNeed :OnlineDataAnalysisandReductionatExtremeScales,1–16.

• Shalf,J.,Dosanjh,S.,&Morrison,J.(2014).Toptenexascale researchchallenges,1–25.Retrievedfromhttp://link.springer.com/chapter/10.1007/978-3-642-19328-6_1

• Burtscher,M.,&Ratanaworabhan,P.(2009).FPC:Ahigh-speedcompressorfordouble-precisionfloating-pointdata.IEEETransactionsonComputers,58(1),18–31.

• Lindstrom,P.(2014).Fixed-ratecompressedfloating-pointarrays.IEEETransactionsonVisualizationandComputerGraphics,20(12),2674–2683.

• Lakshminarasimhan,S.,Shah,N.,Ethier,S.,Klasky,S.,Latham,R.,Ross,R.,&Samatova,N.F.(n.d.).CompressingtheIncompressiblewithISABELA :In-situReductionofSpatio-TemporalData,1–14.

29

Page 30: Canopus: Enabling Extreme-Scale Data Analytics on Big HPC ... · Tao Lu*, Eric Suchyta, Jong Choi, Norbert Podhorszki , and Scott Klasky,Qing Liu*,Dave Pugmire and Matt Wolf, and

Thanks & Questions

30