hdf performance on openstack - nasa · hdf performance on openstack john readey ... • python 3...
TRANSCRIPT
Mission
• InvesHgateperformanceofusingHDF5inacloudenvironment• Measureperformanceusingstandardlibrary
• Compression• Chunklayouts• FileaggregaHon
• InvesHgatewaystoharnesscloudspecificcapabiliHes:• ElasHcCompute–createcomputeinstancesondemand• ObjectStorage–uHlizeobjectstoreforpersistentstorage
• ComparisontouseofotherframeworkslikeHadooporSpark
• DeterminefutureworkthatwouldenableHDF5toperformbeTerinthecloud
Evalua,on Criteria
• EvaluaHonCriteria• Performance–howfastcanatypicalscienceproblembecomputed• Storage–Howmuchstorageisneededforthedataset• Usability–howeasyisittoperformtaskstypicalofscienceanalyHcs• Scalability--HoweffecHvelycanmulHplecoresbeused• Cost–Costmetrics(storage+compute)forvarioussoluHons
Plan of Inves,ga,on
• Selecttestdataset• NCEP3–(720,1440)griddeddata–7980files-130GBuncompressed
• Chooseascienceproblem• Calculatemin/max/avg/stdevforagivendataset
• Selectcomputepladorm• OSDCGriffin–OpenStack,300nodes
• InvesHgateHDF5performance• Phase1:usingonecomputenode
• Varychunklayout/compressionfilters• Phase2:usingusingmulHplenodes• Phase3:client/serverwithHDFServer
• PlanforFuturework
Hardware
• UsingOpenScienceDataCloudGriffincluster• Xeonsystemswith1-16cores• 300computenodes• 10GbEthernet• EphemerallocalPOSIXfilesystem• Sharedpersistentstorage(Cephobjectstore,S3API)
So:ware
• HDF5libraryv1.8.15• Compressionlibraries:MAFISC/GZIP/BLOSC• OperaHngsystem:UbuntuLinux• Linuxdevelopmenttools• HDF5tools:h5dump,h5repack,etc.• Python3• Pythonpackages:h5py,NumPy,ipyparallel,PyTables• HDFServer:hTps://github.com/HDFGroup/h5serv• H5pyd:hTps://github.com/HDFGroup/hpd
OpenStack at OSDC in Brief
• InstancescanbecreatedeitherprogrammaHcallyorviawebconsole• ComputeInstancesiniHalizedfromsnapshotorimagefile• ManydifferentinstanceconfiguraHonsavailable
• RAM2GB–100GB• Disk10GB–2TB
• Onboarddiskisephemeral!–willbelostwhentheinstanceisshutdown
ObjectStorage
PythonTestDriver
TestInstance• Instanceisshutdown
• Testdriverisrun
S3API
• Datafilescopiedfromobjectstoragetoinstance
• Instanceiscreated
LocalDisk
hdf5lib
h5py
• ResultsandperformancemeasurementsstoredinObjectStore
HDF5 Chunking and compression
• ChunkingisoneofthestoragelayoutsforHDF5datasets• HDF5dataset’sbytestreamisbrokenupinchunksandstoredatvariouslocaHonsinthefile• Chunksareofequalsizeindataset’sdataspacebutmaynotbeofequalbytesizeinthefile• HDF5filteringworksonchunksonly• Filtersforcompression/decompression,scaling,checksumcalculaHon,etc.
HDF5 Chunking and compression
Determining chunk layouts
• Twodifferentchunkingalgorithms:• Unidata’sop0malchunkingformulafor3Ddatasets• h5pyformula
• ThreedifferentchunksizeschosenforthecollatedNCEPdataset:• Synop0cmap:1×72×144• Datarod:7850×1×1• Datacube:25×20×20
• BestlayoutdependsonhowwhattheapplicaHonsaccesspaTernis
Results – Compression Size
0
20
40
60
80
100
120
140
None blosc gzip mafisc
CompressedSize(GB)
MAFISCperformedbest,butisalosseycompressor.BloscandgziphavereducHonof~60%
Results – Run,me
• LoadfromS3:~60m• RunHme:
• NoCompression/nochunking:11.8m• (1x72x144)chunklayout/gzip:5.2m• (25x20x20)chunklayout/gzip:68.6m• (7850x1x1)chunklayout/gzip:15.4d
• Fullresultsat:hTps://github.com/HDFGroup/datacontainer/blob/master/results.txt
Phase 2 – U,lizing mul,ple nodes
• Oneadvantageofcloudenvironmentsison-demandcompute,theabilitytoinstanHateandprovisioncomputenodesprogrammaHcally• FrameworkslikeHadooporSparkharnessthepowerofmulHplecomputenodestogetworkdonefaster• HoweasywoulditbetouHlizemulHpleinstanceswithOpenStackandthestandardHDF5library?
Cluster Challenge
• Othersystems(e.g.Hadoop)supportclustersoutofthebox• HDF5doesnot…• …Socreate“on-demand”cluster
• WrotecodetolaunchVM’sprogrammaHcally• ConnectusingZeroMQ• RunwithparallelPython• ModifytestdrivertosupportparallelPython• WrotePythonmoduletodistributedataacrossengines
ObjectStorage
EngineNodes
JupyterNotebook
PythonScripts
iPythonParallel
HDFController
Controller
ApplicaHonLayer
iPythonNotebook
Gui
• ObjectStorecontainsHDF5-basedDataCollecHons(CIMP5/CORDEX/NCEP3)
• DatacollecHonstoragesizerangefrom100GBstoPBs• ObjectsizeinMBstoGBs(canbetuned)• MetadatamapsHme/geographicregiontoobjects• HDF5compression/chunkingreducesspacerequired
• EngineVMscreateondemandbycontroller• EachVMreadsaparHHonofdatafromobjectstore• Codetoberunpushedbycontroller• Outputreturnedtocontrollerorsavedtolocalstore
S3API
ZeroMQ
hTp• ControllerRunsonVM&listensforclientrequests• RunsNotebookkernels• SpinsupEnginesasneeded• Dispatchesworktoengines(viaiPyParallel/0MQ)
• Usersconnecttoweb-basedJupyterNotebook• RuncodeviaREPLorsubmitscripts• PlotresultsusingMatplotliborotherploxngpackage
ObjectStorage
EngineNodes
JupyterNotebook
PythonScripts
iPythonParallel
HDFController
Controller
ApplicaHonLayer
iPythonNotebook
Gui
S3API
ZeroMQ
hTp
• S3Datahasbeenimported(Public-readable)
ApplicaHonLifeCycle1–nousersconnected
• Noenginesrunning
• Controllerlisteningfornewclients• JupyterHublisteningfornewnotebooksessions
ObjectStorage
EngineNodes
JupyterNotebook
PythonScripts
iPythonParallel
HDFController
Controller
ApplicaHonLayer
iPythonNotebook
Gui
S3API
ZeroMQ
hTp
• NoS3transfers
ApplicaHonLifeCycle2–Uselaunchesnotebooksession
• Noenginesrunning
• JupyterHublaunchessession• Controllergetsclientrequestfromnotebook
ObjectStorage
EngineNodes
JupyterNotebook
PythonScripts
iPythonParallel
HDFController
Controller
ApplicaHonLayer
iPythonNotebook
Gui
S3API
ZeroMQ
hTp
• Transferdatatoengines
ApplicaHonLifeCycle3–UseloadsdatacollecHone.g.hdfcontroller.load(‘NCEP3’)#userdoesn’tneedtoknowS3keys,justdatacollecHonlabelandanysubsexnginfo(Hmeorgeo-region)
• Enginesstart• LoaddataparHHon• Signalstocontrollerthatdataisready
• Controllergetsdatarequestfromnotebook• DeterminesopHmumtypeand#ofengines• LaunchesEngines• TellsenginestofetchdataobjectsfromS3
ObjectStorage
EngineNodes
JupyterNotebook
PythonScripts
iPythonParallel
HDFController
Controller
ApplicaHonLayer
iPythonNotebook
Gui
S3API
ZeroMQ
hTp
• NoacHvity
ApplicaHonLifeCycle4–DataanalyHcsE.g.:getvaluesatgeolocaHonRepeatcycleofquery/analyze/plotasdesired
• Enginesprocessrequests• DataislocaltoVM(SSDorRAM)
• ControllergetsuserrequestfromClient• Dispatchesacrossallengines• Waitsforresponses• Returnsaggregatedresulttoclient
ObjectStorage
EngineNodes
JupyterNotebook
PythonScripts
iPythonParallel
HDFController
Controller
ApplicaHonLayer
iPythonNotebook
Gui
S3API
ZeroMQ
hTp
• NoS3acHvity
ApplicaHonLifeCycle5–nouserendssession
• Enginesshutdown• Anydatastoredlocallyislost!
• ControllerterminatesEngines• ConHnueslisteningfornewnotebooksessions
Results – Performance w/ 8 node
0.0
20.0
40.0
60.0
80.0
100.0
120.0
140.0
None blosc gzip45x180 gzip22x46 mafisc
RunTime-8engines(seconds)
BLOSChasbestperformanceforcompressedformat
Todo:chunkedbutnotcompresseddataset
Run,me – by number of nodes – no compression
0.0
500.0
1000.0
1500.0
2000.0
2500.0
4 8 16 32 64
PerformancevsNumberofNodes
0.0
500.0
1000.0
1500.0
2000.0
2500.0
3000.0
3500.0
4000.0
1 4 8 16 32 64
LoadandRunHmesOSDCCluster
Load Run
PercentofHmespentloadingdatagoesupasnumberofnodesincreases
Conclusions - Phase II
• HDF5withsimpleclustersoluHon(ZeroMQ/IPyParallel)provided:• Excellentperformance–superlinearwithnumberofnodes• Didnotrequireexpansionorconversionofdata(aswithHadoop,etc)• EnablesscienHsttousestandardtools/apisforanalysis
• ExisHngclustersoluHondidn’tworkwellwithlargefiles(>10GB)• ClusterlaunchHmeanddataloadingcandominateactualcomputeHme
Methodology – Phase III
• AggregateNCEPdatafilesto7850x720x1440datacube• Onefile,~100GB
• SetuplargeVMwithfileandserver(h5serv,hyrax,orTHREDDS)• Parallelnodesaccessdataviarequeststoserver• Adapttestscripttouseserverinterface• Measureperformancewithdifferent:
• Servers• Chunklayout• Numberofnodes
FileStorage
EngineNodes
JupyterNotebook
PythonScripts
iPythonParallel
HDFController
Controller
ApplicaHonLayer
iPythonNotebook
Gui
• FileStorageonServernodecontainsaggregateddata• DatacollecHonstoragesize~100GBs• MetadatamapsHme/geographicregiontoobjects• HDF5compression/chunkingreducesspacerequired
• EngineVMscreateondemandbycontroller• EachVMsubmitsrequesttodataserver• Codetoberunpushedbycontroller• Outputreturnedtocontrollerorsavedtolocalstore
RESTAPI/DAP
ZeroMQ
hTp• ControllerRunsonVM&listensforclientrequests• RunsNotebookkernels• SpinsupEnginesasneeded• Dispatchesworktoengines(viaiPyParallel/0MQ)
• Usersconnecttoweb-basedJupyterNotebook• RuncodeviaREPLorsubmitscripts• PlotresultsusingMatplotliborotherploxngpackage
h5serv hyrax THREDDS
Results – Server Access
• Testrunswithonenode(computesummariesoverHmeslices)
Chunk/Compression Local Hyrax THREDDS h5serv
None 148.6 3297.1 961.2 885.8
1x72x44/gzip9 317.6 8575.8 1264.1 ?
25x20x20/gzip9 4131.0 13946.5 6936.8 ?
Conclusions Phase III
• Remotedataaccessentailsaperformancepenalty• AllocaHonofalargeinstancerunningconHnuouslyrequired
• Dataonserverwillbelostifinstancegoesdown• Aggregateperformancelevelsoutwithlargenumberofclients
• Serverprocesses/networkiobecomeboTleneck
Future Direc,ons – HDF Scalable Data Service
• ScalableDataServiceforHDF5• Designedforpublicorprivateclouds• UsesObjectStorageforpersistentdata• ”share-nothing”architecture• Supportanynumberofclients• Cost-effecHve• EfficientaccesstoHDF5objectswithoutopeningfile• ClientSDK’sforC/Fortran/PythonenableexisHngapplicaHonstobeusedwiththeservice• RESTAPIcompaHblewithcurrentHDFServer(referenceimplementaHon)
HSDS Architecture • ServiceNodes(SN)handleclientrequests• DataNodes(DN)parHHonobjectstore• BothSNandDNclusterscanscalebasedondemand• HDFObjects(links,datasets,chunks,etc.)storedasobjects
Separa,on of Storage and Compute Costs
• Storage• AWSS3cansupportanysizestorageataffordablecosts(~$0.03/GB/month)• AWShasbuiltinredundancy,sononeedforbackups,etc.
• Compute• IfnoacHveusers,thereisminimalcomputecosts(~$50/month)• Servicenodescanscaleupinresponsetoload(costsproporHonaltousage)
Open Ques,ons • S3storage
• OpHmalobjectstorekeymapping/objectsizes• Compression/chunkingtominimizecost/increaseperformance
• Costprofile(forAWS)• Steadystatecosts–S3storage/controllerVM• VMinstancehours*numberofengines• S3requests?
• BestenginescharacterisHcs• Instancetype-Needenoughlocalstorage.SSDisbeTerthanrotaHng• vCPUs?OnethreadperVM?• OpHmal#ofenginesforagivendatacollecHon
• Security• ZeroMQdoesn’thaveany!• RuninVPCperuser?
• HowwouldAWSimplementaHonperformcomparedtoOpenStack?• CompareusingDockerContainersratherthanVMsasengine(fasterspinupHme)• ValidaHonoftransformedresults