george markomanolis io500 committee: john bent, julian...

Post on 25-Apr-2018

216 Views

Category:

Documents

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

GeorgeMarkomanolisIO500Committee:JohnBent,JulianM.Kunkel,JayLofstead

2017-11-12http://www.io500.org

IBMSpectrumScaleUserGroup,Denver,Colorado,USA

Why?

• Theincreaseofthestudieddomains,leadtolargerdataoutput,thusmorestressonfilesystem• CustomersbuyastorageonlybyevaluatingthemaxGB/sachievedbyIOR,whilemanyrealapplicationscannotachievesimilarperformance• TheI/Oefficiencycanbedowngradedbyinterferencewithmultipleusers• Arealcase,commercialapplicationusingonenode,wasconsumingmorethan15%oftheoverallmetadatacapacity• Weneedasuiteofbenchmarksinordertounderstandwhataretherealperformanceexpectations• Trackingstorageperformanceandsharingbestpractices

How?

• Communitydriveneffort,discussingthroughmailinglist,Slacketc.Everythingisingithub (https://github.com/VI4IO/io-500-dev.git )• Patterns:metadata,data,search• Easyforoptimizedpatterns• Hardfornaïvepatterns

• Reliesoncommunitybenchmarks,suchasIOR,mdtest (fornow)

WhatisIO-500?

IOREasy:Thisiswhatisusedduringtheprocurements,wherewemeasurethemostefficientI/Opattern,usercandeclaredtheparametersandwesaveonefileperMPIprocess

IORHard:Single-sharedfile,47008byterandomaccess,POSIX

MDEasy:CreaterankdirectorieswithNemptyfilesMDHard:Singleshareddirectory,filesof3901bytes,POSIXFind:Findfunctionalitysearchesforfilesof3901bytesacrossallthecreatedfiles.Svenaddedthemmfind.sh scriptforSpectrumscaleenvironment(io-500-dev/utilities/find/mmfind.sh)

Challenges&ApproachI

• Representativeofapplicationsanduserrequirements• Usingdifferentworkloadsforextractingupperandlowerperformanceinthecasesofoptimizedandnon-optimizedapplicationrespectively• Reportmeaningfulmetrics• Implementafindfunctionality(wetried3differentversions)• Libcircle isusedbyparallelfindanditisnotfriendlywithmachineswhichdonotprovidethewrappermpicc,problemissolvedwithsomemanualmodifications

Challenges&ApproachII

• Concurrentrunstobeintegrated,alreadyinitialtestsprovideinterestingresults

• 5minuteslimitperexperimenttoavoidlongruns

• ExtendedIOR/mdtest forphase-outstonewallingoptions

• Easytobuild,lessthan70secondsforthebasicversiontobeinstalled

HowtorunIO-500

• git clonehttps://github.com/VI4IO/io-500-dev• cdio-500-dev• ./utilities/prepare.sh• ./io500.sh(submitthisscriptifyouuseascheduler)• emailresultstosubmit@io500.org

DemoinstallationofIO500

ModifyIO-500

• Modifyio500.shaccordingly,forexample:

io500_mpirun="mpirun"io500_mpiargs="-np2"io500_ior_easy_params="-t 2048k-b 2g-F" io500_mdtest_easy_files_per_proc=25000

ModifyIO-500II

• Modifyio500.shaccordingly,selectwhichexperimentstobeexecuted:io500_run_ior_easy="True"io500_run_md_easy="True"…io500_run_md_hard_delete="True"

• Forvalid submission,youneedtoexecuteallthetestswhilethewritephasesshouldtakeatleast5minutes

ModifyIO-500III

• Modifyio500.shaccordingly,uncommenttheselinesanddeclarethepathtoyourpfind wrapper:

#io500_find_mpi="True"#io500_find_cmd="$PWD/bin/pfind"

Exampleofatestcase

[RESULT]BW phase1 ior_easy_write 96.133GB/s:time187.24seconds[RESULT]BW phase2 ior_hard_write 11.230GB/s:time 46.79seconds[RESULT]BW phase3 ior_easy_read 109.249GB/s:time164.76seconds[RESULT]BW phase4 ior_hard_read 7.871GB/s:time 66.74seconds[RESULT]IOPSphase1 mdtest_easy_write 49.231kiops:time 19.61seconds[RESULT]IOPSphase2 mdtest_hard_write 15.444kiops:time 17.05seconds[RESULT]IOPSphase3 find 8.120kiops:time 98.45seconds[RESULT]IOPSphase5 mdtest_easy_stat 5.313kiops:time127.18seconds[RESULT]IOPSphase6 mdtest_hard_stat 6.772kiops:time 30.43seconds[RESULT]IOPSphase7 mdtest_easy_delete 14.873kiops:time 49.98seconds[RESULT]IOPSphase8 mdtest_hard_read 45.599kiops:time 10.16seconds[RESULT]IOPSphase9 mdtest_hard_delete 30.776kiops:time 11.84seconds[SCORE]Bandwidth31.04GB/s:IOPS16.1537kiops:TOTAL501.4108

ExperiencewithIO500benchmark

• Withnotpropertuning,thebenchmarkwillfinisheithertoofastortooslow• Starttuningwithsmallvaluesandincreasethemtillyoufindtheonesthatproducetherequiredoutcome• Besurethatyouhaveenoughspacefortheoutputdata• CheckformtheIORoutputifitrecognizescorrectlythenumberofprocessesandhowmanyareusedpernode• Ifthebenchmarkistooslowwithoutreason,checkifotherusersexecuteintensiveI/Oapplications• Besurethatyoudonotharmthesystem,trytoexecutethebenchmarkwhenthesystemisnottoobusyorduringmaintenance• FortheIORHard,youcouldstripethecorrespondingfolder

KAUST– CrayDataWarp – IO-500

• 300computenodes,2400processes,268DataWarp nodes

• ior_easy_params="-t 2m-b 192616m”• ior_hard_writes_per_proc=77872• mdtest_hard_files_per_proc=1630• mdtest_easy_files_per_proc=10800

Presentingdatainradarchart

0

0.2

0.4

0.6

0.8

1Score

IO

MD

TotIOPs

RadarchartRankedsystems

#1 #2

ThebeststorageI/Osystemshouldberepresentedinafulldiamondgraph

NASA- IOPSGaloreEncore

SomeGPFSsystemsareinthefirstIO500listwhichwillbepresentedonWednesdayatIO500BOF.

Wouldyoubeinterestedinprovidingnewresults?

Conclusions

• TillnowtheIOReasyisconsideredthenormalapproachforprocurement,however,thisdoesnotcorrespondtotherealapplication• WeneedabetterwaytounderstandtheprocurementofstorageandIO500seemstobeintherightdirection• Acustomercanconcludetodecisionsbasedonhisapplicationrequirements• Weplansomefutureadditions,suchasmixworkload• Moresubmissionswehave,thebettertounderstandthevariousfilesystems

YouarewelcometoIO500BOF!

top related