the next generation of data storage · the next generation of data storage ... , facebook, and...
TRANSCRIPT
TheNextGenerationofDataStorage
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page2of16
ExecutiveSummary................................................................................................................................................................3
1 Introduction..............................................................................................................................................................4
1.1 TheStorageRevolutionContinues...............................................................................................................4
1.2 TheEmergenceofReal-timeApplications...................................................................................................4
1.3 ComplicationsofScale-outApplications......................................................................................................5
1.4 EmergenceofNVMeoverEthernet.............................................................................................................6
1.5 ApeironStorageTechnologyAdvantages....................................................................................................7
1.6 ApeironStoragePerformanceAdvantages..................................................................................................8
2 ApeironTechnology..................................................................................................................................................9
2.1 StorageSolutionHardwareComponents.....................................................................................................9
2.2 HowApeironVLUNsWork.........................................................................................................................10
2.3 OtherApeironInnovations.........................................................................................................................10
2.4 StorageManagementInterfaces................................................................................................................11
3 ApplicationsforNVMeoverEthernet.....................................................................................................................13
3.1 SplunkEnterprise.......................................................................................................................................13
3.2 HighPerformanceComputing(HPC)..........................................................................................................14
3.3 Hadoop.......................................................................................................................................................14
3.4 FinTech.......................................................................................................................................................15
4 Conclusions.............................................................................................................................................................16
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page3of16
ExecutiveSummary
Storagetechnologyhasevolvedovertimetogreaterlevelsofperformance,capacityscalabilityandcosteffectiveness.The
futuredemandsforstoragewillseeamagnitudeofadvancestoaddresstheemergenceofanewgenerationof
applicationsdrivenbyBigDataanalytics.Goingforward,marketsuccesswillbeachievedbystoragevendorsthatcan
addressaquantumleapinthroughputwithhigherIOPs,fullbandwidthandnewlevelsoflowlatencyforbusiness-critical
applicationworkloads.Thisnewperformanceparadigmisrequiredtoenhancecustomerservicelevelexpectationsandto
makefasterbusinessdecisions,Storagewillbethekeyfocalpointasorganizationsofallsizesandverticalmarketsstrive
foracompetitiveadvantage.
TheemergenceofanewtechnologycalledNVMe(Non-VolatileMemoryExpress)isagame-changingenablerof
acceleratedstoragespeedbutnotallNVMestoragesolutionsarethesame.Inthisdocument,wewillexploretheanswers
tothefollowingquestions:
• Howdidwegettowherewearetodaywithstoragearchitectures?
• Whathaschangedwithnewapplicationsandcustomerworkloads?
• WhathasApeirondonetoeliminatetraditionalstoragebottlenecksandwhataretheApeironarchitectural
innovationsthatcanachievefullbandwidthandgreaterthroughputversusanycompetitivealternative?
• WhatarethespecificUseCasesthatrequirenewstoragearchitectureslikeApeiron?
ThefutureforstorageinfrastructurewillbedrivenbythenextwaveofKillerAppswhichrequireanewstorage
architecturethatcanleveragethepotentialofNVMetechnologyinawaythateliminatesIObottlenecksanddeliverson
thepromiseofBigDatadecisionsupportsolutions.
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page4of16
1 Introduction
1.1 TheStorageRevolutionContinues
Ithasgenerallygoneunnoticedthatsincetheearly2000’sastoragerevolutionhasbegun.Enterprisestoragetodayisa
$60BmarketprovidingcustomerswithstoragecentricdesignsthatarebasedonNASorSANarchitectures.Beginningin
thelate1990’sBigDataandsocialmediagiantslikeGoogle,Facebook,andYahoobegantakingparallelprocessing
techniquesusedfordecadesinhigh-performancecomputing(HPC)andapplyingthemtoclustersofconsumergradex86
whiteboxserverswithinternalharddiskdrivesbeforeprogressingtodaytoincludeinternalsolid-statedrives.
Theydevelopedsoftwaretomanagepetabytesofdataspreadacrosshundredsoreventhousandsoftheselow-end
servers,replacingoreliminatingmanyofthebasicNAS/SANfeaturesintheindustryincluding:
CapacityOptimization DataProtection
Deduplication
Compression
DataManagement
Replication
Snapshots
Clones
Table1.1-1TypicalNASandSANFeaturesOftenProvidedbyApplications
Bykeepingdisparatecopiesofdata,thisapplicationsoftwarewasabletobuildlevelsofredundancyandhighavailability
muchmorecost-effectively,andinturn,endtheneedforSANorNASsolutionsinnascentBigDatamarkets.
1.2 TheEmergenceofReal-timeApplications
Whilemostoftheseapplicationswereinitiallyfocusedonanalytics,by2010theybegantofindtheirwayintoreal-time
customerfacingapplications,startingintheAdTechandwebpersonalizationareas.Withthereleaseofopensourcecode
fromGoogle,Facebook,andAmazontheseapplicationsquicklyexpandedacrosstheFortune500enablingcorporations
toextractvaluefromtheever-growingamountsofmachineandnetworkdata.Thishasresultedindifferencesofopinion
regardingthedifferencebetweenBigDataandFastData,soApeironproposesthefollowingdescriptionsforclarification.
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page5of16
BigData FastData
Dataatquantities(volume)beyondthecapabilityofcommonITinfrastructuretoretain,manage,andprocess.
Dataatspeeds(velocity)beyondthecapabilityofcommonITinfrastructuretocapture,manage,andprocess.
Table1.2-1DescriptionsofBigDataandFastData
ApplicationsoranalyticframeworkslikeHadoop,Splunk,Spark,SAPHana,andSecurityOnionquicklymovedintocritical
businessareasrangingfromcybersecurity,totransactionsystems,toFinTech.IDCreportsthatby2020,over70%ofall
Fortune2000corporationswillhaveinplaceatleastonereal-timecustomerfacingBigDataapplicationthatisbusiness
critical.Theseclusterswillrequirethisnewtypeofstorage.
Value Variety Velocity Volume
BigData High High Low High
FastData High High High Low
Table1.2-2ComparisonofBigDataandFastDataCharacteristics
ThekeydifferencesbetweenBigDataandFastDataarenottheusefulnessoftheirdata(value)ordiversityofdatatypes
(variety),buttherateatwhichdataiscreated(velocity)andtheamountofdataretained(volume).CapturingcurrentFast
Data,ofteninreal-time,forfutureBigDataanalyticsisanemergingperformance(velocity)challengeforstorage.Fast
DatabecomesBigData,sostoragecapacityrequirementsforBigDataexceedFastData.Apeironsuggestsanincreasing
numberofBigDataapplicationswillbecomeprovisionedbyFastDataapplications.
1.3 ComplicationsofScale-outApplications
Therapidgrowthofthese“scale-out”applicationsquicklybecameanissuefortheEnterprisestorageprovidersastheir
focushadbeenthereliabilityandtherichnessoffeatures,asopposedtoperformanceandtotalcostofownership(TCO).
ITmanagersdeployingscale-outapplicationsbeginbyinstallingsmallclustersofx86serverswithinternaldriveswhere
thestorageismanagedbytheapplication,notwithexpensivestorageheadcontrollers.Manyofthesesamefeaturesthat
allowcompanieslikeDell/EMC,HPE,andNetApptochargehigherpricesfortheirSAN/NASsolutionsnowlookoutof
placeandoverlyexpensiveinBigDataenvironments.
Recently,newsuppliershavealsoshownupwithadvancedversionsofthesametiredSANarchitecturesbuttheyhave
hadlimitedsuccesssincetheyhaveignoredthenewtenetsofBigDatascale-outstorage:
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page6of16
• ITArchitecturalAdvances:Theapplicationsoftwareontheservernowmanagesstorage
• StorageManagementCostEffectiveness:Customerswillnolongerpayforenterprisestoragefunctionality
• GreaterServiceLevelExpectations:Usersrequirethesameperformancefromanyexternalstorageastheyget
withinternalstorage
Scale-outstorageusinginternalSSD’sscalespoorly.ITmanagersmustaddserverstodaywhenadditionalstorageis
required,evenwhenadditionalprocessinghorsepowerisnot.Theseproblemsareexacerbatedwiththedeploymentof
NVMeflashdriveswithnewNANDand3DXPoint™mediawhereexternalcontrollersactivelylimitthedrives’
performance.TheseissueshavedirectlyaffectedthemarketgrowthofapplicationslikeHadoopandSplunk.ITmanagers
arehesitanttogrowtheirclusterspastacertainsizeduetothefollowingchallengeswithInfrastructurecostthenumber
onecomplaintvoicedbyITmanagersoflargescale-outsolutions.
Figure1.3-1InfrastructureCost Figure1.3-2PerformanceDropatScale Figure1.3-3LackofExpertise
1.4 EmergenceofNVMeoverEthernet
ThiswasthereasonbehinddevelopingNVMeoverEthernetbyApeiron.WewantedtogiveITmanagerstheperformance
theysawwithdrivesinstalledinservers,withthebenefitsofexternalpooledstorage.Thisenablesdatacenterstoscale
processingandstorageresourcesindependently.WhileprovidingtheelasticmanagementofalargepoolofNVMeSSDs
undersoftwaretoolslikeOpenStack,Docker,orHadoop,NVMeoverEthernetalsocanprovideperformancetoservers
thatisoftenbetterthanseenwithSSDsinstalleddirectly,afactthatoftensurprisesITprofessionals.Nowtheycanhave
thebestofbothworlds.
WebeganthisdesignbybuildingalosslessEthernetarchitecturethatcanscaletothousandsofexternalNVMedrives
withlatencyoverheadoflessthan2us-builtupofmulti-portserverHBAsand2UNVMeshelves.Clustersscalewithlinear
performance.Wetookafreshapproachtothedesignimplementingourdatapathusinghigh-speedFPGAs,selecting
Layer2Ethernetasatransportprotocol,andpassingNVMecommandsnativelyacrossthefabricasshowninFigure1.4-1.
Thenativetransportiscriticaltomakingsurenoperformanceislostinthepooledenvironment.
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page7of16
Figure1.4-1NVMeoverEthernetNativeNVMeProtocol
1.5 ApeironStorageTechnologyAdvantages
LegacysolutionsrequirethetransportofdataalongwithconnectionIDs.Theybuilduptablesofinformationonboththe
serverandonthestorageunitabouttheconnections.Unfortunately,thislimitstheabilityofasystemtotakeadvantage
oftheknowledgeaboutthestorageinformation.TheADS1000canacceleratethetransferoflargeI/Oblocksorstringsof
datainservermemory(I.e.multiplePhysicalRegionPagesorPRPs)bypullingI/O’saheadofnormalNVMecompletion
queues.Attheexternalstorageunit,thesecommandsmustberebuiltbyprocessorsinthedatapathtoprovidetothe
drivesandmustbeprocessedserially.Thistakesmoreandmoretimeastheseconnectiontablesgrow,addsexpense,and
requiresadditionalpowerandequipmentspacetoaccomplish.
TheeffectofthisadditionalcomplexitycanbeseenwhencomparingtheADS1000withallflasharraysservingenterprise
storage.Ontopofhavingunitsoftentwoorthreetimesinsize,theyarethreetimesthecost,withperformancethatis
oftenathirdofwhatinternalSSDsprovide(seeApeironStorageComparisondocument).
Thismeanscustomersmustpurchaseadditionalexpensiveallflasharraystosatisfytheclusterperformanceneeds,
comparedtothoseservedbyNVMeoverEthernet.InapplicationslikeSplunk,queryperformancecanbe90xslower
whenusingtraditionalNASandSANasdetailedintheApeironReferenceArchitectureforSplunkdocument.The
performancedifferenceshaveadramaticeffectonthecostofdeploymentandTCO,notjustapplicationperformance.
Keepingtotalcostofownershipinmind,wealsohaveembeddedEthernetswitchesinourequipmentsothatno
additionalequipmentisneeded,reducinginfrastructurebyupto30%whencomparingtonetworksusingInfiniBandor
FibreChannel.Thiscollapsesdatacenterinfrastructurebyeliminatingmultipletransportprotocolstomanage,redundant
externalswitches,andenablingmulti-pathdriveaccessforhighavailability.ThisallowsApeirontodelivermulti-petabytes
offlashorserverclassmemoryinonerack.
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page8of16
1.6 ApeironStoragePerformanceAdvantages
ThesameSSDscandeliverfasterperformanceandmoreresponsivenesswheninstalledinADS1000systemsthana
server’sinternalstoragebays.OnereasonthiscanoccuristhatserverinstalledNVMeSSDsconnecttoPCIex4buses,but
ADS1000systemssupportNVMeoverEthernetwithApeironhostadaptersusingPCIex8connectionstoservers.Figure
1.5-1showshowtheperformanceofdifferentSSDscanimprovebyupto1.5xwhentheSSDsareinstalledinanADS1000
systeminsteadofinsidetheserver.Similarly,figure1.5-2showshowresponsivenessimprovesbyupto2xwhentheSSDs
areinstalledinanADS1000systeminsteadofinsidetheserver.
Figure1.6-1SSDPerformanceinADS1000VersusInternalServerStorageBay(DAS)
Figure1.6-2SSDResponsivenessinADS1000VersusInternalServerStorageBay(DAS)
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page9of16
2 ApeironTechnology
2.1 StorageSolutionHardwareComponents
AlongwiththeseissuesApeironresolvedmultiplecommonscale-outstorageissuesaffectingtheindustry.Webeganby
addressingstoragemetadataissuesthatlimitmanystoragesolutionsastheygrow.OurendpointdesignallowstrueSSD
virtualizationandprovidesbetterfaultisolationthannormallyseenwithothertechnologies.
ForHPC,wealsoprovideanativeservertoservermessagingtechnologythatenablesuserstoeliminateexpensive,
sprawlingInfiniBandinfrastructure,creatingaconvergednetworkfabric.Havingacentralstoragemanagementagent
allowsITmanagerstoquicklyintegrateApeironsystemsintocommondatacentermanagementsystemswithease.
Figure2.1-1ApeironStorageSystem Figure2.1-2ApeironStorageServer
Figures2.1-1and2.1-2showthebasicApeironsolutioncomponents.Weofferadual40Ghostbusadaptor(ADS-40G
HBA)thatisinstalledinastandardhalfheight,half-lengthserverPCIeGen3x8slot.ThispresentsitselfasaSSD,or
endpoint,totheserver,breakingthepathtotheexternalstorageandenablingtrueNVMevirtualization.
TheApeirondriverallowstheHBAtobuildvirtualLUNsfortheclientapplications,builtupofportionsofremoteNVMe
SSDs,orentireSSDs,whicharethenpresentedasblockstoragedriveentriesintheserver/devdirectory(SeeFigure2.1-
2).TheseVLUNscanbeexpandedontheflythroughtheApeironStorageManager(ASM).DatastoredonSSDscanbeset
uptobespreadacrossdrivestoimproveperformanceormirroredforhighavailability.
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page10of16
Figure2.1-3ADS1000VLUNsinaHadoopApplication
2.2 HowApeironVLUNsWork
AstheseVLUNsgrow,themetadatastoredoneachserverissimplyauniqueMACaddressassignedtoeachremoteSDD
alongwithalogicalblockrangeonthedevice.Thisallowstheconnectioninformationtobelimitedtoahandfulofentries
andallowstheclusterperformancetoscalewithoutperformancelimitations.VLUNscanbeexpandedornewVLUNscan
beaddedasapplicationsrequire.InNOSQLapplicationswhichprefertostartmoreprocessorthreadsasworkloadsgrow,
addingmoreindividualLUNswhichcanbepinnedisideal,whereas,withapplicationsstoringlargevolumesofdatalike
HadoopgrowinglargeindividualSDDscanbehelpful.
Insomeapplicationslikesharedcache(i.e.Memcached)havingmultipleserversaccessasingledrivealsocanberequired.
TheADS1000providespermissionssetupbytheASMtoenablethisability;however,incasesofmultipleserverwrite
access,lockingandunlockingmustbemanagedbytheapplication.Thisflexiblemetadatamanagementoffersboth
performanceatscaleandtheapplicationflexibilityneededbyscale-outapplications.Connectiondatamanagementin
largeclustershastroubledengineersfordecades.
SSDsmountedintheADS1000aretrackedbyboththeirMACaddressesanduniquemanufacturingserialnumbers.SSDs
canbehotplugged,removed,movedtoothershelves,andrestartedwithoutaffectingapplications.Thisisatremendous
problemfornetworksbasedoffotherfabrictechnologieslikePCIebecausetheyforceanapplication-stoppingre-
enumerationcycle.Additional“hot”SSDscanbeaddedtotheADS1000andassignedtoserversbytheASMasneeded,or
throughsetrules(I.e.Drivesinusereachsetstoragelimits).
2.3 OtherApeironInnovations
Onthestorageshelf(ADS1000)thereareanumberofinnovations.ThefirstimprovementisthateachADS1000provides
32individual40Gconnections.Today’sstorageunitsnormallyofferahandfulofconnectionswhichthrottlestheSSD
performancepassedtoservers.Intel3DXPoint™SSDsroutinelycanpresentover2GB/sofbothreadandwrite
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page11of16
bandwidth.SamsungNVMeSSDscanprovidecloseto3GB/sofsustainedreadbandwidth.With24individualSSDs,seeing
combinedbandwidthover50GB/sisnotunusual.
TheADS1000wasdesignedtopassallofthisperformancetoapplicationserverswhichisabasicrequirementforscale-
outstorage.Inaddition,theseunitscanbedaisychainedtogetherwiththeseconnectionstoprovidelargeramountsof
storageandtoprovidehighavailability.CreatingasinglerackthatcanprovideseveralpetabytesofSSDstoragealongwith
tensofclientserversisroutine.Apeiron’sinnovationisbuiltonasoftwaredefinedstoragestrategyastheADS1000can
supportallcommerciallyavailableNVMeSSDs.
InternaltotheADS1000thereareEthernetswitches,dualI/Omodules(IOMs),anddualfanandpowersupplies.These
areallfieldreplaceableunits.TheIOMspresentaserverrootcomplextotheinternal24NVMeSSDssotheythinkthey
areinstalledinaserver.Theswitchesaredualpurpose.Theyeliminatetheneedforexternalswitcheslimitingdatacenter
sprawlandcost.Theyalsoallowforhighavailabilitybyprovidingmultiplepathstoeachdrive.
Figure2.3-1ApeironNVMeoverEthernet(NativeInternalSwitches)
Figure2.3-2TraditionalEthernetNAS
(RequiresExternalSwitches)
Figure2.3-3TraditionalFibreChannelSAN(RequiresExternalSwitches)
Thiseliminatestheneedforexpensivemulti-portSSDs.Scale-outapplicationstypicallymakeuseofdatareplicasto
provideredundancy.Insmallclusters,thesereplicascanbeplacedacrossthetwoIOMsinasingleADS1000,andasthe
numberofADS1000’sgrowthereplicascanbespreadacrossunitsorracks.TheADS1000alsoprovidesmultiple40G
pathsinternallybetweenIOMsallowingalternateconnectionstodrivesforredundancy.
2.4 StorageManagementInterfaces
Alongwiththeinnovationsinarchitectureandhardware,Apeironspentagooddealoftimedevelopingacomprehensive
solutiontomanagedistributedstorage.Becausetheclustersnowincorporatetheapplicationserversalongwithstorage
SERVER
STORAGE
SERVER
SERVERSERVERSERVERSERVERSERVER
NETWORKNETWORK
SERVER
STORAGE
SERVER
SERVERSERVERSERVERSERVERSERVER
NETWORKNETWORK
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page12of16
elements,anApeironStorageManager(ASM)wasrequiredthatcanresideonthenetworktomonitorthestateofall
components–applicationservers,HBA’s,ADS1000units,internalswitches,andSSDs.
Figure2.4-1ApeironStorageUserInterface
However,whenconfiguringstorage,theApeironCLIagentwhichmaybeinstalledanywhereinthenetworkhasaccessto
theRESTAPI,ortheAUI.ExamplesoftheApeironCLIandtheApeironStorageUserInterfaceareshowninFigures2.4-1
and2.4-2.
Figure2.4-2ApeironCLI
user@mars1> asmctl -e ADS1000v1-12006 show drives Slot: 0 Enclosure Name: ADS1000v1-12006 Size: 7.64TB SN: SN35248965245 Drive ID: DRVab4de569b3ca45 Updated: 3 seconds ago Slot: 1 Enclosure Name: ADS1000v1-12006 Size: 7.64TB SN: SN35248965395 Drive ID: DRVab4de569b3ca99 Updated: 3 seconds ago Slot: 15 Enclosure Name: ADS1000v1-12006 Size: 7.64TB SN: SN35248965355 Drive ID: DRVab4de569b3ca59 Updated: 3 seconds ago Slot: 16 Enclosure Name: ADS1000v1-12006 Size: 7.64TB SN: SN35248965356 Drive ID: DRVab4de569b3ca61 Updated: 3 seconds ago user@mars1>
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page13of16
3 ApplicationsforNVMeoverEthernet
3.1 SplunkEnterprise
Apeiron’sNVMeoverEthernetarchitectureisidealforavarietyofscale-out,highperformancecomputing,andparallel
storageapplications.ApplicationslikethecoreSplunkEnterprise—andextendedproductsincludingSplunkEnterprise
Security(ES),SplunkITServiceIntelligence(ITSI),andSplunkUserBehaviorAnalytics(UBA)—aredesignedtoingestand
allowqueriesonpetabytesofmachine,andnetworkdata.DuetoI/Operformancelimitationssearchabledataisoften
constrainedandpassedontoalternateofflinestoragesolutionsasitages.Thisdramaticallylimitsthevalueofthedata.
SlowI/OperformancemeansdatamustbespreadacrossadditionalserverstoprovidequerySLAperformance.
Figure3.1-1SplunkEnterprise
CriticalapplicationsliketheEnterpriseSecuritymoduleinSplunkmustbeturnedoff,orrunfewertimesperday.This
createsmorecorporateexposuretoattacksandmeansintrusionstakemuchlongertoanalyze.Theseissuesalsoextend
thetimerequiredtostandupnewSplunkinstallationsasSplunkcertifiedengineersreportthat80%oftheirtimeisspent
todayonperformancetuningduetostorageequipmentperformanceinadequacies.
SplunkSearchType SplunkEventsFound SplunkReferenceSearchTime
ApeironASASearchTime
ApeironASAAdvantage
Rare23 11.2 2.1 5.3x
115 11.2 6.1 1.8x
SuperSparse26,802 1,112.2 12.6 88x
180,850 1,112.2 19.2 58x
Sparse155,459,317 31,091.9 2842.2 10.9x
1,126,745,647 225,349.1 14,985.3 15.0x
Table3.1-1SplunkSearchPerformancewithApeiron
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page14of16
Overthelastseveralyearsnon-volatilememorypricinghasgonedownby4xwhiletheSSDsizeshavegrowngreaterthan
10x.ThisenablesSplunkenvironmentstokeepmonths,orevenyearsofdataactiveinacosteffectivemanner.Withthe
entiredriveperformancepresentedtoapplicationservers,wehavedemonstratedqueryimprovementsbyupto90x
whileaccessingmultipleyearsofdata,andwhilecuttingthenumberofserversusedbyupto80%.SeetheApeiron
ReferenceArchitectureforSplunkdocumentfordetails.
3.2 HighPerformanceComputing(HPC)
Formanyyears,HPCwascharacterizedbylargesinglepurposeclusterswithexpensive,proprietaryhardwaretunedto
specificapplications.Overthelastdecadeagooddealofworkhasbeendonetocreatemoregeneralpurposemulti-
tenantclustersthatcansupportmultipleconcurrentusersanddifferentapplications.TheADS1000wasdesignedto
supportthismovement.ItworkshandinhandwithparallelfilesystemslikeLustreinordertoquicklywarm-upclient
datasetslikethoseusedbytheworld’smostpowerfulsupercomputersasrankedbytheTOP500list(www.top500.org).
Figure3.2-1TOP500
Apeironsupportshigh-speedservertoservermessagingusingtheApeironstoragefabricsothatnoexternalswitching,
interfacecards,oralternativeprotocolsarerequired.Themessagingusesdirectmemoryaccessusingthesamequeues
andbuffersdesignedforNVMeoverEthernetstorageinordertogetsingledigitmicrosecondtransfersusingastandard
OpenFabricAlliance(OFA)softwarestack(Seewww.openfabrics.org).ThismakestheADS1000particularlyusefulin
FinTechapplicationsusingTIXmessagingprotocolsorforscientificanalysisasingenomicsequencing.
3.3 Hadoop
KeychallengesofHadoopclusterimplementationsishowtoaccelerateperformancetomakefasterbusinessdecisions
withoutbreakingthebank,coupledwithhowtoeffectivelydealwithdatacentersprawl.WhenfacedwithI/Ostorage
infrastructurelimitationstheanswerisgenerallytoincreasethenumberofservers.However,Hadoopmanagement
solutionsfromClouderaorHortonworkschargebasedonaperserverbasis.Whenyouarescalingtheclusterstoonlyget
additionalstoragethisbecomesanexpensiveproposition.
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page15of16
Figure3.3-1ApacheHadoop
ThishaslimitedHadooptoapplicationswhereperformanceisnotcriticallikelargedatalakesbuiltonHDDs,orovernight
routineanalytics.However,HadoopwithbuildingblockslikeSparkisideallysuitedtoreal-timepipelinedprocessesfor
DeepLearning,orreal-timeanalyticsonpetabytescaledatasets.
UsingHadoopwithApeironexternalNVMeSSDsversusinternalHDDsincreasesHadoopreadperformanceby49.5xand
writeperformanceby11.6xwhilereducingthenumberofdatanodesrequiredby50%.Evenwhenperformanceis
comparedwithinternalSSDs,ApeironacceleratesHadoopreadperformanceby8.7xandwriteperformanceby2.6xwith
thesamedatanodereduction.Finally,thislevelofApeironperformanceisachievedwith40%fewerHadoopservers.
SATAHDDs SATASDDs ApeironNVMeSSDs ApeironAdvantage
Servers 7 7 4 40%fewerserversrequired
Datanodes 6 6 3 50%fewerdatanodesrequired
Disks 12 12 12 Flexibleexternalvs.internalstorage
Read(MB/s) 722 4,087 35,764 49.5xor8.7xfasterperformance
Write(MB/s) 218 952 2,526 11.6xor2.6xfasterperformance
Table3.3-2ComparisonofHadoopDeploymentswithHDD,SDD,andApeironStorage
3.4 FinTech
AcompellingBigDataapplicationintheFinTechareawhichisidealforApeironispreandposttradeanalytics.NVMeover
EthernetisideallysuitedintheseapplicationssinceitsI/Operformanceeliminateslargenumbersofunneededservers
andprovidesfordramaticqueryperformanceimprovements.ByprovidingdifferenttypesofSSDmediaallinonestorage
environment,highperformancemedialikeIntel’sOptane™SSDscanbeusedforblockcacheorhigh-performancestorage
poolsofhotdatawhilestandardNANDSSDscanbeusedtostorewarmdataallmanagedbyresourcemanagerslikeYARN
orMESOS.Thisallowsapplicationstotailortheiruseofstoragemediatothedataneeds.Formoreinformation,please
readtheHadoopReferenceArchitecturedocument.
TheNextGenerationofDataStorage
www.apeirondata.com ©2017ApeironDataSystems.Allrightsreserved.Version1.0 Page16of16
4 Conclusions
Whilethecorporateworldhasbeenrapidlydeployingreal-timescale-outapplications,thecurrentstorageproducts
quicklybecomethelimitingbarrierasdatasetsgrow.Simpleclustersofx86serverswhichworkwellwhendatasetsare
smallbecomebottlenecksasstorageneedsoutpaceprocessorneeds.Withmanysoftwarebusinessmodelstiedtothe
numberofservers,thisproblemismagnifiedwithescalatingcosts.ITmanagersarefacedwithsprawlinghardwareputin
placeonlytogetmorestoragecapacityandtheninturnarebilledforserverswhichareonlymanagingstorage
TheADS1000wasarchitectedfromthegrounduptoaddresstheuniqueproblemsfacedwithscale-outapplications.
• ServerConsolidation:Dramaticlevelsofperformanceminimizeservers.
• TotalCostofOwnership(TCO)Advantage:Storageandprocessingcanbescaledindependently.
• BusinessCriticalDecisionMakingCapability:Responsetimesseenbyapplicationsaredramaticallyimprovedand
theamountofactivedatanolongerhastobemanaged,allowingmuchricherqueryresponses.
• ApplicationEfficiencies:InapplicationslikeSplunkyounolongerhavetosloworturnofftoolslikeEnterprise
Securityandyourqueriescanprovidehumantimeanswers.
• PerformanceAcceleration:DatacanbeplacedonSSDstoragetypesthatmakesenseforthetypeofdatabeing
used,asopposedtothelimitedSSDsavendoroffers.
Scale-outdeeplearningandanalyticapplicationsofferbusinessestheabilitytoextractvaluefrommassiveamountsof
data;butasthesedatasetsgrow,anewtypeofscale-outstorageisneeded.BothenterpriseSAN/NASproductsandarrays
ofx86serverswithinternalstoragefallshort.TheApeironADS1000storagehasbeenarchitectedtoaddressthespecific
needsofthesestrategicBigDatascale-outapplications.