high performance on wall street 2016 challenges of hpc ... · premise •nosql: 30x performance...
TRANSCRIPT
![Page 1: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/1.jpg)
©201560EastTechnologies,Inc
HighPerformanceonWallStreet2016ChallengesofHPCCodeWri@ng–Capacity,
Performance,SpeedandCost
www.crankuptheamps.com
JeffreyM.BirnbaumCEOandCo-Founderof60EastTechnologies,Inc.
MakersofAMPS:TheAdvancedMessagingProcessingSystem
![Page 2: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/2.jpg)
Premise
• NoSQL:30XPerformanceoverMongoDBonIngesAonandQueries
• Queue:Over25XtheThroughputofRabbitMQatupto60Xlowerlatency• ReducingFootprintfrom11machinesto1
• 4XMoreThroughputthanaPubSubSystemwith0messagelosstolerance
• 2.5XMoreDurableThroughputthanHardwareMessagingAppliance
hOp://www.crankuptheamps.com//blog/posts/2016/02/26/rabbitmq-comparison-to-amps/hOp://www.crankuptheamps.com//blog/posts/2015/07/22/reality-check-pure-soUware-beats-hardware/hOp://www.crankuptheamps.com//blog/posts/2014/09/24/ul@mate-shock-absorber/
Formorecontext,pleaseseethefollowingblogar@cles:
ByconsideringhowourAMPSTechnologyoutperformspopularsystems,weseehowmuchsystemsleaveonthetableintermsofH/Wresources?
![Page 3: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/3.jpg)
PerformanceDrivers
• TradiAonalDriversforHPC?– Cri@caltoMarketMakingandotherlowlatencymarkets
– HandleLargerandLargerDataVolumesandLoad
– HandlePeakLoadsbyleveragingfullextentofMachineresources
• WhyHPCisnowneededeverywhere?
– SignificantlyReduceH/WFootprintCostsbyscalingoutless
– LoweringLatencyenablesonetoaddvalue-addedcapabili@esthatwerepreviouslycostprohibi@ve(i.e.streaminganaly@csforHPCorcontentfiltering)
![Page 4: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/4.jpg)
TruthorMyth:“NoMoreFreeLunch”?
SeeScaleUpandOut:hOp://www.crankuptheamps.com/downloads/documenta@on/CXO-Insight-Mar-2015.pdf
![Page 5: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/5.jpg)
But,WeareLeavingTooMuchontheTable…..
TwentyYearsago,appsweredesignedforslowdisksandnetworksandsinglethreaded.Itseemsliketheys@llare.
ArchitecturalPaOernss@llassumemul@-threadedistoohard,disksandnetworksaretooslowandscalingoutandoutisthenorm.Wearelearningthatbigdataandreal@meproblemsrequirebeOerthinking:hOp://www.crankuptheamps.com//blog/posts/2015/10/01/nba-of-data-science/
![Page 6: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/6.jpg)
Let’sReviewSomeImportant#s
hOp://[email protected]/media/research.google.com/en//people/jeff/stanford-295-talk.pdf
![Page 7: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/7.jpg)
EvolvingElementsofHPC
• CPUs–Coresfrom8to24• Storage–marchtowardsSSD,NVME,XPoint
• Network–around$3000fora100Gbperport (i.e.Arista100Tbswitch)
• Memory–512GBiscommon;Largercache sizes
MEMORY STORAGECPUNETWORK
![Page 8: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/8.jpg)
CPUs,CoresandConcurrency• EmbracingMulA-core
• Mul@-Threaded• Beyondthe80-20Rule
• TuneEverything• LockfreeDataStructures
• Genera@onCount
• NUMAtuning• MostDBproductssayturnNUMAoff.Wekeepitonbecausewedidthehardwork.
• Throughput:• MessagePipelines• DivideandConquer
hOp://www.extremetech.com/compu@ng/188911-intel-haswell-e-review-the-best-consumer-performance-chip-you-can-buy-with-some-caveats/2
![Page 9: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/9.jpg)
LeavingThingsontheTable
hOp://www.crankuptheamps.com//blog/posts/2016/02/26/rabbitmq-comparison-to-amps/
Oneisnottakingadvantageoftheresources.AMPSQueuestoreandforwardmodelisimpactedbylargermessagesizeswriOentopersistentdisk.TheboOleneckshouldbeduetotheh/wlimits.
![Page 10: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/10.jpg)
LeavingThingsontheTable
hOp://www.crankuptheamps.com//blog/posts/2016/02/26/rabbitmq-comparison-to-amps/
![Page 11: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/11.jpg)
LeavingThingsontheTable
hOp://www.crankuptheamps.com/downloads/documenta@on/hpcws-apr2013-v4.pdf
![Page 12: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/12.jpg)
BestPrac@ces1
FocusonForwardScalability:Whenthenextchipcomeswithajumpfrom14to28coresonasingledie,themoreconcurrencythatcanberealized.
FlipSide:A[enAontoDetailSinglethreadedcanbelessworsethanpoorly/excessivelyimplementedlocking.Theyareonlyscalingviapar11oningthedatai.e.REDIS,VoltDB
![Page 13: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/13.jpg)
BestPrac@ces2• ManyApproaches:
– �ScyllaDB/Cassandra-aresendingworkacrossnodeconstantlywithoutNUMA-ness.(ADispatchModelthatpar@@onspercore)
– Justgottodoittherightway(everycoreisassignedaprocessingengine-withawidescaledispatch(i.e.48core=48workers(+dispatchworker)).
• Obviousthingisn’talwaysthebestthinginahighlyconcurrentcontext.A“B-Tree”iscommonfordbindexschemesbutitsnotthebestdatastructureforhighlyconcurrentac@vi@es.(30XperformancegainoverMongoDBinInges@on+Querying).
• i.e.AMPSdoesn’thaveasinglemodel..modelforIOisnotthesameasquerying(i.e.paralleldivideandconquer)��Wepar@@onwhatisimportantacrossthemachine.��
![Page 14: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/14.jpg)
Storage• Howwouldyouwriteyoursystemdifferentlyifyouknewtherewouldbea20X
to40XimprovementinstorageI/O?
• Disk– FineforLogAppending(lowdiskheadmovement/seek)
• SSD– MemoryMapped
Files,Key-ValueStores
• PromiseofXPoint80msthroughputlimited,Onceyouhitmemorylimit,iton
isstuckat80msduetobackpressure
*Whatifthatgoesdownto8ms
hOp://www.crankuptheamps.com//blog/posts/2014/12/08/extreme_storage_performance/hOp://www.crankuptheamps.com//blog/posts/2014/05/01/amps-faster-than-ever-with-memory-channel-storage/
![Page 15: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/15.jpg)
NotLeavingThingsontheTable
InvestinabeOerstoragedevice,andthesoUwareshouldrewardyou…..
![Page 16: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/16.jpg)
NetworkingAdvancementsHowwouldyouwriteyoursystemdifferentlyifyouknewtherewouldbea10Ximprovement
innetworkI/O?
548K 550K
927K
761K
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
1P1S 2P4S
msg
s/se
c
AMPS vs AMPS with Mellanox VMA Subscriber Rate in msgs/sec E5-2690 v3 @ 2.60GHz over 40Gb
network and 10M 512 byte msgs
AMPS AMPS_VMA
Thesearepreliminary#s–wehaven’top@mizedit,thiswasasimpleLD_Preload
![Page 17: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/17.jpg)
Memory
• Knowyourcachelines–andkeepthingsincachewheneverpossible
• KeepyourstructuresandaccesspaOernsaligned(par@@onread/writesec@ons)
• MinimizeHeapMemoryAlloca@on
• LeverageModernAllocators
• AvoidThreadBleeding
![Page 18: High Performance on Wall Street 2016 Challenges of HPC ... · Premise •NoSQL: 30X Performance over MongoDB on IngesAon and Queries •Queue : Over 25X the Throughput of RabbitMQ](https://reader033.vdocuments.us/reader033/viewer/2022052006/601a15cbf649764cc02f7f0e/html5/thumbnails/18.jpg)
Wash,Rinse,Repeat
• ScaleUpandthenOut• ScaleForward(“enjoythefreelunch”)andplanforadvancements(i.e.40XXpoint)
• Forgetabout80-20,Op@mizeeverylastpartofyourcodebase
• LeverageManyModels/Approaches;Con@nuallyImprove