the all-in-one package for massively multicore, heterogeneous jobs with hotspots, and data streaming
Post on 14-Aug-2015
80 Views
Preview:
TRANSCRIPT
The All-In-One Package for
2015/08/05
Marat Zhanikeev maratishe@gmail.com
SWOPP@Beppu
PDF: http://bit.do/150805
maratishe.github.io
.
Why the All-In-One Package?• we need a new Big Data processor• HPC, ManyCore -- etc. are often incorrectly used in Big Data context
• ManyCore is expected to replace MultiCore 12 -- but not good for irregularjobs◦ InfiniBand and other ManyCore devices expect highly regular jobs anddata structures
◦ in this paper,Massively Multicore is different from ManyCore
• existing Big Data processors -- Hadoop/MapReduce 01 -- are bad◦ no support for and no using advantages from multicore 03◦ bottleneck is at 60Mbps 02◦ key-value datatype is inefficient, this paper replaces it with datastreaming
12 R.Brightwell+0 "Workshop on Managed Many-Core Systems" 1st Workshop on Managed Many-Core Systems (2008)
01 "Apache Hadoop" http://hadoop.apache.org/ (2015)
03 A.Rowstron+4 "Nobody ever got fired for using Hadoop on a cluster" 1st Hot Topics in Cloud Data Proc. (2012)
02 K.Shvachko "HDFS Scalability: the Limits to Growth" the Magazine of USENIX, vol.35, no.2 (2012)
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 2/26...
2/26
.
The Packet Traffic Story
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 3/26...
3/26
.
Traffic -and- BigData Similarities
• volume: 10G+ bits per second• variety=heterogeneity: new capture engines require/use variable headerdepth -- DPI in some cases
• variety=heterogeneity (2): various concurrent processing jobs, differenttargets and output datatypes◦ example: M2M pattern detection, heavy hitters, superspreaders
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 4/26...
4/26
.
Multicore Traffic Processor
Meter To infrastructure proper
Gateway
Mirroring
PF_RING
… other PF_RINGs
CPU Cores
Time Probing Job A
Probing Job B Probing Job C
Shared Memory
… more CPU cores (same ring, different cores)
Lifespan
07 myself+0 "A lock-free shared memory design for high-throughput multicore packet traffic capture" IJNM (2014)M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 5/26...
5/26
.
Lockfree Shared Memory
• PFRING is a faster capture driver for rawpackets 07
• key 1: a Lockfree Shared Memory design
• key 2: Double-Linked List (DLL) for sharingpointers across processes (zero copy) 13
• key 3: spreading the load via stale check• key 4: No locks, but light non-lockingpolling on both sides
07 myself+0 "A lock-free shared memory design for high-throughput multicore packet traffic capture" IJNM (2014)
13 "MCoreMemory project page" https://github.com/maratishe/mcorememory (2015)M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 6/26
...
6/26
.
The Lockfree Design
• locks or MPI, both imposemajor overhead -- up to 70%of time
• lockfree 07: no locking, use DLLto push stale items to thetail -- regularly pop the staletail
07 myself+0 "A lock-free shared memory design for high-throughput multicore packet traffic capture" IJNM (2014)
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 7/26...
7/26
.
Lockfree <> MPI Connection
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 8/26...
8/26
.
Multicore for Big Data
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 9/26...
9/26
.
Multicore of Big Data• Standard HPC: regular structures and jobs, network and storage bottlnecks arenot considered
• bigdata: moving the opposite direction, needs to take care of all thebottlenecks first
Network (NW)
Bulk Storage (BS)
Shared Memory (SM)
Core Output
Big Data Processing
HPC, Simulators, Modeling
Small Data
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 10/26...
10/26
.
Smart Multicore for Big Data• help (1) : circuits for bulk network transfer 09• help (2) : only one process uses bulk storage for buffering anddistribution
• contention/congestion on RAM cannot be easily avoided -- this overheadhas to be minimized
Bulk Storage (BS)
Network (NW) 1
RAM-based Shared Memory (sSM)
Parallel accesses
Ability to isolate
Core Output
Small Data
09 myself+0 "Circuit Emulation for Big Data Transfers in Clouds" Networking for Big Data, CRC (2015)M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 11/26...
11/26
.
The Big Data Replay Method
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 12/26...
12/26
.
Traditional Hadoop
Name Node Storage Node (shard)
file A file B file C …
Hadoop Space
Manager
Hadoop Job (your code) Hadoop Job (your code) Hadoop Job (your code) MapReduce job (your code)
many many
Name Server(s)
Client Machine
Hadoop Client
Your Code
You
Start Use Deploy
Find Read/parse
many
• jobs travel over the networkand run on shards
• Name Server is a majorbottleneck and SPOF
• client machine isoutside of the Hadoop space-- this is why Hadoopinstallations are not easilyopened to public
01 "Apache Hadoop" http://hadoop.apache.org/ (2015)
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 13/26...
13/26
.
Proposed: Big Data Replay
Storage Node (shard)
Time-Aware Sub-Store(s)
Manager
Client Machine
Client
Your Sketcher
You
Start Use
Schedule
Multicore Replay
Replay Node
many
• dumb storage, bulk transferto the Replay Node for replay
• jobs are scheduled byclients -- easy to API
• biggest feature: full access to amassively multicoreprocessor
• ... many other features
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 14/26...
14/26
.
Simple Big Data Repslay• note: traditional MapReduce jobs are not time-aware!
Core 1
Core 1
Core X
Replay Manager
Now(replay)
….
Time-Aligned Big Data Cursor
Time Direction
One Sketch One Sketch One Sketch Start End End End
Read/prepare
Shared Memory
Start
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 15/26...
15/26
.
Big Data Replay + Hetero. +Massive
…. Time
Now (buffer head)
Manager
Job
Job
Buffer tail
pos
pos
Controller
Kill
2 Report
Manage in realtime
One Replay Batch
One Buffer
One Buffer
One Buffer Jobs
Jobs
Jobs
Replay at a scale
1 • matching jobsare packed inbatches
• heterogeneity ismanaged by:
1. monitoring the bufferand
2. repacking on the fly
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 16/26...
16/26
.
Data Streaming as Big Data Jobs• jobs based on data streaming 04 are much better: (1) statistically rigid, (2)accountable, (3) richer/free datatype, (4)....
• since data streaming targets are based on information theory 05,performance bounds can be estimated statistically
04 S.Muthukrishnan "Data Streams: Algorithms and Applications" Theoretical Computer Science (2005)
05 myself+0 "Methods and Algorithms for Fast Hashing in Data Streaming" Cryptography, CRC (2014)
10 M.Sung+4 "Scalable and Efficient Data Streaming Algorithms..." ICDE Workshop (2006)
11 S.Venkataraman+3 "New Streaming Algorithms for Fast Detection of Superspreaders" NDSS (2005)
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 17/26...
17/26
.
Analysis
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 18/26...
18/26
.
Analysis Setup
• 8 cores, each core is one batch
• 500 concurrent jobs, random starting times, per-item overhead is defined bythe hotspot distribution
• two models of batch management : drop and grow
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 19/26...
19/26
.
Analysis: Drop and Grow Models
…. Time
Now (buffer head)
Manager
Job
Job
Buffer tail
pos
pos
Controller
Kill
2 Report
Manage in realtime
One Replay Batch
One Buffer
One Buffer
One Buffer Jobs
Jobs
Jobs
Replay at a scale
1
• drop model: assume a fixedbatch size, each lagging job isdropped◦ ideally, repacked into another batch
• grow model: allow for lagging jobs byexpanding the buffer◦ ... expend = keep more and more of DLL
tail
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 20/26...
20/26
.
Analysis: Hotspots•
0 25 50 75 100 125 150 175 200 225 250 275 300 325 350Ordered list
0
0.1
0.2
0.3
0.4
0.5C
PU L
oad
, Ove
rhea
d, e
tc.
Pop/Hot/Flash distributions (increasing thickness)an(350) am(5) av(2)
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 21/26...
21/26
.
Analysis: Result Visualization
0 10 20 30 40 50 60 70 80 90Number of dropped jobs
2.8
8.4
14
19.6
25.2
30.8
Ave
rage
bat
ch sp
an (s
)
300/5350/5
350/1
250/1
250/10
450/1450/5
300/1400/1
300/10
Drop modelGrow model• grow model: takesbetween 2 to 3times largerbatches to avoiddrops
• drop model:between 5% and10% or dropsdepending on thehotspot distribution
• note: did notrepack the jobsthis time, but this willhelp reduce thenumber of drops
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 22/26...
22/26
.
That’s all, thank you ...
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 23/26...
23/26
.
The Time-Aware Big Data Datatype
• time-aware bigdata is in mid-range between the two extremes -- key-value andtraditional Hadoop shards
KV Store
Hadoop (HDFS) and
MapReduce
TABID Time-Aware Big Data (this demo)
HDFS +
Lucene Index
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 24/26...
24/26
.
DLL: The Double-Linked List• 4-way DLL with sideways linking is often used when collisions are non-negligible
Item
Item
Item
ItemItem
sidep
rev
siden
ext
sidep
revprev
nextsdie
next
next
prev
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 25/26...
25/26
.
Data Streaming + Bloom + Fast Hashing
• practical data streaming is a complex technology that depends on:
1. efficient Bloom filters2. fast hashing
Other Uses
Data Streaming
Other uses Bloom Filter
Other Types of Hashing Fast Hashing
M.Zhanikeev -- maratishe@gmail.com -- allinone: Massively Multicore, Heterogeneous Jobs, and Data Streaming -- http://bit.do/150805 26/26...
26/26
top related