minimizing latency in fault-tolerant distributed stream processing...
TRANSCRIPT
Minimizing Latency in Fault-Tolerant Distributed Stream Processing Systems
Andrey Brito1, Christof Fetzer1, Pascal Felber2
1 Technische Universität Dresden, Germany2 Université de Neuchâtel, Switzerland
Department of Computer Science Institute for Systems Architecture, Systems Engineering Group
ICDCS'09, June 23rd, 2009
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 2 of 55
Goal
Minimize the cost of logging/checkpointingin event stream processing systems
Contribution: Usage of an speculation framework based on transactional memory to overlap logging and processing
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 3 of 55
Motivation (1)
• Event stream applications– Directed acyclic graph of operators– Some operators don't keep state
• Trivially parallelizable
– Some do keep state• Not trivially parallelizable
– Sometimes they are order sensitive• Need to process events sequentially, maybe even waiting for
the order to be restored
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 4 of 55
Application example
A0B0A2 A1B1B3A4 B2B5B7A5 B6
STATE
Processor1
Filter nB8
Output Adapter
PublisherB
Filter nA6
PublisherA
STATE
Processor2
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 5 of 55
Application example
A0B0A2 A1B1B3A4 B2B5B7A5 B6
STATE
Processor1
Filter nB8
Output Adapter
PublisherB
Filter nA6
PublisherA
STATE
Processor2
Events based onnon-deterministic
decisionEvents are out!
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 6 of 55
Application example
B5B7A5 B6 B1B3A4 B2 A0B0A2 A1
STATE
Processor1
Filter nB8
Output Adapter
PublisherB
Filter nA6
PublisherA
STATE
Processor2
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 7 of 55
Application example
Restore checkpoint.
STATE
Processor1
Filter nB8
Output Adapter
PublisherB
Filter nA6
PublisherA
STATE
Processor2
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 8 of 55
Application exampleAsk upstream node to replay missing ones.
STATE
Processor1
Filter nB8
Output Adapter
PublisherB
Filter nA6
PublisherA
STATE
Processor2
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 9 of 55
Application example
Processing some events again.
B5B7A5 B6 B1B2B3 A4
STATE
Processor1
Filter nB8
Output Adapter
PublisherB
Filter nA6
PublisherA
STATE
Processor2
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 10 of 55
Application example
B5B7A5 B6 B1B2B3 A4
STATE
Processor1
Filter nB8
Output Adapter
PublisherB
Filter nA6
PublisherA
STATE
Processor2
Incomplete log of non-deterministic decisions no repeatability→
Events reflect different decisions.
What are you talking about?
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 11 of 55
Motivation (2)
• Fault-tolerant event stream applications– Precise recovery– Even if order does not matter, repeatability does– Non-determinism
• Input order from different streams• Non-determinism in processing (multi-threading, time,
random numbers)
– Log or checkpoint before each output
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 12 of 55
Logging is expensive
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 13 of 55
My solution
• Speculate...• … to parallelize stateful components• … to not have to wait for events• … to not have to wait for logging
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 14 of 55
Outline
• How the speculation works
• Logging algorithm
• Experiments
• Final remarks
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 15 of 55
How the speculation works
• Base: TinySTM– Some extra features added– But same basic rule: “it appears to be atomic”
• Goal: track accesses to shared memory– Instrumentation
• Reads and writes are intercepted• Hold back writes, validate reads until all dependencies
satisfied
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 16 of 55
Speculative execution: parallelization
Processor 1
Processor 2
NEXT = 9
912 11 68 7
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 17 of 55
Speculative execution: parallelization
Processor 1
Processor 2
NEXT = 9
11
9
68 71214 13
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 18 of 55
Speculative execution: parallelization
Processor 1
Processor 2
NEXT = 9
11
9
68 71214 13
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 19 of 55
Speculative execution: parallelization
Processor 1
Processor 2
NEXT = 9
11
9
1214 13 68 7
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 20 of 55
Speculative execution: parallelization
Processor 1
Processor 2
NEXT = 10
1179 81214 13
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 21 of 55
Speculative execution: parallelization
Processor 1
Processor 2
NEXT = 10
11
79 81214 13
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 22 of 55
Logging algorithm
• Operator enqueues all events & decisions
• N+1 threads for N disks– One groups requests in a buffers– The others write their buffers to disk
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 23 of 55
Logging algorithm
OperatorE
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 24 of 55
Logging algorithm
Operator
E
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 25 of 55
Logging algorithm
Operator
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 26 of 55
Logging algorithm
Operator
NDDs
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 27 of 55
Logging algorithm
Operator
E
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 28 of 55
Logging algorithm
Operator
E is here waiting.
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 29 of 55
Logging algorithm
Operator
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 30 of 55
Logging algorithm
Operator
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 31 of 55
Logging algorithm
Operator
update(E)
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 32 of 55
Logging algorithm
OperatorE
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 33 of 55
Logging algorithm
A0B0A2 A1B1B3A4 B2B5B7A5 B6
STATE
Processor1
Filter nB8
Output Adapter
PublisherB
Filter nA6
PublisherA
STATE
Processor2
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 34 of 55
Logging algorithm
A0B0A2 A1B1B3A4 B2B5B7A5 B6
STATE
Processor1
Filter nB8
Output Adapter
PublisherB
Filter nA6
PublisherA
STATE
Processor2
Events based onnon-deterministic
decisionEvents are out!
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 35 of 55
Logging algorithm
B5B7A5 B6
STATE
Processor1
Filter nB8
Filter 1A6
Checkpoint /Logging
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 36 of 55
Logging algorithm
STATE
Processor1
Filter n
Filter 1
Checkpoint/Logging
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 37 of 55
Logging algorithm
STATE
Processor1
Filter n
Filter 1
Checkpoint/Logging
1
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 38 of 55
Logging algorithm
2
STATE
Processor1
Filter n
Filter 1
Checkpoint/Logging
1
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 39 of 55
Logging algorithm
23
STATE
Processor1
Filter n
Filter 1
Checkpoint/Logging
1
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 40 of 55
Logging algorithm
2 43
STATE
Processor1
Filter n
Filter 1
Checkpoint/Logging
1
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 41 of 55
Logging algorithm
2 43
STATE
Processor1
Filter n
Filter 1
Checkpoint/Logging
5
1
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 42 of 55
Logging algorithm
2 43 6
STATE
Processor1
Filter n
Filter 1
Checkpoint/Logging
5
1
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 43 of 55
Logging algorithm
2 43 6
STATE
Processor1
Filter n
Filter 1
Checkpoint/Logging
5
17
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 44 of 55
Speculative processing + Logging
• From the original node's viewpoint– Emit outputs as speculative– When logging requests are acknowledged, emit final
• The next downstream node– If speculative event modifies some state, keep track
• Outputs that consider that part of the state are speculative• Speculative status is contagious
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 45 of 55
Speculation + Logging
2 43 6
STATE
Processor1
Filter n
Filter 1
Checkpoint/Logging
5
13'
7
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 46 of 55
Experiments
• Parallelization: benefits & STM's overheads
• Optimism control
• Overlapping processing and logging
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 47 of 55
Speculation costs & speed-ups
Hardware: Sun T1000
Speculation creation-commit-disposal
overheads.
Few shared-memory accesses.
Amdahl's law influence.
Hardware: Sun T1000
Speculation creation-commit-disposal
overheads.
Few shared-memory accesses.
Amdahl's law influence.
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 48 of 55
Processor 1
Processor 2
NEXT = 9
Controlling optimism
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 49 of 55
Controlling optimism
Processor 1
Processor 2
NEXT = 9
Processor 3
Processor 4
Processor 5
Processor 6
Processor 7
Processor 8
Processor 1
Processor 2
NEXT = 9
Processor 3
Processor 4
Processor 5
Processor 6
Processor 7
Processor 8
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 50 of 55
Hardware: SUN T1000
State size varies between 1 and 20.
Size=1: concurrent executions will
conflict.
Size=20: considerable parallelism.
Hardware: SUN T1000
State size varies between 1 and 20.
Size=1: concurrent executions will
conflict.
Size=20: considerable parallelism.
Controlling optimism
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 51 of 55
2 components do logging.
Non-speculative: only stable events
are sent.
Speculative: send events before
logging is finished.
2 components do logging.
Non-speculative: only stable events
are sent.
Speculative: send events before
logging is finished.
Motivation for distributed speculation: logging costs
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS
X axis: number of components logging.
Even in a SAN/WAN, the shapes would look
similar.
For deterministic components: add “fixed” latency.
X axis: number of components logging.
Even in a SAN/WAN, the shapes would look
similar.
For deterministic components: add “fixed” latency.
Accumulated gains
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 53 of 55
Final remarks - Parallelization
• Parallelization through speculation– Easier, less bugs
• Programmer does not need to fight with locks• Keeps sequential semantics
– Waste of resources reduced with optimism control
• Overhead can be much lower with hardware support for TM (e.g., ASF)
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 54 of 55
Final remarks - Logging
• Overlap logging with processing– Independent of available parallelism– Distributed speculation possible due to less aborts– But do not let speculative results get out of the
system
• In combination with speculative parallelization may even reduce logging
ICDCS'09, 23.06.09 Minimizing latency in fault-tolerant DSMS Slide 55 of 55
Thank you!
http://streammine.inf.tu-dresden.dehttp://wwwse.inf.tu-dresden.de