large-scale stream processing in the hadoop ecosystem · large-scale stream processing in the...

43
Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra [email protected] Márton Balassi [email protected]

Upload: others

Post on 20-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Large-Scale Stream Processing in the Hadoop Ecosystem

Gyula Fó[email protected]

Márton [email protected]

Page 2: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

This talk

§ Stream processing by example

§Open source stream processors

§ Runtime architecture and programming model

§Counting words…

§ Fault tolerance and stateful processing

§Closing

2Apache:  Big  Data  Europe2015-­‐‑09-­‐‑28

Page 3: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Stream processing by example

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 3

Page 4: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Streaming applications

ETL style operations• Filter incoming data,

Log analysis• High throughput, connectors,

at-least-once processing

Window aggregations• Trending tweets,

User sessions, Stream joins• Window abstractions

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 4

Input

Input

InputInput

Process/Enrich

Page 5: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Streaming applications

Machine learning• Fitting trends to the evolving

stream, Stream clustering• Model state, cyclic flows

Pattern recognition• Fraud detection, Triggering

signals based on activity• Exactly-once processing

5Apache:  Big  Data  Europe2015-­‐‑09-­‐‑28

Page 6: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Open source stream processors

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 6

Page 7: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Apache Streaming landscape

72015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe

Page 8: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Apache Storm

§ Started in 2010, development driven by BackType, then Twitter

§ Pioneer in large scale stream processing§Distributed dataflow abstraction (spouts &

bolts)

82015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe

Page 9: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Apache Flink§ Started in 2008 as a research project

(Stratosphere) at European universities§Unique combination of low latency streaming

and high throughput batch analysis§ Flexible operator states and windowing

9

Batch  data

Kafka,  RabbitMQ,  ...

HDFS,  JDBC,  ...

Stream  Data

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe

Page 10: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Apache Spark§ Started in 2009 at UC Berkley, Apache since 2013§ Very strong community, wide adoption§ Unified batch and stream processing over a

batch runtime§ Good integration with batch programs

102015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe

Page 11: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Apache Samza

§Developed at LinkedIn, open sourced in 2013§ Builds heavily on Kafka’s log based philosophy§ Pluggable messaging system and execution

backend

112015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe

Page 12: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

System comparison

12

Streamingmodel Native Micro-batching Native Native

API Compositional Declarative Compositional Declarative

Fault tolerance Record ACKs RDD-based Log-based Checkpoints

Guarantee At-least-once Exactly-once At-least-once Exactly-once

State Only in Trident State as DStream

Statefuloperators

Statefuloperators

Windowing Not built-in Time based Not built-in Policy based

Latency Very-Low Medium Low LowThroughput Medium High High High

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe

Page 13: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Runtime and programming model

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 13

Page 14: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Native Streaming

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 14

Page 15: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Distributed dataflow runtime

§ Storm, Samza and Flink§General properties

• Long standing operators• Pipelined execution• Usually possible to create

cyclic flows

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 15

Pros• Full expressivity• Low-latency execution• Stateful operators

Cons• Fault-tolerance is hard• Throughput may suffer• Load balancing is an

issue

Page 16: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Distributed dataflow runtime

§ Storm• Dynamic typing + Kryo• Dynamic topology rebalancing

§ Samza• Almost every component pluggable• Full task isolation, no backpressure (buffering

handled by the messaging layer)§ Flink• Strongly typed streams + custom serializers• Flow control mechanism• Memory management

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 16

Page 17: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Micro-batching

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 17

Page 18: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Micro-batch runtime§ Implemented by Apache Spark§General properties• Computation broken down

to time intervals• Load aware scheduling• Easy interaction with batch

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 18

Pros• Easy to reason about• High-throughput• FT comes for “free”• Dynamic load balancing

Cons• Latency depends on

batch size• Limited expressivity• Stateless by nature

Page 19: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Programming model

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 19

Declarative

§ Expose a high-level API§ Operators are higher order

functions on abstract data stream types

§ Advanced behavior such as windowing is supported

§ Query optimization

Compositional

§ Offer basic building blocks for composing custom operators and topologies

§ Advanced behavior such as windowing is often missing

§ Topology needs to be hand-optimized

Page 20: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Programming model

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 20

DStream, DataStream§ Transformations abstract

operator details§ Suitable for engineers and data

analysts

Spout, Consumer, Bolt, Task, Topology

§ Direct access to the execution graph / topology

• Suitable for engineers

Page 21: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Counting words…

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 21

Page 22: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

WordCount

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 22

storm  budapest  flinkapache  storm  sparkstreaming  samza stormflink  apache  flinkbigdata  stormflink  streaming

(storm,  4)(budapest,  1)(flink,  4)(apache,  2)(spark,  1)(streaming,  2)(samza,  1)(bigdata,  1)

Page 23: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

StormAssembling the topology

232015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe

TopologyBuilder builder = new TopologyBuilder();

builder.setSpout("spout", new SentenceSpout(), 5);builder.setBolt("split", new Splitter(), 8).shuffleGrouping("spout");builder.setBolt("count", new Counter(), 12)

.fieldsGrouping("split", new Fields("word"));

public class Counter extends BaseBasicBolt {Map<String, Integer> counts = new HashMap<String, Integer>();

public void execute(Tuple tuple, BasicOutputCollector collector) {String word = tuple.getString(0);Integer count = counts.containsKey(word) ? counts.get(word) + 1 : 1;counts.put(word, count);collector.emit(new Values(word, count));

}}

Rolling word count bolt

Page 24: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Samza

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 24

public class WordCountTask implements StreamTask {private KeyValueStore<String, Integer> store;

public void process( IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) {

String word = envelope.getMessage();Integer count = store.get(word);if(count == null){count = 0;}store.put(word, count + 1);collector.send(new OutgoingMessageEnvelope(new

SystemStream("kafka", ”wc"), Tuple2.of(word, count))); }

}

Rolling word count task

Page 25: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Flink

val lines: DataStream[String] = env.fromSocketStream(...)

lines.flatMap {line => line.split(" ").map(word => Word(word,1))}

.groupBy("word").sum("frequency")

.print()

case class Word (word: String, frequency: Int)

val lines: DataStream[String] = env.fromSocketStream(...)

lines.flatMap {line => line.split(" ").map(word => Word(word,1))}

.window(Time.of(5,SECONDS)).every(Time.of(1,SECONDS))

.groupBy("word").sum("frequency")

.print()

Rolling word count

Window word count

252015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe

Page 26: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

SparkWindow word count

Rolling word count (kind of)

262015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe

Page 27: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Fault tolerance and stateful processing

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 27

Page 28: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Fault tolerance intro

§ Fault-tolerance in streaming systems is inherently harder than in batch• Can’t just restart computation• State is a problem• Fast recovery is crucial• Streaming topologies run 24/7 for a long period

§ Fault-tolerance is a complex issue• No single point of failure is allowed• Guaranteeing input processing• Consistent operator state• Fast recovery • At-least-once vs Exactly-once semantics

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 28

Page 29: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Storm record acknowledgements

§ Track the lineage of tuples as they are processed (anchors and acks)

§ Special “acker” bolts track each lineage DAG (efficient xor based algorithm)

§ Replay the root of failed (or timed out) tuples

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 29

Page 30: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Samza offset tracking§ Exploits the properties of a durable, offset

based messaging layer§ Each task maintains its current offset, which

moves forward as it processes elements§ The offset is checkpointed and restored on

failure (some messages might be repeated)

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 30

Page 31: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Flink checkpointing

§ Based on consistent global snapshots§Algorithm designed for stateful dataflows

(minimal runtime overhead) § Exactly-once semantics

31Apache:  Big  Data  Europe2015-­‐‑09-­‐‑28

Page 32: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Spark RDD recomputation

§ Immutable data model with repeatable computation

§ Failed RDDs are recomputed using their lineage

§ Checkpoint RDDs to reduce lineage length

§ Parallel recovery of failedRDDs

§ Exactly-once semantics

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 32

Page 33: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

State in streaming programs

§ Almost all non-trivial streaming programs are stateful

§ Stateful operators (in essence):𝒇:   𝒊𝒏, 𝒔𝒕𝒂𝒕𝒆 ⟶ 𝒐𝒖𝒕, 𝒔𝒕𝒂𝒕𝒆.

§ State hangs around and can be read and modified as the stream evolves

§ Goal: Get as close as possible while maintaining scalability and fault-tolerance

33Apache:  Big  Data  Europe2015-­‐‑09-­‐‑28

Page 34: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

§ States available only in Trident API§Dedicated operators for state updates and

queries§ State access methods• stateQuery(…)• partitionPersist(…)• persistentAggregate(…)

§ It’s very difficult toimplement transactionalstates

Exactly-­‐‑once  guarantee

34Apache:  Big  Data  Europe2015-­‐‑09-­‐‑28

Page 35: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

§ Stateless runtime by design• No continuous operators• UDFs are assumed to be stateless

§ State can be generated as a separate stream of RDDs: updateStateByKey(…)

𝒇:   𝑺𝒆𝒒[𝒊𝒏𝒌], 𝒔𝒕𝒂𝒕𝒆𝒌 ⟶ 𝒔𝒕𝒂𝒕𝒆.𝒌§ 𝒇 is scoped to a specific key

§ Exactly-once semantics

35Apache:  Big  Data  Europe2015-­‐‑09-­‐‑28

Page 36: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

§ Stateful dataflow operators(Any task can hold state)

§ State changes are storedas a log by Kafka

§Custom storage engines canbe plugged in to the log

§ 𝒇 is scoped to a specific task§At-least-once processing

semantics

36Apache:  Big  Data  Europe2015-­‐‑09-­‐‑28

Page 37: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

§ Stateful dataflow operators (conceptually similar to Samza)

§ Two state access patterns• Local (Task) state• Partitioned (Key) state

§ Proper API integration• Java: OperatorState interface• Scala: mapWithState, flatMapWithState…

§ Exactly-once semantics by checkpointing

37Apache:  Big  Data  Europe2015-­‐‑09-­‐‑28

Page 38: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Performance

§ Throughput/Latency• A cost of a network hop is 25+ msecs• 1 million records/sec/core is nice

§ Size of Network Buffers/Batching§ Buffer Timeout§Cost of Fault Tolerance§Operator chaining/Stages§ Serialization/Types

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 38

Page 39: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Closing

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 39

Page 40: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Comparison revisited

40

Streamingmodel Native Micro-batching Native Native

API Compositional Declarative Compositional Declarative

Fault tolerance Record ACKs RDD-based Log-based Checkpoints

Guarantee At-least-once Exactly-once At-least-once Exactly-once

State Only in Trident State as DStream

Statefuloperators

Statefuloperators

Windowing Not built-in Time based Not built-in Policy based

Latency Very-Low Medium Low LowThroughput Medium High High High

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe

Page 41: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Summary

§ Streaming applications and stream processors are very diverse

§ 2 main runtime designs• Dataflow based (Storm, Samza, Flink)• Micro-batch based (Spark)

§ The best framework varies based on application specific needs

§ But high-level APIs are nice J

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 41

Page 42: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

Thank you!

Page 43: Large-Scale Stream Processing in the Hadoop Ecosystem · Large-Scale Stream Processing in the Hadoop Ecosystem Gyula Fóra gyfora@apache.org Márton Balassi mbalassi@apache.org

List of Figures (in order of usage)

§ https://upload.wikimedia.org/wikipedia/commons/thumb/2/2a/CPT-FSM-abcd.svg/326px-CPT-FSM-abcd.svg.png

§ https://storm.apache.org/images/topology.png

§ https://databricks.com/wp-content/uploads/2015/07/image11-1024x655.png

§ https://databricks.com/wp-content/uploads/2015/07/image21-1024x734.png

§ https://people.csail.mit.edu/matei/papers/2012/hotcloud_spark_streaming.pdf, page 2.

§ http://www.slideshare.net/ptgoetz/storm-hadoop-summit2014, page 69-71.

§ http://samza.apache.org/img/0.9/learn/documentation/container/checkpointing.svg

§ https://databricks.com/wp-content/uploads/2015/07/image41-1024x602.png

§ https://storm.apache.org/documentation/images/spout-vs-state.png

§ http://samza.apache.org/img/0.9/learn/documentation/container/stateful_job.png

2015-­‐‑09-­‐‑28 Apache:  Big  Data  Europe 43