apache flink overview at stockholm hadoop user group

41
Apache Flink Next-gen data analysis Stephan Ewen [email protected] @StephanEwen

Upload: stephan-ewen

Post on 26-May-2015

4.471 views

Category:

Data & Analytics


2 download

DESCRIPTION

Overview of the Apache Flink project at the Stockholm Hadoop User Group on October 8th, 2014 at Spotify

TRANSCRIPT

Page 1: Apache Flink Overview at Stockholm Hadoop User Group

Apache FlinkNext-gen data analysis

Stephan [email protected]

@StephanEwen

Page 2: Apache Flink Overview at Stockholm Hadoop User Group

2

What is Apache Flink?• Project undergoing incubation in the Apache Software

Foundation

• Originating from the Stratosphere research project started at TU Berlin in 2009

• http://flink.incubator.apache.org

• 59 contributors (doubled in ~ 4 months)

• Has a cool squirrel for a logo

Page 3: Apache Flink Overview at Stockholm Hadoop User Group

3

What is Apache Flink?

Master

Worker

Worker

Flink Cluster

Analytical Program

Flink Client &Optimizer

Page 4: Apache Flink Overview at Stockholm Hadoop User Group

4

This talk

• Introduction to Flink

• Flink from a user perspective

• Tour of Flink internals

• Flink roadmap and closing

Page 5: Apache Flink Overview at Stockholm Hadoop User Group

5

Open Source Data Processing Landscape

5

MapReduce

Hive

Flink

Spark Storm

Yarn Mesos

HDFS

Mahout

Cascading

Tez

Pig

Data processing engines

App and resource management

Applications

Storage, streams KafkaHBase

Crunch

Page 6: Apache Flink Overview at Stockholm Hadoop User Group

6

Common API

StorageStreams

Hybrid Batch/Streaming Runtime

HDFS Files S3

ClusterManager

YARN EC2 Native

Flink Optimizer

Scala API(batch)

Graph API („Spargel“)

JDBC Redis RabbitMQKafkaAzure …

JavaCollections

Streams Builder

Apache Tez

Python API

Java API(streaming)

Apache MRQL

Batc

h

Stre

amin

g

Java API(batch)

Local Execution

Page 7: Apache Flink Overview at Stockholm Hadoop User Group

7

Flink APIs

Page 8: Apache Flink Overview at Stockholm Hadoop User Group

8

Parallel Collections / Distributed Data Sets

DataSetA

DataSetB

DataSetC

A (1)

A (2)

B (1)

B (2)

C (1)

C (2)

X

X

Y

Y

Program

Parallel Execution

X Y

Operator X Operator Y

Page 9: Apache Flink Overview at Stockholm Hadoop User Group

9

Flexible Pipelines

Reduce

Join

Map

Reduce

Map

Iterate

Source

Sink

Source

Map, FlatMap, MapPartition, Filter, Project, Reduce, ReduceGroup, Aggregate, Distinct, Join, CoGoup, Cross, Iterate, Iterate Delta, Iterate-Vertex-Centric

Page 10: Apache Flink Overview at Stockholm Hadoop User Group

10

DataSet<String> text = env.readTextFile(input);

DataSet<Tuple2<String, Integer>> result = text .flatMap((str, out) -> { for (String token : value.split("\\W")) { out.collect(new Tuple2<>(token, 1)); }) .groupBy(0) .aggregate(SUM, 1);

Word Count, Java API

Page 11: Apache Flink Overview at Stockholm Hadoop User Group

11

Word Count, Scala API

val input = env.readTextFile(input);val words = input flatMap { line => line.split("\\W+") }val counts = words groupBy { word => word } count()

Page 12: Apache Flink Overview at Stockholm Hadoop User Group

12

Page 13: Apache Flink Overview at Stockholm Hadoop User Group

13

Beyond Key/Value Pairs

DataSet<Page> pages = ...;DataSet<Impression> impressions = ...;

DataSet<Impression> aggregated = impressions .groupBy("url") .sum("count");

pages.join(impressions).where("url").equalTo("url") .print()// outputs pairs of matching pages and impressions

class Impression { public String url; public long count;}

class Page { public String url; public String topic;}

// outputs pairs of pages and impressions

Page 14: Apache Flink Overview at Stockholm Hadoop User Group

14

Distributed architecture

val paths = edges.iterate (maxIterations) { prevPaths: DataSet[(Long, Long)] => val nextPaths = prevPaths .join(edges) .where(1).equalTo(0) { (left, right) => (left._1,right._2) } .union(prevPaths) .groupBy(0, 1) .reduce((l, r) => l) nextPaths}

Client

Optimization andtranslation to data flow

Job Manager

Scheduling, resource negotiation, …

Task Manager

Data node

mem

ory

heap

Task Manager

Data node

mem

ory

heap

Task Manager

Data node

mem

ory

heap

Page 15: Apache Flink Overview at Stockholm Hadoop User Group

15

What’s new in Flink

Page 16: Apache Flink Overview at Stockholm Hadoop User Group

16

DependabilityJV

M H

eap

Flink ManagedHeap

Network Buffers

UnmanagedHeap

(next version unifies network buffersand managed heap)

User Code

Hashing/Sorting/Caching

• Flink manages its own memory

• Caching and data processing happens in a dedicated memory fraction

• System never breaks theJVM heap, gracefully spills

Shuffles/Broadcasts

Page 17: Apache Flink Overview at Stockholm Hadoop User Group

17

Operating onSerialized Data

• serializes data every time Highly robust, never gives up on you

• works on objects, RDDs may be stored serialized Serialization considered slow, only when needed

• makes serialization really cheap: partial deserialization, operates on serialized form Efficient and robust!

Page 18: Apache Flink Overview at Stockholm Hadoop User Group

18

Operating onSerialized Data

Microbenchmark• Sorting 1GB worth of (long, double) tuples• 67,108,864 elements• Simple quicksort

Page 19: Apache Flink Overview at Stockholm Hadoop User Group

19

Memory Managementpublic class WC { public String word; public int count;}

emptypage

Pool of Memory Pages

• Works on pages of bytes, maps objects transparently• Full control over memory, out-of-core enabled• Algorithms work on binary representation• Address individual fields (not deserialize whole object)• Move memory between operations

Page 20: Apache Flink Overview at Stockholm Hadoop User Group

20

Beyond Key/Value Pairs

DataSet<Page> pages = ...;DataSet<Impression> impressions = ...;

DataSet<Impression> aggregated = impressions .groupBy("url") .sum("count");

pages.join(impressions).where("url").equalTo("url") .print()// outputs pairs of pages and impressions

class Impression { public String url; public long count;}

class Page { public String url; public String topic;}

// outputs pairs of pages and impressions

Page 21: Apache Flink Overview at Stockholm Hadoop User Group

21

Beyond Key/Value Pairs

Why not key/value pairs

• Programs are much more readable ;-)

• Functions are self-contained, do not need to set key for successor)

• Much higher reusability of data types and functionso Within Flink programs, or from other programs

Page 22: Apache Flink Overview at Stockholm Hadoop User Group

22

Flink programsrun everywhere

Cluster (Batch)Cluster (Streaming)

LocalDebugging

Fink Runtime or Apache Tez

As Java CollectionPrograms

Embedded(e.g., Web Container)

Page 23: Apache Flink Overview at Stockholm Hadoop User Group

23

Upcoming Streaming API

Page 24: Apache Flink Overview at Stockholm Hadoop User Group

24

Streaming Throughput

Page 25: Apache Flink Overview at Stockholm Hadoop User Group

25

Migrate EasilyFlink supports out-of-the-box supports• Hadoop data types (writables)• Hadoop Input/Output Formats• Hadoop functions and object model

Input Map Reduce Output

DataSet DataSet DataSetRed Join

DataSet Map DataSet

OutputS

Input

Page 26: Apache Flink Overview at Stockholm Hadoop User Group

26

Little tuning or configuration required• Requires no memory thresholds to configure

o Flink manages its own memory

• Requires no complicated network configso Pipelining engine requires much less memory for data exchange

• Requires no serializers to be configuredo Flink handles its own type extraction and data representation

• Programs can be adjusted to data automaticallyo Flink’s optimizer can choose execution strategies automatically

Page 27: Apache Flink Overview at Stockholm Hadoop User Group

27

Understanding Programs

Visualizes the operations and the data movement of programs

Analyze after execution

Screenshot from Flink’s plan visualizer

Page 28: Apache Flink Overview at Stockholm Hadoop User Group

28

Understanding Programs

Analyze after execution (times, stragglers, …)

Page 29: Apache Flink Overview at Stockholm Hadoop User Group

29

Understanding Programs

Analyze after execution (times, stragglers, …)

Page 30: Apache Flink Overview at Stockholm Hadoop User Group

30

Iterations in other systems

Step Step Step Step Step

ClientLoop outside the system

Step Step Step Step Step

ClientLoop outside the system

Page 31: Apache Flink Overview at Stockholm Hadoop User Group

Iterations in Flink

31

Streaming dataflowwith feedback

map

join

red.

join

System is iteration-aware, performs automatic optimization

Flink

Page 32: Apache Flink Overview at Stockholm Hadoop User Group

32

Automatic Optimization for Iterative Programs

Caching Loop-invariant DataPushing work„out of the loop“

Maintain state as index

Page 33: Apache Flink Overview at Stockholm Hadoop User Group

33

Flink RoadmapWhat is the community currently working on?

• Flink has a major release every 3 months,with >=1 big-fixing releases in-between

• Finer-grained fault tolerance• Logical (SQL-like) field addressing• Python API• Flink Streaming , Lambda architecture support• Flink on Tez• ML on Flink (e.g., Mahout DSL)• Graph DSL on Flink• … and more

Page 34: Apache Flink Overview at Stockholm Hadoop User Group

34

http://flink.incubator.apache.org

github.com/apache/incubator-flink

@ApacheFlink

Page 35: Apache Flink Overview at Stockholm Hadoop User Group

Engine comparison

35

Paradigm

Optimization

Execution

API

Optimizationin all APIs

Optimizationof SQL queriesnone none

DAG

Transformations on k/v pair collections

Iterative transformations on collections

RDD Cyclicdataflows

MapReduce onk/v pairs

k/v pairReaders/Writers

Batchsorting

Batchsorting andpartitioning

Batch withmemorypinning

Stream without-of-corealgorithms

MapReduce

Page 36: Apache Flink Overview at Stockholm Hadoop User Group

36

DataSet<Order> large = ...DataSet<Lineitem> medium = ...DataSet<Customer> small = ...

DataSet<Tuple...> joined1 = large.join(medium).where(3).equals(1) .with(new JoinFunction() { ... });

DataSet<Tuple...> joined2 = small.join(joined1).where(0).equals(2) .with(new JoinFunction() { ... });

DataSet<Tuple...> result = joined2.groupBy(3).aggregate(MAX, 2);

Example: Joins in Flink

Built-in strategies include partitioned join and replicated join with local sort-merge or hybrid-hash algorithms.

⋈⋈

γ

large medium

small

Page 37: Apache Flink Overview at Stockholm Hadoop User Group

37

DataSet<Tuple...> large = env.readCsv(...);DataSet<Tuple...> medium = env.readCsv(...);DataSet<Tuple...> small = env.readCsv(...);

DataSet<Tuple...> joined1 = large.join(medium).where(3).equals(1) .with(new JoinFunction() { ... });

DataSet<Tuple...> joined2 = small.join(joined1).where(0).equals(2) .with(new JoinFunction() { ... });

DataSet<Tuple...> result = joined2.groupBy(3).aggregate(MAX, 2);

Automatic Optimization

Possible execution 1) Partitioned hash-join

3) Grouping /Aggregation reuses the partitioningfrom step (1) No shuffle!!!

2) Broadcast hash-join

Partitioned ≈ Reduce-sideBroadcast ≈ Map-side

Page 38: Apache Flink Overview at Stockholm Hadoop User Group

38

Running Programs

> bin/flink run prg.jar

Packaged ProgramsRemote EnvironmentLocal Environment

Program JAR file

JVM

master master

RPC &Serialization

RemoteEnvironment.execute()LocalEnvironment.execute()

Spawn embeddedmulti-threaded environment

Page 39: Apache Flink Overview at Stockholm Hadoop User Group

39

Unifies various kinds of Computations

ExecutionEnvironment env = getExecutionEnvironment();

DataSet<Long> vertexIds = ...DataSet<Tuple2<Long, Long>> edges = ...

DataSet<Tuple2<Long, Long>> vertices = vertexIds.map(new IdAssigner());

DataSet<Tuple2<Long, Long>> result = vertices .runOperation( VertexCentricIteration.withPlainEdges( edges, new CCUpdater(), new CCMessager(), 100));

result.print();env.execute("Connected Components");

Pregel/Giraph-style Graph Computation

Page 40: Apache Flink Overview at Stockholm Hadoop User Group

40

Delta Iterations speed up certain problems

by a lot

0

200000

400000

600000

800000

1000000

1200000

1400000

Iteration

Bulk

Delta

Twitter Webbase (20)0

1000

2000

3000

4000

5000

6000

Computations performed in each iteration for connected communities of a social graph

Runtime

Cover typical use cases of Pregel-like systems with comparable performance in a generic platform and developer API.

Page 41: Apache Flink Overview at Stockholm Hadoop User Group

41

What is Automatic Optimization

Run on a sampleon the laptop

Run a month laterafter the data evolved

Hash vs. SortPartition vs. BroadcastCachingReusing partition/sortExecution

Plan A

ExecutionPlan B

Run on large fileson the cluster

ExecutionPlan C