apache flink overview at stockholm hadoop user group

Apache FlinkNext-gen data analysis

Stephan Ewensewen@apache.org

@StephanEwen

What is Apache Flink?• Project undergoing incubation in the Apache Software

Foundation

• Originating from the Stratosphere research project started at TU Berlin in 2009

• http://flink.incubator.apache.org

• 59 contributors (doubled in ~ 4 months)

• Has a cool squirrel for a logo

What is Apache Flink?

Master

Worker

Flink Cluster

Analytical Program

Flink Client &Optimizer

This talk

• Introduction to Flink

• Flink from a user perspective

• Tour of Flink internals

• Flink roadmap and closing

Open Source Data Processing Landscape

MapReduce

Spark Storm

Yarn Mesos

Mahout

Cascading

Data processing engines

App and resource management

Applications

Storage, streams KafkaHBase

Crunch

Common API

StorageStreams

Hybrid Batch/Streaming Runtime

HDFS Files S3

ClusterManager

YARN EC2 Native

Flink Optimizer

Scala API(batch)

Graph API („Spargel“)

JDBC Redis RabbitMQKafkaAzure …

JavaCollections

Streams Builder

Apache Tez

Python API

Java API(streaming)

Apache MRQL

Java API(batch)

Local Execution

Flink APIs

Parallel Collections / Distributed Data Sets

DataSetA

DataSetB

DataSetC

Program

Parallel Execution

Operator X Operator Y

Flexible Pipelines

Reduce

Iterate

Source

Map, FlatMap, MapPartition, Filter, Project, Reduce, ReduceGroup, Aggregate, Distinct, Join, CoGoup, Cross, Iterate, Iterate Delta, Iterate-Vertex-Centric

DataSet<String> text = env.readTextFile(input);

DataSet<Tuple2<String, Integer>> result = text .flatMap((str, out) -> { for (String token : value.split("\\W")) { out.collect(new Tuple2<>(token, 1)); }) .groupBy(0) .aggregate(SUM, 1);

Word Count, Java API

Word Count, Scala API

val input = env.readTextFile(input);val words = input flatMap { line => line.split("\\W+") }val counts = words groupBy { word => word } count()

Beyond Key/Value Pairs

DataSet<Page> pages = ...;DataSet<Impression> impressions = ...;

DataSet<Impression> aggregated = impressions .groupBy("url") .sum("count");

pages.join(impressions).where("url").equalTo("url") .print()// outputs pairs of matching pages and impressions

class Impression { public String url; public long count;}

class Page { public String url; public String topic;}

// outputs pairs of pages and impressions

Distributed architecture

val paths = edges.iterate (maxIterations) { prevPaths: DataSet[(Long, Long)] => val nextPaths = prevPaths .join(edges) .where(1).equalTo(0) { (left, right) => (left._1,right._2) } .union(prevPaths) .groupBy(0, 1) .reduce((l, r) => l) nextPaths}

Client

Optimization andtranslation to data flow

Job Manager

Scheduling, resource negotiation, …

Task Manager

Data node

Task Manager

Data node

Task Manager

Data node

What’s new in Flink

DependabilityJV

Flink ManagedHeap

Network Buffers

UnmanagedHeap

(next version unifies network buffersand managed heap)

User Code

Hashing/Sorting/Caching

• Flink manages its own memory

• Caching and data processing happens in a dedicated memory fraction

• System never breaks theJVM heap, gracefully spills

Shuffles/Broadcasts

Operating onSerialized Data

• serializes data every time Highly robust, never gives up on you

• works on objects, RDDs may be stored serialized Serialization considered slow, only when needed

• makes serialization really cheap: partial deserialization, operates on serialized form Efficient and robust!

Operating onSerialized Data

Microbenchmark• Sorting 1GB worth of (long, double) tuples• 67,108,864 elements• Simple quicksort

Memory Managementpublic class WC { public String word; public int count;}

emptypage

Pool of Memory Pages

• Works on pages of bytes, maps objects transparently• Full control over memory, out-of-core enabled• Algorithms work on binary representation• Address individual fields (not deserialize whole object)• Move memory between operations

DataSet<Page> pages = ...;DataSet<Impression> impressions = ...;

DataSet<Impression> aggregated = impressions .groupBy("url") .sum("count");

pages.join(impressions).where("url").equalTo("url") .print()// outputs pairs of pages and impressions

class Impression { public String url; public long count;}

class Page { public String url; public String topic;}

// outputs pairs of pages and impressions

Why not key/value pairs

• Programs are much more readable ;-)

• Functions are self-contained, do not need to set key for successor)

• Much higher reusability of data types and functionso Within Flink programs, or from other programs

Flink programsrun everywhere

Cluster (Batch)Cluster (Streaming)

LocalDebugging

Fink Runtime or Apache Tez

As Java CollectionPrograms

Embedded(e.g., Web Container)

Upcoming Streaming API

Streaming Throughput

Migrate EasilyFlink supports out-of-the-box supports• Hadoop data types (writables)• Hadoop Input/Output Formats• Hadoop functions and object model

Input Map Reduce Output

DataSet DataSet DataSetRed Join

DataSet Map DataSet

OutputS

Little tuning or configuration required• Requires no memory thresholds to configure

o Flink manages its own memory

• Requires no complicated network configso Pipelining engine requires much less memory for data exchange

• Requires no serializers to be configuredo Flink handles its own type extraction and data representation

• Programs can be adjusted to data automaticallyo Flink’s optimizer can choose execution strategies automatically

Understanding Programs

Visualizes the operations and the data movement of programs

Analyze after execution

Screenshot from Flink’s plan visualizer

Analyze after execution (times, stragglers, …)

Iterations in other systems

Step Step Step Step Step

ClientLoop outside the system

Step Step Step Step Step

ClientLoop outside the system

Iterations in Flink

Streaming dataflowwith feedback

System is iteration-aware, performs automatic optimization

Automatic Optimization for Iterative Programs

Caching Loop-invariant DataPushing work„out of the loop“

Maintain state as index

Flink RoadmapWhat is the community currently working on?

• Flink has a major release every 3 months,with >=1 big-fixing releases in-between

• Finer-grained fault tolerance• Logical (SQL-like) field addressing• Python API• Flink Streaming , Lambda architecture support• Flink on Tez• ML on Flink (e.g., Mahout DSL)• Graph DSL on Flink• … and more

http://flink.incubator.apache.org

github.com/apache/incubator-flink

@ApacheFlink

Engine comparison

Paradigm

Optimization

Execution

Optimizationin all APIs

Optimizationof SQL queriesnone none

Transformations on k/v pair collections

Iterative transformations on collections

RDD Cyclicdataflows

MapReduce onk/v pairs

k/v pairReaders/Writers

Batchsorting

Batchsorting andpartitioning

Batch withmemorypinning

Stream without-of-corealgorithms

MapReduce

DataSet<Order> large = ...DataSet<Lineitem> medium = ...DataSet<Customer> small = ...

DataSet<Tuple...> joined1 = large.join(medium).where(3).equals(1) .with(new JoinFunction() { ... });

DataSet<Tuple...> joined2 = small.join(joined1).where(0).equals(2) .with(new JoinFunction() { ... });

DataSet<Tuple...> result = joined2.groupBy(3).aggregate(MAX, 2);

Example: Joins in Flink

Built-in strategies include partitioned join and replicated join with local sort-merge or hybrid-hash algorithms.

⋈⋈

large medium

DataSet<Tuple...> large = env.readCsv(...);DataSet<Tuple...> medium = env.readCsv(...);DataSet<Tuple...> small = env.readCsv(...);

DataSet<Tuple...> joined1 = large.join(medium).where(3).equals(1) .with(new JoinFunction() { ... });

DataSet<Tuple...> joined2 = small.join(joined1).where(0).equals(2) .with(new JoinFunction() { ... });

DataSet<Tuple...> result = joined2.groupBy(3).aggregate(MAX, 2);

Automatic Optimization

Possible execution 1) Partitioned hash-join

3) Grouping /Aggregation reuses the partitioningfrom step (1) No shuffle!!!

2) Broadcast hash-join

Partitioned ≈ Reduce-sideBroadcast ≈ Map-side

Running Programs

> bin/flink run prg.jar

Packaged ProgramsRemote EnvironmentLocal Environment

Program JAR file

master master

RPC &Serialization

RemoteEnvironment.execute()LocalEnvironment.execute()

Spawn embeddedmulti-threaded environment

Unifies various kinds of Computations

ExecutionEnvironment env = getExecutionEnvironment();

DataSet<Long> vertexIds = ...DataSet<Tuple2<Long, Long>> edges = ...

DataSet<Tuple2<Long, Long>> vertices = vertexIds.map(new IdAssigner());

DataSet<Tuple2<Long, Long>> result = vertices .runOperation( VertexCentricIteration.withPlainEdges( edges, new CCUpdater(), new CCMessager(), 100));

result.print();env.execute("Connected Components");

Pregel/Giraph-style Graph Computation

Delta Iterations speed up certain problems

by a lot

200000

400000

600000

800000

1000000

1200000

1400000

Iteration

Twitter Webbase (20)0

Computations performed in each iteration for connected communities of a social graph

Runtime

Cover typical use cases of Pregel-like systems with comparable performance in a generic platform and developer API.

What is Automatic Optimization

Run on a sampleon the laptop

Run a month laterafter the data evolved

Hash vs. SortPartition vs. BroadcastCachingReusing partition/sortExecution

Plan A

ExecutionPlan B

Run on large fileson the cluster

ExecutionPlan C

apache flink overview at stockholm hadoop user group

flink flink

dataset pages

dataset impressions

configuredo flink

configureo flink

flink apis7

data exchange

data representation

Data & Analytics

hadoop user group 29jan2015 apache flink / haven /...

apache flink: the latest and greatest...apache flink: the...

apache flink: the latest and greatestapache flink: the...

scalabale graph data management and analytics …support for...

streaming predictive analytics on apache flink -...

big data and robotics - dhurakij pundit university - big...

apache flink · 2016-07-18 · apache flink flink core...

real-time stream processing with apache flink @ hadoop...

apache flink big data stream processing · pdf fileapache...

apache flink @ nyc flink meetup

unified stream & batch processing with apache flink (hadoop...

stxxl and thrill - panthema.net · hadoop apache spark...

hue: the hadoop ui - stockholm hug

other map-reduce (ish)...

efficient datastream sampling on apache flink -...

introduction to apache spark - tropars.github.io · i...

hopsworks self-service spark/flink/kafka/hadoop ·...

flink london meetup 3 march 2016 - flink basics

streaming data flow with apache flink @ paris flink meetup...

apache flink