Download - Apache Flink and More @ MesosCon Asia 2017
![Page 1: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/1.jpg)
Till Rohrmann [email protected] @stsffap
Apache Flink® and More
Jörg Schad [email protected] @joerg_schad
![Page 2: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/2.jpg)
![Page 3: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/3.jpg)
![Page 4: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/4.jpg)
MapReduce is crunching Data
![Page 5: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/5.jpg)
We need to turn faster!
![Page 6: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/6.jpg)
SMACK Stack
EVENTSUbiquitous data
streams from connected devices
INGEST
Apache Kafka
STORE
Apache Spark
ANALYZE
Apache Cassandra
ACT
Akka
Ingest millions of events per second
Distributed & highly scalable database
Real-time and batch process
data
Visualize data and build data driven
applications
Mesos/ DC/OS
Sensors
Devices
Clients
![Page 7: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/7.jpg)
Evolution of Data Analytics
Batch Event ProcessingMicro-Batch
Days Hours Minutes Seconds Microseconds
Solves problems using predictive and prescriptive analytics
Reports what has happened using descriptive analytics
Predictive User Interface
Real-time Pricing and Routing
Real-time Advertising
Billing,Chargeback
Product recommendations
![Page 8: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/8.jpg)
8
![Page 9: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/9.jpg)
9
Original creators of Apache Flink®
Providers of the dA Platform, a supported
Flink distribution
![Page 10: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/10.jpg)
Apache Flink In a Nutshell
10
Event-driven applications (event sourcing, CQRS)
Stateful, event-driven,event-time-aware processing
Batch Processing (data sets)
Stream Processing / Analytics (data streams, windows, …)
![Page 11: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/11.jpg)
Apache Flink Stack
11
DataStream API Stream Processing
DataSet API Batch Processing
Runtime Distributed Streaming Data Flow
Libraries
Streaming and batch as first class citizens.
![Page 12: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/12.jpg)
Programming Model
12
Computation
Computation
Computation
Computation
Source Source
SinkSink
Transformation
state
state
state
state
![Page 13: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/13.jpg)
API & Execution
13
7
SourceDataStream<String> lines = env.addSource(new FlinkKafkaConsumer010(…));
DataStream<Event> events = lines.map(line -> parse(line));
DataStream<Statistic> stats = stream .keyBy("id") .timeWindow(Time.seconds(5)) .sum(new MyAggregationFunction());
stats.addSink(new BucketingSink(path));
keyBy()/ window()/
apply()
Transformation
Transformation
Sink
Streaming Dataflowmap()Source Sink
![Page 14: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/14.jpg)
Distributed Runtime
14
![Page 15: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/15.jpg)
Levels of Abstraction
15
Process Function (events, state, time)
DataStream API (streams, windows)
Table API (dynamic tables)
Stream SQL
low-level (stateful stream processing)
stream processing & analytics
declarative DSL
high-level language
![Page 16: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/16.jpg)
What Is Flink Good For?
16
![Page 17: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/17.jpg)
17
Detecting fraud in real time
As fraudsters get better, need to update models without downtime
Live 24/7 service
Credit card transactions
Notifications and alerts
Evolving fraud models built by data scientists
@
![Page 18: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/18.jpg)
18
▪ Athena X ▪ SQL to define metrics ▪ Thresholds and actions to trigger
▪ Blends analytics andactions Streams from
Hadoop, Kafka, etc
SQL, thresholds, actions
Analytics Alerts
Derived streams
@
![Page 19: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/19.jpg)
19
▪ Route events to Kafka, ES, Hive ▪ Complex interaction sessions rules ▪ Mix of stateless / small state / large state
▪ Stream Processing as a Service • Launching, monitoring, scaling, updating • DSL to define jobs
@
![Page 20: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/20.jpg)
20
▪ Blink based on Flink ▪ A core system in Alibaba Search
• Machine learning, search, recommendations • A/B testing of search algorithms • Online feature updates to boost conversion rate
▪ Alibaba is a major contributor to Flink ▪ Contributing many changes back to open source
@
![Page 21: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/21.jpg)
21
Complete social network Implemented using event sourcing andCQRS (Command Query Responsibility Segregation)
@
![Page 22: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/22.jpg)
Apache Flink & Apache Mesos
22
![Page 23: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/23.jpg)
Why Apache Mesos?
▪ Mesos offers full functionality to implement fault tolerant and elastic distributed applications
▪ 30% of survey respondents were running Flink on Mesos (prior to proper Mesos support, September 2016)
23
![Page 24: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/24.jpg)
Flink’s Mesos Integration
24▪ Kudos to Eron Wright ( EronWright) for this work
Apache Flink Framework
Mesos Master
Mesos App Master
Flink MesosResourceManager
JobManager
Mesos Task
TaskManager
Mesos Task
TaskManager
Allocate Resources
Launch Mesos tasks
Register
Execute Job
![Page 25: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/25.jpg)
Resource Manager Components
▪ Monitors connection to Mesos
25
Connection Monitor Launch Coordinator
▪ Resource offer processing and task scheduling
▪ Gathers offers and matches them to tasks using Fenzo
Task MonitorReconciliation Coordinator
▪ Monitors Mesos tasks ▪ Triggers reconciliation ▪ Makes sure tasks are properly
killed
▪ Reconciles tasks view between ResourceManager and Mesos Master
![Page 26: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/26.jpg)
Component Interplay
26
ResourceManager
Connection Monitor
Launch Coordinator
Task MonitorReconciliation Coordinator
Mesos MasterResource offers
Launch tasks
Monitor tasks
Status messages
Trigger reconciliation
Status messages
Mesos Task
Reconcile tasks
Start TaskManagers
Recover tasks
Kill task
![Page 27: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/27.jpg)
Fenzo▪ Developed by Netflix ▪ Generic task scheduler for frameworks ▪ Matching between tasks and resource offers
• Pluggable fitness evaluator
27
Fenzo
Mesos
Launch Coordinator
Periodic resource offers
Tell Fenzo offered resources & tasks
Fenzo returns resource task matchings
Tasks to launch
![Page 28: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/28.jpg)
Datacenter
![Page 29: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/29.jpg)
NAIVE APPROACH
Typical Datacentersiloed, over-provisioned servers,
low utilization
Industry Average 12-15% utilization
mySQL
microservice
Cassandra
Flink
Kafka
![Page 30: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/30.jpg)
© 2017 Mesosphere, Inc. All Rights Reserved. 30
![Page 31: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/31.jpg)
Apache Mesos
Typical Datacentersiloed, over-provisioned servers,
low utilization
Industry Average 12-15% utilization
mySQL
microservice
Cassandra
Flink
Kafka
Mesos automated schedulers, workload multiplexing
onto the same machines
![Page 32: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/32.jpg)
Why Mesos?● 2-level scheduling● Fault-tolerant, battle-tested● Scalable to 10,000+ nodes● Created by Mesosphere founder @
UC Berkeley; used in production by 100+ web-scale companies [1]
[1] http://mesos.apache.org/documentation/latest/powered-by-mesos/
APACHE MESOS
![Page 33: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/33.jpg)
![Page 34: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/34.jpg)
DC/OS
Datacenter Operating System (DC/OS)
Distributed Systems Kernel (Mesos)
Big Data + Analytics EnginesMicroservices (in containers)
StreamingBatchMachine Learning
Analytics
Functions & Logic Search
Time SeriesSQL / NoSQL
Databases
Modern App Components
Any Infrastructure (Physical, Virtual, Cloud)
![Page 35: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/35.jpg)
© 2016 Mesosphere, Inc. All Rights Reserved.
DEMO
![Page 36: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/36.jpg)
Conclusion
36
![Page 37: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/37.jpg)
Conclusion
▪ Apache Flink runs on Mesos using Fenzo
▪ DC/OS offers easy to use Flink package ▪ Contributions welcome!
DC/OS Office Hour June 29th
37
![Page 38: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/38.jpg)
Thank you! @stsffap
@joerg_schad @ApacheFlink @dataArtisans
@dcos
![Page 39: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/39.jpg)
39
![Page 40: Apache Flink and More @ MesosCon Asia 2017](https://reader031.vdocuments.us/reader031/viewer/2022030318/5a6476e47f8b9afc4d8b4685/html5/thumbnails/40.jpg)
We are hiring! data-artisans.com/careers