dublin apache kafka meetup, 30 august 2017 …...© 2017 mesosphere, inc. all rights reserved. 1...

36
© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017 The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Joseph @pleia2 * ASF projects

Upload: others

Post on 24-Jan-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 1

Dublin Apache Kafka Meetup, 30 August 2017

The SMACK Stack:Spark*, Mesos*, Akka, Cassandra*, Kafka*

Elizabeth K. Joseph@pleia2

* ASF projects

Page 2: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 2

❏ Developer Advocate at Mesosphere❏ 15+ years working in open source

communities❏ 10+ years in Linux systems

administration and engineering roles❏ Founder of OpenSourceInfra.org❏ Author of The Official Ubuntu Book

and Common OpenStack Deployments

Elizabeth K. Joseph, Developer Advocate

Page 3: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved.

MODERN APPLICATION -> FAST DATA BUILT-IN

Data Ingestion

Request/Response

Devices

Client

Sensors

MessageQueue/Bus

Microservices Distributed Storage

Analytics(Streaming) Use Cases:

● Anomaly detection

● Personalization

● IoT Applications

● Predictive Analytics

● Machine Learning

Page 4: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved.

The SMACK Stack

Data Ingestion

Request/Response

Devices

Client

Sensors

Use Cases:

● Anomaly detection

● Personalization

● IoT Applications

● Predictive Analytics

● Machine Learning

MessageQueue/Bus

Microservices Distributed Storage

Analytics(Streaming)

Page 5: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 5

SMACK StackPARK

Page 6: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 6

Microbatching● Apache Spark (Streaming)

Native Streaming● Apache Flink● Apache Storm/Heron ● Apache Apex● Apache Samza

STREAMING ANALYTICS

Page 7: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 7

Typical Use: distributed, large-scale data processing; micro-batching

Why Spark Streaming?● Micro-batching creates very low

latency, which can be faster● Well defined role means it fits in well

with other pieces of the pipeline

APACHE SPARK (STREAMING)

Page 8: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 8

SMACK Stack E S O S

Page 9: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 9

Apache Mesos:The datacenter kernel

http://mesos.apache.org/

Page 10: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved.

• A cluster resource negotiator

• A top-level Apache project• Scalable to 10,000s of

nodes• Fault-tolerant, battle-tested• An SDK for distributed apps• Native Docker support

10

Building block of the modern internet

http://mesos.apache.org/documentation/latest/powered-by-mesos/

Page 11: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 11

MULTIPLEXING OF DATA, SERVICES, USERS, ENVIRONMENTS

Typical Datacentersiloed, over-provisioned servers,

low utilization

Apache Mesosautomated schedulers, workload multiplexing onto the

same machines

mySQL

microservice

Cassandra

Spark/Hadoop

Kafka

Page 12: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

12

Page 13: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 13

● Mesos can’t run applications on its own.

● A Mesos framework is a distributed system that has a scheduler.

● Schedulers like Marathon start and keep your applications running. A bit like a distributed init system.

● Mesos mechanics are fair and HA● Learn more at

https://mesosphere.github.io/marathon/

Marathon

Page 14: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

14

Page 15: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 15

● Rapid deployment● Some service isolation● Dependency handling● Container image repositories

Containers

Page 16: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 16

DC/OS Architecture Overview

Security &GovernanceContainer Orchestration Monitoring & Operations User Interface & Command Line

HDFS Jenkins Marathon Cassandra Flink

Spark Docker Kafka MongoDB +30 more...

DC/OS

Services & Containers

ANY INFRASTRUCTURE

Page 17: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 17

DC/OS brings it all together

● Resource management● Task scheduling (Metronome)● Container orchestration● Logging and metrics● Network management● “Universe” catalog of pre-configured apps

(including Apache Kafka, Apache Spark…), browse at http://universe.dcos.io/

● And much more https://dcos.io/

Page 18: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

18

DC/OS is … ● 100% open source (ASL2.0)+ A big, diverse community

● An umbrella for ~30 OSS repos+ Roadmap and designs+ Documentation and tutorials

● Not limited in any way● Familiar, with more features

+ Networking, Security, CLI, UI, Service Discovery, Load Balancing, Packages, ...

Page 19: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 19

Web-based GUI

https://dcos.io/docs/latest/usage/webinterface/

Interact with DC/OS (1/2)

Page 20: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 20

CLI tool

https://dcos.io/docs/latest/usage/cli/

API

https://dcos.io/docs/latest/api/

Interact with DC/OS (2/2)

Page 21: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 21

Catalog of Applications (Universe)

Page 22: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 22

Install an Application

Page 23: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 23

Application JSON

Page 24: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 24

SMACK Stack K K A

Page 25: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 25

Akka-driven applications

Akka is a toolkit for building highly concurrent, distributed, and resilient message-driven applications for Java and Scala.

● Simple● Highly Performant● Elastic● Reactive

Page 26: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 26

SMACK Stack A S S A N D R A

Page 27: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 27

NoSQL● ArangoDB● mongoDB● Apache Cassandra● Apache HBase

SQL● MemSQL

Filesystems● Quobyte● HDFS

DISTRIBUTEDSTORAGE

Time-Series Datastores● InfluxDB● OpenTSDB● KairosDB● Prometheus

see also iot-a.info

Page 28: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 28

Typical Use: No-dependency, time series database

Why Cassandra?● A top level Apache project born at

Facebook and built on Amazon’s Dynamo and Google’s BigTable

● Offers continuous availability, linear scale performance, operational simplicity and easy data distribution

APACHE CASSANDRA

Page 29: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 29

SMACK Stack A F K A

Page 30: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 30

Message Brokers● Apache Kafka● ØMQ, RabbitMQ, Disque

Log-based Queues● fluentd, Logstash, Flume

see also queues.io

MESSAGE QUEUES

Page 31: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 31

Typical Use: A reliable buffer for stream processing

Why Kafka?● High-throughput, distributed,

persistent publish-subscribe messaging system

● Created by LinkedIn; used in production by 100+ web-scale companies [1]

[1] https://cwiki.apache.org/confluence/display/KAFKA/Powered+By

APACHE KAFKA

Page 32: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 32

DELIVERY GUARANTEES

● At most once—Messages may be lost but are never re-delivered

● At least once—Messages are never lost but may be redelivered (Kafka)

● Exactly once—Messages are delivered once and only once (this is what everyone actually wants, but it’s tricky)

Murphy’s Law of Distributed Systems:

Anything that can go wrong, will go wrong … partially!

Page 33: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 33

● Easy Installation● Service discovery within the cluster● Ability to run multiple Kafka clusters● Elastic scaling of brokers● Cluster and broker monitoring● Support in the DC/OS CLI command

Learn more at https://docs.mesosphere.com/service-docs/kafka/

Key Features of Kafka on DC/OS

Page 34: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2016 Mesosphere, Inc. All Rights Reserved. 34

Available for you to try at: https://github.com/dcos/demos/tree/master/fastdata-iot/

SMACK Stack Demo: Los Angeles Metro

Page 35: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

35

Page 36: Dublin Apache Kafka Meetup, 30 August 2017 …...© 2017 Mesosphere, Inc. All Rights Reserved. 1 Dublin Apache Kafka Meetup, 30 August 2017The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*,

© 2017 Mesosphere, Inc. All Rights Reserved. 36

Questions?

Elizabeth K. JosephTwitter: @pleia2

Email: [email protected]

@dcos

[email protected]

/dcos/dcos/examples/dcos/demos

chat.dcos.io