webinar - how to build data pipelines for real-time applications with smack & apache kafka

45
@PatrickMcFadin Patrick McFadin Chief Evangelist for Apache Cassandra How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka 1

Upload: datastax

Post on 06-Jan-2017

1.094 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

@PatrickMcFadin

Patrick McFadinChief Evangelist for Apache Cassandra

How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

1

Page 2: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

The problem

Your Magical

App

Page 3: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Sad solutions

Page 4: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

SMACK

Page 5: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Spark

Mesos

Akka

Cassandra

Kafka

Page 6: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

CassandraAkka

SparkKafka

Organize Process Store

Mesos

KafkaKafkaKafka SparkSparkSpark

AkkaAkkaAkka CassandraCassandraCassandra

Page 7: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

CassandraAkka

SparkKafka

Organize Process Store

Page 8: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Managing Weather Data

Windsor California67.3 FRainfall total: 1.2cm

Today:

High: 73.4FLow : 51.4F

Yesterday:

High: 75.2FLow : 52.3F

Our Magical

App

Reactive and immediate

Batch

Page 9: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

KillrWeather

KillrWeather

Windsor California67.3 FRainfall total: 1.2cm

Today:

High: 73.4FLow : 51.4F

Yesterday:

High: 75.2FLow : 52.3F

Page 10: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

https://github.com/killrweather/killrweather

Page 11: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Spark

Mesos

Akka

Cassandra

Kafka

Page 12: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Kafka

Page 13: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Kafka decouples data pipelines

Page 14: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka
Page 15: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka
Page 16: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

The problem

Kitchen

Hamburgerplease

Meat diskon breadplease

Page 17: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

The problem

Kitchen

Page 18: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

The problem

Kitchen

Order Queue

Hamburgerplease

Order

Page 19: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

The problem

Kitchen

Order Queue

Page 20: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

The problem

Kitchen

Order Queue

Meat diskon breadplease

You mean aHamburger?

Uh yeah. That.

Order

Page 21: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Consumer

Topic = FoodOrder

Page 22: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Consumer

Page 23: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

Consumer

Page 24: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Page 25: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

Page 26: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

3

Page 27: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

3

Page 28: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

3

Page 29: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

3

Order

Page 30: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

3

Order

4

Page 31: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

3

Order

4

Order

Page 32: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

3

Order

4

Order

5

Page 33: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

3

Order

4

Order

5

Page 34: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

3

Order

4

Order

5

Page 35: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Order from chaosProducer

Topic = Food

Order

1

Order

2

Consumer

Order

3

Order

4

Order

5

Page 36: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

ScaleProducer

Topic = Hamburgers

Order

1

Order

2

Consumer

Order

3

Order

4

Order

5

Topic = Pizza

Order

1

Order

2

Order

3

Order

4

Order

5

Topic = Food

Page 37: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

KafkaProducer

Topic = Temperature

Temp

1

Temp

2

Consumer

Temp

3

Temp

4

Temp

5

Collection API

Temperature Processor

Topic = Precipitation

Precip

1

Precip

2

Precip

3

Precip

4

Precip

5Precipitation Processor

Broker

Page 38: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

KafkaProducer

Topic = Temperature

Temp

1

Temp

2

Consumer

Temp

3

Temp

4

Temp

5

Collection API

Temperature Processor

Topic = Precipitation

Precip

1

Precip

2

Precip

3

Precip

4

Precip

5Precipitation Processor

Broker

Partition 0

Partition 0

Page 39: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

KafkaProducer Consumer

Collection API

Temperature Processor

Precipitation Processor

Topic = Temperature

Temp

1

Temp

2

Temp

3

Temp

4

Temp

5

Topic = Precipitation

Precip

1

Precip

2

Precip

3

Precip

4

Precip

5

Broker

Partition 0

Partition 0

Temp

1

Temp

2

Temp

3

Temp

4

Temp

5Partition 1 Temperature

Processor

Page 40: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

KafkaProducer Consumer

Collection API

Temperature Processor

Precipitation Processor

Topic = Temperature

Temp

1

Temp

2

Temp

3

Temp

4

Temp

5

Topic = Precipitation

Precip

1

Precip

2

Precip

3

Precip

4

Precip

5

Broker

Partition 0

Partition 0

Temp

1

Temp

2

Temp

3

Temp

4

Temp

5Partition 1

Temperature Processor

Topic = Temperature

Temp

1

Temp

2

Temp

3

Temp

4

Temp

5

Topic = Precipitation

Precip

1

Precip

2

Precip

3

Precip

4

Precip

5

Broker

Partition 0

Partition 0

Temp

1

Temp

2

Temp

3

Temp

4

Temp

5Partition 1

Topic TemperatureReplication Factor = 2

Topic PrecipitationReplication Factor = 2

Page 41: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

KafkaProducer

Consumer

Collection API

Temperature Processor

Precipitation Processor

Topic = Temperature

Temp

1

Temp

2

Temp

3

Temp

4

Temp

5

Topic = Precipitation

Precip

1

Precip

2

Precip

3

Precip

4

Precip

5

Broker

Partition 0

Partition 0

Temp

1

Temp

2

Temp

3

Temp

4

Temp

5Partition 1 Temperature

Processor

Topic = Temperature

Temp

1

Temp

2

Temp

3

Temp

4

Temp

5

Topic = Precipitation

Precip

1

Precip

2

Precip

3

Precip

4

Precip

5

Broker

Partition 0

Partition 0

Temp

1

Temp

2

Temp

3

Temp

4

Temp

5Partition 1

Temperature Processor

Temperature Processor

Precipitation Processor

Topic TemperatureReplication Factor = 2

Topic PrecipitationReplication Factor = 2

Page 42: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

GuaranteesOrder•Messages are ordered as they are sent by the producer•Consumers see messages in the order they were inserted by the producer

Durability•Messages are delivered at least once•With a Replication Factor N up to N-1 server failures can be tolerated without losing committed messages

Page 43: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

CassandraAkka

SparkKafka

Organize Process Store

Mesos

Page 44: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Coming soon!• May 4: How to Achieve High Throughput for Real-Time Applications with SMACK,

Apache Kafka and Spark Streaming

• May 18: How to Build Data Pipelines with SMACK: Storage Strategy using Cassandra and DSE

• June 1: How to Build Data Pipelines with SMACK: Analyzing Data with Spark

• For the latest schedule of webinars, check out our Webinars page: http://www.datastax.com/resources/webinars

Page 45: Webinar - How to Build Data Pipelines for Real-Time Applications with SMACK & Apache Kafka

Go get your SMACK on

Thank you!

Follow me on twitter: @PatrickMcFadin