stream processing at scale with spring xd and kafka

48
Unless otherwise indicated, these slides are © 2013-2015 Pivotal Software, Inc. and licensed under a Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/ SPRINGONE2GX WASHINGTON, DC Stream Processing at Scale with Spring XD and Kafka By Marius Bogoevici @mariusbogoevici (and more)

Upload: spring-by-pivotal

Post on 15-Apr-2017

3.942 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

SPRINGONE2GXWASHINGTON, DC

Stream Processing at Scale with Spring XD and Kafka

By Marius Bogoevici @mariusbogoevici

(and more)

Page 2: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

About me

• Staff Engineer at Pivotal

• Works on Spring XD, Spring Cloud Stream, Spring Cloud Data Flow, Spring Integration, Spring Integration

• Formerly: InfinityQuick, Red Hat (JBoss), SpringSource

2

Page 3: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Safe Harbor Statement

The following is intended to outline the general direction of Pivotal's offerings. It is intended for information purposes only and may not be incorporated into any contract. Any information regarding pre-release of Pivotal offerings, future updates or other planned modifications is subject to ongoing evaluation by Pivotal and is subject to change. This information is provided without warranty or any kind, express or implied, and is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions regarding Pivotal's offerings. These purchasing decisions should only be based on features currently available. The development, release, and timing of any features or functionality described for Pivotal's offerings in this presentation remain at the sole discretion of Pivotal. Pivotal has no obligation to update forward looking information in this presentation.

3

Page 4: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 4

Stream Processing in the Big Data

landscape

Page 5: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

The domain of stream processing

• Now have the ability to cheaply store and analyze huge quantities of data

• Traditionally the realm of batch processing but also demand for real-time

• aka ‘Stream Processing’

• Some characteristics of Stream Processing

• High volume

• Low latency

• Often data is grouped and ordered

5

Page 6: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Streaming and real-time analysis examples

• “Internet of Things”

• Fraud Detection

• Measuring Quality of Service

• Predictive Maintenance

• Log aggregation

6

Page 7: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 7

Our Toolkit

Page 8: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 8

Page 9: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Apache Kafka message

• Initially developed at LinkedIn, open-sourced in 2011

• High-throughput distributed messaging system

• Publish-subscribe messaging rethought as a message log

• Decoupled data pipelines between producers and consumers

9

Page 10: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Apache Kafka: Topics and logs

• Each topic, divided in many partitions read/written independently

• Producers can target partitions and topics directly

• Distributed, replicated

• Linear reads, writes

10

Page 11: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Producers and consumers

• Producers, consumers completely decoupled from each other

• Partitions are distributed and replicated

• Reads/writes from the partition leader

11

Page 12: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Why Kafka for Stream Processing

• High throughput, low latency due to linear reads/writes

• High performance on commodity hardware

• Replayability (consuming from existing offsets)

• Can be used as a master dataset

• Consumer groups are completely independent of each other

• Guaranteed ordered delivery within a partition

• Competing consumer scenarios are difficult to implement

• Generally, not an issue with stream processing

12

Page 13: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 13

+

Page 14: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring Integration Kafka

• Started in 2011

• Goal: adapting to the abstractions Spring Messaging and Spring Integration

• Easy access to the unique features of Kafka;

• Namespace, Java DSL support

• To migrate to 0.8.3 once available

• Defaults focused towards performance (disable ID generation, timestamp)

14

Page 15: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring Integration Kafka: Channel Adapters

15

Kafka Inbound Channel Adapter Kafka Outbound Channel AdapterSpring Messaging Message Channel

Spring Messaging Message

Spring Messaging Message

DI Configuration DI Configuration

Page 16: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring Integration Kafka Producer Configuration

• Default producer configuration

• Distinct per-topic producer configurations

• Destination target or partition controlled via SPEL expressions or headers

16

<int-­‐kafka:producer-­‐context  id="kafkaProducerContext">                  <int-­‐kafka:producer-­‐configurations>                          <int-­‐kafka:producer-­‐configuration  broker-­‐list="localhost:9092"                                                key-­‐class-­‐type="java.lang.String"                                                value-­‐class-­‐type="java.lang.String"                                                topic="test1"                                                value-­‐serializer="kafkaSerializer"                                                key-­‐serializer="kafkaSerializer"                                                compression-­‐type="none"/>                          <int-­‐kafka:producer-­‐configuration  broker-­‐list="localhost:9092"                                                  topic="regextopic.*"                                                  compression-­‐type="gzip"/>                  </int-­‐kafka:producer-­‐configurations>          </int-­‐kafka:producer-­‐context>

Page 17: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring Integration Kafka Consumer• Own client based on Simple Consumer API

• Listen to specific partitions!

• Offset control - when to be written and where (no Zookeeper);

• Programmer-controlled acknowledgment;

• SI defaults focused towards performance

• No ID, timestamp generation (can be turned on, optionally)

• Concurrent message processing (preserving per-partition ordering) via Reactor

• Basic operations via KafkaTemplate

• Kafka specific headers

17

Page 18: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring Integration Kafka Consumer Example

18

<int-­‐kafka:zookeeper-­‐connect  id="zkConnect"  zk-­‐connect=“localhost:2181"/>  

<bean  id="kafkaConfiguration"  class="org.springframework.integration.kafka.core.ZookeeperConfiguration">     <constructor-­‐arg  ref="zkConnect"/>  </bean>  

<int-­‐kafka:message-­‐driven-­‐channel-­‐adapter         id="adapter"         channel="output"         connection-­‐factory="connectionFactory"         key-­‐decoder="decoder"         payload-­‐decoder="decoder"         max-­‐fetch="100"         topics="${kafka.test.topic}"/>  

Page 19: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring Integration Kafka Message Listener

• Auto-acknowledging

• With manual acknowledgment

19

public  interface  MessageListener  {     void  onMessage(KafkaMessage  message);  }  

public  interface  AcknowledgingMessageListener  {     void  onMessage(KafkaMessage  message,  Acknowledgment  acknowledgment);  }  

Page 20: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring Integration Kafka: Message Headers

20

public abstract class KafkaHeaders {private static final String PREFIX = "kafka_";public static final String TOPIC = PREFIX + "topic";public static final String MESSAGE_KEY = PREFIX + "messageKey";public static final String PARTITION_ID = PREFIX + "partitionId";public static final String OFFSET = PREFIX + "offset";public static final String NEXT_OFFSET = PREFIX + "nextOffset";public static final String ACKNOWLEDGMENT = PREFIX +

"acknowledgment";}

Page 21: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring Integration Kafka: Offset Management

• Injectable strategy

• Allows customizing the starting offsets

• Implementations: SI MetadataStore-backed (e.g. Redis, Gemfire), Kafka compacted topic-backed (pre-0.8.2), Kafka 0.8.2 native

• Messages can be auto acknowledged (by the adapter) or manually acknowledged (by the user)

• Manual acknowledgment useful when messages are processed asynchronously

• Acknowledgment passed as message header or as argument

21

Page 22: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

DEMO: Spring Integration Kafka at Work

22

Page 23: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 23

++

Page 24: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring XD and Stream processing

• Higher abstractions are required

• Integrating seamlessly and transparently with the middleware

• Building on top of Spring Integration and Spring Batch

• Pre-built modules using the entire power of the Spring ecosystem

24

Page 25: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Streams in Spring XD

25

HTTP$JMS$Ka*a$

RabbitMQ$JMS$

Gemfire$File$SFTP$Mail$JDBC$Twi;er$Syslog$TCP$UDP$MQTT$Trigger$

Filter$Transformer$

Spli;er$Aggregator$HTTP$Client$

JPMML$Evaluator$Shell$Python$Groovy$Java$RxJava$

Spark$Streaming$

File$HDFS$HAWQ$Ka*a$

RabbitMQ$Redis$Splunk$Mongo$Redis$JDBC$TCP$Log$Mail$

Gemfire$MQTT$

Dynamic$Router$Counters$

Note: Named channels allow for a directed graph of data flow

channel

Page 26: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring XD: Stream DSL

26

Page 27: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring XD - Message Bus abstraction

• Binds module inputs and outputs to a transport

27

Binds module inputs and outputs to a transport Performs Serialization (Kryo) Local, Rabbit, Redis, and Kafka

Page 28: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring XD Distributed Runtime

28

Page 29: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring XD and Kafka - the message bus

• Each pipe between modules is a topic;

• Spring XD creates topics automatically;

• Topics are pre-partitioned based on module count and concurrency;

• Overpartitioning is available as an option;

• Multiple consumer modules ‘divide’ the partition set of a topic using a deterministic algorithm;

29

Page 30: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring XD and Kafka: Partitioning

• Required in distributed stateful processing: related data must be processed on the same node;

• Partitioning logic configured in Spring XD via deployment manifest

• partitionKeyExpression=payload.sensorId

• When using Kafka as a bus, partition key logic maps directly to Kafka transport partitioning natively

30

Partition 0

Partition 1

HTTP

HTTP

HTTP

Average Processor

Average Processor

Topic

http | avg-temperatures

Page 31: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring XD and Kafka:

31

Combine the high performance of Kafka with the powerful abstractions of Spring XD

Page 32: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Performance metrics of Spring XD 1.2

32

Page 33: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Performance metrics of Spring XD 1.2

33

Page 34: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Demo: Stream processing at scale with Spring XD

34

High%volume%

Low%Latency%

Programming%Model%

Distributed%Run;me%

Ka=a%%(moving%data)%

RxJava%(processing)%

Spring%XD%(distribu;ng)%

Page 35: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 35

Cloud Native

Page 36: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Stream Processing at Scale: Message-driven Microservices

36

Page 37: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Scaling with streams

• Parallelism is key for scaling

• A single machine’s resources can be exhausted really fast

• CPU: serialization

• Memory: keeping state, caching lookup data

• Disk: especially for brokers

• Network: saturating the interfaces

• Vertical scaling == exponential cost

• And failover redundancy is still a requirement

• Cluster sizes can get really big

• tens, hundreds of module instances

• which leads to ….

37

Page 38: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 38

Page 39: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring Cloud Stream/Spring Cloud Data Flow

• Successors to Spring XD 1.x

• New goals:

• Cloud native;

• Micro service architecture for modules;

• Most valuable features should be preserved;

• Abstracting out the connection to the transport

• Same abstractions, backwards compatible DSL;

39

Page 40: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 40Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Spring XD Modules

•  Execute inside of a Spring XD Container

•  Consist of beans in an application context

•  Multiple modules may run in a container, each with their own class loader

5

Page 41: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 41Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 7

•  OOTB modules •  Executable Boot Apps

•  Boot for Spring Integration and Batch •  Auto-Configuration, Bindings

XD

Container

Modules

Admin

Spring Cloud Data Flow

Spring Cloud Stream

Modules

Spring Cloud Task Modules

Spring Cloud Stream

Spring Cloud Task

•  REST API, Shell, UI •  Module Deployer SPI:

•  Singlenode •  Lattice •  YARN •  Cloud Foundry

From Modules to Microservices

Page 42: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 42Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 10

Container Container

Spring XD 1.x

Host

Spring Cloud Stream

•  Modules run inside containers •  Container is a Boot app •  Modules are ApplicationContexts

•  Modules are executable Boot apps •  Easier to use Spring Cloud features •  Easier to test, consistent dev experience •  Portable and “cloud-native”

M

Host

M

M M

M M M

M M

M

ZooKeeper Spring Cloud Data Flow (optional) SPI Implementation

Page 43: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Transparent development model

43

public  interface  Source  {     String  OUTPUT  =  "output";     @Output(Source.OUTPUT)     MessageChannel  output();  }  

@EnableBinding(Source.class)  @EnableConfigurationProperties(TimeSourceProperties.class)  @Import(PeriodicTriggerConfiguration.class)  public  class  TimeSource  {     @Autowired     private  TimeSourceProperties  properties;     @InboundChannelAdapter(value  =  Source.OUTPUT,  poller  =  @Poller(         trigger  =  PeriodicTriggerConfiguration.TRIGGER_BEAN_NAME,                                      maxMessagesPerPoll  =  "1"))     public  String  publishTime()  {       return  new  SimpleDateFormat(this.properties.getFormat()).format(new  Date());     }  }  

Page 44: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Where is Kafka ?

• Out of sight, but not out of mind

• On the class path ( just like all the other binders)

• Bindings/Channel Adapters are created transparently

• Channel names are topic names (overridable by Boot properties)

• Reusing code of the Spring XD 1.x Kafka message bus

• Partitioning/multiple instance support, with a microservice twist

• Spring Boot properties to compute the target partition set (instanceIndex, instanceCount)

44

Page 45: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 45

INSTANCE_INDEX=0

HTTP

INSTANCE_INDEX=0

INSTANCE_INDEX=1

INSTANCE_INDEX=6

LOG

Kafka Service

http.0 (0) http.0 (1)

http.0 (2)

http.0 (6)

Broker 0 Broker 1 Broker 4

stream create logger --definition "http | log"

stream deploy logger --properties module.log.partitioned=true, module.log.count=7

Page 46: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 46

++++

= massive scale!

Page 47: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/

Demo: Scaled, Partitioned Streams on PCF and Lattice

47

Page 48: Stream Processing at Scale with Spring XD and Kafka

Unless otherwise indicated, these s l ides are © 2013-2015 Pivotal Software, Inc. and l icensed under a Creat ive Commons Attr ibut ion-NonCommercial l icense: ht tp: / /creat ivecommons.org/ l icenses/by-nc/3.0/ 48

Tell your friends!

Developing Real-Time Data Pipelines with Apache Kafka - Joe Stein, Thursday, 10:30

Learn More. Stay Connected.

@springcentral Spring.io/video