microservices in the apache kafka ecosystem

Download Microservices in the Apache Kafka Ecosystem

If you can't read please download the document

Upload: confluent

Post on 16-Apr-2017

6.408 views

Category:

Software


2 download

TRANSCRIPT

Microservices in a Streaming World

Ben Stopford, Engineer, Confluent

Event Driven Microservices The toolset: Kafka, KStreams, Connect10 Principals for Streaming ServicesWhat well cover

Microservices in the Kafka Ecosystem

Today were going to talk about two fields which sit somewhat apart. Stream Processing & Business Systems

Services encourage us to share responsibilities. To rely on others. To collaborate and evolve a businesss view of the world over time. A world that is inherently decentralised. Inherently spread across many interconnected systems. Spreading many disparate subsets of our businesss state.

Yet our approach rarely reflects this. We tend to think in centralised ways, we accumulate data. We protect it. We control it. We hoard it.But it does not need to be this way.

GUIUIServiceOrdersServiceReturnsServiceFulfilmentServicePaymentServiceStockServiceMicroservices

So whyServices?

Today were going to talk about two fields which sit somewhat apart. Stream Processing & Business Systems

Services encourage us to share responsibilities. To rely on others. To collaborate and evolve a businesss view of the world over time. A world that is inherently decentralised. Inherently spread across many interconnected systems. Spreading many disparate subsets of our businesss state.

Yet our approach rarely reflects this. We tend to think in centralised ways, we accumulate data. We protect it. We control it. We hoard it.But it does not need to be this way.

The MonolithCan we do reuse, encapsulation?

The MonolithWhat happens when we grow?

Companies are inevitably a collection of applicationsThey must work together to some degree

Inverse Conway Maneuver

The 'Inverse Conway Maneuver' recommends evolving your team and organizational structure to promote your desired architecture. Org StructureSoftware Architecture

Eschew shared, mutable state

Service based approaches separate state

But state must inevitably be shared between services

Use a toolkit that embraces decentralisation

My name is Ben Stopford. I work as an engineer on Apache Kafka. In a previous life I did the most centralised thing you could possibly do. I built a single, central database for a large financial institution. This talk is about exploring the exact opposite approach to this problem. Its about thinking of the world in terms of the evolution and provisioning of state across many systems. Its about embracing decentralisation. Its about thinking in streams.

Some simple patterns of distributed systems

Request / Response

When do we need Request Response?Looking things up

What is in my shopping basket?

Event DrivenAsync / Fire and Forget / Brokered

Businesses are often modeled as a sequence events.

Add to cart

Buy

Make Payment

UpdateStock

EmailConf

PackageItem

Dispatch

Fraud

When do we need Event Driven?

SOA / Microservices

Message Broker

Event BasedRequest/Response

HybridsEvent- BasedRequest/Response

Request ResponseBuyItemsOrderServiceStock ServiceFul-filmentServiceFraudDet-ection

MmmNew IPadREST etcUIService

Event DrivenBuyItemsOrderServiceStock ServiceFul-filmentServiceFraudDet-ectionMessage Broker

MmmNew IPadUIService

HybridBuyItemsOrderServiceStock ServiceFul-filmentServiceFraudDet-ectionMessage Broker

MmmNew IPadREST

UIService

As software engineers we are inevitably affected by the tools we surround ourselves with.

Languages, frameworks, even processes all act to shape the software we build.

GUIUIServiceOrdersServiceReturnsServiceFulfilmentServicePaymentServiceStockServiceThe tools we choose have a big effect on our architecture

GUIUIServiceOrdersServiceReturnsServiceFulfilmentServicePaymentServiceStockServiceEvent DrivenRequest Response

Kafka is well suited to Event Driven Architectures

It will lead you down that path

The Tool Set

a Distributed Log

What is a Distributed Log?

Shard on the way in

ProducingServicesKafkaConsumingServices

Each shard is a queue

ProducingServicesKafkaConsumingServices

Consumers share loadProducingServicesKafkaConsumingServices

Reduces to a globally ordered queue

Load Balanced Services

Fault Tolerant Services

Build Always On ServicesRely on Fault Tolerant Broker

Services can Rewind & Replay the logRewind & Replay

Compacted Log(retains only latest version)

Version 3Version 2Version 1Version 2Version 1Version 5Version 4Version 3Version 2Version 1

A database engine for data-in-flight

Max(price)From orderswhere ccy=GBPover 1 day windowemitting every secondContinuously Running Queries

What is stream processing engine?DataIndexQueryEngineQueryEnginevsDatabaseFinite sourceStream ProcessorInfinite source

WindowingFor unordered or unpredictable streamsSlidingFixed(tumbling)

Features: similar to database query engineJoinFilterAggr-egateViewWindow

Stateful Stream ProcessingstreamCompactedstreamJoinStream DataStream-TabularDataWindowedStream

Locally Cached Table(disk resident)KafkaKafka Streams

Useful for EnrichmentstreamCompactedstreamJoinOrders

CustomersKafkaKafka Streams

Scales Out

Embeddable

Orders ServiceKafka

Orders Service

Orders Service

Kafka ConnectView Replication

Sometimes you need to physically move data

Replicate it, so both copies are identical

Iterate via regeneration

Kafka Connect

KafkaConnectKafkaConnectKafka

So

Service BackboneScalable, Fault Tolerant, Concurrent, Strongly Ordered, Stateful

Embeddable tool for data manipulation

Adapt data streams(transformation, stream maintenance, analytic function)

Replicate Data Sources Exactly

Create Regenerable, Streaming Views

How do we actually do this?10 (opinionated) principals for Streaming Services

1. Dont use Kafka for shopping carts!(OK, you can, but use sparingly)

ShoppingCartUIServiceBroker/durability/broadcast add little to request response

Do use Kafka for your core business processing.

Think business events. An order was created, a payment was received, a trade was booked etc. Pub/Sub.

GUIUIServiceOrdersServiceReturnsServiceFulfilmentServicePaymentServiceStockService

2. Pick Topics with Business Significance

OrdersPaymentsReturnsInvoices

Give your messages meaningful IDs and version themOrdersService1-Order-1234-v2Should relate tothe real worldShould be Versioned (if mutable)

Include the servicename

Note the key used for sharding in Kafka may not be this key

3. Decouple publishers from subscribers

GUIUIServiceOrdersServiceReturnsServiceFulfilmentServicePaymentServiceStockService

Add Request/Response only where needed

GUIUIServiceOrdersServiceReturnsServiceFulfilmentServicePaymentServiceStockService

REST

4. Use the log to regenerate stateAvoid journaling incoming events

Event Source side effectsUse offsets to tie these back to the streamStore in:KafkaKstreams state storeOther DB

5. Apply the Single Writer Principal

Change at source (by calling that service)Let the change propagate backKeep local copies read only.

GUIUIServiceOrdersServiceReturnsServiceFulfilmentServicePaymentServiceStockService

(1) Change Orders at source

(2) Let the change propagate through

6. Leverage keeping datasets inside the broker

customercountrysupplierex-rates

Leverage keeping only the latest version (table view)

Version 3Version 2Version 1Version 2Version 1Version 5Version 4Version 3Version 2Version 1

Join & Process on the flystreamCompactedstreamJoinOrders

CustomersKafkaKafka Streams

7. Prefer stream processing over maintaining historic views

UIServiceOrdersServiceReturnsServiceFulfilmentServicePaymentServiceStockService

Orders

Historic copiesdiverge

Join & Process on the flystreamCompactedstreamJoinOrders

CustomersKafkaKafka Streams

8. Sometimes you need historic views. => Replicate & Keep Read Only

Replicate

KafkaConnectKafkaConnectKafkaKeep me Read Only

Iterate

Polyglotic Persistence

9. Use Schemas(especially if data is retained)

Schemaless data doesnt age well

Messages

Confluent Schema Registry can help

OrdersService

SchemaRegistry

EmailService

Avro

10. Consider Stream Management Services

StreamManagementKafkaRetaining data => Admin tasksSimilar to the role of a DBA Data MigrationRepartitioningLatest/versionedEnvironment ManagementCQRS

KStreams is a good toolset for this

Stream ManagementKStreamsLatestStreamVersionedStream

So1. Dont use Kafka for shopping carts!2. Pick Topics with Business Significance3. Decouple publishers from subscribers4. Use the log to regenerate state5. Apply the Single Writer Principal6. Leverage keeping datasets inside the broker7. Prefer stream processing over maintaining historic views8. Sometimes you need historic views. => Replicate Read Only 9. Use Schemas10. Consider Stream Management Services

Microservices push us away from shared, mutable state

But state needs to be communicated

In an increasingly data-heavy world we need tools to do this efficiently

and in real time.

We need a data-centric toolset to do thisAll Your DataImmutableStreams

Event Driven ServicesStreamManagementServicesLegacy CDCView ReplicationPolyglotic PersistencetablesstreamsStreamingServices

Keep it simple, Keep it moving

@benstopfordThanks!