dive into streams with brooklin · issues with kmm: didn’t scale well, difficult to operate and...

67
Dive into Streams with Brooklin Celia Kung LinkedIn

Upload: others

Post on 22-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Dive into Streams with Brooklin

Celia KungLinkedIn

Page 2: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Outline

Background

Scenarios

Application Use Cases

Architecture

Current and Future

Page 3: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Background

Page 4: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Nearline Applications

• Require near real-time response

• Thousands of applications at LinkedIn

○ E.g. Live search indices, Notifications

Page 5: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Nearline Applications

• Require continuous, low-latency access to data

○ Data could be spread across multiple database systems

• Need an easy way to move data to applications

○ App devs should focus on event processing and not on data access

Page 6: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Heterogeneous Data Systems

Espresso (LinkedIn’s

document store)

MicrosoftEventHubs

Page 7: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Building the Right Infrastructure

• Build separate, specialized solutions to stream data from and to each different system?

○ Slows down development

○ Hard to manage!

MicrosoftEventHubs

...

Streaming System C

Streaming System D

Streaming System A

Streaming System B

...

Nearline Applications

Page 8: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Need a centralized, managed, and extensible service to continuously deliver data in near real-time

Page 9: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Brooklin

Page 10: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Brooklin

• Streams are dynamically provisioned and individually configured

• Extensible: Plug-in support for additional sources/destinations

• Streaming data pipeline service

• Propagates data from many source types to many destination types

• Multitenant: Can run several thousand streams simultaneously

Page 11: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Kinesis

EventHubs

Kafka

Des

tinat

ions

App

licat

ions

Kinesis

EventHubs

Kafka

Databases

Messaging SystemsSo

urce

sEspresso

Pluggable Sources & Destinations

Page 12: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Scenarios

Page 13: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Change Data Capture

Scenario 1:

Page 14: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Capturing Live Updates

1. Member updates her profile to reflect her recent job change

Page 15: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Capturing Live Updates

2. LinkedIn wants to inform her colleagues of this change

Page 16: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Capturing Live Updates

Member DB

Updates

News Feed Service

query

Page 17: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Capturing Live Updates

Member DB

Updates

News Feed Service

Search Indices Service

query

query

Page 18: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Capturing Live Updates

Member DB

UpdatesNews Feed

Service

queryNotifications

Service

Standardization Service

...

query

query

query

Search Indices Service

Page 19: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Capturing Live Updates

Member DB

UpdatesNews Feed

Service

queryNotifications

Service

Standardization Service

...

query

query

query

Search Indices Service

Page 20: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Capturing Live Updates

Member DB

UpdatesNews Feed

Service

queryNotifications

Service

Standardization Service

...

query

query

query

Search Indices Service

Page 21: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

• Isolation: Applications are decoupled from the sources and don’t compete for resources with online queries

• Applications can be at different points in change timelines

• Brooklin can stream database updates to a change stream

• Data processing applications consume from change streams

Change Data Capture (CDC)

Page 22: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Change Data Capture (CDC)

Messaging System

News Feed Service

Notifications Service

Standardization Service

Search Indices Service

Member DB

Updates

Page 23: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Streaming Bridge

Scenario 2:

Page 24: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

● Across…

○ cloud services

○ clusters

○ data centers

Stream Data from X to Y

Page 25: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Streaming Bridge

• Data pipe to move data between different environments

• Enforce policy: Encryption, Obfuscation, Data formats

Page 26: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

● Aggregating data from all data centers into a centralized place

● Moving data between LinkedIn and external cloud services (e.g. Azure)

● Brooklin has replaced Kafka MirrorMaker (KMM) at LinkedIn

○ Issues with KMM: didn’t scale well, difficult to operate and manage, poor

failure isolation

Mirroring Kafka Data

Page 27: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Use Brooklin to Mirror Kafka Data

DestinationsSources

Messaging systems

MicrosoftEventHubs

Messaging systems

MicrosoftEventHubs

Databases Databases

Page 28: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Kafka MirrorMaker

Topology

Datacenter B

aggregate tracking

tracking

Datacenter A

aggregate tracking

tracking

KMM

aggregate metrics

metrics

aggregate metrics

metrics

Datacenter C

aggregate tracking

tracking

aggregate metrics

metrics

...

KMM KMM

KMM KMM KMM

KMM KMM KMM KMM KMM KMM

KMM KMM KMM KMM KMM KMM

... ... ...

Page 29: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Brooklin Kafka Mirroring Topology

Datacenter A

aggregate tracking

tracking

Brooklin

metrics

aggregate metrics

Datacenter B

aggregate tracking

tracking metrics

aggregate metrics

Datacenter C

aggregate tracking

tracking metrics

aggregate metrics

...Brooklin Brooklin

Page 30: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Brooklin Kafka Mirroring

● Optimized for stability and operability

● Manually pause and resume mirroring at every level

○ Entire pipeline, topic, topic-partition

● Can auto-pause partitions facing mirroring issues

○ Auto-resumes the partitions after a configurable duration

● Flow of messages from other partitions is unaffected

Page 31: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Application Use Cases

Page 32: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Security

Cache

Application Use Cases

Page 33: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Security

Cache

Search Indices

Application Use Cases

Page 34: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Security

Cache

Search Indices

ETL or Data warehouse

Application Use Cases

Page 35: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Security

Cache

Search Indices

ETL or Data warehouse

Materialized Views or Replication

Application Use Cases

Page 36: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Security

Cache

Search Indices

ETL or Data warehouse

Materialized Views or Replication

Repartitioning

Application Use Cases

Page 37: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Adjunct Data

Application Use Cases

Page 38: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Bridge

Adjunct Data

Application Use Cases

Page 39: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Bridge

Serde, Encryption, Policy

Adjunct Data

Application Use Cases

Page 40: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Bridge

Serde, Encryption, Policy

Standardization, Notifications …

Adjunct Data

Application Use Cases

Page 41: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Architecture

Page 42: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Stream updates made to Member Profile

Example:

Page 43: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Capturing Live Updates

Member DB

Updates

News Feed Service

Page 44: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

● Scenario: Stream Espresso Member Profile updates into Kafka

○ Source Database: Espresso (Member DB, Profile table)

○ Destination: Kafka

○ Application: News Feed service

Example

Page 45: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

• Describes the data pipeline

• Mapping between source and destination

• Holds the configuration for the pipeline

Datastream

Name: MemberProfileChangeStream

Source: MemberDB/ProfileTableType: EspressoPartitions: 8

Destination: ProfileTopicType: KafkaPartitions: 8

Metadata: Application: News Feed service Owner: [email protected]

Page 46: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

1. Client makes REST call to create datastream

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DB

create POST /datastreamNews Feed

service

Page 47: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

2. Create request goes to any Brooklin instance

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 48: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

3. Datastream is written to ZooKeeper

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 49: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

4. Leader coordinator is notified of new datastream

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 50: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

5. Leader coordinator calculates work distribution

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 51: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

6. Leader coordinator writes the assignments to ZK

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 52: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

7. ZooKeeper is used to communicate the assignments

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 53: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

8. Coordinators hand task assignments to consumers

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 54: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

9. Consumers start streaming data from the source

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 55: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

10. Consumers propagate data to producers

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 56: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

11. Producers write data to the destination

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 57: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

12. App consumes from Kafka

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

Page 58: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

13. Destinations can be shared by apps

Brooklin Client

Load Balancer

ZooKeeper

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator (Leader)

Kafka ProducerBrooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Brooklin Instance

Datastream Management Service (DMS) Espresso Consumer

Coordinator

Kafka Producer

Member DBNews Feed

service

News Feed service

News Feed service

Page 59: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Brooklin Architecture

ZooKeeper

Brooklin Instance

Brooklin Instance

Coordinator

Datastream Management

Service (DMS)

Cons

umer

A

Cons

umer

B

Prod

ucer

X

Prod

ucer

Y

Prod

ucer

Z

Brooklin Instance

Datastream Management

Service (DMS)

Cons

umer

A

Prod

ucer

X

Cons

umer

B

Prod

ucer

Y

Prod

ucer

Z

Coordinator (Leader)

Brooklin Instance

Coordinator

Datastream Management

Service (DMS)

Cons

umer

A

Cons

umer

B

Prod

ucer

X

Prod

ucer

Y

Prod

ucer

Z

Brooklin Instance

Page 60: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Current & Future

Page 61: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Current

• Consumers: ○ Espresso○ Oracle○ Kafka○ EventHubs○ Kinesis

• Producers: ○ Kafka○ EventHubs

• APIs are standardized to support additional sources and destinations

• Multitenant: Can power thousands of datastreams across several source and destination types

• Guarantees: At-least-once delivery, order is maintained at partition level

• Kafka mirroring improvements: finer control of pipelines (pause/auto-pause partitions), improved latency with flushless-produce mode

Sources & Destinations Features

Page 62: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

38Bmessages/day

2K+datastreams

1K+unique sources

200+applications

Brooklin in Production

Brooklin streams with Espresso, Oracle, or EventHubs as the source

Page 63: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

2T+messages/day

200+datastreams

10K+topics

Brooklin in Production

Brooklin streams mirroring Kafka data

Page 64: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Future

• Consumers: ○ MySQL○ Cosmos DB○ Azure SQL

• Producers: ○ Azure Blob storage○ Kinesis○ Cosmos DB○ Azure SQL○ Couchbase

Sources & Destinations Open Source

• Plan to open source Brooklin in 2019 (soon!)

Optimizations

• Brooklin auto-scaling

• Passthrough compression

• Read optimizations: Read once, Write multiple

Page 65: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Thank you

Page 66: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka

Questions?

Celia Kung

: [email protected] : /in/celiakkung/

Page 67: Dive into Streams with Brooklin · Issues with KMM: didn’t scale well, difficult to operate and manage, poor failure isolation Mirroring Kafka Data. Use Brooklin to Mirror Kafka