challenges of a multi tenant kafka service

Post on 16-Mar-2018

433 Views

Category:

Data & Analytics

6 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Thomas Alex

Principal Program Manager

Microsoft

Introduction

Goals

Solution

Tenant model

Deployment architecture

Open Discussion

Siphon: Enterprise Data Bus

Near real-time

Compliant

No data dead-ends

Hyper scale

Reliable

Network effects

8 millionEVENTS PER SECOND PEAK INGRESS

800 TB (10 GB per Sec)INGRESS PER DAY

1,800PRODUCTION KAFKA BROKERS

450TOPICS

15 Sec99th PERCENTILE LATENCY

SDK Collector

Siphon

connector

API

Management UI

Metadata dB

Customer: Major Car Manufacturer

Scenario: Connected Car Telematics

Data producers

Millions of cars

Routed via cloud gateway to Siphon endpoint

Data consumers

Spark streaming applications

Siphon compute forwards data to blob storage

UI

Backend

Source

systemsDestination

systems

Data

producers

• Send data

reliably

Customers

• Manage capacity

• Manage

tenant/topic/subscription

• Pay for the service

Data

consumers

• Consume

data in

NRT

Service owners

• Manage service

with SLA

Managed service

Availability

Reliability

Isolation

Low cost

Self-service

Regulatory Compliance

Data sharing

Instance

Instance

Instance

Customer A

Customer B

Customer C

Multiple instances

Single tenant per instance

Customer A

Customer B

Customer C

Single instance

Multiple tenant per instance

Instance

Customer A

Customer B

Customer C

Multiple instances

Multiple tenant per instance

Instance

Instance

Siphon Deployment Unit

• Ingress service (Collector)

• Kafka cluster

• Connector (HLC)

• Monitoring

Management Service

• Metadata

• Self-serve API

• Self-serve UI

Collector HLC

APIMetadata dB

Tenant

Principals (administrators, users)

Resources

Endpoint

Topics

Subscriptions

Quota

Storage capacity

Throughput

Threshold for auto-approval

Default limits

Topic capacity

Retention

Partitions

Tenant 3Traffic

Manager 3

Tenant 2Traffic

Manager 2

Siphon DU 1

Collector HLC

Siphon DU 2

Collector HLC

Siphon DU 3

Collector HLC

Tenant 1Traffic

Manager 1

Scalability

Underlying infra is IaaS

Isolation

Availability and Latency SLA

Regulatory compliance guarantees

Enterprise cloud depends on data security & privacy

Regulatory framework for certifications e.g. SOC, FEDRAMP, HIPAA

Data sharing

Manageability

Provisioning

Monitoring

Maintainability

Comments / Feedback

https://www.linkedin.com/in/tomalex/

tomalex@microsoft.com

Compliance regions North America

South America

Europe

Asia Pacific

Go Local Australia

Canada

India

Japan

United Kingdom

Sovereign Germany

China

Government

Self-service Tenant creation & management

Topic creation & management

Topic health & data preview

Subscription creation & management

AuthN Azure AD based for Self-service API & UI

Cert based for data producers and consumers

AuthZ Siphon Metadata used to authorize provisioning & management (tenants, topics, etc.)

Kafka ACLs for topic level access control

Throttling EventServer throttles based on quota limit

Monitoring Operational metrics in a single system (MDM) for monitoring and alerting

Data quality Audit Trail system for e2e latency and completeness monitoring

top related