paypal: creating a central data backbone: couchbase to couchbase to kafka to hadoop and back:...

24
CREATING A CENTRAL DATA BACKBONE AT PAYPAL: COUCHBASE TO KAFKA TO HADOOP AND BACK Shibi Sudhakaran| PayPal Justin Michaels l Couchbase

Upload: couchbase

Post on 26-Jul-2015

540 views

Category:

Technology


3 download

TRANSCRIPT

CREATING A CENTRAL DATA BACKBONE AT PAYPAL: COUCHBASE TO KAFKA TO HADOOP AND BACKShibi Sudhakaran| PayPalJustin Michaels l Couchbase

©2015 Couchbase Inc. 3

Agenda

• Define Problem DomainJustin Michaels | Solution Architect, Couchbase

• Use case at Paypal & DemoShibi Sudhakaran| Engineer, Paypal

• Q&A

©2015 Couchbase Inc. 4

Couchbase at PayPal

4

Footprint Overview Seven use cases (more going live at later date) Each cluster is 10 to 20 nodes per cluster Three data center locations per use case

Global Cookie Service Three clusters (two handle traffic, one for DR) Bi-Directional Replication Billions of Documents TB of Data (Maximum of 10 over time)

Challenge Data Analytics

©2015 Couchbase Inc. 5

How do you analyze Couchbase data

5

Couchbase Views

Sqoop

ElasticSearch

Stream Data

©2015 Couchbase Inc. 6

Couchbase at PayPal

6

Couchbase Solution Couchbase Server deployed to

capture and serve global cookies Integrates with Hadoop to pass data

for additional offline analytics via Kafka

Results Consistent low latency

SLA 10ms application SLA 1ms Couchbase

High availability enabled by distributed cache and data center replication

Kafka integration for analytics within Hadoop cluster

©2015 Couchbase Inc. 7

Couchbase < 3.0.3

Query Service

Couchbase Cluster

View (Incremental Map Reduce)

Data Service

node1 node8

Homogenous Scaling– Each node get a slice of the workload– Simple to do…

But...– Workloads compete and interfere with each other– Cant fine tune each workload

- Core Data operation are partition-able so great with wider fan-out- Indexing and Query not always partition-able so worse with wider fan-out

©2015 Couchbase Inc. 8

Couchbase 4.0

Index ServiceGlobal

Secondary Indexes

Couchbase Cluster

Query Service

Data ServiceViews and Geo Views

node1 node8

Multi-Dimensional Scalability• Independent services for Query, Index and Data• Independent scalability for capacity per Service• Data access provided by distributed cache

©2015 Couchbase Inc. 9

Couchbase 4.0

Couchbase Cluster

node1 node8 node9

Data Service

Index ServiceQuery

Service

Heavier indexing (index more fields) : add compute to index service nodes

Increased query load : linearly scale query serviceMore data : linearly scale data service

Innovative leader in Payment

165 Million

> 100

Active Customers

payment currencies

203

57

Available markets

countries

2014 was a year of significant growth.

$235 Billion

$8 Billion

Net Total Payment Volume

^26% YoY

Revenue^19% YoY

19 Million

$168 Billion

New Active Digital Wallets

Merchant Services Payment Volume^34% YoY

Size limitations

Device Centric Le

gacy

Applicat

ions

CookieConsumers &

Merchants

Overuse

Plain/encrypted/

session/persistent

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 13

Cluster aware Cookie

The Fix

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.

Aug/Sep Oct Nov Dec

Month Month MonthMonth

14

Data volume/ Scalability

• Online system ; >1B documents

• 4-10k size ; 5-10TB total storage

• Linearly Scalable

Availability

• Multi data center – DR

• Availability requirement of

99.99%

Requirements for Database

Data Structure

• Flexible & Schema less; document

based

Performance

• 50% read/50% write;

• Low latency < 5-10 msec

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.

Aug/Sep Oct Nov Dec

Month Month Month Month

15

Couchbase Core Principles

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 16

Cookie Application

Front Tier

Customers

Applications (C++,Node, Java)Cookie Libraries

Mid Tier

Data Tier

Couchbase Client

Functional View

CB Kafka Adapter

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.

CookieApp

CookieApp

CookieApp

XDCR

Active

Write

Read

17

Bi-directional Uni-directional

Active Passive

Deployment Model

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 18

Cluster Overview

Analyze Cookie data

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 20

Couchbase TAP

• Snapshot Entire Database

• Export Future mutations

• TAP observe data changes in memcached

server

• Kafka - A high-throughput distributed

messaging system.

Couchbase Kafka AdapterBased on Couchbase Tap & Kafka Producer

Kafka Producer

Fast

Scalable

Durable

Distributed

https://github.com/paypal/couchbasekafka

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.

Stream data out of databasehttps://github.com/paypal/couchbasekafka

21

Camus , MR Jobs

TAP StreamCouchbase Kafka Adapter

{TAP Client + Kafka Producer}

[1] [2] [3]

[4][5][6]

[7]

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.

Data Partitions

22

Map Couchbase Partitions to Kafka Partitions for Total Ordering

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 23

Demo … We will supply a link seperately

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.

Monitoring

24

© 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 25

THANK YOU

Twitter: @s007

https://linkedin.com/in/shibisudhakaran

Shibi Sudhakaran

Twitter: @justindmichaels

https://linkedin.com/in/justindmichaels

Justin Michaels