paypal: creating a central data backbone: couchbase to couchbase to kafka to hadoop and back:...
TRANSCRIPT
CREATING A CENTRAL DATA BACKBONE AT PAYPAL: COUCHBASE TO KAFKA TO HADOOP AND BACKShibi Sudhakaran| PayPalJustin Michaels l Couchbase
©2015 Couchbase Inc. 3
Agenda
• Define Problem DomainJustin Michaels | Solution Architect, Couchbase
• Use case at Paypal & DemoShibi Sudhakaran| Engineer, Paypal
• Q&A
©2015 Couchbase Inc. 4
Couchbase at PayPal
4
Footprint Overview Seven use cases (more going live at later date) Each cluster is 10 to 20 nodes per cluster Three data center locations per use case
Global Cookie Service Three clusters (two handle traffic, one for DR) Bi-Directional Replication Billions of Documents TB of Data (Maximum of 10 over time)
Challenge Data Analytics
©2015 Couchbase Inc. 5
How do you analyze Couchbase data
5
Couchbase Views
Sqoop
ElasticSearch
Stream Data
©2015 Couchbase Inc. 6
Couchbase at PayPal
6
Couchbase Solution Couchbase Server deployed to
capture and serve global cookies Integrates with Hadoop to pass data
for additional offline analytics via Kafka
Results Consistent low latency
SLA 10ms application SLA 1ms Couchbase
High availability enabled by distributed cache and data center replication
Kafka integration for analytics within Hadoop cluster
©2015 Couchbase Inc. 7
Couchbase < 3.0.3
Query Service
Couchbase Cluster
View (Incremental Map Reduce)
Data Service
node1 node8
Homogenous Scaling– Each node get a slice of the workload– Simple to do…
But...– Workloads compete and interfere with each other– Cant fine tune each workload
- Core Data operation are partition-able so great with wider fan-out- Indexing and Query not always partition-able so worse with wider fan-out
©2015 Couchbase Inc. 8
Couchbase 4.0
Index ServiceGlobal
Secondary Indexes
Couchbase Cluster
Query Service
Data ServiceViews and Geo Views
node1 node8
Multi-Dimensional Scalability• Independent services for Query, Index and Data• Independent scalability for capacity per Service• Data access provided by distributed cache
©2015 Couchbase Inc. 9
Couchbase 4.0
Couchbase Cluster
node1 node8 node9
Data Service
Index ServiceQuery
Service
Heavier indexing (index more fields) : add compute to index service nodes
Increased query load : linearly scale query serviceMore data : linearly scale data service
Innovative leader in Payment
165 Million
> 100
Active Customers
payment currencies
203
57
Available markets
countries
2014 was a year of significant growth.
$235 Billion
$8 Billion
Net Total Payment Volume
^26% YoY
Revenue^19% YoY
19 Million
$168 Billion
New Active Digital Wallets
Merchant Services Payment Volume^34% YoY
Size limitations
Device Centric Le
gacy
Applicat
ions
CookieConsumers &
Merchants
Overuse
Plain/encrypted/
session/persistent
© 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 13
Cluster aware Cookie
The Fix
© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.
Aug/Sep Oct Nov Dec
Month Month MonthMonth
14
Data volume/ Scalability
• Online system ; >1B documents
• 4-10k size ; 5-10TB total storage
• Linearly Scalable
Availability
• Multi data center – DR
• Availability requirement of
99.99%
Requirements for Database
Data Structure
• Flexible & Schema less; document
based
Performance
• 50% read/50% write;
• Low latency < 5-10 msec
© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.
Aug/Sep Oct Nov Dec
Month Month Month Month
15
Couchbase Core Principles
© 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 16
Cookie Application
Front Tier
Customers
Applications (C++,Node, Java)Cookie Libraries
Mid Tier
Data Tier
Couchbase Client
Functional View
CB Kafka Adapter
© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.
CookieApp
CookieApp
CookieApp
XDCR
Active
Write
Read
17
Bi-directional Uni-directional
Active Passive
Deployment Model
© 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 20
Couchbase TAP
• Snapshot Entire Database
• Export Future mutations
• TAP observe data changes in memcached
server
• Kafka - A high-throughput distributed
messaging system.
Couchbase Kafka AdapterBased on Couchbase Tap & Kafka Producer
Kafka Producer
Fast
Scalable
Durable
Distributed
https://github.com/paypal/couchbasekafka
© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.
Stream data out of databasehttps://github.com/paypal/couchbasekafka
21
Camus , MR Jobs
TAP StreamCouchbase Kafka Adapter
{TAP Client + Kafka Producer}
[1] [2] [3]
[4][5][6]
[7]
© 2015 PayPal Inc. All rights reserved. Confidential and proprietary.
Data Partitions
22
Map Couchbase Partitions to Kafka Partitions for Total Ordering
© 2015 PayPal Inc. All rights reserved. Confidential and proprietary. 23
Demo … We will supply a link seperately