Introducing Couchbase Server 2.0
Dipti BorkarDirector, Product Management
2.0
NoSQL Document Database
Couchbase Server
• Leading NoSQL database project focused on distributed database technology and surrounding ecosystem
• Supports both key-value and document-oriented use cases
• All components are available under the Apache 2.0 Public License
• Obtained as packaged software in both enterprise and community editions.
Couchbase Open Source Project
Couchbase Open Source Project
Easy Scalabili
ty
Consistent High
Performance
Always On
24x365
Grow cluster without application changes, without downtime with a single click
Consistent sub-millisecond read and write response times
with consistent high throughput
No downtime for software upgrades, hardware maintenance, etc.
JSONJSONJSON
JSONJSON
PERFORMANCE
Flexible Data Model
JSON document model with no fixed schema.
Couchbase Server
Flexible Data Model
• No need to worry about the database when changing your application
• Records can have different structures, there is no fixed schema
• Allows painless data model changes for rapid application development
{ “ID”: 1, “FIRST”: “Dipti”, “LAST”: “Borkar”, “ZIP”: “94040”, “CITY”: “MV”, “STATE”: “CA”}
JSONJSON
JSON JSON
New in 2.0
JSON support Indexing and Querying
Cross data center replicationIncremental Map Reduce
JSONJSONJSON
JSONJSON
Additional Couchbase Server Features
Built-in clustering – All nodes equal
Data replication with auto-failover
Zero-downtime maintenance
Built-in managed cached
Append-only storage layer
Online compaction
Monitoring and admin API & UI
SDK for a variety of languages
Couchbase Server 2.0 Architecture
Hea
rtbe
at
Proc
ess
mon
itor
Glo
bal s
ingl
eton
sup
ervi
sor
Confi
gura
tion
man
ager
on each node
Reba
lanc
e or
ches
trat
or
Nod
e he
alth
mon
itor
one per cluster
vBuc
ket s
tate
and
repl
icati
on m
anag
er
http
REST
man
agem
ent A
PI/W
eb U
I
HTTP8091
Erlang port mapper4369
Distributed Erlang21100 - 21199
Erlang/OTP
storage interface
Couchbase EP Engine
11210Memcapable 2.0
Moxi
11211Memcapable 1.0
Memcached
New Persistence Layer
8092Query API
Que
ry E
ngin
e
Data Manager Cluster Manager
Couchbase Server 2.0 Architecture
Hea
rtbe
at
Proc
ess
mon
itor
Glo
bal s
ingl
eton
sup
ervi
sor
Confi
gura
tion
man
ager
on each node
Reba
lanc
e or
ches
trat
or
Nod
e he
alth
mon
itor
one per cluster
vBuc
ket s
tate
and
repl
icati
on m
anag
er
http
REST
man
agem
ent A
PI/W
eb U
I
HTTP8091
Erlang port mapper4369
Distributed Erlang21100 - 21199
Erlang/OTP
storage interface
Couchbase EP Engine
11210Memcapable 2.0
Moxi
11211Memcapable 1.0
Memcached
New Persistence Layer
8092Query API
Que
ry E
ngin
e
COUCHBASE OPERATIONS
33 2
Single node - Couchbase Write Operation
Managed Cache
Dis
k Q
ueue
Disk
Replication Queue
App Server
Couchbase Server Node
Doc 1Doc 1
Doc 1
To other node
33 2
Single node - Couchbase Update Operation
Managed Cache
Dis
k Q
ueue
Replication Queue
App Server
Doc 1’
Doc 1
Doc 1’Doc 1
Doc 1’
Disk
To other node
Couchbase Server Node
GET
Doc
1
33 2
Single node - Couchbase Read Operation
Dis
k Q
ueue
Replication Queue
App Server
Doc 1
Doc 1Doc 1
Managed Cache
Disk
To other node
Couchbase Server Node
COUCHBASE SERVER CLUSTER
Basic Operation
• Docs distributed evenly across servers
• Each server stores both active and replica docsOnly one server active at a time
• Client library provides app with simple interface to database
• Cluster map provides map to which server doc is onApp never needs to know
• App reads, writes, updates docs
• Multiple app servers can access same document at same time
User Configured Replica Count = 1
READ/WRITE/UPDATE
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 1
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc
SERVER 2
Doc 8
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
REPLICA
Doc 6
Doc 3
Doc 2
Doc
Doc
Doc
REPLICA
Doc 7
Doc 9
Doc 5
Doc
Doc
Doc
SERVER 3
Doc 6
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Doc 9
Add Nodes to Cluster
• Two servers addedOne-click operation
• Docs automatically rebalanced across clusterEven distribution of docsMinimum doc movement
• Cluster map updated
• App database calls now distributed over larger number of servers
REPLICA
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc 4
Doc 1
Doc
Doc
SERVER 1
REPLICA
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc 6
Doc 3
Doc
Doc
SERVER 2
REPLICA
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc 7
Doc 9
Doc
Doc
SERVER 3 SERVER 4 SERVER 5
REPLICA
ACTIVE
REPLICA
ACTIVE
Doc
Doc 8 Doc
Doc 9 Doc
Doc 2 Doc
Doc 8 Doc
Doc 5 Doc
Doc 6
READ/WRITE/UPDATE READ/WRITE/UPDATE
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
COUCHBASE SERVER CLUSTER
User Configured Replica Count = 1
Fail Over Node
REPLICA
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc 4
Doc 1
Doc
Doc
SERVER 1
REPLICA
ACTIVE
Doc 4
Doc 7
Doc
Doc
Doc 6
Doc 3
Doc
Doc
SERVER 2
REPLICA
ACTIVE
Doc 1
Doc 2
Doc
Doc
Doc 7
Doc 9
Doc
Doc
SERVER 3 SERVER 4 SERVER 5
REPLICA
ACTIVE
REPLICA
ACTIVE
Doc 9
Doc 8
Doc Doc 6 Doc
Doc
Doc 5 Doc
Doc 2
Doc 8 Doc
Doc
• App servers accessing docs
• Requests to Server 3 fail
• Cluster detects server failedPromotes replicas of docs to activeUpdates cluster map
• Requests for docs now go to appropriate server
• Typically rebalance would follow
Doc
Doc 1 Doc 3
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
User Configured Replica Count = 1
COUCHBASE SERVER CLUSTER
DEMO TIME
• Define materialized views on JSON documents and then query across the data set
• Using views you can define• Primary indexes • Simple secondary indexes (most common use case)• Complex secondary, tertiary and composite indexes• Aggregations (reduction)
• Indexes are eventually indexed
• Queries are eventually consistent
• Built using Map/Reduce technology • Map and Reduce functions are written in Javascript
Indexing and Querying – The basics
COUCHBASE SERVER CLUSTER
Indexing and Querying
User Configured Replica Count = 1
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 1
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
APP SERVER 1
COUCHBASE Client Library
CLUSTER MAP
COUCHBASE Client Library
CLUSTER MAP
APP SERVER 2
Doc 9
• Indexing work is distributed amongst nodes
• Large data set possible
• Parallelize the effort
• Each node has index for data stored on it
• Queries combine the results from required nodes
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 2
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
Doc 9
ACTIVE
Doc 5
Doc 2
Doc
Doc
Doc
SERVER 3
REPLICA
Doc 4
Doc 1
Doc 8
Doc
Doc
Doc
Doc 9
Query
• Replicate your Couchbase data across clusters
• Clusters may be spread across geos
• Configured on a per-bucket (per-database) basis
• Supports unidirectional and bidirectional operation
• Application can read and write from both clusters – Active – Active replication
• Replication throughput scales out linearly
• Different from intra-cluster replication
Cross Data Center Replication – The basics
33 2
Cross data center replication – Data flow
2
Managed Cache
Dis
k Q
ueue
Disk
Replication Queue
App Server
Couchbase Server Node
Doc 1Doc 1
Doc 1
To other node
XDCR Queue
Doc 1
To other cluster
Cross Data Center Replication (XDCR)COUCHBASE SERVER CLUSTER
NY DATA CENTERACTIVE
Doc
Doc 2
SERVER 1
Doc 9
SERVER 2 SERVER 3
RAM
Doc Doc Doc
ACTIVE
Doc
Doc
Doc RAM
ACTIVE
Doc
Doc
DocRAM
DISK
Doc Doc Doc
DISK
Doc Doc Doc
DISK
COUCHBASE SERVER CLUSTERSF DATA CENTER
ACTIVE
Doc
Doc 2
SERVER 1
Doc 9
SERVER 2 SERVER 3
RAM
Doc Doc Doc
ACTIVE
Doc
Doc
Doc RAM
ACTIVE
Doc
Doc
DocRAM
DISK
Doc Doc Doc
DISK
Doc Doc Doc
DISK
Couchbase SDKs
Java SDK
.Net SDK
PHP SDK
Ruby SDK
…and many more
Java client API
User Code
Couchbase Server
CouchbaseClient cb = new CouchbaseClient(listURIs,"aBucket", "letmein");
cb.set("hello", 0, "world");cb.get("hello");
http://www.couchbase.com/develop
Couchbase Java Library (spymemcached)
DEMO TIME
Demo: The next big social game
3 Objects (documents) within game:• Players•Monsters• Items
Gameplay:• Players fight monsters•Monsters drop items• Players own items
Player Document
{"jsonType": "player","uuid": "35767d02-a958-4b83-8179-616816692de1","name": "Keith4540","hitpoints": 75,"experience": 663,"level": 4,"loggedIn": false
}
Player ID
Item Document
{"jsonType": "item","name": "Katana_e5890c94-11c6-65746ce6c560","uuid": "e5890c94-11c6-4856-a7a6-65746ce6c560","ownerId": "Dale9887"
}
Item ID
Player ID
Monster Document
{"jsonType": "monster","name": "Bauchan9932","uuid": "d10dfc1b-0412-4140-b4ec-affdbf2aa5ec","hitpoints": 370,"experienceWhenKilled": 52,"itemProbability": 0.5050581341872865
}
Monster ID
GAME ON
www.couchbase.com/download
MCGRAW HILL EDUCATION LABS LEARNING PORTAL
Use Case: Content and metadata store
Building a self-adapting, interactive learning portal with Couchbase
As learning move online in great numbers
Growing need to build interactive learning environments that
Scale!
Scale to millions of learners
Serve MHE as well as third-party content
Including open content
Support learning apps
010100100111010101010101001010101010
Self-adapt via usage data
The Problem
• Allow for elastic scaling under spike periods
• Ability to catalog & deliver content from many sources
• Consistent low-latency for metadata and stats access
• Require full-text search support for content discovery
• Offer tunable content ranking & recommendation functions
Backend is an Interactive Content Delivery Cloud that must:
XML Databases
SQL/MR Engines
In-memory Data Grids
Enterprise Search Servers
Experimented with a combination of:
Hmmm...this looks kinda like:+ Content Caching (Scale)+ Social Gaming (Stats) + Ad Targeting (Smarts)
The Challenge
The Technologies
The Learning Portal
• Designed and built as a collaboration between MHE Labs and Couchbase
• Serves as proof-of-concept and testing harness for Couchbase + ElasticSearch integration
• Available for download and further development as open source code
• Document Modeling
• Metadata & Content Storage
• View Querying to support Content Browsing
• Elastic Search Integration (Full Text Search)
-Content Updated in near Real-Time
-Search Content Summaries
-Relevancy boosted based on User Preferences
• Real-Time Content Updates
• Event Logging for offline analysis
Techniques Used
Couchbase 2.0 + Elasticsearch
Store full-text articles as well as document metadata for image, video and text content in Couchbase
Combine user preferences statistics with custom relevancy scoring to provide personalized search results
Logs user behavior to calculate user preference statistics (e.g. video > text)
1
2 4
Continuously accept updates from Couchbase with new content & stats
3
Data Model
Content Metadata Bucket
User ProfilesBucket
Content StatsBucket
• Stores content metadata for media objects and content for articles
• Includes tags, contributors, type information
• Includes pointer to the media
• Stores user view details per type
• Updated every time a user views a doc with running count
• To be used for customizing ES search results per user preference• Stores content view details
• Updated for every time a document is viewed
• To be used for boosting ES search results based on popularity
Via XDCR
Couchbase ES Transport
Data
Couchbase Server Cluster
MR Views MR Views MR Views MR Views Elastic Search Cluster
RefsTS Query
External Media Store
View Query
App Server
Couchbase Ruby SDK
ES queries over HTTP
Architecture
Market Adoption – CustomersInternet Companies Enterprises
Q & A