nosql 101: couchbase connect 2014

45
NoSQL 101 Dipti Borkar | Sr. Director | Solutions Engineering Couchbase

Upload: couchbase

Post on 20-Aug-2015

608 views

Category:

Documents


1 download

TRANSCRIPT

NoSQL 101Dipti Borkar | Sr. Director | Solutions Engineering

Couchbase

Why NoSQL?

©2014 Couchbase, Inc. 3

NoSQL

Macro Trends Driving NoSQL Technology

+ +

More Data More Users Interactive Apps

©2014 Couchbase, Inc. 4

Cloud-Based, Data-Centric Apps are Creating a Disruption

Why NoSQL?

Client/Server Cloud

Apps run on premise, support thousands of simultaneous users

Centralized architecture on high-end, expensive servers

Manage relatively small amount of mostly structured data

Apps run in the cloud, support millions of simultaneous users

Distributed, web-scale architecture on low-cost, commodity servers

Data-centric apps that must handle large amount of unstructured data

©2014 Couchbase, Inc. 5

Right Database for Cloud-Based, Data-Centric Apps

Why NoSQL?

Scalability PerformanceAgileDevelopment

Availability

PERFORMANCE

JSONJSONJSON

JSONJSON

©2014 Couchbase, Inc. 6

JSON Data Model Fits Today’s Developer Needs Better

Agile Development

Hundreds or thousands of inter-related tables

Handles structured data well, unstructured data poorly

Rigid schema requires migrations that can take weeks, months

Impedance mismatch with developers

Aggregates & denormalizes data into single document

Handles structured & unstructured data equally well

Inferred schema requires no migration

JSON rapidly being adopted

Hotel Descriptions

Reviews

User Profiles

Reviews points to users

Hotels points to reviews

 { “ID”: 1, “NAME”: “Fairmont San Francisco”,…}

 {“REVIEW_ID”: 1, “REVIEW”: “Loved Hotel…”,…}

 { “REVIEW_ID”: 2, “REVIEW”: “Nice, but …”,…}

 { “USER_ID”: 1, “DISPLAY”: “Ted’s Trip…”,…}

 { “USER_ID”: 2, “DISPLAY”: “WhatWhat …”,…}

©2014 Couchbase, Inc. 7

Must Dynamically Scale Apps to Support Millions of Users

Scalability

Centralized, scale up architecture with big, expensive servers

Manual sharding at app level struggles to support “web scale”

High software costs & TCO

Distributed, scale-out architecture with cluster of low-cost, commodity servers

Auto-sharding at database level to support Big Data, Big Users

Open source & lower TCO

RDBMS Scales UpGet a bigger, more complex server

Users

Application Scales OutJust add more commodity web

servers

Users

System CostApplication Performance

System CostApplication Performance

Won’t scale beyond this point

©2014 Couchbase, Inc. 8

Consumers & Employees Demand Highly Responsive Apps

Performance

Architecture based on “speed of disk”

Requires joins across hundreds or thousands of tables

High throughput requires very expensive hardware

Architecture based on “speed to memory”

Faster access to aggregated, de-normalized objects

High throughput at low TCO with cluster of commodity servers

Application layer

RDBMSCache Application layer

RDBMSCacheCouchbase

©2014 Couchbase, Inc. 9

Apps Must Now Stay Online 24 x 365

Availability

Relational systems use clustering as an afterthought

Must take database down for “maintenance windows”

Struggle to support XDCR replication across many DCs

Clustered systems with intra-cluster replication for availability

Designed for online software upgrades & maintenance

Native master-master XDCR for higher availability

JSONJSON

JSONJSON

24/7

http://www.mypage.com

turpis eget dolor mollis, id tincidunt dui mattis. Nunc sodales elementum turpis, vel interdum ante congue quis. Pellentesque habitant morbi tristique senectus et netus et malesuada Well, this is embarrassing.

We are having some difficulties and we apologies for the inconvenience.

Flavors of NoSQL

©2014 Couchbase, Inc. 11

Key-Value

memcached redis

Data Structure Document Column Graph

mongoDB

couchbase cassandra

Cac

he(m

emor

y on

ly)

Dat

aba

se(m

emor

y/di

sk)

Neo4j

NoSQL catalog

©2014 Couchbase, Inc. 12

The Key-Value Store – the foundation of NoSQL

Key

101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101101100101000100010011101

OpaqueBinaryValue

©2014 Couchbase, Inc. 13

memcached – the NoSQL precursor

memcached

In-memory only Limited set of operations Blob Storage: Set, Add, Replace,

CAS Retrieval: Get Structured Data: Append, Increment Simple and fast Challenges: cold cache, disruptive

elasticity

Key

101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101101100101000100010011101

OpaqueBinaryValue

©2014 Couchbase, Inc. 14

Couchbase – document-oriented database

Key

{ “string” : “string”, “string” : value, “string” : { “string” : “string”, “string” : value }, “string” : [ array ]}

Auto-sharding Disk-based with built-in

memcached cache Elastic scalability Highly-available (data replication) When values are JSON objects

(“documents”): Create indices, views and query

against the views

JSONOBJECT

(“DOCUMENT”)

Couchbase

©2014 Couchbase, Inc. 15

Couchbase Architecture

read/write/update

Active

SERVER 1

Active

SERVER 2

Active

SERVER 3

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Shard 5

Shard 2

Shard 9

Shard

Shard

Shard

Shard 4

Shard 7

Shard 8

Shard

Shard

Shard

Shard 1

Shard 3

Shard 6

Shard

Shard

Shard

Replica Replica Replica

Shard 4

Shard 1

Shard 8

Shard

Shard

Shard

Shard 6

Shard 3

Shard 2

Shard

Shard

Shard

Shard 7

Shard 9

Shard 5

Shard

Shard

Shard

©2014 Couchbase, Inc. 16

MongoDB – document-oriented database

Disk-based with OS caching BSON (“binary JSON”) format and

wire protocol Master-slave replication Auto-sharding Values are BSON objects Supports ad hoc queries – best when

indexed

MongoDBKey

{ “string” : “string”, “string” : value, “string” : { “string” : “string”, “string” : value }, “string” : [ array ]}

JSONOBJECT

(“DOCUMENT”)

©2014 Couchbase, Inc. 17

mongoDB Architecture

©2014 Couchbase, Inc. 18

Cassandra – Column-family database

More disk-based system Key includes a row, column family and

column name Store versioned blobs in one large table Queries can be done on rows, column

families and column names Row and column designs are critical Clustered External caching required for low-latency

reads

CassandraKey

101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101101100101000100010011101

OpaqueBinaryValue

Column 1

Column 2

Column 3 (not present)

©2014 Couchbase, Inc. 19

Cassandra Architecture

©2014 Couchbase, Inc. 20

Neo4j – Graph database

Disk-based system External caching required for low-latency

reads Nodes, relationships and paths Properties on nodes Delete, Insert, Traverse, etc.

Neo4j

Key

101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101101100101000100010011101

OpaqueBinaryValue

Key

101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101101100101000100010011101

OpaqueBinaryValue

Key

101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101101100101000100010011101

OpaqueBinaryValue

Key

101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101101100101000100010011101

OpaqueBinaryValue

Key

101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101

101100101000100010011101101100101000100010011101101100101000100010011101

OpaqueBinaryValue

©2014 Couchbase, Inc. 21

NoSQL Considerations

Accessing data– No standards exist yet– Typically via SDKs or over HTTP– Check if the programing language of your choice is

supported.

App Server

App Server

App Server

Consistency– Consistent only at the document level– Most documents stores currently don’t support multi-

document transactions– Analyze your application needs

Availability– Each node stores active and replica data

(Couchbase)– Each node is either a master or slave (MongoDB)

©2014 Couchbase, Inc. 22

NoSQL considerations

Operations– Monitoring the system– Backup and restore the system– Upgrades and maintenance – Support

App Server

App ServerClient

Ease of Scaling– Ease of adding and reducing capacity– Single node type– App availability on topology changes

Indexing and Querying– Secondary indexes – Aggregates Grouping – Basic querying / Ad hoc querying

Where is NoSQL a good fit?

©2014 Couchbase, Inc. 24

3rd party or user defined structure (Twitter feeds) Support for unlimited data growth (Viral apps) Data with non-homogenous structure Need to quickly and often change data structure Variable length documents Sparse data records Hierarchical data

Application Characteristics - Data driven

©2014 Couchbase, Inc. 25

Low latency critical (ex. 1millisecond) High throughput (ex. 200000 ops / sec) Large number of users Unknown demand with sudden growth of users/data Predominantly direct document access Read / Mixed / Write heavy workloads

Application Characteristics - Performance driven

NoSQL Use Cases

©2014 Couchbase, Inc. 27

High-Availability Caching

RDBMS

Application LayerUser Requests

Cache Misses and Write Requests

Read-Write Requests

Couchbase Distributed Cache

Use Case 1

©2014 Couchbase, Inc. 28

Application objects Popular search query results Session information Heavily accessed web landing

pages

High-Availability Caching

Speed up RDBMS Consistently low response times

for document / key lookups High-availability 24x7x365 Replacement for entire caching

tier

Data cached in Couchbase? Application characteristic

Use Case 1http://www.Look.PopularSearchWuerycom

Look Something Search

WEB % of clicks % of clicks

something 56.3 28

DoSomething.com 13.4 25.08

SomethingFishy.org 9.8 14.68

Popular

©2014 Couchbase, Inc. 29

Use Case 2

Session Store

©2014 Couchbase, Inc. 30

Session Store

Extremely fast access to session data using unique session ID

Easy scalability to handle fast growing number of users and user-generated data

Always-on functionality for global user base

Application characteristic

Use Case 2

Session values or Cookies (stored as key-value pairs)

Examples include: items in a shopping cart, flights selected, search results, etc.

Data stored in Couchbase?

©2014 Couchbase, Inc. 31

Use Case 3

Globally Distributed User Profile Store

©2014 Couchbase, Inc. 32

http://www.ProfileStore.com

e enim nec felis rhoncus, ac volutpat magna blandit. Nunc facilisis turpis eget dolor mollis, id tincidunt dui mattis. Nunc sodales elementum turpis, vel interdum ante congue quis. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Aliquam erat volutpat. Nullam suscipit diam nec tortor pharetra, vitae adipiscing dolor pretium. Integer ac porta tortor. Vestibulum imperdiet quam laoreet nisl scelerisque, a tempus tortor tincidunt. Mauris suscipit dui ac urna dignissim, vitae aliquet velit convallis. Phasellus lobortis felis eu magna vulputate dapibus. Ut ornare ut quam a vulputatullam et dui odio. Nulla pharetra, velit ac convallis semper, dolor turpis porta nunc, in egestas mauris leo a nisi. Pellentesque fringilla sagittis magna vitae imperdiet. Mauris ac leo ut tellus aliquet interdum. Interdum et malesuada fames ac ante ipsum primis in faucibus. Nunc cursus odio sit amet elit mollis, et sollicitudin lacus accumsan. Nulla facilisi. Fusce et vehicula sem. Curabitur interdum vestibulum nulla id accumsan. Integer ut tortor in ligula semper vehicula. Vestibulum ut nibh ultrices, venenatis metus at, adipiscing ipsum. Donec quis consequat lectus.Class aptent taciti sociosqu ad litora torquent per conubia nostra, per inceptos himenaeos. Donec a diam tempus, aliquet ipsum eu, vestibulum sapien. Donec eleifend lectus sit amet luctus facilisis. Morbi porttitor, orci sit amet placerat tempus, nisi justo dictum augue, ac dignissim elit enim eget dolor. Praesent pulvinar ipsum arcu, eu posuere eros luctus nec. Vestibulum odio eros, ultrices non metus sit amet, tristique malesuada augue. Pellentesque lacinia dolor nec diam eleifend mollis. Vestibulum sit amet ultrices diam. Aliquam lacinia accumsan eros id hendrerit. Cras placerat laoreet urna scelerisque rutrum. Duis ornare mi ac augue varius, sit amet accumsan leo lacinia. Vivamus nec egestas neque. Quisque interdum enim molestie urn.

turpis eget dolor mollis, id tincidunt dui mattis. Nunc sodales elementum turpis, vel interdum ante congue quis. Pellentesque habitant morbi tristique senectus et netus et malesuada

Welcome back Laura!You have 3 items in your shopping cart waiting for you.

LOGIN

ID:

PASS:

Globally Distributed User Profile Store

Extremely fast access to individual profiles

Always online system as multiple applications access user profiles

Flexibility to add and update user attributes

Easy scalability to handle fast growing number of users

User profile with unique ID User setting / preferences User’s network User application state

Data stored in Couchbase? Application characteristic

Use Case 3

Laura930

********

©2014 Couchbase, Inc. 33

Data Aggregation

Flexibility to store any kind of content Flexibility to handle schema changes Full-text Search across data set High speed data ingestion Scales horizontally as more content

gets added to the system

Social media feeds: Twitter, Facebook, LinkedIn

Blogs, news, press articles Data service feeds: Hoovers, Reuters Data form other systems

Data stored in Couchbase? Application characteristic

Use Case 4

in

Ft

NEWS

Blog

©2014 Couchbase, Inc. 34

Use Case 5

Content and Metadata

Nature, Field, Summer, Farm, Sky, Environment, Landscaped, Grass, Green,Blue, Oilseed, Rape, Agriculture, Scenics, Land, Spring, Non-Urban Scene,Environmental, Conservation, Sun, Meadow, Horizon, Season, Cloud, Landscapes, Travel Locations, Pasture, Cultivated Land, Stratoshpere, cloudy day, Oliseed Rape, Rural Scene, Vibrant Color, No People, Beauty In Nature,Gold, Color Image, Beauty, Idyllic, Multicolored, Yellow, Colors, Cloudscape,Outdoors, Plant, Sunlight, Horizon Over Land

Content and metadata store

©2014 Couchbase, Inc. 35

Content and Metadata Store

Flexibility to store any kind of content Fast access to content metadata (most

accessed objects) and content Full-text Search across data set Scales horizontally as more content

gets added to the system

Content metadata Content: Articles, text Landing pages for website Digital content: eBooks, magazine,

research material

Data stored in Couchbase? Application characteristic

Use Case 5http://www.LandingPage.com

ebookMag

Document Modeling Example

©2014 Couchbase, Inc. 37

Document Databases Easily Accommodate Unstructured Data

 { “ID”: 1, “NAME”: “Fairmont San Francisco”, “DESCRIPTION”: “Historic grandeur…”, “AVG_REVIEWER_SCORE”: “4.3”, “AMENITY”: {“TYPE”: “gym”, DESCRIPTION: “fitness center” }, {“TYPE”: “wifi”, “DESCRIPTION”: “free wifi”}, “RATE_TYPE”: “nightly”, “PRICE”: “$199”, “REVIEWS”: [“review_1”, “review_2”], “ATTRACTIONS”: “Chinatown”, }

JSON

 { “ID”: 2, “NAME”: “W San Francisco”, “DESCRIPTION”: “Chic, hip accommodations..”, “AVG_REVIEWER_SCORE”: “4.0”, “AMENITY”: {“TYPE”: “spa”, DESCRIPTION: “Bliss Spa” }, {“TYPE”: “wifi”, “DESCRIPTION”: “free wifi”}, {“TYPE”: “dining”, “DESCRIPTION”: “bar/lounge”}, “RATE_TYPE”: “nightly”, “PRICE”: “$194”, “REVIEWS”: [“review_1”, “review_2”],} JSON

Hotels

©2014 Couchbase, Inc. 38

Document Databases Easily Accommodate Unstructured Data

 { “ID”: 1, “NAME”: “Fairmont San Francisco”,…}

JSON

 { “REVIEW_ID”: 1, “REVIEW”: “Loved Hotel & Location”, “WOULD RECOMMEND”: “yes”, “AVG_REVIEWER_SCORE”: “5”, “REVIEW_DATE”: “May 29, 2013”, “USER_PROFILE_ID”: “271”,

}

JSON

  { “REVIEW_ID”: 2, “REVIEW”: “Nice, but a few kinks”, “WOULD RECOMMEND”: “yes”, “AVG_REVIEWER_SCORE”: “4”, “REVIEW_DATE”: “May 22, 2013”, “USER_PROFILE_ID”: “923”,

}

JSON

Hotels

Reviews

©2014 Couchbase, Inc. 39

Document Databases Easily Accommodate Unstructured Data

 { “ID”: 1, “NAME”: “Fairmont San Francisco”,…}

JSON

Hotel Descriptions

Reviews { “REVIEW_ID”: 1, “REVIEW”: “Loved Hotel…”,…} JSON

 { “REVIEW_ID”: 2, “REVIEW”: “Nice, but …”,…} JSON

User Profiles  { “USER_ID”: 1, “DISPLAY_NAME ”: “Ted’s Trip Experience”, “CITY”: “Saratoga”, “STATE”: “California”,“NUM_OF_REVIEWS”: “8”, }

JSON

 { “USER_ID”: 1, “DISPLAY_NAME ”: “WhatWhat567”, “CITY”: “Kansas City”, “STATE”: “MO”,“NUM_OF_REVIEWS”: “3”, }

JSON

©2014 Couchbase, Inc. 40

Document Databases Easily Accommodate Unstructured Data

 { “ID”: 1, “NAME”: “Fairmont San Francisco”,…} JSON

Hotel Descriptions

Reviews { “REVIEW_ID”: 1, “REVIEW”: “Loved Hotel…”,…}

JSON

 { “REVIEW_ID”: 2, “REVIEW”: “Nice, but …”,…}

JSON

User Profiles { “USER_ID”: 1, “DISPLAY”: “Ted’s Trip…”,…}

JSON

 { “USER_ID”: 2, “DISPLAY”: “WhatWhat …”,…}

JSON

Document IDs associates related objects

Hotels points to reviews

Reviews points to users

©2014 Couchbase, Inc. 41

Indexing with Document Databases

• Index on AVG_REVIEWER_SCORE

©2014 Couchbase, Inc. 42

Indexing with Document Databases

Index on AVG_REVIEWER_SCORE…4.0, doc_id4.0, doc_id4.1, doc_id4.3, doc_id5.0, doc_id…

Index

©2014 Couchbase, Inc. 43

Querying with Document Databases

Query on AVG_REVIEWER_SCORE

…3.4, doc_id3.4, doc_id3.5, doc_id3.6, doc_id3.7, doc_id3.8, doc_id4.0, doc_id4.1, doc_id4.3, doc_id4.5, doc_id4.7, doc_id4.9, doc_id5.0, doc_id…5.0, doc_id

Index Matching ResultsQuery

Q & A

[email protected]@dborkar