developing*in*nosql*with* couchbase*files.meetup.com/1775479/chicagobigdata.pdf · •architect and...

73
Developing in NoSQL with Couchbase Raghavan “Rags” Srinivas Developer Advocate Simple. Fast. Elastic.

Upload: others

Post on 21-May-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Developing  in  NoSQL  with  Couchbase  

Raghavan  “Rags”  Srinivas  Developer Advocate

Simple. Fast. Elastic.

Page 2: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

•  Architect and Evangelist working with developers •  Speaker at JavaOne, RSA conferences, Sun Tech Days, JUGs and other

developer conferences •  Taught undergrad and grad courses •  Technology Evangelist at Sun Microsystems for 10+ years •  Still trying to understand and work more effectively on Java and distributed

systems •  Couchbase Developer Advocate working with Java and Ruby developers •  Philosophy: “Better to have an unanswered question than a unquestioned

answer”

Speaker Introduction

Page 3: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

>  Introduction and Getting Started with the APIs >  Demonstration: Topology changes and xDCR >  Hadoop Sqoop connector and use cases >  Secondary Indexing with Query and Views >  Roadmap and conclusions

Agenda

Page 4: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

A BIT OF NOSQL

Intro. Demos. Hadoop Connector

Views Summary

Page 5: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

NoSQL (from a Couchbase perspective)

Short for Not Only SQL Not single class of database (i.e., NoSQL is not analogous to the RDMBS

classification) Generally refers to databases that neither expose a SQL interface nor use the

relational data model >  Perhaps better thought of as “non-relational databases” Typically don’t offer common RDBMS features for performance gains >  Often schema-less >  No key constraints >  No multi-step transactions >  No concept of a “join” Well suited for cloud deployments

Page 6: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

The CAP theorem states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:

CAP Theorem/Brewer’s Conjecture

•  Consistency means that each node/client always has the same view of the data,

•  Availability means that all clients can always read and write,

•  Partition tolerance means that the system works well across physical network partitions.

Page 7: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

CAP Systems

CA: Consistency and Availability >  Consider a system that uses a two phase commit for distributed transactions >  The commit is impeded if a network segment is unavailable CP: Consistency and Partition Tolerance >  Consider a system that uses pessimistic locking for concurrency control >  Node failures hinder availability AP: Availability and Partition Tolerance >  Consider a system that always responds (e.g., DNS) >  Reads might be dirty as data awaits propagation

Page 8: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

NoSQL Taxonomies

NoSQL Databases Come in Many Flavors* >  XML (myXMLDB, Tamino, Sedna) >  Tabular (Hbase, Big Table) >  Key/Value (Couchbase, Redis, Riak, Cassandra) >  Object (db4o, JADE) >  Graph (Trinity, neo4j, InfoGrid) >  Document store (Couchbase Server 2.0, CouchDB, MongoDB)

* These are loose taxonomies

Page 9: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Web Application Architecture and Performance

Application Scales Out  Just add more commodity web servers

Database Scales Up  Get a bigger, more complex server

www.facebook.com/animalparty  

Web  Servers  

RelaConal    Database  

Load  Balancer  

-­‐  Expensive  and  disrupCve  sharding  -­‐  Doesn’t  perform  at  Web  Scale  

Page 10: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Couchbase data layer scales like application logic tierData layer now scales with linear cost and constant performance.

Application Scales Out  Just add more commodity web servers

Database Scales Out  Just add more commodity data servers

Scaling out flattens the cost and performance curves.

Couchbase    Servers  

www.facebook.com/animalparty  

Web  Servers  Load  Balancer  

Horizontally  scalable,  schema-­‐less,  auto-­‐sharding,  high-­‐performance  at  Web  Scale  

Page 11: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

WHAT IS COUCHBASE?

Intro. Demos. Hadoop Connector

Views Summary

Page 12: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Couchbase Server Overview

Open source, Apache 2.0 licensed Database Commercial support provided by Couchbase Free community edition for developers Enterprise edition free for 2 nodes, support license required for larger clusters Administered via web console or RESTful API Available for Windows, Linux and Mac (development only) Written in C/C++ with some components in Erlang Both a distributed key/value store and a document oriented database >  Implements Memcached API (version 1.8) >  Implements most of CouchDB RESTful view API (version 2.0)

Page 13: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Membase

CouchDB

Couchbase memcached

Page 14: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

1,000s of community deployments More than 150 paying customers Over 3,500 nodes deployed in production by customers

Rapidly Growing Community

Page 15: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Customers with Production Deployments…

Page 16: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Pictionary Like Game

OMGPOP’s DrawSomething Went Viral

#1 Paid & Free App in 84 Countries 35m Downloads

Page 17: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Couchbase Server 1.8 Architecture

Heartbeat  

Process  m

onito

r  

Glob

al  singleton  supe

rviso

r  

Confi

guraCo

n  manager  

on  each  node  

Rebalance  orchestrator  

Nod

e  he

alth  m

onito

r  

one  per  cluster  

vBucket  state  and

 replicaC

on  m

anager  

hOp  RE

ST  m

anagem

ent  A

PI/W

eb  UI  

HTTP  8091  

Erlang  port  mapper  4369  

Distributed  Erlang  21100  -­‐  21199  

Erlang/OTP  

Cluster  Manager  

Persistence Layer

storage  interface  

Memcached

Couchbase EP Engine

11210  Memcapable    2.0  

Moxi

11211  Memcapable    1.0  

Data  Manager  

Page 18: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Component Architecture

Hea

rtbea

t

Pro

cess

mon

itor

Con

figur

atio

n M

anag

er

Glob

al  singleton  supe

rviso

r  

on  each  node  

Rebalance  orchestrator  

Nod

e  he

alth  m

onito

r  

one  per  cluster  

vBucket  state  and

 replicaC

on  m

anager  

http

RES

T M

anag

emen

t /W

ebU

I

HTTP  8091  

Erlang  port  mapper  4369  

Distributed  Erlang  21100  -­‐  21199  

Erlang/OTP

Persistence Layer

storage  interface  

Memcached

Couchbase EP Engine

Moxi

11210  Memcapable    2.0  

11211  Memcapable    1.0  

Page 19: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

A Distributed Hash Table Document Store

Application

set(key, json) get(key) returns json

DData

Cluster

Database

{ !“id": ”brewery_Legacy”,!“type” : “brewery”,!"name" : "Legacy Brewing”,!“address": "525 Canal Street Reading", "updated": "2010-07-22 20:00:20", }!

Page 20: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Couchbase Distributed Key/Value Store Features

All documents/items are stored and retrieved using standard hashtable operations (i.e, get, add, etc.)

Keys are distributed evenly across the nodes of a cluster by way of vBuckets >  A vBucket is a level of indirection between a key and value, and the node on

which it is stored Constant time access to items via keys Single dimension hashtable with no secondary keys

Page 21: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Couchbase Document Store Features

Items can be stored as JSON documents Non-JSON items are stored as binary attachments to CouchDB JSON documents Secondary, B-tree indexes are created by way of views >  Views are accessed via RESTful API >  Indexes are incrementally updated after access – not insert >  Caller specifies whether to accept stale data Views may be queried by key or range Indexes are stored across the cluster and accumulated from all nodes when

requested

Page 22: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Workflow (including replication)

User action results in the need to change the VALUE of KEY

Application updates key’s VALUE, performs SET operation

Couchbase driver hashes KEY, identifies KEY’s master server SET request sent over

network to master server

Couchbase replicates KEY-VALUE pair, caches it in memory and stores it to disk

1

2

3 4

5

Page 23: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

GETTING STARTED

Intro. Demos. Hadoop Connector

Views Summary

Page 24: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

•  Operating systems supported

•  RedHat – RPM

•  Ubuntu – dpkg

•  Windows

•  Mac OS

•  GUI Installer

•  Unattended installation

•  Quickstart once installed (invokes the Admin Console) •  http://<hostname>:8091

Installation of the Server for development

Page 25: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

COUCHBASE CLIENT INTERFACE

Intro. Demos. Hadoop Connector

Views Summary

Page 26: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Client Setup: Getting Cluster Configuration

26

Server Node

Couchbase Client

http://myserver:8091/

{ … "bucketCapabilities": [ "touch", "sync", "couchapi" ], "bucketCapabilitiesVer": "sync-1.0", "bucketType": ”couchbase", "name": "default", "nodeLocator": "vbucket", "nodes": [ ….

Cluster Configuration over REST

Server Node

Server Node

Server Node

Page 27: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Client at Runtime: Adding a node

27

Couchbase Client

Server Node

Server Node

Server Node

Server Node

Server Node

Cluster ���Topology���Update

New node coming online

Page 28: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Connect to the Cluster URI of any cluster node

Opening a Connection (Java)

List<URI> uris = new LinkedList<URI>(); uris.add(URI.create("http://127.0.0.1:8091/pools")); try { client = new CouchbaseClient(uris, "default", ""); } catch (Exception e) { System.err.println("Error connecting to Couchbase: " + e.getMessage()); System.exit(0); }

Page 29: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

PROTOCOL OVERVIEW

Intro. Demos Hadoop Connector

Views Summary

Page 30: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Store Operations

Operation Description add() Adds new value if key does not exist or returns error. replace() Replaces existing value if specified key already exists. set() Sets new value for specified key.

TTL allow for ‘expiry’ time specified in seconds:

Expiry Value Description Values < 30*24*60*60 Time in seconds to expiry Values > 30*24*60*60 Absolute time from epoch for expiry

Page 31: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Retrieve Operations

Operation Description get() Get a value. getAndTouch() Get a value and update the expiry time. getBulk() Get multiple values simultaneously, more efficient. gets() Get a value and the CAS value.

Page 32: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

ASYNCHRONOUS OPERATIONS

Intro. Advanced opertations

Doc. Design Views Summary

Page 33: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

•  Synchronous commands –  Force wait on application until return from server

•  Asynchronous commands –  Allow background operation until response available –  Operations return Future object

Couchbase Java Client Asynchronous Functions

Page 34: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Set Operation is asynchronous // Do an asynchronous set!OperationFuture<Boolean> setOp = ! client.set(KEY, EXP_TIME, VALUE);!!// Do something not related to set …!!// Check to see if our set succeeded!// We block when we call the get()!if (setOp.get().booleanValue()) {! System.out.println("Set Succeeded");!} else {! System.err.println("Set failed: " + ! setOp.getStatus().getMessage());!}!

It’s Asynchronous

Page 35: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Compare and Swap Operations >  Often referred to as “CAS” >  Optimistic concurrency control >  Available with many mutation operations, depending on client.

Get with Lock >  Often referred to as “GETL” >  Pessimistic concurrency control >  Locks have a short TTL >  Locks released with CAS operations >  Useful when working with object graphs

Distributed System Design: Concurrency Controls

35

Actor 1 Actor 2

Couchbase Server

CAS mismatch Success

A

B

F

C D

E

Page 36: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

CAS Example CASValue<Object> casv = client.gets(KEY);! Thread.sleep(random.nextInt(1000)); // a random workload! ! // Wake up and do a set based on the previous CAS value! Future<CASResponse> setOp =! client.asyncCAS(KEY, casv.getCas(), random.nextLong());!! // Wait for the CAS Response! try {! if (setOp.get().equals(CASResponse.OK)) {! System.out.println("Set Succeeded");! } else {! System.err.println("Set failed: ");! }! }! …!

CAS Operation

Page 37: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

CONCURRENCY

Intro. Demos Hadoop Connector

Views Summary

Page 38: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

KEYS AND RELATIONSHIPS

Intro. Demos Hadoop Connector

Views Summary

Page 39: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Couchbase >  Distributed Key, Value pairs >  De-normalized Data >  Fast access Session store Hadoop >  Unstructured/semi-structured data >  Analytics and frequent data analysis >  Leverage the Hadoop Ecosystem (Pig, Hive, Hbase, etc.)

Utilizing the strengths of the Platform

Page 40: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Ad Targeting >  Decisions have to be made in milliseconds based on historical trends, location,

user preferences, etc. Virtual Reality >  Sub-second response time based on data available from number of different

sources Gaming applications >  Responsive applications that integrate with data in the back end

Use cases

Page 41: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

SQOOP

Page 42: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

•  Easily Import/Export Data into Hadoop

•  Generate Datatypes for use in MapReduce Applications

•  Integrate with Pig, Hive and Hbase

•  Easily export Data from Hadoop

Sqoop HDFS Hive HBase

Hadoop Pig

Introducing Sqoop

Page 43: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

COUCHBASE PLUGIN

Page 44: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

•  Based on the Couchbase Tap Interface •  Allows importing and exporting of entire database key mutations

Couchbase HDFS

1.  Data imported via Tap mechanism 2. Hadoop

Processing

3. Data exported back to Couchbase

Couchbase Plugin

Page 45: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

$ sqoop import –-connect http://localhost:8091/pools --table DUMP

$ sqoop import –-connect http://localhost:8091/pools --table BACKFILL_5

$ sqoop export --connect http://localhost:8091/pools

--table DUMP –export-dir DUMP

For Imports, table must be: >  DUMP: All keys currently in Couchbase >  BACKFILL_n: All key mutations for n minutes Specified –username maps to bucket >  By default set to “default” bucket

Couchbase Import and Export

Page 46: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Couchbase >  System of Record >  Provides fast access to users and being able to record events Hadoop >  Analytics >  Generates a count of the # of clicks

Ad Targeting Application

Couchbase HDFS

1.  User/Event Data is imported to HDFS

3. Consolidated data Is used to target ads.

2. Event data is consolidated with Map/Reduce

Page 47: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Importing to Hadoop

Page 48: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Counting the clicks

Page 49: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Analyze using Pig

Page 50: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Counting the ads

Page 51: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Analyze and Consolidate

Page 52: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

THE VIEW AND QUERY API

Intro. Demos Hadoop Connector

Views Summary

Page 53: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Identical feature set to Couchbase Server 1.8 >  Cache-layer >  High-performance >  Cluster >  Elastic >  Core interface Adds >  Document based storage (JSON) >  Views with materialized indexes >  Querying

Couchbase Server 2.0

Page 54: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

JSON (JavaScript Object Notation) >  Lightweight Data-interchange format >  Easy for humans to read and manipulate >  Easy for machines to parse and generate (many parsers are available) >  JSON is language independent (although it uses similar constructs)

Why use JSON?

Page 55: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

•  Views create perspectives on a collection of documents •  Views can cover a few different use cases

–  Simple secondary indexes (the most common) –  Aggregation functions

•  Example: count the number of North American Ales

–  Organizing related data •  Use Map/Reduce

–  Map defines the relationship between fields in documents and output table –  Reduce provides method for collating/summarizing

•  Views materialized indexes –  Views are not create until accessed –  Data writes are fast (no index) –  Index updates all changes since last update

What Are Views

Page 56: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

•  Extract fields from JSON documents and produce an index of the selected information

View Processing

Page 57: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

map function creates a mapping between input data and output reduce function provides a summary (such as count, sum, etc.)

View Creation using Incremental Map/Reduce

Page 58: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

•  Map  outputs  one  or  more  rows  of  data  •  Map  outputs:  

>  Document  ID  >  View  Key  (user  configurable)  >  View  Value  

•  View  Key  Controls  >  SorCng  >  Querying  

•  Key  +  Value  stored  in  the  Index  •  Maps  provide  query  base

Map  Func:ons  

Page 59: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

A map function to get breweries by province function (doc) { if (doc.type == "brewery" && doc.province) { emit(doc.province, doc.name); } }

Map/Reduce – Map Function

Page 60: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

The conceptual output of the map function is

[ {"key":"Connecticut", "value":"Cottrell Brewing"}, {"key":"Connecticut","value":"Thomas Hooker Brewing Co."}, {"key":"Massachusetts","value":"Harpoon Brewing Company"}, {"key":"New York","value":"Brewery Ommegang"}, {"key":"Vermont","value":"Long Trail Brewing Company"} ]

Map/Reduce – Map Results

Page 61: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

A reduce function to count breweries by province >  Or simply use the build in _count function function (keys, values) { return values.length; } The conceptual input to the reduce function is: [ {"keys":["Connecticut", "Connecticut"], "values": ["Cottrell Brewing", "Thomas Hooker Brewing Co."]}, {"keys":["Massachusetts"],"values":["Harpoon Brewing Co."]}, {"keys":["New York"],"values":["Brewery Ommegang"]}, {"keys":["Vermont"],"values":["Long Trail Brewing Company“]} ]

Map/Reduce – Reduce Function

Page 62: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

The conceptual result of the reduce is {"key":"Connecticut","value":2}, {"key":"Massachusetts","value":1}, {"key":"New York","value":1}, {"key":"Vermont","value":1}

Map/Reduce – Reduce Results

Page 63: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

THE QUERY API

Intro. Demos Hadoop Connector

Views Summary

Page 64: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Query APIs

// map function!function (doc) {! if (doc.type == "beer") {! emit(doc._id, null);! }!}!!// Java code!Query query = new Query();!!query.setReduce(false);!query.setIncludeDocs(true);!query.setStale(Stale.FALSE);!!

Page 65: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

THE VIEW API

Intro. Demos Hadoop Connector

Views Summary

Page 66: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

View APIs with Java

// map function!function (doc) {! if (doc.type == "beer") {! emit(doc._id, null);! }!}!!// Java code!View view = client.getView("beers", "beers");!!ViewResponse result = client.query(view, query);!!Iterator<ViewRow> itr = result.iterator();! !while (itr.hasNext()) {! row = itr.next();! doc = (String) row.getDocument();! // do something!}!!!

Page 67: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

USING THE QUERY API FOR CALCULATION

Intro. Advanced opertations

Doc. Design Views Summary

Page 68: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Querying with Java – Custom Reduce

// map function!function(doc) {! if (doc.type == "beer") {! if(doc.abv) {! emit(doc.abv, 1);! }}}!!//reduce function!_count!!// Java code!View view = client.getView("beers", "beers_count_abv");!query.setGroup(true);!!while (itr.hasNext()) {! row = itr.next();! System.out.println(String.format("%s: %s”, ! row.getKey(), row.getValue()));!}!

Page 69: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

View Calculation Result

10: 2!9.6: 5!9.1: 1!9: 3!8.7: 2!8.5: 1!8: 2!7.5: 4!7: 4!6.7: 1!6.6: 1!6.2: 1!6: 2!5.9: 1!5.6: 1!5.2: 1!5: 1!4.8: 2!4.5: 1!4: 2!

Page 70: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

RESOURCES AND SUMMARY

Intro. Demos Hadoop Connector

Views Summary

Page 71: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Server 1.8 Compatible >  1.0.1/2.8.0 >  1.0.2/2.8.0 (Memcache Node imbalance) >  1.0.x/2.8.0 (Replica Read) Server 2.0 Compatible >  1.1-dp/2.8.1 (Views) >  1.1-dp2/2.8.x (Observe) >  1.1/2.8.x (Final) Other Features >  Spring Integration >  Your feedback?

!

Java Client Library Roadmap

Page 72: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

Couchbase Server Downloads >  http://www.couchbase.com/downloads-all >  http://www.couchbase.com/couchbase-server/overview Views >  http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-views.html Developing with Client libraries >  http://www.couchbase.com/develop/java/current Couchbase Java Client Library wiki – tips and tricks >  http://www.couchbase.com/wiki/display/couchbase/Couchbase+Java+Client

+Library Open Beer Database >  http://openbeerdb.com Couchbase Internals >  http://horicky.blogspot.com

Resources and Call For Action

Page 73: Developing*in*NoSQL*with* Couchbase*files.meetup.com/1775479/ChicagoBigData.pdf · •Architect and Evangelist working with developers • Speaker at JavaOne, RSA conferences, Sun

THANKS  -­‐  Q&A  

73  

Raghavan “Rags” Srinivas, Couchbase Inc.  ([email protected], @ragss)