seminar.2010.nosql

July 11th, 2010

Apples, Oranges and NOSQL

Roi Aldaag Architect & ConsultantNadav Wiener Architect & Consultant

3

» What is NoSQL?

» What’s “wrong” with RDBMS?

» Why now?

Introduction

Agenda

4

» Scaling

» CAP Theorem

» ACID vs. BASE

RDBMS vs. NoSQL

Agenda

5

» Key / Value

» Column

» Document

» Graph

NoSQL Taxonomy

Agenda

6

» Comparing Apples to Oranges

» Polyglot Persistence

How to choose?

Agenda

Introduction

8

Introduction

Question: What do they all have in common?

9

Before we answer – some facts:

Introduction

10

Before we answer – some facts:

Introduction

Daily Page Views

Daily Visitors

Data size

7.8x109

620x106

Petabytes

7.1x109

500x106

Petabytes

550x106

56x106

Petabytes

350x106

37x106

Terabytes

82x106

12x106

Terabytes

July, 2010: http://www.alexa.com

11

Introduction

Answer: They use NoSQL data stores

12

Why!?

Introduction

13

» ACID doesn’t scale well horizontally

Sharding breaks relations

Joins are inefficient

» Transactions overhead

» Schema is not flexible

Predfined

Hard to evolve

Relational DBs Have Scaling Limitations

Introduction

14

» NO SQL / Not Only SQL

» A collective description of Open Source, Non-relational, data stores

Highly distributed

Highly scalable

Not ACID and... doesn’t use SQL

» Term coined in a convention in 2009 called “NoSQL” (Eric Evans)

» Started a movement that is gaining momentum

What is NoSQL?

Introduction

15

Introduction

16

» NoSQL data stores predate RDBMS (1970)

But remained a niche

» RDBMS – most popular and generic option

» Web 2.0 introduced new requirements:

Exponential increase in data

Information connectivity

Semi-structured data

» NoSQL data stores had answers

When time was right

When RDBMSs didn’t

Why now?

Introduction

17

It’s theory time:

Introduction

18

Scaling

19

» Adding resources to a single node in a system

» Add more CPUs or memory

» Move system to a larger machine

» Pros:

Quick and Simple

» Cons:

Outgrowing the capacity of largestsystem available (More’s law)

Expensive

Creates vendor lock-in

Scaling Up

Scaling

20

» Add more nodes to a system

» Functional Scaling (vertical)

Grouping data by function and spreading functional groups across databases

» Sharding (horizontal)

Splitting same functional data across multiple databases

» Pros: More flexible

» Cons: More complex

Scaling Out

Scaling

Distributed

Databases

22

Distributed Databases

» Many nodes

» Same databaseNode 1 Node 2

Node 3

23

» Consistency

All clients can see the same data

» Availability

All clients can always access data

» Partition tolerance

The ability to continue working when the network topology is broken

The ability to recover once the network is healed

What are the requirements from distributed databases?


24

» You can fully satisfy at most 2 out of 3

Compromise on 3rd

» Not “all or nothing”

Choose various levels of consistency, availability or partition tolerance

» Recognize which of the CAP rules your business needs for the task

CAP Theorem (E. Brewer, N. Lynch)


25

» Partition Tolerance is compromised

» Single site clusters (easier to ensure all nodes are always in contact)

» When a network partition occurs, the system blocks

» e.g. Two Phase Commit (2PC)

CA: Consistency & Availability


Partition

Tolerance

26

» Availability is compromised

» Access to some data may be temporarily limited

» The rest is still consistent/accurate

» e.g. Sharded database

» TBD sample

CP: Consistency & Partitioning


Partition

Tolerance

27

» Consistency is compromised

» System is still available under partitioning

» Some data returned may be temporarily not up-to-date

» Requires conflict resolution strategy

» e.g. DNS, caches, Master/Slave replication

» TBD sample

AP: Availability & Partitioning


Partition

Tolerance

ACID vs. BASE

29

» Atomicity

When a part of the transaction fails -> the entire transaction fails; Database state is left unchanged

» Consistency

A transaction takes database from one consistent state to another

» Isolation

A transaction can't see dirty state from other transactions

» Durability

Commit means commit.

ACID – a quick recap

ACID vs. BASE

30

» The CAP compliment of ACID

Just had to be called BASE

Backronym:

» Basically Available

» Soft State

» Eventually Consistent

BASE

ACID vs. BASE

31

» RDBMSs strive to provide ACID guarantees

ACID forces consistency

» NoSQL solutions often scale through BASE

BASE accepts that conflicts will happen

RDBMS & ACID / NoSQL & BASE

ACID vs. BASE

Taxonomy

33

Taxonomy

Key / Value Column

Graph

TXT

XML

BIN

Document

34

Taxonomy

Key / Value Databases

35

» Simple Key / Value lookups (DHT)

» Value is opaque

» Focus on scaling to huge amounts of data

» Designed to handle massive load

» E.g.

Riak

Project Voldemort

Redis

Key/Value Stores

Taxonomy

Based on Amazon’s

Dynamo paper

36

Key/Value e.g.: Riak

Taxonomy

» No single point of failure

» No machines are special or central

» MapReduce queries (Erlang / Javascript)

» HTTP/JSON API

» Ring cluster with automatic replication

» Elastic / partition rebalancing» Written in: Erlang, C, Javascript

» Developed by: Basho Technologies

» Java client: (jonjlee / riak-java-client)

37

Data Model


» Key / Value pairs are stored in a Bucket

» A Bucket ~ a namespace

» Each update is tracked by a Vector Clock

An algorithm for determining ordering and detecting conflicts

» When in conflict

Last wins / manual resolution

Versioning

38

» Read an object

» Store a new object

» Store an object with existing key (update)


GET /riak/bucket/key

POST /riak/bucket

PUT /riak/bucket/key

Example: REST API

39

» A framework supporting distributed computing on large data sets on clusters of machines

» Leverage parallel processing power

» Introduced by Google

» Inspired by map / reduce functions in functional programming

» Map step

» Reduce step


MapReduce

40

» Map

» Parse each document

» Emit a sequence of <word, doc_id> pairs


MapReduce example: Inverted Index

<doc_id, doc_text>

<100, >,

<200, >,

<300, >

Node

1

Node

2

Node

3

TXT1

TXT2

TXT3

<word ,doc_id>

< word1 ,100>,< word2 ,100>,

< word2 ,200>,

< word2 ,300>

41

» Reduce

» Accept all pairs for a given word

» Sort the corresponding document IDs

» Emit a <word, list(document ID)> pair


MapReduce example: Inverted Index

<word, list(document_id)>

< word1 ,(100) >,

< word2 ,(100,200)>, < word3 ,(300) >

42

Taxonomy

BigTable and

Column Oriented Databases

44

» RDBMS:

Create a central table with common attributes

Create a table per product with unique attributes

Use a join query

Alternatively create a table that holds meta data on products

» NoSQL:

Column oriented database

Use arbitrarily columns

Use Case: Manage products with diverse attributes

Taxonomy

45

» Data model: Google’s BigTable

» Infrastructure: Amazon Dynamo

» Incremental scalability

» Flexible schema

» No single point of failure (Distributed P2P)

» Optimistic replication (Gossip protocol)

» Written in: Java

» Developed by: Facebook

» Java client: e.g. Hector / Thrift

Column Store e.g.: Cassandra

Taxonomy

46

» Column

Smallest increment of data: tuple of name, value, timestamp

Data Model

Column e.g.: Cassandra

{

name: "emailAddress",

value: “[email protected]",

timestamp: 123456789

}

47

» SuperColumn

A sorted, associative, unbounded array of columns


{ // this is a SuperColumn

name: "homeAddress",

// with an unbounded array of Columns

value: {

// the keys is the name of the Column

street: {name: "street", value: "s", timestamp:...},

city: {name: "city", value: "c", timestamp:...},

zip: {name: "zip", value: "z", timestamp:...}

}

}

48

» ColumnFamily

A container (~Table) for columns sorted by their names

Column Families are referenced and sorted by row keys


Users = { // ColumnFamily

john: { // key to row in CF

"role" : "admin",

"status" : "offline",

"nick" : "dude1934"

}, // end row

fred: { // another row

"nick" : “freddy",

"email" :"[email protected]",

"age" : "25",

"gender" : "male",…

},… // more rows

} Column Family

49

» Keyspace

The outer most grouping of data (~DB Schema)

Contains ColumnFamily’s

There is no imposed relationship between ColumsFamily’s


50

» Example


Tweets CF

Timeline CFKeyspace

51

Taxonomy

Document Oriented DatabasesTXT

XML

BIN

52

» Store semi-structured documents (think JSON)

» Document versioning

» Map/Reduce based queries, sorting, aggregation, etc.

» DB is aware of internal structure

» E.g. MongoDB

CouchDB

JackRabbit (JCR JSR 170)

Document Store

Taxonomy

53

» RDBMS:

Table for each: posts, comments, tags

Foreign relations

» NoSQL:

Document storage

Store post + tags + comments as a document

Use Case: Blog with tagged posts and comments

Taxonomy

54

» MongoDB (from "humongous")

» Manages collections of JSON-like documents (BSON)

» Queries can return specific fields of documents

» Supports secondary indexes

» Atomic operations on single documents

» Developed by: 10gen

» Written in: C++

» Clients: Java, Scala and more

Document Store e.g: MongoDB

Taxonomy

55

» Suppose you host a blog, where each post is tagged:

» Notice how posts have an array of tags

Example: Blog posts

Docment e.g.: MongoDB

db.posts.save({

_id : 3,

author:"john",

title : “Apples, Oranges and NOSQL",

text : “This article will…",

tags : [ “database", “nosql" ]

});

56

» MongoDB supports secondary indexes and a query optimizer

Compound indexes are also supported


db.posts.ensureIndex({ tags: 1 });

db.posts.ensureIndex({ author: 1});

db.posts.find({ author: "john", tags: "nosql" });

// Result:

{

"_id" : 3,

"author" : "john",

"title" : "Apples, Oranges and NOSQL",

"text" : "This article will…",

"tags" : ["database", "nosql", "mongodb" ]

}

57

» Let's update our posts to include some comments:


db.posts.update({ _id: 3 }, {

$inc: { comments_count: 4},

$pushAll : {

comments: [

{ text: “Comment 1" },

{ text: “Comment 2", author: "Mr. T" },

{ text: “Comment 3" },

{ text: “Comment 4" }

]

}

});

58

Taxonomy

Graph Databases

59

» Inspired by mathematical graph theory G=(E,V)

» Models the structure of data

» Navigational data model

» Scalability / data complexity

» Data model: Key-Value pairs on Edges / Nodes

» Relationships: Edges between Nodes

» E.g. Neo4j

Pregel (Google’s PageRank)

AllegroGraph

Graph databases

Taxonomy

60

» RDBMS

Complex recursive algorithm

Multiple Self joins

Round trips to DB / bulk read and resolve in RAM

» NoSQL:

Graph Storage

Network traversal

Use Case: Connected data - deep relationship links between users in a social network

Taxonomy

61

» High-performance graph engine

» Embedded / disk based

» Work with OO model: nodes, relationships, properties

» ACID Transactions

JTA support – participate in 2PC with your RDBMS

» Developed by: Neo Technologies

» Written in: Java

» Clients: Java, client libraries in other platforms

Graph e.g.: Neo4J

Taxonomy

62

Graph e.g.: Neo4j

http://neo4j.org/

Comparing Apples to Oranges

64

» RDBMS

Databases contains tables, columns and rows

All rows the same structure

Inherent ORM mismatch

» NoSQL

Choose your data structure

Data is stored in natural structure (e.g. Documents, Graphs, Objects)

Comparing Data Structures


65

» RDBMS

Strict schema, difficult to evolve

Maintains relations and forces data integrity

» NoSQL

Structure of data can be changed dynamically

• e.g. Column stores – Cassandra

Data can sometimes be completely opaque

• e.g Key/Value – Project Voldemort

Comparing Schema Flexibility


66

» RDBMS

The data model is normalized to remove data duplication

Normalization establishes table relations

» NoSQL

Denormalization is not a dirty word

Relations are not explicitly defined

Related data is usually grouped and stored as one unit

• E.g. document, column

Comparing Normalization & Relations


67

» RDBMS

CRUD operations using SQL

Access data from multiple tables using SQL joins

Generic API such as JDBC

» NoSQL

Proprietary API and DSLs (e.g. Pig / Hive / Gremlin)

MapReduce, graph traversals

REST APIs, portable serialization formats• BSON, JSON, Apache Thrift, Memcached

Comparing Data Acces


68

» RDBMS

Slice and Dice data, then reassemble any way you like

» NoSQL

Hard to repurpose data for ad-hoc usage

• Plan ahead

Think in advance

• How and what you store

• Data access patterns

Comparing Reporting Capabilities


Summary

70

» ACID ruled exclusively in the last 40 years

doesn’t compromise on consistency

» Database industry neglected distributed DBs w/ availability

» Vacuum was filled with “NoSQL” BASE architectures

Strict A and P, minimize C compromise

» Relational databases are now trying to catch up

Why NOSQL / BASE

Summary

71

» Missing some query capabilities

joins / composite transaction

» Eventual consistency -- not for every problem

» Not a drop in replacement for RDBMS “on ACID”

» No standardization -> product lock-in

» Relatively immature (support, bugs, community)

NoSQL Limitations

Summary

72

» Relational databases and NoSQL databases are designed to meet different needs

» RDBMS-only should not be a default

» NOSQL databases outperform RDBMS’s in their particular niche

» No one size fits all / Silver bullet

...but you don’t have to choose one

Choose the right tool for the job

Summary

73

» Poly: many Glot: language

» Meshing up persistence mechanisms to best meet requirements

» Good integration stories:

E.g. Neo4j + JDBC using JTA

Polyglot Persistence

Summary

seminar.2010.nosql

Documents

data availability

nosql data stores11

data partition tolerance

nosql scaling cap theorem

base base

huge amounts of data

cap compliment of acid

distributed databasescp