topic 12: nosql in action
DESCRIPTION
Cloud Computing Workshop 2013, ITUTRANSCRIPT
12: NoSQL in Action
Zubair Nabi
April 20, 2013
Zubair Nabi 12: NoSQL in Action April 20, 2013 1 / 33
Outline
1 Amazon’s Dynamo
2 MongoDB
3 Google BigTable
4 Cassandra
Zubair Nabi 12: NoSQL in Action April 20, 2013 2 / 33
Outline
1 Amazon’s Dynamo
2 MongoDB
3 Google BigTable
4 Cassandra
Zubair Nabi 12: NoSQL in Action April 20, 2013 3 / 33
Introduction
At the forefront of the NoSQL movement and has influenced the designof many subsequent systems
Design considerations are two-fold: 1) Infrastructure and 2) Business
Zubair Nabi 12: NoSQL in Action April 20, 2013 4 / 33
Introduction
At the forefront of the NoSQL movement and has influenced the designof many subsequent systems
Design considerations are two-fold: 1) Infrastructure and 2) Business
Zubair Nabi 12: NoSQL in Action April 20, 2013 4 / 33
Infrastructure Considerations
Tens of thousands of servers and network elements distributed acrossthe globe
Commodity off-the-shelf hardwareI Failure is normal
Hundreds of services, all decentralized and loosely coupled
Zubair Nabi 12: NoSQL in Action April 20, 2013 5 / 33
Infrastructure Considerations
Tens of thousands of servers and network elements distributed acrossthe globeCommodity off-the-shelf hardware
I Failure is normal
Hundreds of services, all decentralized and loosely coupled
Zubair Nabi 12: NoSQL in Action April 20, 2013 5 / 33
Infrastructure Considerations
Tens of thousands of servers and network elements distributed acrossthe globeCommodity off-the-shelf hardware
I Failure is normal
Hundreds of services, all decentralized and loosely coupled
Zubair Nabi 12: NoSQL in Action April 20, 2013 5 / 33
Business Considerations
Strict, internal SLAs regarding performance, reliability, and efficiency
Reliability is of paramount importance because an outage means lossin revenue and customer trust
The platform needs to be highly scalable, to support continuous growthMost services only store and retrieve data by primary key, such as bestsellers lists, shopping carts, etc.
I No need for complex querying and management afforded by RDBMS
Zubair Nabi 12: NoSQL in Action April 20, 2013 6 / 33
Business Considerations
Strict, internal SLAs regarding performance, reliability, and efficiency
Reliability is of paramount importance because an outage means lossin revenue and customer trust
The platform needs to be highly scalable, to support continuous growthMost services only store and retrieve data by primary key, such as bestsellers lists, shopping carts, etc.
I No need for complex querying and management afforded by RDBMS
Zubair Nabi 12: NoSQL in Action April 20, 2013 6 / 33
Business Considerations
Strict, internal SLAs regarding performance, reliability, and efficiency
Reliability is of paramount importance because an outage means lossin revenue and customer trust
The platform needs to be highly scalable, to support continuous growth
Most services only store and retrieve data by primary key, such as bestsellers lists, shopping carts, etc.
I No need for complex querying and management afforded by RDBMS
Zubair Nabi 12: NoSQL in Action April 20, 2013 6 / 33
Business Considerations
Strict, internal SLAs regarding performance, reliability, and efficiency
Reliability is of paramount importance because an outage means lossin revenue and customer trust
The platform needs to be highly scalable, to support continuous growthMost services only store and retrieve data by primary key, such as bestsellers lists, shopping carts, etc.
I No need for complex querying and management afforded by RDBMS
Zubair Nabi 12: NoSQL in Action April 20, 2013 6 / 33
Business Considerations
Strict, internal SLAs regarding performance, reliability, and efficiency
Reliability is of paramount importance because an outage means lossin revenue and customer trust
The platform needs to be highly scalable, to support continuous growthMost services only store and retrieve data by primary key, such as bestsellers lists, shopping carts, etc.
I No need for complex querying and management afforded by RDBMS
Zubair Nabi 12: NoSQL in Action April 20, 2013 6 / 33
Design
1 Implemented as a partitioned system with replication and consistencywindows
2 Targets applications that require weaker consistency
3 Gives high availability
4 Possibility for write operations even in the presence of partitioningamongst replicas
5 Always writeable so conflict resolution needs to happen during reads
Zubair Nabi 12: NoSQL in Action April 20, 2013 7 / 33
Design
1 Implemented as a partitioned system with replication and consistencywindows
2 Targets applications that require weaker consistency
3 Gives high availability
4 Possibility for write operations even in the presence of partitioningamongst replicas
5 Always writeable so conflict resolution needs to happen during reads
Zubair Nabi 12: NoSQL in Action April 20, 2013 7 / 33
Design
1 Implemented as a partitioned system with replication and consistencywindows
2 Targets applications that require weaker consistency
3 Gives high availability
4 Possibility for write operations even in the presence of partitioningamongst replicas
5 Always writeable so conflict resolution needs to happen during reads
Zubair Nabi 12: NoSQL in Action April 20, 2013 7 / 33
Design
1 Implemented as a partitioned system with replication and consistencywindows
2 Targets applications that require weaker consistency
3 Gives high availability
4 Possibility for write operations even in the presence of partitioningamongst replicas
5 Always writeable so conflict resolution needs to happen during reads
Zubair Nabi 12: NoSQL in Action April 20, 2013 7 / 33
Design
1 Implemented as a partitioned system with replication and consistencywindows
2 Targets applications that require weaker consistency
3 Gives high availability
4 Possibility for write operations even in the presence of partitioningamongst replicas
5 Always writeable so conflict resolution needs to happen during reads
Zubair Nabi 12: NoSQL in Action April 20, 2013 7 / 33
Conflict Resolution
A datastore can only perform simple conflict resolution
Passes the buck to the application
The application is aware of the data schema and hence better suited tochoose a conflict resolution mechanism
If the application does not want to implement conflict resolution, simplemechanisms, such as “last write wins” provided by the framework
Zubair Nabi 12: NoSQL in Action April 20, 2013 8 / 33
Conflict Resolution
A datastore can only perform simple conflict resolution
Passes the buck to the application
The application is aware of the data schema and hence better suited tochoose a conflict resolution mechanism
If the application does not want to implement conflict resolution, simplemechanisms, such as “last write wins” provided by the framework
Zubair Nabi 12: NoSQL in Action April 20, 2013 8 / 33
Conflict Resolution
A datastore can only perform simple conflict resolution
Passes the buck to the application
The application is aware of the data schema and hence better suited tochoose a conflict resolution mechanism
If the application does not want to implement conflict resolution, simplemechanisms, such as “last write wins” provided by the framework
Zubair Nabi 12: NoSQL in Action April 20, 2013 8 / 33
Conflict Resolution
A datastore can only perform simple conflict resolution
Passes the buck to the application
The application is aware of the data schema and hence better suited tochoose a conflict resolution mechanism
If the application does not want to implement conflict resolution, simplemechanisms, such as “last write wins” provided by the framework
Zubair Nabi 12: NoSQL in Action April 20, 2013 8 / 33
Interface
1 Simple key/value interface storing values as BLOBs
2 Operations limited to one key/value pair at a time
3 No support for hierarchichal namespaces (like those in filesystems)
Zubair Nabi 12: NoSQL in Action April 20, 2013 9 / 33
Interface
1 Simple key/value interface storing values as BLOBs
2 Operations limited to one key/value pair at a time
3 No support for hierarchichal namespaces (like those in filesystems)
Zubair Nabi 12: NoSQL in Action April 20, 2013 9 / 33
Interface
1 Simple key/value interface storing values as BLOBs
2 Operations limited to one key/value pair at a time
3 No support for hierarchichal namespaces (like those in filesystems)
Zubair Nabi 12: NoSQL in Action April 20, 2013 9 / 33
Node Assignment
Completely decentralized so all nodes have equal responsibilities
As nodes can be heterogeneous, work is distributed proportional to thecapabilities of a node
Zubair Nabi 12: NoSQL in Action April 20, 2013 10 / 33
Node Assignment
Completely decentralized so all nodes have equal responsibilities
As nodes can be heterogeneous, work is distributed proportional to thecapabilities of a node
Zubair Nabi 12: NoSQL in Action April 20, 2013 10 / 33
Operations
Provides two operations:
1 get(key), returns a list of objects and a context2 put(key, context, object)
get can return more than one object if more than one conflictingversions
The context contains system metadata such as the object version
Keys and values are stored as an array of bytes, and only interpretedby the application
Zubair Nabi 12: NoSQL in Action April 20, 2013 11 / 33
Operations
Provides two operations:1 get(key), returns a list of objects and a context
2 put(key, context, object)
get can return more than one object if more than one conflictingversions
The context contains system metadata such as the object version
Keys and values are stored as an array of bytes, and only interpretedby the application
Zubair Nabi 12: NoSQL in Action April 20, 2013 11 / 33
Operations
Provides two operations:1 get(key), returns a list of objects and a context2 put(key, context, object)
get can return more than one object if more than one conflictingversions
The context contains system metadata such as the object version
Keys and values are stored as an array of bytes, and only interpretedby the application
Zubair Nabi 12: NoSQL in Action April 20, 2013 11 / 33
Operations
Provides two operations:1 get(key), returns a list of objects and a context2 put(key, context, object)
get can return more than one object if more than one conflictingversions
The context contains system metadata such as the object version
Keys and values are stored as an array of bytes, and only interpretedby the application
Zubair Nabi 12: NoSQL in Action April 20, 2013 11 / 33
Operations
Provides two operations:1 get(key), returns a list of objects and a context2 put(key, context, object)
get can return more than one object if more than one conflictingversions
The context contains system metadata such as the object version
Keys and values are stored as an array of bytes, and only interpretedby the application
Zubair Nabi 12: NoSQL in Action April 20, 2013 11 / 33
Operations
Provides two operations:1 get(key), returns a list of objects and a context2 put(key, context, object)
get can return more than one object if more than one conflictingversions
The context contains system metadata such as the object version
Keys and values are stored as an array of bytes, and only interpretedby the application
Zubair Nabi 12: NoSQL in Action April 20, 2013 11 / 33
Partitioning
MD5 hash of keys determines their storage nodes
Consistent hashing to provide incremental scalability
Partitioning done across virtual nodes instead of physical ones to takehardware heterogeneity into account
Zubair Nabi 12: NoSQL in Action April 20, 2013 12 / 33
Partitioning
MD5 hash of keys determines their storage nodes
Consistent hashing to provide incremental scalability
Partitioning done across virtual nodes instead of physical ones to takehardware heterogeneity into account
Zubair Nabi 12: NoSQL in Action April 20, 2013 12 / 33
Partitioning
MD5 hash of keys determines their storage nodes
Consistent hashing to provide incremental scalability
Partitioning done across virtual nodes instead of physical ones to takehardware heterogeneity into account
Zubair Nabi 12: NoSQL in Action April 20, 2013 12 / 33
Outline
1 Amazon’s Dynamo
2 MongoDB
3 Google BigTable
4 Cassandra
Zubair Nabi 12: NoSQL in Action April 20, 2013 13 / 33
Introduction
Schemaless document database in C++
Used by a large number of organizations including SourceForge.net.foursquare, the New York Times, bit.ly, Craigslist, SAP, MTV, EASports, github, etc.
Databases are distributed over multiple servers
Zubair Nabi 12: NoSQL in Action April 20, 2013 14 / 33
Introduction
Schemaless document database in C++
Used by a large number of organizations including SourceForge.net.foursquare, the New York Times, bit.ly, Craigslist, SAP, MTV, EASports, github, etc.
Databases are distributed over multiple servers
Zubair Nabi 12: NoSQL in Action April 20, 2013 14 / 33
Introduction
Schemaless document database in C++
Used by a large number of organizations including SourceForge.net.foursquare, the New York Times, bit.ly, Craigslist, SAP, MTV, EASports, github, etc.
Databases are distributed over multiple servers
Zubair Nabi 12: NoSQL in Action April 20, 2013 14 / 33
Databases and Collections
Databases contain collections (“named groupings”) of documents
Documents within a collection might be heterogeneous
But a good strategy is to create a database collection for each objecttype
A collection is created automatically whenever the first document isinserted into the database
Zubair Nabi 12: NoSQL in Action April 20, 2013 15 / 33
Databases and Collections
Databases contain collections (“named groupings”) of documents
Documents within a collection might be heterogeneous
But a good strategy is to create a database collection for each objecttype
A collection is created automatically whenever the first document isinserted into the database
Zubair Nabi 12: NoSQL in Action April 20, 2013 15 / 33
Databases and Collections
Databases contain collections (“named groupings”) of documents
Documents within a collection might be heterogeneous
But a good strategy is to create a database collection for each objecttype
A collection is created automatically whenever the first document isinserted into the database
Zubair Nabi 12: NoSQL in Action April 20, 2013 15 / 33
Databases and Collections
Databases contain collections (“named groupings”) of documents
Documents within a collection might be heterogeneous
But a good strategy is to create a database collection for each objecttype
A collection is created automatically whenever the first document isinserted into the database
Zubair Nabi 12: NoSQL in Action April 20, 2013 15 / 33
Hierarchical Namespaces
Documents can be organized into a hierarchical structure using adot-notation
I For instance, the collections wiki.articles, wiki.categoriesand wiki.authors exist within the namespace wiki
The collection namespace itself is flat, hierarchical structure only forthe user
Zubair Nabi 12: NoSQL in Action April 20, 2013 16 / 33
Hierarchical Namespaces
Documents can be organized into a hierarchical structure using adot-notation
I For instance, the collections wiki.articles, wiki.categoriesand wiki.authors exist within the namespace wiki
The collection namespace itself is flat, hierarchical structure only forthe user
Zubair Nabi 12: NoSQL in Action April 20, 2013 16 / 33
Hierarchical Namespaces
Documents can be organized into a hierarchical structure using adot-notation
I For instance, the collections wiki.articles, wiki.categoriesand wiki.authors exist within the namespace wiki
The collection namespace itself is flat, hierarchical structure only forthe user
Zubair Nabi 12: NoSQL in Action April 20, 2013 16 / 33
Documents
Unit of data storage
Conceptually similar to an XML document, JSON document, etc.
Documents are persisted in Binary JSON (BSON)
Easy to convert between BSON and JSON and between BSON andother programming language structures
Possible to insert (insert), search (find), and update a document(save)
Zubair Nabi 12: NoSQL in Action April 20, 2013 17 / 33
Documents
Unit of data storage
Conceptually similar to an XML document, JSON document, etc.
Documents are persisted in Binary JSON (BSON)
Easy to convert between BSON and JSON and between BSON andother programming language structures
Possible to insert (insert), search (find), and update a document(save)
Zubair Nabi 12: NoSQL in Action April 20, 2013 17 / 33
Documents
Unit of data storage
Conceptually similar to an XML document, JSON document, etc.
Documents are persisted in Binary JSON (BSON)
Easy to convert between BSON and JSON and between BSON andother programming language structures
Possible to insert (insert), search (find), and update a document(save)
Zubair Nabi 12: NoSQL in Action April 20, 2013 17 / 33
Documents
Unit of data storage
Conceptually similar to an XML document, JSON document, etc.
Documents are persisted in Binary JSON (BSON)
Easy to convert between BSON and JSON and between BSON andother programming language structures
Possible to insert (insert), search (find), and update a document(save)
Zubair Nabi 12: NoSQL in Action April 20, 2013 17 / 33
Documents
Unit of data storage
Conceptually similar to an XML document, JSON document, etc.
Documents are persisted in Binary JSON (BSON)
Easy to convert between BSON and JSON and between BSON andother programming language structures
Possible to insert (insert), search (find), and update a document(save)
Zubair Nabi 12: NoSQL in Action April 20, 2013 17 / 33
Datatypes
Scalar: boolean, integer, double
Character sequence: string, code, etc.
BSON-objects: object
Object ID: To identify documents within a collection
Misc: null, array, date
Zubair Nabi 12: NoSQL in Action April 20, 2013 18 / 33
Datatypes
Scalar: boolean, integer, double
Character sequence: string, code, etc.
BSON-objects: object
Object ID: To identify documents within a collection
Misc: null, array, date
Zubair Nabi 12: NoSQL in Action April 20, 2013 18 / 33
Datatypes
Scalar: boolean, integer, double
Character sequence: string, code, etc.
BSON-objects: object
Object ID: To identify documents within a collection
Misc: null, array, date
Zubair Nabi 12: NoSQL in Action April 20, 2013 18 / 33
Datatypes
Scalar: boolean, integer, double
Character sequence: string, code, etc.
BSON-objects: object
Object ID: To identify documents within a collection
Misc: null, array, date
Zubair Nabi 12: NoSQL in Action April 20, 2013 18 / 33
Datatypes
Scalar: boolean, integer, double
Character sequence: string, code, etc.
BSON-objects: object
Object ID: To identify documents within a collection
Misc: null, array, date
Zubair Nabi 12: NoSQL in Action April 20, 2013 18 / 33
References
No mechanism for foreign keys
References between documents need to be resolved by clientapplications
Zubair Nabi 12: NoSQL in Action April 20, 2013 19 / 33
References
No mechanism for foreign keys
References between documents need to be resolved by clientapplications
Zubair Nabi 12: NoSQL in Action April 20, 2013 19 / 33
Transaction Properties
Atomicity for only update and delete operations
Allows code to be executed locally on database nodes (server-sidecode execution)Three different strategies for server-side execution:
1 Execution of arbitrary code on a single node via eval operator2 Aggregation via count, group, and distinct3 MapReduce code execution on multiple nodes
Zubair Nabi 12: NoSQL in Action April 20, 2013 20 / 33
Transaction Properties
Atomicity for only update and delete operations
Allows code to be executed locally on database nodes (server-sidecode execution)
Three different strategies for server-side execution:1 Execution of arbitrary code on a single node via eval operator2 Aggregation via count, group, and distinct3 MapReduce code execution on multiple nodes
Zubair Nabi 12: NoSQL in Action April 20, 2013 20 / 33
Transaction Properties
Atomicity for only update and delete operations
Allows code to be executed locally on database nodes (server-sidecode execution)Three different strategies for server-side execution:
1 Execution of arbitrary code on a single node via eval operator2 Aggregation via count, group, and distinct3 MapReduce code execution on multiple nodes
Zubair Nabi 12: NoSQL in Action April 20, 2013 20 / 33
Transaction Properties
Atomicity for only update and delete operations
Allows code to be executed locally on database nodes (server-sidecode execution)Three different strategies for server-side execution:
1 Execution of arbitrary code on a single node via eval operator
2 Aggregation via count, group, and distinct3 MapReduce code execution on multiple nodes
Zubair Nabi 12: NoSQL in Action April 20, 2013 20 / 33
Transaction Properties
Atomicity for only update and delete operations
Allows code to be executed locally on database nodes (server-sidecode execution)Three different strategies for server-side execution:
1 Execution of arbitrary code on a single node via eval operator2 Aggregation via count, group, and distinct
3 MapReduce code execution on multiple nodes
Zubair Nabi 12: NoSQL in Action April 20, 2013 20 / 33
Transaction Properties
Atomicity for only update and delete operations
Allows code to be executed locally on database nodes (server-sidecode execution)Three different strategies for server-side execution:
1 Execution of arbitrary code on a single node via eval operator2 Aggregation via count, group, and distinct3 MapReduce code execution on multiple nodes
Zubair Nabi 12: NoSQL in Action April 20, 2013 20 / 33
Outline
1 Amazon’s Dynamo
2 MongoDB
3 Google BigTable
4 Cassandra
Zubair Nabi 12: NoSQL in Action April 20, 2013 21 / 33
Introduction
Supports a relaxed relational model that is dynamically controlled bythe clients
Clients can reason about the locality properties of the data
Data indexing can be row-wise as well as column-wise
Data can be delivered either out of memory or from disk
Used internally by Google for more than 60 projects including GoogleEarth, Google Analytics, Orkut, and Google Docs
Zubair Nabi 12: NoSQL in Action April 20, 2013 22 / 33
Introduction
Supports a relaxed relational model that is dynamically controlled bythe clients
Clients can reason about the locality properties of the data
Data indexing can be row-wise as well as column-wise
Data can be delivered either out of memory or from disk
Used internally by Google for more than 60 projects including GoogleEarth, Google Analytics, Orkut, and Google Docs
Zubair Nabi 12: NoSQL in Action April 20, 2013 22 / 33
Introduction
Supports a relaxed relational model that is dynamically controlled bythe clients
Clients can reason about the locality properties of the data
Data indexing can be row-wise as well as column-wise
Data can be delivered either out of memory or from disk
Used internally by Google for more than 60 projects including GoogleEarth, Google Analytics, Orkut, and Google Docs
Zubair Nabi 12: NoSQL in Action April 20, 2013 22 / 33
Introduction
Supports a relaxed relational model that is dynamically controlled bythe clients
Clients can reason about the locality properties of the data
Data indexing can be row-wise as well as column-wise
Data can be delivered either out of memory or from disk
Used internally by Google for more than 60 projects including GoogleEarth, Google Analytics, Orkut, and Google Docs
Zubair Nabi 12: NoSQL in Action April 20, 2013 22 / 33
Introduction
Supports a relaxed relational model that is dynamically controlled bythe clients
Clients can reason about the locality properties of the data
Data indexing can be row-wise as well as column-wise
Data can be delivered either out of memory or from disk
Used internally by Google for more than 60 projects including GoogleEarth, Google Analytics, Orkut, and Google Docs
Zubair Nabi 12: NoSQL in Action April 20, 2013 22 / 33
Data Model
Values stored as arrays of bytes which need to be interpreted by theclients
Values are addressed by a 3-tuple (row-key, column-key,timestamp)
Row keys are strings of up to 64KBRows are maintained in lexicographic order and are dynamicallypartitioned into tablets
I The unit of distribution and load balancing
Reads can be made efficient (only having to access a small number ofservers) by wisely choosing row keys
I Row ranges with small lexicographic distances are partitioned into fewertablets
I For instance storing URLs in reverse order: com.cnn.blogs,com.cnn.www, etc.
Zubair Nabi 12: NoSQL in Action April 20, 2013 23 / 33
Data Model
Values stored as arrays of bytes which need to be interpreted by theclients
Values are addressed by a 3-tuple (row-key, column-key,timestamp)
Row keys are strings of up to 64KBRows are maintained in lexicographic order and are dynamicallypartitioned into tablets
I The unit of distribution and load balancing
Reads can be made efficient (only having to access a small number ofservers) by wisely choosing row keys
I Row ranges with small lexicographic distances are partitioned into fewertablets
I For instance storing URLs in reverse order: com.cnn.blogs,com.cnn.www, etc.
Zubair Nabi 12: NoSQL in Action April 20, 2013 23 / 33
Data Model
Values stored as arrays of bytes which need to be interpreted by theclients
Values are addressed by a 3-tuple (row-key, column-key,timestamp)
Row keys are strings of up to 64KB
Rows are maintained in lexicographic order and are dynamicallypartitioned into tablets
I The unit of distribution and load balancing
Reads can be made efficient (only having to access a small number ofservers) by wisely choosing row keys
I Row ranges with small lexicographic distances are partitioned into fewertablets
I For instance storing URLs in reverse order: com.cnn.blogs,com.cnn.www, etc.
Zubair Nabi 12: NoSQL in Action April 20, 2013 23 / 33
Data Model
Values stored as arrays of bytes which need to be interpreted by theclients
Values are addressed by a 3-tuple (row-key, column-key,timestamp)
Row keys are strings of up to 64KBRows are maintained in lexicographic order and are dynamicallypartitioned into tablets
I The unit of distribution and load balancing
Reads can be made efficient (only having to access a small number ofservers) by wisely choosing row keys
I Row ranges with small lexicographic distances are partitioned into fewertablets
I For instance storing URLs in reverse order: com.cnn.blogs,com.cnn.www, etc.
Zubair Nabi 12: NoSQL in Action April 20, 2013 23 / 33
Data Model
Values stored as arrays of bytes which need to be interpreted by theclients
Values are addressed by a 3-tuple (row-key, column-key,timestamp)
Row keys are strings of up to 64KBRows are maintained in lexicographic order and are dynamicallypartitioned into tablets
I The unit of distribution and load balancing
Reads can be made efficient (only having to access a small number ofservers) by wisely choosing row keys
I Row ranges with small lexicographic distances are partitioned into fewertablets
I For instance storing URLs in reverse order: com.cnn.blogs,com.cnn.www, etc.
Zubair Nabi 12: NoSQL in Action April 20, 2013 23 / 33
Data Model
Values stored as arrays of bytes which need to be interpreted by theclients
Values are addressed by a 3-tuple (row-key, column-key,timestamp)
Row keys are strings of up to 64KBRows are maintained in lexicographic order and are dynamicallypartitioned into tablets
I The unit of distribution and load balancing
Reads can be made efficient (only having to access a small number ofservers) by wisely choosing row keys
I Row ranges with small lexicographic distances are partitioned into fewertablets
I For instance storing URLs in reverse order: com.cnn.blogs,com.cnn.www, etc.
Zubair Nabi 12: NoSQL in Action April 20, 2013 23 / 33
Data Model
Values stored as arrays of bytes which need to be interpreted by theclients
Values are addressed by a 3-tuple (row-key, column-key,timestamp)
Row keys are strings of up to 64KBRows are maintained in lexicographic order and are dynamicallypartitioned into tablets
I The unit of distribution and load balancing
Reads can be made efficient (only having to access a small number ofservers) by wisely choosing row keys
I Row ranges with small lexicographic distances are partitioned into fewertablets
I For instance storing URLs in reverse order: com.cnn.blogs,com.cnn.www, etc.
Zubair Nabi 12: NoSQL in Action April 20, 2013 23 / 33
Data Model
Values stored as arrays of bytes which need to be interpreted by theclients
Values are addressed by a 3-tuple (row-key, column-key,timestamp)
Row keys are strings of up to 64KBRows are maintained in lexicographic order and are dynamicallypartitioned into tablets
I The unit of distribution and load balancing
Reads can be made efficient (only having to access a small number ofservers) by wisely choosing row keys
I Row ranges with small lexicographic distances are partitioned into fewertablets
I For instance storing URLs in reverse order: com.cnn.blogs,com.cnn.www, etc.
Zubair Nabi 12: NoSQL in Action April 20, 2013 23 / 33
Columns
No limit on the number of columns per table
Columns grouped into sets called column families based on their keyprefix
I Basic unit of access controlI Expected to store the same or similar type of data so that it can be
compressedI Need to be created before data can be stored in a column
Zubair Nabi 12: NoSQL in Action April 20, 2013 24 / 33
Columns
No limit on the number of columns per tableColumns grouped into sets called column families based on their keyprefix
I Basic unit of access controlI Expected to store the same or similar type of data so that it can be
compressedI Need to be created before data can be stored in a column
Zubair Nabi 12: NoSQL in Action April 20, 2013 24 / 33
Columns
No limit on the number of columns per tableColumns grouped into sets called column families based on their keyprefix
I Basic unit of access control
I Expected to store the same or similar type of data so that it can becompressed
I Need to be created before data can be stored in a column
Zubair Nabi 12: NoSQL in Action April 20, 2013 24 / 33
Columns
No limit on the number of columns per tableColumns grouped into sets called column families based on their keyprefix
I Basic unit of access controlI Expected to store the same or similar type of data so that it can be
compressed
I Need to be created before data can be stored in a column
Zubair Nabi 12: NoSQL in Action April 20, 2013 24 / 33
Columns
No limit on the number of columns per tableColumns grouped into sets called column families based on their keyprefix
I Basic unit of access controlI Expected to store the same or similar type of data so that it can be
compressedI Need to be created before data can be stored in a column
Zubair Nabi 12: NoSQL in Action April 20, 2013 24 / 33
Timestamps
64-bit integers that represent different versions of a cell value
Value assigned by either the datastore or the client
Cells ordered in decreasing order of their timestamp
Automatic garbage collection can be used to remove revisions
Zubair Nabi 12: NoSQL in Action April 20, 2013 25 / 33
Timestamps
64-bit integers that represent different versions of a cell value
Value assigned by either the datastore or the client
Cells ordered in decreasing order of their timestamp
Automatic garbage collection can be used to remove revisions
Zubair Nabi 12: NoSQL in Action April 20, 2013 25 / 33
Timestamps
64-bit integers that represent different versions of a cell value
Value assigned by either the datastore or the client
Cells ordered in decreasing order of their timestamp
Automatic garbage collection can be used to remove revisions
Zubair Nabi 12: NoSQL in Action April 20, 2013 25 / 33
Timestamps
64-bit integers that represent different versions of a cell value
Value assigned by either the datastore or the client
Cells ordered in decreasing order of their timestamp
Automatic garbage collection can be used to remove revisions
Zubair Nabi 12: NoSQL in Action April 20, 2013 25 / 33
API
Read operations for lookup, selection, etc.
Write operations for creation, update, and deletion of values
Write operations for tables and column families for creation anddeletion
Administrative operations to modify store configuration and metadata
MapReduce hooks
Transactions are atomic at the single-row level
Zubair Nabi 12: NoSQL in Action April 20, 2013 26 / 33
API
Read operations for lookup, selection, etc.
Write operations for creation, update, and deletion of values
Write operations for tables and column families for creation anddeletion
Administrative operations to modify store configuration and metadata
MapReduce hooks
Transactions are atomic at the single-row level
Zubair Nabi 12: NoSQL in Action April 20, 2013 26 / 33
API
Read operations for lookup, selection, etc.
Write operations for creation, update, and deletion of values
Write operations for tables and column families for creation anddeletion
Administrative operations to modify store configuration and metadata
MapReduce hooks
Transactions are atomic at the single-row level
Zubair Nabi 12: NoSQL in Action April 20, 2013 26 / 33
API
Read operations for lookup, selection, etc.
Write operations for creation, update, and deletion of values
Write operations for tables and column families for creation anddeletion
Administrative operations to modify store configuration and metadata
MapReduce hooks
Transactions are atomic at the single-row level
Zubair Nabi 12: NoSQL in Action April 20, 2013 26 / 33
API
Read operations for lookup, selection, etc.
Write operations for creation, update, and deletion of values
Write operations for tables and column families for creation anddeletion
Administrative operations to modify store configuration and metadata
MapReduce hooks
Transactions are atomic at the single-row level
Zubair Nabi 12: NoSQL in Action April 20, 2013 26 / 33
API
Read operations for lookup, selection, etc.
Write operations for creation, update, and deletion of values
Write operations for tables and column families for creation anddeletion
Administrative operations to modify store configuration and metadata
MapReduce hooks
Transactions are atomic at the single-row level
Zubair Nabi 12: NoSQL in Action April 20, 2013 26 / 33
Architecture
Implemented atop GFS
Multiple tablet servers and a single master
Zubair Nabi 12: NoSQL in Action April 20, 2013 27 / 33
Architecture
Implemented atop GFS
Multiple tablet servers and a single master
Zubair Nabi 12: NoSQL in Action April 20, 2013 27 / 33
HBase
Open source clone of HBase in Java
Implemented atop HDFS
HBase can be the source and/or the sink of Hadoop jobs
Facebook Chat implemented using HBase
Zubair Nabi 12: NoSQL in Action April 20, 2013 28 / 33
HBase
Open source clone of HBase in Java
Implemented atop HDFS
HBase can be the source and/or the sink of Hadoop jobs
Facebook Chat implemented using HBase
Zubair Nabi 12: NoSQL in Action April 20, 2013 28 / 33
HBase
Open source clone of HBase in Java
Implemented atop HDFS
HBase can be the source and/or the sink of Hadoop jobs
Facebook Chat implemented using HBase
Zubair Nabi 12: NoSQL in Action April 20, 2013 28 / 33
HBase
Open source clone of HBase in Java
Implemented atop HDFS
HBase can be the source and/or the sink of Hadoop jobs
Facebook Chat implemented using HBase
Zubair Nabi 12: NoSQL in Action April 20, 2013 28 / 33
Outline
1 Amazon’s Dynamo
2 MongoDB
3 Google BigTable
4 Cassandra
Zubair Nabi 12: NoSQL in Action April 20, 2013 29 / 33
Introduction
Borrows concepts from both Dynamo and BigTable
Originally developed by Facebook but now an Apache open sourceproject
Designed for Facebook Chat for efficiently storing, indexing, andsearching messages
Zubair Nabi 12: NoSQL in Action April 20, 2013 30 / 33
Introduction
Borrows concepts from both Dynamo and BigTable
Originally developed by Facebook but now an Apache open sourceproject
Designed for Facebook Chat for efficiently storing, indexing, andsearching messages
Zubair Nabi 12: NoSQL in Action April 20, 2013 30 / 33
Introduction
Borrows concepts from both Dynamo and BigTable
Originally developed by Facebook but now an Apache open sourceproject
Designed for Facebook Chat for efficiently storing, indexing, andsearching messages
Zubair Nabi 12: NoSQL in Action April 20, 2013 30 / 33
Design Goals
Processing of a large amount of data
Highly scalable
Reliability at a massive scale
High throughput writes without sacrificing read efficiency
Zubair Nabi 12: NoSQL in Action April 20, 2013 31 / 33
Design Goals
Processing of a large amount of data
Highly scalable
Reliability at a massive scale
High throughput writes without sacrificing read efficiency
Zubair Nabi 12: NoSQL in Action April 20, 2013 31 / 33
Design Goals
Processing of a large amount of data
Highly scalable
Reliability at a massive scale
High throughput writes without sacrificing read efficiency
Zubair Nabi 12: NoSQL in Action April 20, 2013 31 / 33
Design Goals
Processing of a large amount of data
Highly scalable
Reliability at a massive scale
High throughput writes without sacrificing read efficiency
Zubair Nabi 12: NoSQL in Action April 20, 2013 31 / 33
Data Model
A table is a distributed multidimensional map indexed by a key
Rows are identified by a string-key and operations over them areatomic per replica regardless of the number of columns
Column families encapsule columns and super columns
Columns have a name and store a number of values per row, each witha timestamp
Super columns are columns with sub columns
Only three operations to get, insert, and delete
Zubair Nabi 12: NoSQL in Action April 20, 2013 32 / 33
Data Model
A table is a distributed multidimensional map indexed by a key
Rows are identified by a string-key and operations over them areatomic per replica regardless of the number of columns
Column families encapsule columns and super columns
Columns have a name and store a number of values per row, each witha timestamp
Super columns are columns with sub columns
Only three operations to get, insert, and delete
Zubair Nabi 12: NoSQL in Action April 20, 2013 32 / 33
Data Model
A table is a distributed multidimensional map indexed by a key
Rows are identified by a string-key and operations over them areatomic per replica regardless of the number of columns
Column families encapsule columns and super columns
Columns have a name and store a number of values per row, each witha timestamp
Super columns are columns with sub columns
Only three operations to get, insert, and delete
Zubair Nabi 12: NoSQL in Action April 20, 2013 32 / 33
Data Model
A table is a distributed multidimensional map indexed by a key
Rows are identified by a string-key and operations over them areatomic per replica regardless of the number of columns
Column families encapsule columns and super columns
Columns have a name and store a number of values per row, each witha timestamp
Super columns are columns with sub columns
Only three operations to get, insert, and delete
Zubair Nabi 12: NoSQL in Action April 20, 2013 32 / 33
Data Model
A table is a distributed multidimensional map indexed by a key
Rows are identified by a string-key and operations over them areatomic per replica regardless of the number of columns
Column families encapsule columns and super columns
Columns have a name and store a number of values per row, each witha timestamp
Super columns are columns with sub columns
Only three operations to get, insert, and delete
Zubair Nabi 12: NoSQL in Action April 20, 2013 32 / 33
Data Model
A table is a distributed multidimensional map indexed by a key
Rows are identified by a string-key and operations over them areatomic per replica regardless of the number of columns
Column families encapsule columns and super columns
Columns have a name and store a number of values per row, each witha timestamp
Super columns are columns with sub columns
Only three operations to get, insert, and delete
Zubair Nabi 12: NoSQL in Action April 20, 2013 32 / 33
References
1 NoSQL Databases: https://oak.cs.ucla.edu/cs144/handouts/nosqldbs.pdf
Zubair Nabi 12: NoSQL in Action April 20, 2013 33 / 33