nosql brownbag
TRANSCRIPT
© 2011 Fidelity National Information Services, Inc. and its subsidiaries.
NoSQL – An Introduction
Sandeep Kumar
Not Only SQL
Agenda
•Introduction to NoSQL
•Motivation behind NoSQL
•Where NoSQL Fits
•Theory of NoSQL – CAP (NoSQL principles)
•Types of NoSQL Databases
•Example with MongoDB
•Match with your favorite (RDBMS)
•Question & Answer
• NoSQL is a whole new way of thinking about a database.
• Sometimes it revered to as 'not only SQL'.
• Schema-Less
– It is not built on tables and does not employ SQL to manipulate data.
• It also may not provide full ACID, but still has a distributed and fault tolerant architecture.
• There are more than one storage mechanism that could be used based on the needs
•Mostly open-source
•May not be the best solution for all situations
•Many of the big giants in e-Commerce, Social Networking like flipKart, Amazon, Facebook, Linkedin etc.. have adopted this concept.
Introduction to NoSQL
Motivation behind NoSQL• Scale out approach (Sharding)
The group of servers together forms a clustered
storage array and provides LUNs (Logical Unit Numbers)
or file shares over a network
• NoSQL (Non-Relational) database technology
• NoSQL refers to progressive data management engines that go beyond legacy relational databases in satisfying the needs of today’s modern business applications.
• A very flexible data model, horizontal
Scalability, distributed architectures.
• And the use of languages and interfaces that are “not only” SQL typically characterize NoSQL technology.
How it changed the way of handling data for…
• First reason to use NoSQL is because you have big data projects to tackle. Abig data project is normally typified by:– Data velocity – lots of data coming in very quickly, possibly from different locations
– Data variety – storage of data that is structured, semi-structured, and unstructured
– Data volume – data that involves many terabytes or petabytes in size
– Data complexity – data that is stored and managed in different locales, data centers,or cloud geo-zones
• Scale Out approach.
• Application driven schema.
• You need a better architecture. • You need continuous availability for an application.• You need location independence for a system. • You need modern transaction support. • You need a more flexible data model.
Where NoSQL Fits
Theory of NoSQL - CAP
Consistency
• All nodes see the same data at the same time
Availability
• A guarantee that every request receives a
response about whether it succeeded/failed
Partition Tolerance
• Multiple entry points
• System continues to operate despite
arbitrary partitioning due to network
failures
A P
C
Thumb rule is to pre join the data.
CAP Theorem:satisfying all three at the same
time is impossible
CA
AP
CP
Types of NoSQL DatabasesKey – Value
Data Model:
• Global key-value mapping
• Big scalable HashMap
• Highly fault tolerant (typically)
Pros:
• Simple data model
• Scalable
Cons:
• Create your own “foreign keys”
• Poor for complex data
Key Value
“FIS”{“C-5, Sector-126, Noida, India – 301306″}
Types of NoSQL Databases (Cont…)Document Oriented
Data Model:
• A collection of documents
• A document is a key value collection
• Index-centric, lots of map-reduce
Pros:
• Simple, Powerful data model
• Scalable
Cons:
• Poor for interconnected data
• Query model limited to keys and indexes
• Map reduce for larger queries
Document
{company: “FIS”,OfficeAddress: {“B-25, Sector-58,
Noida, India – 201301″}}
Types of NoSQL Databases (Cont…)Column Family
Data Model:
• A big table, with column families
• Map Reduce for querying/processing
Pros:
• Supports Simi-Structured Data
• Naturally Indexed (columns)
• Scalable
Cons:
• Poor for interconnected data
Column Families
{FIS: {Address: {city: Noida,pincode: 201301},details: {strength: 250,projects: 20}}
Types of NoSQL Databases (Cont…)Graph Family
Data Model:
• Nodes and Relationships
Pros:
• Powerful data model, as general as RDBMS
• Connected data locally indexed
• Easy to query
Cons:
• Sharding (Still under optimization)
• Scales UP reasonably well
• Requires rewiring your brain
Example with MongoDB
Document Store
RDBMS MongoDB
Database Database
Table , View Collection
Row Document (JSON, BSON)
Column Field
Index Index
Join Embedded Document
Foreign Key Reference
Partition Shard
> db.user.findOne({age:39}){
"_id" : ObjectId("5114e0bd42…"),"first" : “Sandeep","last" : “Kumar","age" : 34,
"interests" : ["Reading","Mountain Biking” ]
"favorites": { "color": "Blue"}
}
CRUD Example
> db.user.insert({first: "John",last : "Doe",age: 39
})
> db.user.find (){
"_id" : ObjectId("51…"),
"first" : "John","last" : "Doe","age" : 39
}
> db.user.update({"_id" :
ObjectId("51…")},{$set: {age: 40,salary: 7000}}
)
> db.user.remove({"first": /^J/
})
Match with your favorite (RDBMS)
Lets conclude…
RDBMS NoSQL
Scale up Scale out
Offers a big feature set and data integrity
May not need all features, cost andcomplexity
Data Modeling -Schema Schema less
Relational Data Model Pre-Joined Document data model
Fit with Structured data Best suits for unstructured data
Doesn’t fit with modern Agile development
Best fit for Agile development
Required great precision Faster Data Processing
Easy Querying Overhead for complex queries
Offers the degree of reliability Doesn’t offers degree of reliability
Consistency Compromise with consistency
Customer support No customer support
References :
http://www.leavcom.com/pdf/NoSQL.pdf
http://www.couchbase.com/sites/default/files/uploads/all/whitepapers/Couchbase_Whitepaper_Transitioning_Relational_to_NoSQL.pdf
https://en.wikipedia.org/wiki/NoSQL
Thanks for participating!