nosql brownbag

14
© 2011 Fidelity National Information Services, Inc. and its subsidiaries. NoSQL – An Introduction Sandeep Kumar Not Only SQL

Upload: sandeep-kumar

Post on 22-Jan-2018

389 views

Category:

Documents


1 download

TRANSCRIPT

© 2011 Fidelity National Information Services, Inc. and its subsidiaries.

NoSQL – An Introduction

Sandeep Kumar

Not Only SQL

Agenda

•Introduction to NoSQL

•Motivation behind NoSQL

•Where NoSQL Fits

•Theory of NoSQL – CAP (NoSQL principles)

•Types of NoSQL Databases

•Example with MongoDB

•Match with your favorite (RDBMS)

•Question & Answer

• NoSQL is a whole new way of thinking about a database.

• Sometimes it revered to as 'not only SQL'.

• Schema-Less

– It is not built on tables and does not employ SQL to manipulate data.

• It also may not provide full ACID, but still has a distributed and fault tolerant architecture.

• There are more than one storage mechanism that could be used based on the needs

•Mostly open-source

•May not be the best solution for all situations

•Many of the big giants in e-Commerce, Social Networking like flipKart, Amazon, Facebook, Linkedin etc.. have adopted this concept.

Introduction to NoSQL

Motivation behind NoSQL• Scale out approach (Sharding)

The group of servers together forms a clustered

storage array and provides LUNs (Logical Unit Numbers)

or file shares over a network

• NoSQL (Non-Relational) database technology

• NoSQL refers to progressive data management engines that go beyond legacy relational databases in satisfying the needs of today’s modern business applications.

• A very flexible data model, horizontal

Scalability, distributed architectures.

• And the use of languages and interfaces that are “not only” SQL typically characterize NoSQL technology.

How it changed the way of handling data for…

• First reason to use NoSQL is because you have big data projects to tackle. Abig data project is normally typified by:– Data velocity – lots of data coming in very quickly, possibly from different locations

– Data variety – storage of data that is structured, semi-structured, and unstructured

– Data volume – data that involves many terabytes or petabytes in size

– Data complexity – data that is stored and managed in different locales, data centers,or cloud geo-zones

• Scale Out approach.

• Application driven schema.

• You need a better architecture. • You need continuous availability for an application.• You need location independence for a system. • You need modern transaction support. • You need a more flexible data model.

Where NoSQL Fits

Theory of NoSQL - CAP

Consistency

• All nodes see the same data at the same time

Availability

• A guarantee that every request receives a

response about whether it succeeded/failed

Partition Tolerance

• Multiple entry points

• System continues to operate despite

arbitrary partitioning due to network

failures

A P

C

Thumb rule is to pre join the data.

CAP Theorem:satisfying all three at the same

time is impossible

CA

AP

CP

Types of NoSQL DatabasesKey – Value

Data Model:

• Global key-value mapping

• Big scalable HashMap

• Highly fault tolerant (typically)

Pros:

• Simple data model

• Scalable

Cons:

• Create your own “foreign keys”

• Poor for complex data

Key Value

“FIS”{“C-5, Sector-126, Noida, India – 301306″}

Types of NoSQL Databases (Cont…)Document Oriented

Data Model:

• A collection of documents

• A document is a key value collection

• Index-centric, lots of map-reduce

Pros:

• Simple, Powerful data model

• Scalable

Cons:

• Poor for interconnected data

• Query model limited to keys and indexes

• Map reduce for larger queries

Document

{company: “FIS”,OfficeAddress: {“B-25, Sector-58,

Noida, India – 201301″}}

Types of NoSQL Databases (Cont…)Column Family

Data Model:

• A big table, with column families

• Map Reduce for querying/processing

Pros:

• Supports Simi-Structured Data

• Naturally Indexed (columns)

• Scalable

Cons:

• Poor for interconnected data

Column Families

{FIS: {Address: {city: Noida,pincode: 201301},details: {strength: 250,projects: 20}}

Types of NoSQL Databases (Cont…)Graph Family

Data Model:

• Nodes and Relationships

Pros:

• Powerful data model, as general as RDBMS

• Connected data locally indexed

• Easy to query

Cons:

• Sharding (Still under optimization)

• Scales UP reasonably well

• Requires rewiring your brain

Example with MongoDB

Document Store

RDBMS MongoDB

Database Database

Table , View Collection

Row Document (JSON, BSON)

Column Field

Index Index

Join Embedded Document

Foreign Key Reference

Partition Shard

> db.user.findOne({age:39}){

"_id" : ObjectId("5114e0bd42…"),"first" : “Sandeep","last" : “Kumar","age" : 34,

"interests" : ["Reading","Mountain Biking” ]

"favorites": { "color": "Blue"}

}

CRUD Example

> db.user.insert({first: "John",last : "Doe",age: 39

})

> db.user.find (){

"_id" : ObjectId("51…"),

"first" : "John","last" : "Doe","age" : 39

}

> db.user.update({"_id" :

ObjectId("51…")},{$set: {age: 40,salary: 7000}}

)

> db.user.remove({"first": /^J/

})

Match with your favorite (RDBMS)

Lets conclude…

RDBMS NoSQL

Scale up Scale out

Offers a big feature set and data integrity

May not need all features, cost andcomplexity

Data Modeling -Schema Schema less

Relational Data Model Pre-Joined Document data model

Fit with Structured data Best suits for unstructured data

Doesn’t fit with modern Agile development

Best fit for Agile development

Required great precision Faster Data Processing

Easy Querying Overhead for complex queries

Offers the degree of reliability Doesn’t offers degree of reliability

Consistency Compromise with consistency

Customer support No customer support

References :

http://www.leavcom.com/pdf/NoSQL.pdf

http://www.couchbase.com/sites/default/files/uploads/all/whitepapers/Couchbase_Whitepaper_Transitioning_Relational_to_NoSQL.pdf

https://en.wikipedia.org/wiki/NoSQL

Thanks for participating!