nosql

Post on 02-Dec-2014

2.818 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

An introduction to NoSQL

TRANSCRIPT

An introduction toNoSQL

Radu Potop

NoSQL

● umbrella term● non-relational data storage● no fixed table schemas● a fresh take on the database technology

Relational databases have issues in handling big volumes of data

Some companies and their databases:● Digg.com - 3 TB for green badges● Facebook - 50 TB for inbox search● eBay - 2 P(eta)B in total

Issues

● horizontal scalability● server performance● rigid schemas● distribution across servers

Characteristics of NoSQL

● no ACID guarantees (Atomicity, Consistency, Isolation, Durability)● highly distributed● scalable● better performance - they don't have to handle relations

NoSQL databases examples:

● Google Bigtable (used intensively by almost everything made by Google)● Amazon Dynamo (used by Amazon S3)● Facebook Cassandra● Apache HBase● LinkedIn Voldemort

Some types of databases:

● Document Oriented databases● JSON format, XML databases● examples: CouchDB, BaseX

● Key - Value pairs databases● values can be more than strings (set of strings)

● examples: Redis, Cassandra

CouchDB

● created by the Apache Foundation● written in Erlang● open source● document oriented database● stores data as JSON documents collection

● queried via REST API● JavaScript is the default language● also supported:

PHP, Ruby, Python and Erlang● built-in replication features● used by Ubuntu One

JSON document{"_id" : "fc5e038d38a570","_rev" : "D546012",

"to" : "email@example","subject" : "helloWorld","body" : "some text"

}

Operations with these documents

● HTTP requests:● GET (select), POST (create), PUT (update), DELETE (delete).

● HTTP AUTH● Aplications: curl, Futon● JavaScript● any application that knows HTTP requests

Futon interface

Redis

● key - value database● written in C● open source● networked● in-memory● persistent database● similar to memcached● data is non-volatile

● atomic operations● very high performance

~100.000 operations/secondby 50 parallel clients

● all data is kept in memory - blazing fast● periodic synchronization to hard-drive● powerful replication

● bindings for a lot of languages: PHP, Ruby, Python, C, Java, etc.

SET foo barGET foo => bar

SET - insertGET - select

Key - value based databases became very popular lately

Other key-value databases:● Facebook's Cassandra (now also used by Digg)● GM.T● MemcacheDB (a persistence enabled variant of memcached)● LinkedIn Voldemort

Conclusion

● relational databases are not the holy grail of data storage● scalability issues determined large corporations to look to other solutions● don't believe the FUD and give them a try

Thank you

top related