conference nosql databases and scalability : new paradigm

43
:: Conférence :: NoSQL / Scalabilite Etat de l’art Samuel BERTHE 10 Mars 2014 Epitech Nantes

Upload: samuel-berthe

Post on 27-Jan-2015

117 views

Category:

Technology


1 download

DESCRIPTION

Conférence at Epitech Nantes (10/03/2014) Target : tek1-3 (3 first yeasr in Epitech)

TRANSCRIPT

  • 1. :: Confrence :: NoSQL / Scalabilite Etat de lart Samuel BERTHE 10 Mars 2014Epitech Nantes

2. Ksako ?? Not Only SQL 3. Ksako ?? WHATSAPP : * 5 years * 450 millions users ouch ! 4. Scalability Scale UP Scale OUT 5. Scalability Scale UP Scale OUT - Hard to maintain - Single Point Of Failure (SPOF) - A fat application - 1To of RAM on a server doesnt exist - A server is broken ? Doesnt matter ! - Easy to grow - Easy to maintain - More flexible (cloud) 6. Scalability Capacity Cost 7. Scalability File system Dont try to scale your FS : you cant ! - Hard to maintain - SPOF Riak CS or AWS S3 is a good choice 8. Scalability - Stateless Your memory isnt a database. - Dont use global variables -> Use a datastore - Consistent request -> 1 variable = 1 request 9. Technologies actuelles 10. Technologies actuelles 11. Use case Web Agency One server with : - httpd (sometimes with load balancing) - RDBMS (sometimes separated) SQL only 12. Use case Web Agency How to scale up ? How to protect you data ? What is you faults tolerance ? 13. Use case Web Agency WORDPRESS / SQL == Single Point Of Failure + Fu****g backup management + Im poor, so I cant use cloud to scale up 14. Use case Worldwide chain store So many DATA 15. Use case Worldwide chain store Operational database 16. Use case Worldwide chain store Data Wharehouse 17. SQL Transactions Customer 1 Customer 2 GET nbr of Samsung Galaxy S5 -> answer = 42 GET nbr of Samsung Galaxy S5 -> answer = 42 Customer buy 2 phones : -> nbr -= 2 (== 40) Customer buy 1 phone : -> nbr -= 1 (== 41) UPDATE value in DB : 40 UPDATE value in DB : 41 Example : stock management system at Amazon.com -> two customers buy at the same time a Samsung Galaxy S5 18. SQL ACID Transaction Atomicity Consistency Isolation Durability 19. SQL but that was before ! 20. NoSQL Thorme CAP Availability Consistency Partition Tolerance Pick two 21. NoSQL Key/Value-oriented DB Use case : - Session storage - Cache 22. NoSQL Document-oriented DB Use case : - Natural data modeling - Fast to develop - Polyvalent 23. NoSQL Column-oriented DB Use case : - Large datasets - Logs - Write flooding - BigData 24. NoSQL Graph-oriented DB Use case : - Social relations - Graph architecture 25. NoSQL Thorme CAP Availability Consistency Partition Tolerance Pick two Mysql, PostgreSQL Couchdb, Cassandra, Riak Couchbase, Mongodb, HBase 26. NoSQL Replication Partitioning A - H I - P Q - Z 27. NoSQL Replication Sharding A-Z A-Z A-Z 28. NoSQL Replication Partitioning + Sharding A H + I - P I P + Q - Z Q Z + A - H 29. NoSQL Replication Cross Datacenter Replication (XDCR) 30. NoSQL Replication Tunable consistency 31. More about MongoDB Document oriented database Collections Big community Big documentation Shell client Supported in several languages Transactional operators Aggregation 32. More about MongoDB Easy to index Easy to request Fast to learn Replica set Master-Slave replication Fucking shard key Hard to maintain 33. More about Couchbase Document oriented database Buckets TTL Shell client Browser Interface Statistics Asynchronous write Master-Master replication Auto-rebalancing 34. More about Couchbase Memcached integration Index replication Map/Reduce - Views - Stale Supported in less languages than MongoDB Harder to request Small community A lot of Memory (at least 4Go) 35. ElasticSearch Scalable indexing engine 36. ElasticSearch Rivers JSON Request Real time GET Segments + shards With Leader Election CP but can be AP 37. NoSQL is used for Big Data Big new challenges : * capture data * storage * data exploration Usages : * Marketing * Customer relation * Research * Merchandising * Spying (NSA) ;-) 3V : Varit, Volume, Vlocit 38. BigData Map/Reduce 39. BigData Map/Reduce function map(doc) { if (doc.video_id) emit(doc.video_id, 1); } Lets try to make a youtube video view counter function reduce(id, docs) { var res = 0; for (var i = 0; I < docs.length; ++i) { res++; } emit(id, res); } OUT : - a, 1 - b, 1 - a, 1 - a, 1 - b, 1 OUT : - a, 3 - b, 2 40. BigData Hadoop Framework HBase HDFS Map/Reduce JobTracker Hive 41. Learn more Advises : - Use many different databases, for each usage, in a same project - but one database to begin 42. Learn more - Training MOOCs : - 10gen educations (Mongodb learning) - DataStax Academy (Cassandra learning) Online testing db Mongodb and Couchdb : pretty easy Node.JS / Python You mean BigData ? I tell you Java ! Download datasets, consume API or make a crawler 43. Enjoy ! Samuel BERTHE --- [email protected] @SamuelBerthe www.samuel-berthe.fr