mongo and redis
DESCRIPTION
presentation for Bucharest BigData Meetup, short overview of MongoDB and RedisTRANSCRIPT
![Page 1: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/1.jpg)
NoSQLMongoDB and Redis as alternatives to
traditional RDBMS
![Page 2: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/2.jpg)
Then...
![Page 3: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/3.jpg)
...and now
*This thing weighs less than 50g
![Page 4: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/4.jpg)
Meaning of NoSQL
1970 = We have no SQL1980 = Know SQL2000 = No SQL!2005 = Not only SQL2014 = No, SQL
(slide adapted from @markmadsen)
![Page 5: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/5.jpg)
MongoDB
![Page 6: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/6.jpg)
MongoDB
● it is the “new MySQL”● Project started in 2007 by 10gen (now MongoDB Inc)● Cross-platform, open-source● 5th most used DBMS & most used Document Store*
(next DS CouchDB - 21st)* According to db-engines.com as of Oct 2014
![Page 7: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/7.jpg)
Characteristics
● “It's really a hybrid database with features from a few different places.” (Gaetan Voyer-Perrault on Quora)
● Document Oriented but NO SCHEMA! ● Documents grouped in Collections● Binary JSON (BSON) format● Load Balancing (automated sharding, sharding key
can be user defined)● Replication (Replica Sets)● Automated failover
![Page 8: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/8.jpg)
Characteristics - continued
● Primary and Secondary Indexes● JavaScript for UDF● MapReduce● Capped Collections● Aggregation Framework since 2.2● Ad-hoc Query Support
![Page 9: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/9.jpg)
Caveats
![Page 10: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/10.jpg)
Generic performance tips
● Use 64-bit OS● Lots of RAM, fast disks (was anyone expecting
something else?)● ensure that at least indexes + working set fit in RAM
(db.stats(), db.<coll>.stats()) - if not, you might want to try TokuMX
● Design for de-normalized data models
![Page 11: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/11.jpg)
Generic performance tips
● Write-Concerns● Shard early● Fixed (or at least bounded) record size => better write
performance● Use short attribute names (reduces index & data size,
OFC!)● EXT4 or XFS
![Page 12: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/12.jpg)
IRL
● virtualized server 8G RAM, 4 vCPU - no sharding, no replica sets
● 100 inserts/s , 130M doc collection WITH secondary index (avg doc size 0.6k)
● 20 inserts/s 3M doc collection WITH 18 secondary indexes (avg doc size 10k)
![Page 13: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/13.jpg)
Use Cases
● Logs● Location Data (Mongo has built in Geospatial ops)● Account and User Profiles● Messaging● (complex) Config Data● http://www.mongodb.com/who-uses-mongodb (hint:
Expedia, Business Insider, The Weather Channel, Foursquare, eBay)
![Page 14: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/14.jpg)
Redis
![Page 15: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/15.jpg)
Redis
● Salvatore Sanfilippo (@antirez)● Started in 2009● Key-Value Store● 11th most used DBMS & most used KV Store* (next
KVS memcached - 19th)● Sponsored by Pivotal (spinoff EMC/VMware)* According to db-engines.com as of Oct 2014
![Page 16: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/16.jpg)
Characteristics
● Holds all data in memory, persists on disk● Data Models
○ Strings/Blobs/Bit-Maps (not really Bitmaps)○ Hashtables○ Linked Lists○ Sets○ Sorted Sets
● HyperLogLog (+2.8.9 - trade accuracy for memory)● Master Slave Replication● High Availability (through Sentinel)
![Page 17: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/17.jpg)
Characteristics - continued
● Redis Cluster in works (not production ready yet) - sharding ○ asynchronous replication○ does not guarantee strong consistency (may ‘forget’ writes)
● AOF sync - default 2s● Does not support secondary indexes● Pub/Sub mode since 2.0● Key expiry● Server scripting with Lua
![Page 18: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/18.jpg)
IRL
● virtualized server 4G RAM, 1vCPU● +50k get/set per second (redis-benchmark)● only 128 queries out of 1165550375 over 10ms
(0.00001%)○ uptime_in_days:439○ used_memory_human:424.09M○ used_memory_peak_human:834.94M○ total_connections_received:1352935○ db0:keys=610884,expires=355397
![Page 19: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/19.jpg)
Generic performance tips
● Use short key names (reduces data size, OFC!)● You can create secondary indexes (but you have to
maintain them, e.g. using SET)● You can have ad-hoc queries (actually is query) :
using SORT
![Page 20: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/20.jpg)
Use Cases
● Cache● IPSS/IPC● Queue mechanisms (see e.g. Resque)● Log/Task buffers● Statistics and aggregation datastore● (anywhere you use memcached)● http://redis.io/topics/whos-using-redis (hint: Twitter,
GitHub, Snapchat, StackOverflow a.o.)
![Page 21: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/21.jpg)
Recap
One size does NOT fit all!
![Page 22: Mongo and Redis](https://reader034.vdocuments.us/reader034/viewer/2022052304/5594547e1a28ab98118b4589/html5/thumbnails/22.jpg)
Further reading
● Must read: http://blog.andreamostosi.name/big-data/ (almost exhaustive list of all things NoSQL and BigData)