a morning with mongodb barcelona: use cases and roadmap
DESCRIPTION
http://www.10gen.com/events/MongoDB-Morning-BarcelonaTRANSCRIPT
Use Cases and Roadmap
Norberto Leite
Senior Solutions Architect, [email protected]
@nleite
Sunday, 21 October 12
Agenda
•Use Cases•Roadmap•Future
Sunday, 21 October 12
Use Cases
Sunday, 21 October 12
Big Data = MongoDB = Solved
User Data Management High Volume Data Feeds
Content Management Operational Intelligence E-Commerce
Mobile
Sunday, 21 October 12
Location Based Service
•Problem:•Location based social networking service needs to scale to high number of users and check-ins
•Solution:•Used MongoDB deployed on EC2•8 clusters, 40 machines, 15k QPS, 2.3 billion records•Auto-sharding and geo-spatial indexing are key
•Results:•To date have scaled to 9m users, 3m check-ins per day, 750m total check-ins, 20m places, 400k merchants
Sunday, 21 October 12
•Problem:•Business needed modern data store for rapid development and scale
•Solution:•Used PHP and MongoDB
•Results:•RealTime estatistics•All data, images etc store together•No need for complex migrations•Enable very rapid development and growth
Sunday, 21 October 12
•Problem:•Deal with massive data volume across all customers
•Solution:•Use MongoDB to replace Google Analytics / Omniture
•Results:•Less than one week to build prototype and POC•Rapid deployment of new features
Sunday, 21 October 12
•Problem:•Lots of friction with RDMS for archiving storage•Needed to more scalable archive storage database
•Solution:•Keep MySQL for active data ( 100 Million )•MongoDB for archive ( 2 Billion )
•Results:•No more alter tables statements taking over 2 months•Sharding fixed vertical scale problem•Very happily looking for other ways to use MongoDB
Sunday, 21 October 12
How Telefónica uses MongoDB
•London:•O2 UK: Priority Moments location based offers service•O2 UK: eCommerce Product Catalog
•Madrid:•M2M (machine to machine) event acquisition platform•Personalization Server (Oracle migration)
Sunday, 21 October 12
How Telefónica uses MongoDBM2M Event Acquisition
MNOn
Operator Network
MNO1MNO2
Event acquisition
Event notification
BOSS
Event Gateway
Event Storage
Event Notifier Portal
MngPlatform
MngStorage
API
Core
Apps
Mng
Sunday, 21 October 12
How Telefónica uses MongoDBProduct Catalog
Sunday, 21 October 12
And many others ...
Sunday, 21 October 12
Roadmap
Sunday, 21 October 12
The Evolution of MongoDB
2.2Aug ‘12
2.4 winter ‘12
2.0Sept ‘11
1.8March ‘11
Journaling
Sharding and Replica set enhancements
Spherical geo search
Index enhancements to improve size and performance
Authentication with sharded clusters
Replica Set Enhancements
Concurrency improvements
Aggregation Framework
Multi-Data Center Deployments
Improved Performance and Concurrency
Sunday, 21 October 12
• Concurrency: yielding + db level locking • New aggregation framework• TTL Collections• Improved free list implementation• Tag aware sharding• Read Preferences
• http://docs.mongodb.org/manual/release-notes/2.2/
2.2 Release August 2012
Sunday, 21 October 12
Yielding + DB Locking
• improved yielding on page fault• breaking down the global level lock• Lock per Database in 2.2• Lock per Collection post 2.2
Sunday, 21 October 12
Aggregation Framework• pipeline model (a bit like unix pipes)• like a "group by"–Operators–$project, $group, $match, $limit, $skip, $unwind, $sort
– Expressions–Logical Expressions: $and, $not, $or, $cmp ...–Math Expressions: $add, $divide, $mod ...–String Expressions: $strcasecmp, $substr, $toLower ...–Date/Time Expressions: $dayOfMonth, $hour...–Multi-Expressions: $ifNull, $cond
• Use Cases: Real-time / inline analytics
Sunday, 21 October 12
Example - For each "tag", list the authors{ title : "my tech blog" , author : "bob" , tags : [ "fun" , "good" , "tech" ] ,}
{ title : "cool tech" , author : "jim" , tags : [ "awesome" , "tech" ] ,}
Sunday, 21 October 12
Aggregate Command
db.blogs.aggregate( { $project : { author : 1, tags : 1 } }, { $unwind : "$tags" }, { $group : { _id : { tags : "$tags" }, authors : { $addToSet : "$author" } } });
Sunday, 21 October 12
Time To Live (TTL)Collections• auto expire data out of a collection• must be on a date datatype• single value is evaluated• Use Cases: data retention, cache expiration
db.events.ensureIndex( { "timestamp": 1 }, { expireAfterSeconds: 3600 } )
Sunday, 21 October 12
Tag aware sharding
• Distribute data based on a Tag• Use Cases: Locality for Data by Data Center
sh.addShardTag("shard0000", "dc-emea")
sh.addTagRange("mydb.users", { country: "uk"}, { country: "ul"}, "dc-emea");
sh.addTagRange("mydb.users", { country: "by"},{ country: "bz"}, "dc-emea");
Sunday, 21 October 12
Read Preferences
• Mode• PRIMARY, PRIMARY_PREFERRED• SECONDARY, SECONDARY_PREFERRED• NEAREST
• Tag Sets• Uses Replica Set tags• Passed Tag is used to find matching members
Sunday, 21 October 12
2.4 Roadmap
Must• Kerberos integration• LDAP/AD integration
Nice To Have• Hash Shard Key• Background Index Build on Secondaries• V8 for Map/Reduce (replaces Spider Monkey)• Geo: intersecting polygons, Geo shard key• Agg: $out, more functions, speed improvements
Sunday, 21 October 12
And beyond
• Full Text Search• Collection / Extent level locking• Field level security• Audit
Sunday, 21 October 12
Future of NoSQL?
Sunday, 21 October 12
Future of the Data Center
• Hardware• More Cores• More Memory• More IOPs (SSD)• More Capacity• More bandwidth (100GbE)
• "Auto Pilot"• Zero human intervention
Sunday, 21 October 12
Future of NoSQL?
• Real Time Analytics• Can't wait for a batch process / ETL / DW
• Ad-Hoc / Analytics• Map/Reduce = Hammer
• Greater Scale• 100s -> 1,000s of nodes
• Deeper history• Petabytes -> Exabytes
• Heterogeneous deployment• Seamless integration with what you have
Sunday, 21 October 12
Sunday, 21 October 12