mongodb iot city tour eindhoven: sharding in mongodb
DESCRIPTION
An overview of the principals of "sharding" within MongoDBTRANSCRIPT
MongoDB and The Internet of Things
Arthur ViegersSenior Solutions Architect, MongoDB
MongoDB IoT City Tour 2014
Scaling Data
*
Read/Write Throughput Exceeds I/O
*
Working Set Exceeds Physical Memory
*
Vertical Scalability (Scale Up)
*
Horizontal Scalability (Scale Out)
MongoDB’s Approach
*
• User defines a shard key
• Shard key defines a range of data
• Key space is like points on a line
• Range is a segment of that line
Partitioning
*
• Initially 1 chunk
• Default max. chunk size: 64MB
• MongoDB automatically splits & migrates chunks when max reached
Data Distribution
*
• Queries routed to specific shards
• MongoDB balances cluster
• MongoDB migrates data to new nodes
Routing and Balancing
Architecture
*
• A shard is a node of the cluster
• A shard can be a single mongod or a replica set
What is a Shard?
*
Sharding Infrastructure
Shard Key Considerations
*
• Cardinality
• Write distribution
• Query isolation
• Reliability
• Index locality
Shard Key Considerations
*
Rexroth NEXO schema
{ _id: ObjectID("52ecf3d6bf1e623a52000001"), assetId: "NEXO 109", hour: ISODate("2014-07-03T22:00:00.000Z"), status: "Online", type: "Nutrunner", serialNo : "100-210-ABC", ip: "127.0.0.1", positions: { 0: { 0: { x: "10", y:"40", zone: "itc-1", accuracy: "20” }, …, 59: { x: "15", y: "30", zone: "itc-1", accuracy: "25” } }, …, 59: { 0: { x: "22", y: "27", zone: "itc-1", accuracy: "22” }, …, 59: { x: "18", y: "23", zone: "itc-1", accuracy: "24” } } }}
*
Shard Key Selection Rexroth NEXO
Cardinality
Write Distributi
on
Query Isolation Reliability Index
Locality
_id Doc level One shard
Scatter/gather
All users affected Good
hash(_id) Hash level All Shards Scatter/
gatherAll users affected Poor
assetId Many docs All Shards Targeted
Some assets
affectedGood
assetId, hour Doc level All Shards Targeted
Some assets
affectedGood
Summary
*
• MongoDB scales horizontally (sharding)
• Each shard is an independent database, and collectively, the shards make up a single logical database
• MMS makes it easy and reliable to run MongoDB at scale
• Sharding requires minimal effort from the application code: same interface as single mongod
Scaling Data
Thank you!