tag based sharding presentation
TRANSCRIPT
Tag-based shardingDistribute your data as you need
MongoDB User GroupMadrid, October 13th 2015
Juan Antonio Roy Couto
Who am I?
Juan Antonio Roy Couto
Financial Software Developer
Twitter: @juanroycouto
Linkedin: https://www.linkedin.com/in/juanroycouto
Personal blog: http://www.juanroy.es
Contributor at: http://www.mongodbspain.com
Charrosfera member: http://www.charrosfera.com
Email: [email protected]
MongoDB User GroupTag-based sharding
❏ Cluster overview❏ Definitions❏ Steps for balancing❏ Steps to split a chunk❏ Migration steps❏ Normal MongoDB operation❏ Pre-splitting❏ Commands to split a chunk❏ Tag-based sharding overview❏ Tag your shards❏ Tag your chunk ranges
Table of ContentsMongoDB User GroupTag-based sharding
❏ Replica set❏ Shards❏ config servers❏ config database❏ mongos
Cluster overviewMongoDB User GroupTag-based sharding
Cluster overviewReplica Set
● High availability
● Data safety
● Disaster recovery
MongoDB User GroupTag-based sharding
Replica Set
Secondary
Secondary
Primary
Scale out
Even data distribution across all of the
shards based on a shardkey
A shardkey range belongs to only one
shard
More efficient queries
Cluster overviewShards
MongoDB User GroupTag-based sharding
Cluster
Shard 0 Shard 2Shard 1
A-I J-Q R-Z
Cluster overviewMongoDB User GroupTag-based sharding
Cluster overviewConfig servers
MongoDB User GroupTag-based sharding
● config database
● Identical information (consistency check).
● Metadata:
○ Cluster shards list
○ Data per shard (chunk ranges)
○ ...
● Don’t sync from each other.
● Default Config server (All mongos read it)
Cluster overviewconfig database
Collections:
● changelog: splits and migration information● chunks *● collections * (only sharded)● databases *● lockpings● locks● mongos● settings● shards *● system.indexes● tags● version *
* Checked for consistency
MongoDB User GroupTag-based sharding
● Receives client requests and returns results.
● Reads the metadata and sends the query to the necessary
shard/shards
● Does not store data
● Keeps a cache version of the metadata. We can refresh it by:
○ mongos>db.runCommand( { flushRouterConfig : 1 } )
○ or restarting the server
Cluster overviewmongos
MongoDB User GroupTag-based sharding
MongoDB User GroupTag-based shardingDefinitions
● Range: Data division based on the values of the shardkey.
● Chunk: They are not physical data. Chunks are just a logical
grouping of data into ranges (64MB by default).
● Split: Chunk division. No data is moved.
● Migration: Chunk movements between shards in order to get an
even distribution. Only one chunk is moved at a time.
● Balanced system: The same number of chunks per shard.
● Balancer: Checks if a migration is needed and starts it.
MongoDB User GroupTag-based sharding
Split
Migration
Steps for balancing
Steps to split a chunkMongoDB User GroupTag-based sharding
Shard 0Final
Shard 0
Chunk 1 Chunk 1
Chunk 2
Chunk 2
Chunk 3
Config server
mongos
1. Needs chunk 2 to be splitted?
2. Split pointslist
3. Updatemetadata
4. Refreshcache
Migration stepsMongoDB User GroupTag-based sharding
Shard 1 (To)
Shard 0 (From)
Chunk 4
Chunk 1
Chunk 2
Chunk 3
Config server
mongos
1. Is balancer running?2. Is the system imbalance?3. Pick chunk 34. Split chunk 3?5. Begin (1)
6. Deletes finished?(2)
7. Read this chunk(3)
8. Transfer(4)
9. Update metadata(5)
10. Remove chunk 3 from shard 0(6)
11. Refresh mongos cache Chunk 3
1
4
78911
Normal MongoDB operationMongoDB User GroupTag-based sharding
Shard 0 Shard 1 Shard 2 Shard 3
mongosClient
Migrations
Useful for storing data directly in the shards (massive data loads).
Avoid bottlenecks.
MongoDB does not need to split or migrate chunks.
After the split, the migration must be finished before data loading.
Pre-splittingMongoDB User GroupTag-based sharding
Cluster
Shard 0 Shard 2Shard 1
Chunk 1
Chunk 5
Chunk 3
Chunk 4
Chunk 2
Splitting a chunk:
mongos>for (var i=0; i<20, i++) {
sh.splitAt(“testdb.presplit”, { x : 1000*i } );
}
Querying existing chunks:
mongos>use config
mongos>db.chunks.find( { ns : “testdb.presplit” } )
Commands to split a chunkMongoDB User GroupTag-based sharding
Tags are used when you want to pin ranges to a specific shard.
Tag-based sharding overviewMongoDB User GroupTag-based sharding
shard0
EMEA
shard1
APAC
shard2
LATAM
shard3
NORAM
mongos>sh.addShardTag(“shard0”, “EMEA”)
mongos>sh.addShardTag(“shard1”, “APAC”)
mongos>sh.addShardTag(“shard2”, “LATAM”)
mongos>sh.addShardTag(“shard3”, “NORAM”)
Tag your shardsMongoDB User GroupTag-based sharding
mongos>sh.addTagRange( namespace, minimum, maximum, tag )
mongos>sh.addTagRange( “testdb.tagrange”,
{ “x” : 0 },
{ “x” : 1000 },
“EMEA” )
minimum: the minimum value (inclusive) of the shard key range to include in the tag.
maximum: the maximum value (exclusive) of the shard key range to include in the tag.
Tag your chunk rangesMongoDB User GroupTag-based sharding
Questions?
Questions?MongoDB User GroupTag-based sharding
We are looking for writers
mongodbspain.comMongoDB User GroupTag-based sharding
Thank you for your attention!
MongoDB User GroupTag-based shardingMadrid, October 13th 2015
Juan Antonio Roy Couto