strategies for backing up mongo db 10.2012 copy
DESCRIPTION
Presentation by Jeff Yemin @ MongoBoston.TRANSCRIPT
Engineering Manager, 10gen
Jeff Yemin
#MongoBoston
Strategies for Backing Up MongoDB
File and Directory Layout• A set of files per database
Insert with write concern of {fsync : true}
Archive the data directory
Restore the data directory
Start mongod on restored data directory
Everything is fine, right?
• No, it's not
• But you can't tell until you look
Try validating the collection• In the shell, run the validate command
How can we get a clean backup?• kill mongod
• fsyncLock / fsyncUnlock
How can we get a clean backup?• mongodump
mongodump
• Snapshot of each collection– Does NOT represent a point in time, even for a
single collection
• Can NOT be combined with fsyncLock– Remember, you can't read…
• You CAN dump directly from data files to get a point in time backup– mongodump –dbpath
• Can be costlier than archiving as FS level
Snaphot Query
1
2
3 4
5
6
7
8 9
How can we get a clean backup?• journaling
Journaling
• Write-ahead log
• Guarantees a consistent view even after a hard crash
• Default behavior as of 2.0
• Journal stored in –dbpath /journal folder
• --journalCommitInterval* (2ms - 300ms)
Journaling implications for backup
• Logical Volume Manager (LVM)
• LVM snapshots to the rescue– lvcreate –size 100M –snapshot –name mdb-snap01 /dev/vg0/mongodb
• No shutdown or fsyncLock needed
• True point in time backup for a single instance
Replica Sets
Backing up a replica set
• Back up a (hidden) secondary– kill mongod– fsyncLock– mongodump– LVM snapshot
Mongodump for replica sets• True point in time
– mongodump –oplog– mongorestore –-oplogreplay
• Snapshot query of each collection, then replay the oplog at the end– Similar to how a new secondary does an initial
sync
Sharded clusters
Shard 1 Shard 2 Shard 3 Shard 4
5
9
1
6
10
2
7
11
3
8
12
4
17
21
13
18
22
14
19
23
15
20
24
16
29
33
25
30
34
26
31
35
27
32
36
28
41
45
37
42
46
38
43
47
39
44
48
40
mongos
balancerconfig
config
config
Chunks!
Backing up a sharded cluster• mongodump through mongos
– (but no –oplog)
• mongorestore through mongos
Backup a Sharded Cluster1. Stop Balancer, and wait till inactive
(state:0) db.settings.update( { _id: "balancer" }, { $set : { stopped: true } } , true )
2. Stop a config server Backup Data– Each shard– Config server (mongodump --db config)
3. Restart config server
4. Resume balancer
Engineering Manager, 10gen
Jeff Yemin
#MongoBoston
Thank You