advanced replication

38
Solutions Architect, 10gen Marc Schwering #MongoDBDays - @m4rcsch Advanced Replication

Upload: mongodb

Post on 24-Jun-2015

1.034 views

Category:

Technology


1 download

DESCRIPTION

In this session we will cover wide area replica sets and using tags for backup. Attendees should be well versed in basic replication and familiar with concepts in the morning's basic replication talk. No beginner topics will be covered in this session

TRANSCRIPT

Page 1: Advanced Replication

Solutions Architect, 10gen

Marc Schwering

#MongoDBDays - @m4rcsch

Advanced Replication

Page 2: Advanced Replication

Roles & Configuration

Page 3: Advanced Replication

Replica Set Roles

Page 4: Advanced Replication

> conf = {

_id : "mySet",

members : [

{_id : 0, host : "A"},

{_id : 1, host : "B"},

{_id : 2, host : "C", "arbiter" : true}

]

}

> rs.initiate(conf)

Configuration Options

Page 5: Advanced Replication

Simple Setup Demo

Page 6: Advanced Replication

Behind the Curtain

Page 7: Advanced Replication

Implementation details

• Heartbeat every 2 seconds– Times out in 10 seconds

• Local DB (not replicated)– system.replset– oplog.rs• Capped collection• Idempotent version of operation stored

Page 8: Advanced Replication

Op(erations) Log

Page 9: Advanced Replication

> db.replsettest.insert({_id:1,value:1})

{ "ts" : Timestamp(1350539727000, 1), "h" : NumberLong("6375186941486301201"), "op" : "i", "ns" : "test.replsettest", "o" : { "_id" : 1, "value" : 1 } }

> db.replsettest.update({_id:1},{$inc:{value:10}})

{ "ts" : Timestamp(1350539786000, 1), "h" : NumberLong("5484673652472424968"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "value" : 11 } } }

Op(erations) Log is idempotent

Page 10: Advanced Replication

oplog and multi-updates

Page 11: Advanced Replication

> db.replsettest.update({},{$set:{name : ”foo”}, false, true})

{ "ts" : Timestamp(1350540395000, 1), "h" : NumberLong("-4727576249368135876"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 2 }, "o" : { "$set" : { "name" : "foo" } } }

{ "ts" : Timestamp(1350540395000, 2), "h" : NumberLong("-7292949613259260138"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 3 }, "o" : { "$set" : { "name" : "foo" } } }

{ "ts" : Timestamp(1350540395000, 3), "h" : NumberLong("-1888768148831990635"), "op" : "u", "ns" : "test.replsettest", "o2" : { "_id" : 1 }, "o" : { "$set" : { "name" : "foo" } } }

Single operation can have many entries

Page 12: Advanced Replication

Operations

Page 13: Advanced Replication

Maintenance and Upgrade

• No downtime

• Rolling upgrade/maintenance– Start with Secondary– Primary last

– Commands: • rs.stepDown(<secs>)• db.version()• db.serverBuildInfo()

Page 14: Advanced Replication

Upgrade Demo

Page 15: Advanced Replication

Replica Set – 1 Data Center

• Single datacenter

• Single switch & power

• Points of failure:– Power– Network– Data center– Two node failure

• Automatic recovery of single node crash

Page 16: Advanced Replication

Replica Set – 2 Data Centers

• Multi data center

• DR node for safety

• Can’t do multi data center durable write safely since only 1 node in distant DC

Page 17: Advanced Replication

Replica Set – 2 Data Centers

• Analytics

• Disaster Recovery

• Batch Jobs

• Options– low or zero priority– hidden– slaveDelay

Page 18: Advanced Replication

Replica Set – 3 Data Centers

• Three data centers

• Can survive full data center loss

• Can do w= { dc : 2 } to guarantee write in 2 data centers (with tags)

Page 19: Advanced Replication

Replica Set – 3+ Data Centers

delayed

Secondary

Secondary

Secondary Secondar

y

Secondary

Secondary

Primary

Page 20: Advanced Replication

Commands

• Managing– rs.conf()– rs.initiate(<conf>) & rs.reconfig(<conf>)– rs.add(host:<port>) & rs.addArb(host:<port>)– rs.status()– rs.stepDown(<secs>)

• Minority reconfig– rs.reconfig( cfg, { force : true} )

Page 21: Advanced Replication

Options

• Priorities

• Hidden

• Slave Delay

• Disable indexes (on secondaries)

• Default write concerns

Page 22: Advanced Replication

Developing with Replica Sets

Page 23: Advanced Replication

Strong Consistency

Page 24: Advanced Replication

Delayed Consistency

Page 25: Advanced Replication

Write Concern

• Network acknowledgement

• Wait for error

• Wait for journal sync

• Wait for replication– number– majority– Tags

Page 26: Advanced Replication

Write Concern Demo

Page 27: Advanced Replication

Datacenter awareness (Tagging)

• Control where data is written to, and read from

• Each member can have one or more tags– tags: {dc: "ny"}– tags: {dc: "ny", subnet: "192.168", rack:

"row3rk7"}

• Replica set defines rules for write concerns

• Rules can change without changing app code

Page 28: Advanced Replication

{

_id : "mySet",

members : [

{_id : 0, host : "A", tags : {"dc": "ny"}},

{_id : 1, host : "B", tags : {"dc": "ny"}},

{_id : 2, host : "C", tags : {"dc": "sf"}},

{_id : 3, host : "D", tags : {"dc": "sf"}},

{_id : 4, host : "E", tags : {"dc": "cloud"}}],

settings : {

getLastErrorModes : {

allDCs : {"dc" : 3},

someDCs : {"dc" : 2}} }

}

> db.blogs.insert({...})

> db.runCommand({getLastError : 1, w : "someDCs"})

> db.getLastErrorObj({"someDCs"})

Tagging Example

Page 29: Advanced Replication

Wait for Replication

Page 30: Advanced Replication

settings : {

getLastErrorModes : {

allDCs : {"dc" : 3},

someDCs : {"dc" : 2}} }

}

> db.getLastErrorObj({"allDCs"},100);

> db.getLastErrorObj({”someDCs"},500);

> db.getLastErrorObj(1,500);

Write Concern with timeout

Page 31: Advanced Replication

Read Preference Modes

• 5 modes (new in 2.2)– primary (only) - Default– primaryPreferred– secondary– secondaryPreferred– Nearest

When more than one node is possible, closest node is used for reads (all modes but primary)

Page 32: Advanced Replication

Tagged Read Preference

• Custom read preferences

• Control where you read from by (node) tags– E.g. { "disk": "ssd", "use": "reporting" }

• Use in conjunction with standard read preferences– Except primary

Page 33: Advanced Replication

{"dc.va": "rack1", disk:"ssd", ssd: "installed" }

{"dc.va": "rack2", disk:"raid"}

{"dc.gto": "rack1", disk:"ssd", ssd: "installed" }

{"dc.gto": "rack2", disk:"raid”}

> conf.settings = { getLastErrorModes: { MultipleDC :

{ "dc.va": 1, "dc.gto": 1}}

> conf.settings = {

"getLastErrorModes" : {

"ssd" : {

"ssd" : 1

},...

Tags

Page 34: Advanced Replication

{ disk: "ssd" }

JAVA:

ReadPreference tagged_pref =

ReadPreference.secondaryPreferred(

new BasicDBObject("disk", "ssd")

);

DBObject result =

coll.findOne(query, null, tagged_pref);

Tagged Read Preference

Page 35: Advanced Replication

Tagged Read Preference

• Grouping / Failover{dc : "LON", loc : "EU"}

{dc : "FRA", loc : "EU"}

{dc : "NY", loc : "US”}

DBObject t1 = new BasicDBObject("dc", "LON");

DBObject t2 = new BasicDBObject("loc", "EU");

ReadPreference pref =

ReadPreference.primaryPreferred(t1, t2);

Page 36: Advanced Replication

Conclusion

Page 37: Advanced Replication

Best practices and tips

• Odd number of set members

• Read from the primary except for– Geographically distribution– Analytics (separate workload)

• Use logical names not IP Addresses in configs

• Set WriteConcern appropriately for what you are doing

• Monitor secondaries for lag (Alerts in MMS)

Page 38: Advanced Replication

Solutions Architect, 10gen

Marc Schwering

#MongoDBDays - @m4rcsch

Thank You