![Page 1: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/1.jpg)
May-29-2019
MongoDB HA, what can go wrong?
![Page 2: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/2.jpg)
{"name": "Igor Donchovski",
"live_in": "Skopje",
"email": "[email protected]",
"current_role": "Lead database consultant",
"education": [{"type": "College", "name": "FEIT", "graduated": "2008", "university": "UKIM"},
{"type": "Master", "name": "FINKI", "graduated": "2013", "university": "UKIM"}],
"work": [{"role": "Web developer", "start": "2007", "end": "2012", "company": "Gord Systems"},
{"role": "DBA", "start": "2012", "end": "2014", "company": "NOVP"},
{"role": "Database consultant", "start": "2014", "end": "2016", "company": "Pythian"},
{"role": "Lead database consultant", "start": "2016", "company": "Pythian"}],
"certificates": [{"name": "C100DBA", "year": "2016", "description": "MongoDB certified DBA"}],
"social": [{"network": "LinkedIn", "url": "https://mk.linkedin.com/in/igorle"},
{"network": "Twitter", "url": "https://twitter.com/igorle"}],
"interests": ["Hiking", "Biking", "Traveling"],
"hobbies": ["Painting", "Photography", "Cooking"],
"proud_of": ["Volunteering", "Helping the Community"]}
About Me
© 2019 Pythian. Confidential
![Page 3: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/3.jpg)
Overview• What is replica set, how replication works
• Replication concept
• Replica set features, deployment architectures
• Hidden nodes, Arbiter nodes, Priority 0 nodes
• Production failures
• Monitoring replica set
• QA
© 2019 Pythian. Confidential
![Page 4: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/4.jpg)
© 2019 Pythian. Confidential
Replication
![Page 5: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/5.jpg)
• Group of mongod processes that maintain the same data set
• Redundancy and high availability
• Increased read capacity (scaling reads)
• Automatic failover
Replica Set
# Members # Nodes Required to Elect New Primary Fault Tolerance
3 2 1
4 3 1
5 3 2
6 4 2
7 4 3
© 2019 Pythian. Confidential
priority:1 votes:1
priority:1 votes:1 priority:1 votes:1
![Page 6: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/6.jpg)
Replication Concept1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
1.
© 2019 Pythian. Confidential
![Page 7: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/7.jpg)
Replication Concept1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary2. oplog
1.
© 2019 Pythian. Confidential
![Page 8: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/8.jpg)
Replication Concept1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary2. oplog
1.
3. 3.
© 2019 Pythian. Confidential
![Page 9: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/9.jpg)
Replication Concept1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary
© 2018 Pythian. Confidential
2. oplog
1.
3. 3.
4. 4.
![Page 10: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/10.jpg)
Replication Concept1. Write operations go to the Primary node
2. All changes are recorded into operations log
3. Asynchronous replication to Secondary
4. Secondaries copy the Primary oplog
5. Secondary can use sync source Secondary*
*settings.chainingAllowed (true by default)
2. oplog
1.
3. 3.
4. 4.
5.
© 2019 Pythian. Confidential
![Page 11: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/11.jpg)
Replica Set Oplog• Special capped collection that keeps a rolling record of all operations that
modify the data stored in the databases
• Idempotent
• Default oplog size
For Unix and Windows systemsStorage Engine Default Oplog Size Lower Bound Upper Bound
In-memory 5% of physical memory 50MB 50GB
WiredTiger 5% of free disk space 990MB 50GB
MMAPv1 5% of free disk space 990MB 50GB
© 2019 Pythian. Confidential
![Page 12: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/12.jpg)
© 2019 Pythian. Confidential
Configuration
![Page 13: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/13.jpg)
Configuration Options• 50 members per replica set (7 voting members)
• Arbiter node
• Priority 0 node
• Hidden node
• Delayed node
© 2019 Pythian. Confidential
![Page 14: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/14.jpg)
• Does not hold copy of data
• Votes in elections
Arbiter Node
hidden : true
Arbiter
© 2019 Pythian. Confidential
![Page 15: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/15.jpg)
Priority 0 NodePriority - floating point (i.e. decimal) number between 0 and 1000
• Cannot become primary, cannot trigger election
• Visible to application (accepts reads/writes)
• Votes in elections
Secondarypriority : 0
© 2019 Pythian. Confidential
![Page 16: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/16.jpg)
Hidden Node• Not visible to application
• Never becomes primary, but can vote in elections
• Use cases
○ Reporting
○ Backups
hidden : true hidden: true priority:0
Secondaryhidden : true priority : 0
© 2019 Pythian. Confidential
![Page 17: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/17.jpg)
Delayed Node• Must be priority 0 member
• Should be hidden member (not mandatory)
• Mainly used for backups (historical snapshot of data)
• Recovery in case of human error
SecondaryslaveDelay : 3600priority : 0hidden : true
© 2019 Pythian. Confidential
![Page 18: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/18.jpg)
© 2019 Pythian. Confidential
Everyone on the same page?
![Page 19: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/19.jpg)
© 2019 Pythian. Confidential
Failures
![Page 20: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/20.jpg)
Small Oplog Size1. Primary/Secondary node down
○ Node failure
○ Planned maintenance
2. Automatic Failover
…… (several hours later)
3. New Primary overwrites latest oplog
4. Failed Node needs resync
MongoDB >= 3.6: db.adminCommand({replSetResizeOplog: 1, size: 32000})
© 2019 Pythian. Confidential
![Page 21: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/21.jpg)
Arbiter Nodes
● Votes in election
● Does not hold copy of data
● If 2 nodes are down, no majority to elect
new Primary
● Fault tolerance is still 1 node
● 4 data nodes + 1 Arbiter makes more
sense
Heartbeat
© 2019 Pythian. Confidential
![Page 22: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/22.jpg)
Priority 0 Nodes
● Application driver sends writes to Primary
● Reads go to Primary by default
● Secondaries can serve reads
● Read preference
○ primary (default)○ primaryPreferred○ secondary○ secondaryPreferred○ nearest
© 2019 Pythian. Confidential
![Page 23: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/23.jpg)
• Primary node fails
• Replica set starts election for new Primary
• Zero nodes eligible for Primary
• Application can not send writes
• Database is read only*
*depends on read preference setting
Priority 0 Nodes
© 2019 Pythian. Confidential
![Page 24: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/24.jpg)
Hidden Nodes
● Application driver sends writes to Primary
● Reads go to Primary by default
● Secondaries cannot serve reads
● Read preference
○ primary
© 2019 Pythian. Confidential
![Page 25: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/25.jpg)
• Primary node fails
• Replica set starts election for new Primary
• Zero nodes eligible for Primary (priority:0)
• Application can not send writes/reads
• Downtime
Hidden Nodes
© 2019 Pythian. Confidential
![Page 26: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/26.jpg)
• Primary node fails
• Secondary elected as new Primary
• Working set does not fit in memory
• Performance degradation
• Application stalls
Hardware
64GB RAM, 16 CPU
32GB RAM, 8 CPU 32GB RAM, 8 CPU
© 2019 Pythian. Confidential
![Page 27: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/27.jpg)
• Dataset grows
• No Disk space on Secondary
• mongod process fails
• 2 nodes replica set
• Zero tolerance for failures
Hardware
Disk: 300GB
Disk: 300GB Disk: 200GB
© 2019 Pythian. Confidential
![Page 28: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/28.jpg)
● Heartbeat lost
● Primary step down
● New Primary election
● Application timeout*
● Rollback
Best Practice: Test Primary step
down for your application
*Retryable writes since MongoDB 3.6
Network
© 2019 Pythian. Confidential
![Page 29: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/29.jpg)
• All replica set members deployed in single Availability Zone
• Availability Zone #1 goes down
• Downtime
Cloud
Cloud Deployment
Region #1
Availability Zone #1
© 2019 Pythian. Confidential
![Page 30: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/30.jpg)
● Availability Zone #1 goes down
○ New Primary elected from AZ #2
● Availability Zone #2 goes down
○ Database is read only
Cloud Deployment
© 2019 Pythian. Confidential
Cloud
Region #1
AZ#1 AZ#2
![Page 31: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/31.jpg)
• Region #1 goes down
• Downtime
Cloud Deployment
© 2019 Pythian. Confidential
Cloud
Region #1
AZ#1 AZ#2 AZ#3
![Page 32: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/32.jpg)
● VM2 goes down
○ Primary node has majority on VM1
● VM1 goes down
○ Database is read only
Virtualization
VMWARE
VM1 VM2
Physical Server
© 2019 Pythian. Confidential
![Page 33: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/33.jpg)
● Replica set major version upgrade (3.6>4.0)
● Driver v3.6 not compatible with DB v4.0
● Compatibility changes
● Application cannot send requests
● Downtime
● Rollback to previous DB version
Version Upgrades
MongoDB: 3.6.4 MongoDB: 3.6.4
© 2019 Pythian. Confidential
![Page 34: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/34.jpg)
● Replica set major version upgrade
● Promote new version as Primary
● Confirm application works
● Forget to upgrade Secondaries
● Start using new features
● New Primary elected
● Application errors
Version Upgrades
MongoDB: 3.6 MongoDB: 3.6
MongoDB: 4.0
© 2019 Pythian. Confidential
![Page 35: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/35.jpg)
● Minor version upgrade
● Promote new version as Primary
● Confirm application works
● Forget to upgrade Secondaries
● Bug fixes in minor release
● New Primary elected
● Application errors
Version Upgrades
MongoDB: 3.6.4 MongoDB: 3.6.4
MongoDB: 3.6.8
© 2019 Pythian. Confidential
![Page 36: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/36.jpg)
Version Upgrades
MongoDB: 3.6.8MongoDB: 3.6.8MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.3
MongoDB: 3.6.3
MongoDB: 3.6.8
MongoDB: 3.6.8MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
MongoDB: 3.6.8
© 2019 Pythian. Confidential
MongoDB: 3.6.3
![Page 37: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/37.jpg)
● Adding index on a collection
● Connect to the Primary node○ db.people.createIndex( { zipcode: 1 }, { background: true } )
DDL Operation
© 2019 Pythian. Confidential
![Page 38: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/38.jpg)
● Stop one Secondary
● Restart on different port
DDL Operation
Secondary--port=27777
© 2019 Pythian. Confidential
![Page 39: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/39.jpg)
● Add the Index
● Rejoin to replica
● Promote Secondary as Primary
● Forget the other nodes
DDL Operation
Secondary--port=27777
db.people.createIndex({zipcode:1})
© 2019 Pythian. Confidential
![Page 40: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/40.jpg)
● Pick one Secondary
● db.fsyncLock()
● Take snapshot
● db.fsyncUnlock()
● Unlock fails
● Secondary starts lagging
● Primary overwrites oplog
● Secondary needs initial sync
Backups
© 2019 Pythian. Confidential
![Page 41: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/41.jpg)
© 2019 Pythian. Confidential
![Page 42: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/42.jpg)
Sharded Clusters
© 2019 Pythian. Confidential
![Page 43: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/43.jpg)
Sharded Clusters
© 2019 Pythian. Confidential
![Page 44: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/44.jpg)
Monitoring Replica Set• Replica set has no Primary
• Number of unhealthy members is above threshold
• Replication lag is above threshold
• Replica set elected new Primary
• Host of any type has restarted
• Host of type Secondary is recovering
• Host of any type is down
• Host of any type has experienced Rollback
• Network issues between members of the replica set or cluster
• Monitoring backup status
© 2019 Pythian. Confidential
![Page 45: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/45.jpg)
Summary• Replica set with odd number of voting members
• Hidden or Delayed member for dedicated functions (reporting, backups …)
• Have more than one eligible Primary in the replica set
• Use multi-AZ for Cloud deployments
• Don’t deploy more than one mongod process per node/host
• Run replica set members with same hardware for all nodes
• Run replica set members with same mongo version
• Monitor your replica set status and nodes
• Monitor replication lag and Oplog size
© 2019 Pythian. Confidential
![Page 46: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/46.jpg)
Questions?
© 2019 Pythian. Confidential
![Page 47: MongoDB HA, what can go wrong? - Percona · 2019-06-17 · Summary • Replica set with odd number of voting members • Hidden or Delayed member for dedicated functions (reporting,](https://reader034.vdocuments.us/reader034/viewer/2022042319/5f08ef4c7e708231d4246f5f/html5/thumbnails/47.jpg)
We’re Hiring!https://www.pythian.com/careers/
© 2019 Pythian. Confidential