comparison of red hat clusters with openvms clusters keith ... · red hat clusters history...

55
Comparison of Red Hat Clusters with OpenVMS Clusters Keith Parris

Upload: others

Post on 26-Mar-2020

20 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Comparison of Red Hat Clusters with OpenVMS Clusters

Keith Parris

Page 2: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Thank you to our Global, Platinum & Gold Sponsors!

Global:

Platinum:

Gold:

Page 3: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

OpenVMS Cluster History

– VAX/VMS Version 1.0, 1978 • Includes Record Management Services (RMS)

– Version 2.0, 1980 • Ethernet networking added

– Version 3.0, 1982 • Lock Manager

• System Communications Services (SCS) and Mass Storage Control Protocol (MSCP)

• 1983: Computer Interconnect (CI) & HSC controller based clusters

– Limited sharing: could mount disks for read-write access on one node and also mount them read-only on other nodes

– Version 4.0, 1984 • VAXclusters: Connection Manager, Distributed Lock Manager, Cluster-wide File System

– Version 4.5, 1986 • Local Area VAXcluster (LAVC) using Ethernet as a cluster interconnect

Page 4: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

OpenVMS Cluster History

– Version 5.0, 1988 • Mixed-Interconnect VAXclusters

– Version 5.2, 1989 • Support for 96 nodes • Lock tree remastering, based on new LOCKDIRWT parameter

– Version 5.4, 1989 • MSCP Load Balancing and preferred paths • 5.4-3: Multiple-NIC support for cluster interconnect

– Version 5.5, 1991 • Host-Based Volume Shadowing (HBVS) for host-based mirroring (RAID-1) • Dynamic activity-based lock tree remastering • Tape MSCP Serving

– Version 6.0, 1993 • Cluster-wide Virtual I/O Cache

– Version 6.2, 1995 • SCSI clusters

Page 5: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

OpenVMS Cluster History

– Version 7.0, 1995 • 64-bit addressing • Fast I/O and Fast Path for optimized I/O

– Version 7.1, 1996 • Memory Channel cluster interconnect

– Version 7.2, 1999 • Galaxy: Shared Memory cluster interconnect, and migrate CPUs between software

partitions

– Version 7.3, 2001 • Improved Multiple NIC performance (simultaneous transmission over multiple paths) • XFC cluster-wide file cache

– Version 8.3, 2006 • Improved activity-based lock tree remastering with new LOCKRMWT parameter

– Version 8.4, 2010 • IP networks as a cluster interconnect • 6-member HBVS shadowsets

Page 6: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat Clusters History

– Version 0.01 of Linux from Linus Torvalds, 1991

– First version of Red Hat Linux, 1994

– First Red Hat Cluster Suite, circa 1996

– Red Hat acquires Sistina and open-sources GFS file system, 2003-4

– GFS2 and DLM in RHEL 5.3, 2009

– Up through RHEL 5.x: product name is Red Hat Cluster Suite

– RHEL 6, 2010: • RHEL 6 as base plus “Add-Ons”

– High Availability Add-On

– Resilient Storage Add-On

– etc.

Page 7: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat Cluster-related Products Add-Ons for Red Hat Enterprise Linux 6 – High Availability Add-On

• Provides failover services between nodes within a cluster.

– Resilient Storage Add-On • CLVM (Cluster Logical Volume Manager) and GFS2 (Global File System

2) cluster-wide file system for coordinated use of a shared block device using distributed lock manager (DLM)

– Load Balancer Add-On • Directs network requests to nodes within a pool of servers providing

identical services

– Scalable File System Add-On • XFS for support of file systems up to 100 TB in size and/or with multi-

threaded parallel I/O workloads

– High Performance Network Add-On • Provided remote direct memory access over Converged Ethernet

(RoCE)

Page 8: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat Definition of Clusters

• Cluster Types – Storage cluster

– High availability cluster

– Load balancing cluster

– High performance cluster

– Multi-Site cluster

– Stretch cluster

Page 9: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat Definition of Clusters

• Cluster Types – Storage cluster

• “Storage clusters provide a consistent file system image across servers in a cluster, allowing the servers to simultaneously read and write to a single shared file system. A storage cluster simplifies storage administration by limiting the installation and patching of applications to one file system. Also, with a cluster-wide file system, a storage cluster eliminates the need for redundant copies of application data and simplifies backup and disaster recovery. The High Availability Add-On provides storage clustering in conjunction with Red Hat GFS2 (part of the Resilient Storage Add-On).” – RHEL6 HA Add-On Overview, section 1.1

– High availability cluster

– Load balancing cluster

– High performance cluster

– Multi-Site cluster

– Stretch cluster

Page 10: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat Definition of Clusters

• Cluster Types – Storage cluster – High availability cluster

• “High availability clusters provide highly available services by eliminating single points of failure and by failing over services from one cluster node to another in case a node becomes inoperative. Typically, services in a high availability cluster read and write data (via read-write mounted file systems). Therefore, a high availability cluster must maintain data integrity as one cluster node takes over control of a service from another cluster node. Node failures in a high availability cluster are not visible from clients outside the cluster. (High availability clusters are sometimes referred to as Failover clusters.) The High Availability Add-On provides high availability clustering through its High Availability Service Management component, rgmanager.” – RHEL6 HA Add-On Overview, section 1.1

– Load balancing cluster – High performance cluster – Multi-Site cluster – Stretch cluster

Page 11: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat Definition of Clusters

• Cluster Types – Storage cluster – High availability cluster – Load balancing cluster

• “Load-balancing clusters dispatch network service requests to multiple cluster nodes to balance the request load among the cluster nodes. Load balancing provides cost-effective scalability because you can match the number of nodes according to load requirements. If a node in a load-balancing cluster becomes inoperative, the load-balancing software detects the failure and redirects requests to other cluster nodes. Node failures in a load-balancing cluster are not visible from clients outside the cluster. Load balancing is available with the Load Balancer Add-On.” – RHEL6 HA Add-On Overview, section 1.1

– High performance cluster – Multi-Site cluster – Stretch cluster

Page 12: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat Definition of Clusters

• Cluster Types – Storage cluster – High availability cluster – Load balancing cluster – High performance cluster

• “High-performance clusters use cluster nodes to perform concurrent calculations. A high-performance cluster allows applications to work in parallel, therefore enhancing the performance of the applications. (High performance clusters are also referred to as computational clusters or grid computing.) … The Red Hat Enterprise Linux High Availability Add-On contains support for configuring and managing high availability servers only. It does not support high-performance clusters.” – RHEL6 HA Add-On Overview, section 1.1

– Multi-Site cluster – Stretch cluster

Page 13: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat Definition of Clusters

• Cluster Types – Storage cluster

– High availability cluster

– Load balancing cluster

– High performance cluster

– Multi-Site cluster • “Multi-site or disaster-tolerant clusters are separate clusters that run

at different physical sites, typically using SAN-based storage replication to replicate data. Multi-site clusters are usually used in an active/passive manner for disaster recovery with manual failover of the active cluster to the passive cluster.” -- Red Hat Enterprise Linux Cluster, High Availability, and GFS Deployment Best Practices, Updated 22 Feb 2013, https://access.redhat.com/knowledge/articles/40051

– Stretch cluster

Page 14: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat Definition of Clusters

• Cluster Types – Storage cluster

– High availability cluster

– Load balancing cluster

– High performance cluster

– Multi-Site cluster

– Stretch cluster

• “Stretch clusters are single-cluster configurations that span multiple physical sites. Additional details on the supportability of multi-site and stretch clusters can be found in Support for Red Hat Enterprise Linux Cluster and High Availability Stretch Architectures and please note that all stretch clusters require an architecture review.” -- Red Hat Enterprise Linux Cluster, High Availability, and GFS Deployment Best Practices, Updated 22 Feb 2013, https://access.redhat.com/knowledge/articles/40051

Page 15: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat Definition of Clusters

• Cluster Types • Red Hat Clusters focus on capabilities of these types: 1. High availability cluster

2. Load balancing cluster

Page 16: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat High Availability Add-On

• Overview

Source: RedHat.com

Page 17: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat High Availability Add-On

• Components – Conga: User Interface for configuration & management

• luci: Runs on management server and provides web-based GUI interface • ricci: Agent which runs on each cluster node

– CCS: Cluster Configuration System • Provides CLI interface for cluster management

– corosync: cluster executive • Implements the Totem Single Ring Ordering and Membership Protocol • Came from OpenAIS open cluster infrastructure project from the Service

Availability Forum

– rgmanager: Resource Group Manager for failover/relocation of services – CMAN: Cluster Manager -- handles quorum – fenced: Fencing system – CLVM: Cluster Logical Volume Manager (clustered version of LVM2) – GFS2: shared-disk cluster file system – DLM: Distributed Lock Manager – Load Balancer based on Piranha (Linux Virtual Server, LVS)

Page 18: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

OpenVMS Cluster

• Limitations

• Supported Limits – Number of nodes supported: 96

– Data sharing: As many as all cluster nodes at once

– Number of geographical sites supported in a cluster: Unlimited

• Synchronously-replicated identical copies of data in up to 6 sites at once with Volume Shadowing and OpenVMS Version 8.4

– Distance between sites:

• 150 miles out-of-the-box

• 500 miles with DTCS (Disaster Tolerant Continuity Services)

• >500 miles supported with Product Manager approval

Page 19: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat High Availability Add-On

• Limitations • Supported Limits

– Number of nodes supported: 16

– Data sharing: 1 node at a time

– Number of geographical sites supported within a cluster: 1 or 2 (3rd site is allowed for Quorum Disk)

– Distance between sites: 200 km (2 ms round-trip latency)

Page 20: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat High Availability Add-On

• Limitations • Supported Limits

– Number of nodes supported: 16 • “The maximum number of cluster nodes supported by the High

Availability Add-On is 16…. However, the majority of Red Hat’s clustering customers use node counts much lower than the maximum. In general, if your cluster requires node counts higher than eight, it is advisable to verify your cluster architecture with Red Hat before deployment to confirm that it is supportable. Red Hat's clustering solution is primarily designed to provide high availability and cold application failover, and it is not meant to be used for either high-performance or load-sharing clusters.” – Red Hat Enterprise Linux Cluster, High Availability, and GFS Deployment Best Practices at https://access.redhat.com/knowledge/articles/40051

• See “Architecture Review Process for Red Hat Enterprise Linux High Availability, Clustering, and GFS/GFS2” Updated 24 Jan 2013, at https://access.redhat.com/kb/docs/DOC-53348

– Data sharing: 1 node at a time – Number of geographical sites supported in a cluster: 1 or 2 (3rd site

is OK for Quorum Disk)

Page 21: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat High Availability Add-On

• Limitations • Supported Limits

– Number of nodes supported: 16

– Data sharing: 1 node at a time

• ” To ensure data integrity, only one node can run a cluster service and access cluster-service data at a time. ... This prevents two nodes from simultaneously accessing the same data and corrupting it.” – Red Hat RHEL6 Cluster Administration Manual, section 2.1

– Number of geographical sites supported in a cluster: 1 or 2 (3rd site is OK for Quorum Disk)

Page 22: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat High Availability Add-On

• Limitations • Supported Limits

– Number of nodes supported: 16

– Data sharing: 1 node at a time

– Number of geographical sites supported in a cluster: 1 or 2 (3rd site is OK for Quorum Disk)

• As of RHEL6: “Only single site clusters are fully supported at this time. Clusters spread across multiple physical locations are not formally supported. For more details and to discuss multi-site clusters, please speak to your Red Hat sales or support representative.” – Red Hat RHEL6 Cluster Administration Manual, section 2.1

Page 23: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat High Availability Add-On

• Limitations • Supported Limits

– Number of nodes supported: 16

– Data sharing: 1 node at a time

– Number of geographical sites supported in a cluster: 1 or 2 (3rd site is OK for Quorum Disk) • Present time: “Only certain configurations of stretch clusters can be

supported at this time. All stretch clusters require obtaining a formal architecture review from Red Hat Support to ensure that the deployed cluster meets established guidelines.” – “Support for Red Hat Enterprise Linux Cluster and High Availability Stretch Architectures” Updated 24 Jan 2013, at https://access.redhat.com/knowledge/articles/27136

• See also "Red Hat Enterprise Linux Cluster, High Availability, and GFS Deployment Best Practices" Updated 22 Feb 2013, at https://access.redhat.com/knowledge/articles/40051

Page 24: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Red Hat High Availability Add-On

Stretch Cluster requirements: • Maximum inter-site latency 2 milliseconds, round-trip • Maximum of 2 physical sites, not counting any Quorum Disk at 3rd site • Each site must have an equal number of cluster nodes • Minimum of 2 nodes; maximum of 16 nodes; even numbers of nodes only • Both sites must be on the same logical network, and routing between sites

is not supported. One of 4 supported communication methods must be used (e.g. Broadcast, Multicast, etc.)

• A quorum disk is required for all stretch clusters of 4 or more cluster nodes. Using a quorum server as the tie-breaker is not supported on a stretch cluster.

• GFS, GFS2, clvmd, cmirror are not supported on stretch clusters. From “Support for Red Hat Enterprise Linux Cluster and High Availability Stretch Architectures” Updated 24 Jan 2013, at https://access.redhat.com/knowledge/articles/27136

Page 25: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

How is Split Brain avoided in OpenVMS Clusters?

• Quorum Scheme – Each node is assigned a number of votes (0-127)

• Reboot of a node is required to change its VOTES parameter

– Optional Quorum Disk may be assigned a number of votes (1-127)

– Expected Votes is total of all votes in the cluster • Cluster-wide Expected Votes value ratchets upward as new nodes with

votes are added to the cluster

• Expected Votes may be adjusted downward manually, but never below the actual number of votes in the cluster at a given point in time

• Node leaving the cluster may have the effect of its votes removed from Expected Votes with REMOVE_NODE option during shutdown

– Quorum is calculated as just over half of the Expected Votes

– If a set of nodes has more than half of Expected Votes, it has quorum

Page 26: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

How is interference from minority nodes avoided?

• OpenVMS Cluster: Voluntary “self-quarantine” • OpenVMS Cluster nodes which find they cannot communicate, or can

only communicate with nodes totaling a minority of the expected votes: – Put all disks into Mount Verification to prevent any I/Os from being issued

– Remove the Quorum capability bit from all CPUs • Since by default all processes require the Quorum capability bit to run, this

prevents any processes from being scheduled to run

– Retain all context, awaiting further developments: • If communications is re-established with nodes with a majority of votes within

RECNXINTERVAL seconds, activity continues unaffected

• If communications is re-established but this node finds it has been removed from the cluster, it does a CLUEXIT bugcheck (throwing away all held locks) and reboots to try to rejoin the cluster

• If the minority node(s) are the only remaining system(s), a Quorum Adjustment can be made via IPC at the console or Availability Manager, and node(s) formerly in the minority can continue right where they left off, without losing context

Page 27: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

How is Split Brain avoided in Red Hat Clusters?

• Quorum Scheme – Each node is assigned a number of votes (default=1)

• Number of votes assigned to a node can be modified without a reboot

– Total votes is the sum of all cluster node votes

– Optional Quorum Disk may be assigned a number of votes

– Expected Votes is initially the total of all votes in the cluster, but is modifiable via cman_tool expected –e <votes>

– Quorum is calculated as just over half of the Expected Votes

– If a set of nodes has more than half of the votes, it has quorum • The cluster and applications only operate if the cluster has quorum

– Two-node special case is the exception: • “two-node=1” and “expected=1” parameters in cluster.conf file

allow Expected Votes of 1 for 2-node clusters

Page 28: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

How is interference from minority nodes avoided?

• RHEL6 with High Availability Add-On: Fencing • Nodes which lose communications with the majority of

the cluster: – Are involuntarily “Fenced” off to prevent them from any access

to shared resources

– Fencing action is not initiated by the node itself, but instead: • Action is initiated by the cluster infrastructure through the fence

daemon, fenced

– No warning to the node

– This technique was first described in gruesome fashion as Shoot The Other Machine In The Head, or STOMITH for short.

– Someone misremembered this as Shoot The Other Node In The Head, or STONITH, which became an even more memorable acronym

Page 29: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Fencing Methods

• Power fencing – Remove power from the node:

• Network-connected power switch from a vendor such as APC, WTI or Baytech

• Storage fencing – Disable SAN switch port that the node’s HBA is attached to

– Disable Ethernet switch port that the node is attached to, to prevent access to iSCSI storage

– Use SCSI-3 Persistent Reservations to revoke access from a node

– Disable access to an iSCSI server from a particular node

• Management processor fencing – HP integrated Lights Out (iLO), Dell Remote Access Controller (DRAC), IBM Remote

Supervisor Adapter (RSA), Intelligent Platform Management Interface (IPMI), etc.

• VM guest fencing – Stop the Virtual Machine that a cluster node is executing within

Page 30: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Power Fencing

Source: RedHat.com

Page 31: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Storage Fencing

Source: RedHat.com

Storage fencing

Page 32: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Power Fencing with Redundant Power Supplies

Source: RedHat.com

Page 33: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Storage Fencing with Dual SAN Connections

Source: RedHat.com

Page 34: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Fencing

• Since fencing is used to ensure data integrity (prevent data corruption): – Cluster waits to start recovery for (or make any access to) shared

resources until fencing is successful

Since fencing methods may not always succeed:

– Multiple Fencing methods may be used for redundancy: • Cluster invokes each fencing method in turn until one returns successful

status

• The cluster node chosen to actually perform a fencing action on the to-be-fenced node is arbitrary

• If the end of the list of fencing methods is reached, it starts over with the first method

• Fencing attempts continue until the node is successfully fenced

Page 35: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Fencing

• In a 2-node cluster, loss of communications could result in a race condition where the two nodes race to fence each other off, and both might theoretically “shoot” each other simultaneously

Page 36: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Source: “STONITH Deathmatch Explained”, http://ourobengr.com/ha/

Page 37: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Fencing

• In a 2-node cluster, loss of communications could result in a race condition where the two nodes race to fence each other off, and both might theoretically “shoot” each other simultaneously – Quorum Disk is used to avoid this possibility

Page 38: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Quorum Disk in an OpenVMS Cluster

• A disk can be designated as the Quorum Disk for the cluster

• The Quorum Disk is given a number of votes, in the range of 1-127

• Nodes with direct access to the Quorum Disk (and the DISK_QUORUM SYSGEN parameter specified) become Quorum Disk Watchers – They read (and update if needed) the QUORUM.DAT file every QDSKINTERVAL

seconds

– Four successful accesses in a row (with no conflicts detected) allow the Quorum Disk to become trusted and its votes to become counted toward quorum

• Nodes which can’t be Quorum Disk Watchers take the word of one as to whether it’s votes can be trusted

• When a node leaves the cluster without the cluster receiving a “Last Gasp” datagram from it, the Quorum Disk’s votes are removed until it can be re-scanned and become trusted again

Page 39: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Quorum Disk in a Red Hat Cluster

• quorumd block in cluster.conf file contains parameters: – Each cluster node is assigned a status block on the quorum disk

• This requires that nodes have sequential Node IDs in cluster.conf

– Each node updates its status block every interval seconds

– Other nodes examine the updates to determine if a node is hung

– After tko failed updates, a node is declared down

– After tko_up successful status updates, a node is declared online

Page 40: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Quorum Disk in a Red Hat Cluster

• A quorum disk may contribute votes toward achieving quorum

• 1 to 10 arbitrary heuristics (tests) are used to determine if the votes are contributed or not – Tests may check such things as:

• Does this node have network access to the outside world (so it can provide service to external users)? This might be done by checking for a successful ping of a router.

– Does this node have access to file systems on shared storage or other resources it needs to provide service?

• Each heuristic has a score with a number of points

• If more than half of the points are scored (or a configured min_score value) then the quorum disk’s votes are counted on a node

Page 41: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Quorum Disk in a Red Hat Cluster

Some more rules:

• When using a Quorum Disk: – the disk unit must be at least 10 MB in size and

– preferably controller-based RAID for performance

– each cluster node must have exactly one vote, and

– Power Fencing should be used

• The CMAN membership timeout value (default of 10 seconds) should be at least two times the qdiskd membership timeout value

Page 42: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

How are failures handled in an OpenVMS Cluster?

• Restart mechanisms: – Submit batch job with /RESTART qualifier – Application runs on multiple nodes, but only one copy is active at a time,

and it uses a “dead-man” lock to detect failure of the active process and allow a process on another node to take over

Or, of course: – Run an application on all nodes at once, so that if a node fails, the

application continues to run, unaffected, on all the surviving nodes Another option to consider: – HP OpenVMS ServiceControl (OSC) is available for free:

• “HP OpenVMS ServiceControl monitors applications running on an OpenVMS cluster and all their required resources. If the cluster member on which an application is running fails, or if a particular required resource fails, HP OpenVMS ServiceControl relocates or restarts the application depending on the type of failure and the failover policy applied.”

• http://www.openservicecontrol.org/

Page 43: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

How are failures handled in a Red Hat Cluster?

• Resource Group Manager can automate action on a failure: • Restart

• Relocate

• Disable

Failover domain is a named list of nodes on which a service may be bound • Cluster Manager can relocate a failed node’s services to one of these nodes

Service may be: • Restricted: May only run on nodes in the failover domain; else stops

• Unrestricted: May run on any cluster node, but prefers its domain & will migrate there if a preferred node becomes available

• Exclusive Service: Specifies service will only start on a node if it has no other services running

• Failover Domains may be: • Prioritized (Ordered), where each node is assigned a priority of 1-100 (1=highest), or

• Non-prioritized (Unordered)

Page 44: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Web Server Cluster Service Example

Source: RedHat.com

Page 45: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

HA Services

• High Availability Add-On supports HA services such as these: – Apache

– NFS

– Samba

– MySQL

– Open LDAP

– PostgreSQL 8

– SAP

– Tomcat 6

– Application (Script)

Page 46: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Distributed Lock Manager (DLM)

• Distributed Lock Manager (DLM) supports: – Communications between nodes to manage lock traffic, using TCP/IP

• Network redundancy provided by bonding driver, or using SCTP which allows multiple IP addresses per node

• IP Port number used is 21064

– Symbolic resource names used (by convention) to coordinate access to physical resources

– Multiple locks per resource

– New lock requests, or promotion or demotion through lock conversions

– Synchronous or Asynchronous lock requests

– Passing global data through lock value blocks

Page 47: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Distributed Lock Manager (DLM)

• Distributed Lock Manager (DLM) supports: – Six locking modes: NL, CR, CW, PR, PW, EX. Compatibility matrix is:

Lock States:

– Granted

– Converting

– Blocked

Mode NL CR CW PR PW EX

NL Yes Yes Yes Yes Yes Yes

CR Yes Yes Yes Yes Yes No

CW Yes Yes Yes No No No

PR Yes Yes No Yes No No

PW Yes Yes No No No No

EX Yes No No No No No

Source: http://en.wikipedia.org/wiki/Distributed_lock_manager

Page 48: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Distributed Lock Manager (DLM)

More DLM details:

– One node is the “master” for a given resource

• Other nodes ask that node for locks on that resource

• First node to take out lock on resource becomes the master for that resource

– Resource directory tracks which node is master for a given resource

• Directory is distributed across nodes

• Directory must be rebuilt when a node leaves the cluster

• Nodes have lock directory weighting

– Locks are re-mastered after a node failure

Page 49: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Distributed Lock Manager (DLM)

The DLM is used by:

– CLVM to synchronize updates to LVM volumes and volume groups

– GFS2 to synchronize access to file system metadata

– rgmanager to synchronize service states

Page 50: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

OpenVMS Cluster

• Shared storage choices – Storage subsystems presenting disk units on Fibre Channel SANs

– “SCSI Clusters”

• Shared parallel SCSI of up to 4 nodes on Alpha

• On Integrity, 2-node shared SCSI using MSA30MI shelf, or up to 4-node shared SAS using MSA60/70 shelf connected with the Smart Array P700 controller

– Older shared-storage technologies such as CI, DSSI, dual-ported DSA disks, dual-ported Massbus, etc. have been supported over the years

Page 51: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

OpenVMS Cluster

• MSCP Server • Mass Storage Control Protocol (MSCP) Server

– Allows an OpenVMS Cluster node to provide remote access to storage to nodes without direct access to that storage

Some OpenVMS Cluster nodes may not have access (and may not need access) to some of the storage. They are not required to have access to all storage.

Page 52: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

RHEL with High Availability Add-On

• Shared storage access – All nodes in the cluster must have direct access to the shared

storage

• “When you configure a GFS2 file system as a cluster file system, you must ensure that all nodes in the cluster have access to the shared file system. Asymmetric cluster configurations in which some nodes have access to the file system and others do not are not supported. This does not require that all nodes actually mount the GFS2 file system itself.” -- Red Hat RHEL6 Cluster Administration Manual, section 2.1

– iSCSI Server on a node outside this cluster might be set up to provide direct access to storage for all cluster members

Page 53: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

RHEL with High Availability Add-On

• Shared storage choices – Storage subsystems presenting disk units on Fibre Channel SANs

– iSCSI served storage

• “Certain low-cost alternatives, such as host RAID controllers, software RAID without cluster support, and multi-initiator parallel SCSI configurations are not compatible or appropriate for use as shared cluster storage.” – Red Hat RHEL6 Cluster Administration Manual, section 2.1

Page 54: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Questions?

Page 55: Comparison of Red Hat Clusters with OpenVMS Clusters Keith ... · Red Hat Clusters History –Version 0.01 of Linux from Linus Torvalds, 1991 –First version of Red Hat Linux, 1994

Speaker contact info:

• Keith Parris

• E-mail: [email protected] or [email protected]

• Website: http://www2.openvms.org/kparris/