microsoft cluster server basics

4
Microsoft Cluster Server Microsoft Cluster Server ("MSCS") is clustering software that first shipped with Microsoft Windows NT Server - Enterprise Edition. MCSC 1.0 (codenamed "Wolfpack") was released in 1997. Since then, MSCS has been upgraded to version 1.1 in Windows 2000 Advanced Server and Datacenter Server and to version 1.2 in Windows Server 2003 Enterprise Edition and Datacenter Edition. MSCS supports clusters nodes which are specially linked servers running the cluster service. The primary function of MSCS occurs when one server in a cluster fails or is taken offline. With MSCS, the other server in the cluster takes over the failed server’s operations. Clients using server resources experience little or no interruption of their work because the resource functions move from one server to the other. The primary purpose of clustering is to provide failover and reinstantiation of services and resources, thereby providing increased availability for the services (e.g., messaging, database, file and print, etc.). MSCS is comprised of two main components: clustering software and the Cluster Administrator (cluadmin.exe, a GUI and cluster.exe, a command-line management tool). The clustering software enables the two servers of a cluster to exchange specific types of messages that trigger the transfer of resources at the appropriate times. The clustering software has two primary components: the Cluster Service and the Resource Monitor. The Cluster Service runs on each cluster server. It controls cluster activity, communication between cluster servers, and failure operations. The Resource Monitor handles communication between the Cluster Service and the application resources. The Cluster Administrator is a graphical application that is used to manage a cluster. It runs on any version of NT (server, workstation) that has Service Pack 3 or later installed, Windows 2000, Windows XP and Windows 2003. In MSCS, a cluster is a configuration of two nodes, each of which is an independent computer system. Together, these independent servers create a "server cluster." The cluster appears to users as a single server. For MSCS, both nodes must be running NT Server - Enterprise Edition, Windows 2000 Advanced/Datacenter Server or Windows Server 2003 Enterprise/Datacenter Server. The network applications, data files, and other tools you install on the nodes are the cluster resources, which provide services to network clients. A resource is hosted on only one node at any time. The figure below shows the relationship between nodes, groups, and resources. Picture Source: Microsoft Cluster Server Administrator's Guide Windows Cluster Terminology Clustering introduces several new terms which should be thoroughly understood before clusters of any kind are implemented. Node. The term used to refer to a server that is a member of a cluster. Resource. A hardware or software component that exists in a cluster, such as a disk, an IP address, a network name, or an instance of an Exchange 2000 component.

Upload: samee-chougule

Post on 07-Mar-2015

197 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Microsoft Cluster Server Basics

Microsoft Cluster ServerMicrosoft Cluster Server ("MSCS") is clustering software that first shipped with Microsoft Windows NT Server - Enterprise Edition.  MCSC 1.0 (codenamed "Wolfpack") was released in 1997.  Since then, MSCS has been upgraded to version 1.1 in Windows 2000 Advanced Server and Datacenter Server and to version 1.2 in Windows Server 2003 Enterprise Edition and Datacenter Edition.

MSCS supports clusters nodes which are specially linked servers running the cluster service.  The primary function of MSCS occurs when one server in a cluster fails or is taken offline.  With MSCS, the other server in the cluster takes over the failed server’s operations.  Clients using server resources experience little or no interruption of their work because the resource functions move from one server to the other.  The primary purpose of clustering is to provide failover and reinstantiation of services and resources, thereby providing increased availability for the services (e.g., messaging, database, file and print, etc.).

MSCS is comprised of two main components: clustering software and the Cluster Administrator (cluadmin.exe, a GUI and cluster.exe, a command-line management tool).  The clustering software enables the two servers of a cluster to exchange specific types of messages that trigger the transfer of resources at the appropriate times.  The clustering software has two primary components: the Cluster Service and the Resource Monitor.  The Cluster Service runs on each cluster server.  It controls cluster activity, communication between cluster servers, and failure operations.  The Resource Monitor handles communication between the Cluster Service and the application resources.  The Cluster Administrator is a graphical application that is used to manage a cluster.  It runs on any version of NT (server, workstation) that has Service Pack 3 or later installed, Windows 2000, Windows XP and Windows 2003.

In MSCS, a cluster is a configuration of two nodes, each of which is an independent computer system.  Together, these independent servers create a "server cluster."  The cluster appears to users as a single server. For MSCS, both nodes must be running NT Server - Enterprise Edition, Windows 2000 Advanced/Datacenter Server or Windows Server 2003 Enterprise/Datacenter Server.  The network applications, data files, and other tools you install on the nodes are the cluster resources, which provide services to network clients.  A resource is hosted on only one node at any time.  The figure below shows the relationship between nodes, groups, and resources.

Picture Source: Microsoft Cluster Server Administrator's Guide

 

Windows Cluster Terminology

Clustering introduces several new terms which should be thoroughly understood before clusters of any kind are implemented.

Node.  The term used to refer to a server that is a member of a cluster. Resource.  A hardware or software component that exists in a cluster, such as a disk, an IP address, a network name, or an instance of an Exchange 2000

component. Group. A combination of resources that are managed as a unit of failover.  Groups are also known as resource groups or failover groups. Dependency. An alliance between two or more resources in the cluster architecture. You’ll need to understand cluster resource dependencies when installing

a cluster. Failover/failback.  The process of moving resources from one server to another. Failover can occur when one server experiences a failure of some sort or

when you, the administrator, initiate a proactive failover. Quorum resource.  This is a special type of cluster resource that provides persistent arbitration mechanisms by allowing one node to gain control of it and

then defending that node’s control.  In addition, it provides physical storage that can be accessed by any node in the cluster (only one node can access the quorum at any given time).  The quorum also maintains access to the most current version of the cluster database, and if a failure occurs, the quorum writes

the changes to the cluster database.

Page 2: Microsoft Cluster Server Basics

Heartbeat.  The network and Remote Procedure Call (RPC) traffic that flows between servers in a cluster.  Windows 2000 and Windows 2003 clusters communicate by using RPC calls on IP sockets with User Datagram Protocol (UDP) packets.  Heartbeats are single UDP packets sent between each node’s

every 1.2 seconds.  These packets are used to confirm that the node’s network interface is still active. Membership. This term is used to describe the orderly addition and removal of active nodes to and from the cluster. Global update. This term refers to the propagation of cluster configuration changes to all members. The cluster registry is maintained through this

mechanism. Cluster registry. Inside the Windows 2000 registry is the cluster registry—also known as the cluster database.  This maintains configuration information on

each member in the cluster, as well as on resources and parameters. This information is stored on the quorum resource. Virtual server. A virtual server is a combination of configuration information and cluster resources, such as an IP address, network name and application

resource. Active/Active.  From a software perspective, this describes applications (or resources) that can existing as multiple instances in a cluster.  This means that

both nodes can be active servicing clients. Active/Passive.  This terms describes applications that run as a single instance in a cluster.  This generally also means that one node typically sits idle until a

failover occurs.  However, you can have an Active/Passive implementation of an application in an Active/Active cluster.  An example of this would be a

cluster that contained clustered file and print sharing resources and a single Exchange or SQL virtual server. Shared storage.  This refers to the external SCSI or fibre channel storage enclosure and the disks contained therein.  Shared storage is a requirement for

multi-node clusters.  Although this storage is shared, only one node can access an external storage resource at any given time.

 

Windows 2000 Cluster ServicesWindows NT 4.0 Enterprise Edition included Microsoft Cluster Server 1.0.  Windows 2000 Advanced Server and Windows 2000 Datacenter Server include the Windows Cluster Service, or Microsoft Cluster Server 1.1.  Aside from the 4-node capabilities of Windows 2000 Datacenter Server, there aren't a whole lot of differences between the two versions.  Windows 2000 clusters do add the following enhancements, though:

Event Log Replication More services are supported (e.g., DFS, IIS, etc.) Improved client network recovery Support for rolling upgrades

Clusters created using Microsoft Cluster Server or Windows 2000 Cluster Services are known as Server Clusters.

Windows Load Balancing Service ("WLBS")WLBS is a load balancing feature for Windows NT TCP/IP applications that supports load balancing and clustering for web-based services such as Internet Information Server (web, FTP, etc.), streaming media, virtual private networking (VPN), and Microsoft Proxy Server.  WLBS was formerly a product called Convoy Cluster Software by Beaverton, Oregon company Valence Research, Inc.  Microsoft acquired Valence Research on August 25, 1998.  Both microsoft.com and msn.com used WLBS (now NLB, see below) to manage the high volume of traffic that these sites get.  WLBS is a free download for Windows NT 4.0 Enterprise Edition users.

Network Load Balancing ("NLB")Formerly known as Windows Load Balancing, Network Load Balancing is the name of the TCP/IP application load balancing software in Windows 2000 Advanced Server and Datacenter Server, and in all editions of Windows 2003.  NLB clusters distribute client connections over multiple servers, providing scalability and high availability for TCP/IP-based services and applications.

Component Load Balancing ("CLB")CLB is also known as Application Load Balancing, and sometimes as COM+ Load Balancing.  It is a way for developers to code COM+ components for use by multiple application servers.  CLB provides scalable, reliable and load-balanced activation of COM+ objects across application-cluster members in a manner transparent to clients.  This enables virtualization of an application much in the same what that NLB and Server Clusters virtualize servers.   A primary CLB servers watches for application server failures and automatically moves the affected objects to another cluster member.

How Many Servers can I cluster?MSCS and Windows 2000 Advanced Server support a maximum of two servers (nodes) per server cluster.  Both servers must be running either the Enterprise Edition of NT Server 4.0 or Windows 2000 Advanced Server.  Windows 2000 Datacenter Server supports a maximum of four nodes per cluster.

How many servers can I load balance?WLBS and NLB support a maximum of 32 nodes.

Windows 2003 Cluster ServicesThere are a substantial number of improvements in Windows 2003 Server Clusters and NLB:

Windows Server 2003 Enterprise Edition, the upgrade to Windows 2000 Advanced Server, will support up to eight nodes per cluster.

Page 3: Microsoft Cluster Server Basics

Windows Server 2003 Datacenter Server will also support up to eight nodes per cluster.

All versions of Windows Server 2003 will include Network Load Balancing. A number of Server Cluster enhancements, such as 64-bit support, greatly improved setup, and Active Directory integration. A number of Network Load Balancing enhancements, such as better manageability, Multi-NIC support, and bi-directional

affinity support for clustered ISA Servers. Windows Server 2003 introduces the concept of a Majority Node Set.  This allows server clusters to be built without using the

shared disk for the quorum.  This enables you to build and configure geographically dispersed clusters. Active Directory integration Support for Dynamic Disks and Encrypting File System Enhanced setup, resource configuration and management

Windows 2003 NLB ServicesThere are also a number of NLB enhancements in Windows 2003, including:

IGMP support (to limit switch flooding) Bi-direction affinity, which enables load balancing of ISA Server Support for multiple network interface cards Virtual clusters for traffic filtering, host preference and separate configurations New NLB Manager utility

What Microsoft BackOffice applications can I cluster?You can cluster several BackOffice applications: Exchange Server 5.5 Enterprise Edition, Exchange 2000 Enterprise Server, SQL Server 6.5/7.0/2000 - Enterprise Edition, Internet Security and Acceleration Server 2000, and Internet Information Server 4.0 and 5.0.  You can also cluster file and print services, as well.  You cannot cluster Microsoft SNA Server (or Host Integration Server), Microsoft SMS, Microsoft Proxy Server, or RAS.  Using NLB, you can load-balance Microsoft Proxy Server and ISA Server, Outlook Web Access, intranets, and other IP-based applications.  Using Windows 2000 Advanced Server, Windows Server 2003 Enterprise Edition, or Windows Server 2003 Datacenter Edition, you can also cluster WINS, DHCP, and DFS.  Look here and here for more details.

Requirements of Server Clusters (MSCS and Cluster Service)There are certain things you need to have before you can cluster servers using MSCS or Windows 2000 Cluster Services.  For example, you need:

Two Servers (recommend hardware from cluster category of Microsoft's HCL with identical configurations, e.g., RAM, CPUs, etc.)

PCI Cluster Interconnect (two are recommended for redundancy) Two NICS per node (one for private network; one for public) TCP/IP External storage cabinet SCSI or Fibre Channel for shared disks

Best PracticesThere are some general things you can do to ensure your cluster configurations are as robust, reliable, available, and scalable as possible.

1. Perform an risk audit .  This involves analyzing the various components of your cluster to determine what remaining single points of failure exist.  For instance, do you need additional protection for network connectivity (e.g., redundant path) or power-loss protection (e.g., UPS), and the like.  You can find a sample risk audit table here. 

2. Make sure your application can be clustered .  There are two main application types within the context of clustering: Cluster-aware applications: these applications can use cluster services via its API.  These applications provide a DLL

file that uses the LooksAlive and IsAlive APIs to manage resources within a cluster.  These applications are designed to be clustered.

Cluster-unaware applications: these applications don't use any APIs, don't have special DLL files and are not managed as cluster resources.  However, that doesn't mean that these applications cannot be installed on a cluster, or that they cannot failover properly.  There are many applications that are not cluster-aware which can still function normally in a clustered environment.  Be sure to test your application in a test cluster before going into production. 

3. Make sure your hardware is on Microsoft's Hardware Compatibility List for Clustering.  If your hardware is not on the Cluster HCL, don't use it.  It may work, but it is not supported by Microsoft and your results will be unpredictable.  Familiarize yourself with Microsoft's Support Policy for Server Clusters and the HCL. 

4. Don't skimp on other areas of redundancy .  Where possible take advantage of other technologies with redundancy.  Use RAID arrays on internal drives and the external shared disk.  Use redundant network adapters and networking equipment,

Page 4: Microsoft Cluster Server Basics

power supplies, fans and CPUs. 

5. Test your cluster in a lab environment .  Test EVERYTHING!  This includes, UPS software/hardware, backup software, simulated failures, manual failovers, and the effect of failures that clustering doesn't protect you from (e.g., router failures, etc.).  Even go so far as to yank network cables, drives that are part of RAID arrays and powering off one of the nodes.  The more testing you do, the better prepared you'll be to manage and support your cluster(s). 

6. Research, Research, Research .  Read the white papers and other documents published by Microsoft.  Check out the Microsoft public newsgroups.  Talk with other cluster operators, application vendors and hardware manufacturers.  You simply cannot digest enough information on clustering.  After all, you want your cluster to be as reliable and available as possible.  You can find some useful links to clustering information here.