-- linux-ha release 2 hands-on lwce – sf - august, 2005 linux-ha release 2 hands-on tutorial alan...

72
-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke IBM Linux Technology Center [email protected]

Upload: gregory-tyler

Post on 29-Jan-2016

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Linux-HA Release 2 Hands-On Tutorial

Alan Robertson

Project Leader – Linux-HA project

Dave Blaschke

IBM Linux Technology Center

[email protected]

Page 2: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Part I: Linux-HA Release 2 Overview

What is High-Availability (HA) Clustering?

What can HA do for me?

What is the Linux-HA project?

Linux-HA applications

Linux-HA customers

Linux-HA release 1 capabilities

Linux-HA release 2 capabilities

Comparative Architectures

Release 2 Details

Thoughts about cluster security

Page 3: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

What Is HA Clustering?

Putting together a group of computers which trust each other to provide a service even when system components fail

When one machine goes down, others take over its work

This involves IP address takeover, service takeover, etc.

New work comes to the “takeover” machine

Not primarily designed for high-performance

Page 4: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

What Can HA Clustering Do For You?

It cannot achieve 100% availability – nothing can.

HA Clustering designed to recover from single faults

It can make your outages very short

From about a second to a few minutes

It is like a Magician's (Illusionist's) trick:

When it goes well, the hand is faster than the eye

When it goes not-so-well, it can be reasonably visible

A good HA clustering system adds a “9” to your base availability

99->99.9, 99.9->99.99, 99.99->99.999, etc.

Complexity is the enemy of reliability!

Page 5: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Lies, Damn Lies, and Statistics

Counting nines – downtime allowed per year

99.9999% 30 sec99.999% 5 min99.99% 52 min99.9% 9 hr 99% 3.5 day

Page 6: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

The Desire for HA systems

Who wants low-Who wants low-availability systems?availability systems?

Why are so few systems High-Availability?

Page 7: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Why isn't everything HA?

Cost

Complexity

Page 8: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Page 9: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

How is HA Clustering Different from Disaster Recovery?

HA:

Failover is cheap

Failover times measured in seconds

Reliable inter-node communication

DR:

Failover is expensive

Failover times often measured in hours

Unreliable inter-node communication assumed

Page 10: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

How Does HA work?

Manage redundancy to improve service availability

Like a cluster-wide-super-init

Even complex services are now “respawn”

on node (computer) death

on “impairment” of nodes

on loss of connectivity

for services that aren't working (not necessarily stopped)

managing very complex dependency relationships

Page 11: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Single Points of Failure (SPOFs)

A single point of failure is a component whose failure will cause near-immediate failure of an entire system or service

Good HA design adds redundancy eliminates of single points of failure

Page 12: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Non-Obvious SPOFs

Replication links are rarely single points of failure

The system may fail when another failure happens

Some disk controllers have SPOFs inside them which aren't obvious without schematics

Redundant links buried in the same wire run have a common SPOF

Non-Obvious SPOFs can require deep expertise to spot

Page 13: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

The “Three R's” of High-Availability

Redundancy

Redundancy

Redundancy

If this sounds redundant, that's probably appropriate...

Most SPOFs are eliminated by redundancy

HA Clustering is a good way of providing and managing redundancy

Page 14: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Redundant Communications

Intra-cluster communication is critical to HA system operation

Most HA clustering systems provide mechanisms for redundant internal communication for heartbeats, etc.

External communications is usually essential to provision of service

External communication redundancy is usually accomplished through routing tricks

Having an expert in BGP or OSPF routing is a help

Page 15: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Fencing

Guarantees resource integrity in the case of certain difficult cases (split-brain)

Four Common Methods:

FiberChannel Switch lockouts

SCSI Reserve/Release (difficult to make reliable)

Self-Fencing (like IBM ServeRAID)

STONITH – Shoot The Other Node In The Head

Linux-HA has native support for the last two

Page 16: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Redundant Data Access

Replicated

Copies of data are kept updated on more than one computer in the cluster

Shared

Typically Fiber Channel Disk (SAN)

Sometimes shared SCSI

Back-end Storage (“Somebody Else's Problem”)

NFS, SMB

Back-end database

Page 17: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Data Sharing – Replication

Some applications provide their own replication

DNS, DHCP, LDAP, DB2, etc.

Linux has excellent disk replication methods available

DRBD is my favorite

DRBD-based HA clusters are shockingly cheap

Some environments can live with less “precise” replication methods – rsync, etc.

Generally does not support parallel access

Fencing usually required

EXTREMELY cost effective

Page 18: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Data Sharing – ServeRAID

IBM ServeRAID disk is self-fencing

This helps integrity in failover environments

This makes cluster filesystems, etc. impossible

No Oracle RAC, no GPFS, etc.

ServeRAID failover requires a script to perform volume handover

Linux-HA provides such a script in open source

Linux-HA is ServerProven with ServeRAID

Page 19: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Data Sharing – Shared Disk

The most classic data sharing mechanism – commonly fiber channel

Allows for failover mode

Allows for true parallel access

Oracle RAC, Cluster filesystems, etc.

Fencing always required with Shared Disk

In the hand-on portion, we will non-fiber-channel shared virtual disks

Page 20: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Data Sharing – Back-End

Network Attached Storage can act as a data sharing method

Existing Back End databases can also act as a data sharing mechanism

Both make reliable and redundant data sharing Somebody Else's Problem (SEP).

If they did a good job, you can benefit from them.

Beware SPOFs in your local network

Page 21: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

The Linux-HA Project

Linux-HA is the oldest high-availability project for Linux, with the largest associated community

The core piece of Linux-HA is called “heartbeat”(though it does much more than heartbeat)

Linux-HA has been in production since 1999, and is currently in use on about ten thousand sites

Linux-HA also runs on FreeBSD and Solaris, and is being ported to OpenBSD and others

Linux-HA is shipped with every major Linux distribution except one.

Page 22: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Linux-HA Release 1 Applications

Database Servers (DB2, Oracle, MySQL, others)

Load Balancers

Web Servers

Custom Applications

Firewalls

Retail Point of Sale Solutions

Authentication

File Servers

Proxy Servers

Medical ImagingAlmost any type server application you can think of – except SAP

Page 23: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Linux-HA customersMAN Nutzfahrzeuge AGMAN Nutzfahrzeuge AG – truck manufacturing division of Man AG

Karstadt, Circuit City Karstadt, Circuit City use Linux-HA and databases each in several hundred stores

FedExFedEx – Truck Location Tracking

BBCBBC – Internet infrastructure

Citysavings BankCitysavings Bank in Munich (infrastructure)

Bavarian Radio StationBavarian Radio Station (Munich) coverage of 2002 Olympics in Salt Lake City

The Weather ChannelThe Weather Channel (weather.com)

SonySony (manufacturing)

EmageonEmageon – medical imaging services

IncredimailIncredimail bases their mail service on Linux-HA on IBM hardware

University of Toledo (US)University of Toledo (US) – 20k student Computer Aided Instruction system

ISO New EnglandISO New England manages power grid using 25 Linux-HA clusters

Page 24: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Linux-HA Release 1 capabilities

Supports 2-node clusters

Can use serial, UDP bcast, mcast, ucast comm.

Fails over on node failure

Fails over on loss of IP connectivity

Capability for failing over on loss of SAN connectivity

Limited command line administrative tools to fail over, query current status, etc.

Active/Active or Active/Passive

Simple resource group dependency model

Requires external tool for resource (service) monitoring

SNMP monitoring

Page 25: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Linux-HA Release 2 capabilities

Built-in resource monitoring

Support for the OCF resource standard

Much Larger clusters supported (>= 8 nodes)

Sophisticated dependency model with rich constraint support (resources, groups, incarnations, master/slave) (needed for SAP)

XML-based resource configuration

Coming in 2.0.1:

Configuration and monitoring GUI

Support for GFS cluster filesystem

Multi-state (master/slave) resource support

Initially - no IP, SAN monitoring

Page 26: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Linux-HA Release 1 Architecture

Page 27: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Linux-HA Release 2 Architecture(add TE and PE)

Page 28: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Resource Objects in Release 2

Release 2 supports “resource objects” which can be any of the following:

Primitive ResourcesOCF, heartbeat-style, or LSB resource agent scripts

Resource Clones – need “n” resource objects - somewhere

Resource Groups – a group of primitive resources with implied co-location and linear ordering constraints

Multi-state resources (master/slave)Designed to model master/slave (replication) resources (DRBD, et al)

Page 29: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Basic Dependencies in Release 2

Ordering Dependencies

start before (normally implies stop after)

start after (normally implies stop before)

Mandatory Co-location Dependencies

must be co-located with

cannot be co-located with

Page 30: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Resource Location Constraints

Mandatory Constraints:

Resource Objects can be constrained to run on any selected subset of nodes. Default depends on setting of symmetric_cluster.

Preferential Constraints:

Resource Objects can also be preferentially constrained to run on specified nodes by providing weightings for arbitrary logical conditions

The resource object is run on the node which has the highest weight (score)

Page 31: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Resource Clones

Resource Clones allow one to have a resource which runs multiple (“n”) times on the cluster

This is useful for managing

load balancing clusters where you want “n” of them to be slave servers

Cluster filesystems

Cluster Alias IP addresses

Page 32: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Resource Groups

Resource Groups provide a shorthand for creating ordering and co-location dependencies

Each resource object in the group is declared to have linear start-after ordering relationships

Each resource object in the group is declared to have co-location dependencies on each other

This is an easy way of converting release 1 resource groups to release 2

Page 33: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Multi-State (master/slave) Resources

Normal resources can be in one of two stable states:

Multi-state resources can have more than two stable states. For example:

This is ideal for modeling replication resources like DRBD

Page 34: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Advanced Constraints

Nodes can have arbitrary attributes associated with them in name=value form

Attributes have types: int, string, version

Constraint expressions can use these attributes as well as node names, etc in largely arbitrary ways

Operators:

=, !=, <, >, <=, >=

defined(attrname), undefined(attrname),

colocated(resource id), not colocated(resource id)

Page 35: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Advanced Constraints (cont'd)

Each constraint is associated with particular resource, and is evaluated in the context of a particular node.

A given constraint has a boolean predicate associated with it according to the expressions before, and is associated with a weight, and condition.

If the predicate is true, then the condition is used to compute the weight associated with locating the given resource on the given node.

All conditions are given weights, positive or negative. Additionally there are special values for modeling must-have conditions

+INFINITY

-INFINITY

Page 36: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

DRBD – RAID1 over the LAN

DRBD is a block-level replication technology

Every time a block is written on the master side, it is copied over the LAN and written on the slave side

Typically, a dedicated replication link is used

It is extremely cost-effective – common with xSeries

Worst-case around 10% throughput loss

Recent versions have very fast “full” resync

Page 37: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Page 38: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Security Considerations

Cluster: A computer whose backplane is the Internet

If this isn't scary, you don't understand...

You may think you have a secure cluster network

You're probably mistaken now

You will be in the future

Page 39: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Secure Networks are Difficult Because...

Security is not often well-understood by adminsSecurity is well-understood by “black hats”Network security is easy to breach accidentally

Users bypass it

Hardware installers don't fully understand it

Most security breaches come from “trusted” staffStaff turnover is often a big issue

Virus/Worm/P2P technologies will create new holes especially for Windows machines

Page 40: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Security Advice

Good HA software should be designed to assume insecure networks

Not all HA software assumes insecure networks

Good HA installation architects use dedicated (secure?) networks for intra-cluster HA communication

Crossover cables are reasonably secure – all else is suspect

Page 41: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Part II – Hands-On Cluster Configuration

Overview of hardware

Overview of Logical cluster Setup

Overview of Services to be Configured

Detailed Configuration:

ha.cf

authkeys

Cluster Information Base (CIB)

Page 42: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Overview of Cluster Hardware

Each cluster will consist of:

Two logical computers (LPARs)

Each logical computer consists of:One logical CPU

One logical boot/root disk (sda)

One logical disk sharable by the other LPAR (sd[bc])

One logical disk shared from the other LPAR (sd[bc])

One logical ethernet connection (eth0)

One adminstrative IP address + a service address

The observant will note the lack of significant redundancy in this test environment

Page 43: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Logical Cluster Setup – Part I

Communications:

Heartbeat has unicast, broadcast, and multicast communications modes.

Broadcast communications will NOT work in this environment – all clusters are on the same subnet.

Multicast communications are simplest for this environment

Unicast communications are probably the best choice for this environment.

Page 44: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Resources To Be Configured

For this cluster we will configure the following resources (services):

IP address using resource agent

Filesystem mounts using the resource

Apache web server using the resource agent

SAMBA file sharing using the and LSB-style init scripts that come with SLES 9

NFS file sharing using the and LSB-style init scripts that come with SLES9

Page 45: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Shared Disk Access - Naming

Earlier it was noted that the shared logical disk numbering was noted as sd[bc]

When configuring resources, it is necessary for a given resource to come up consistently on all nodes in a cluster

To avoid problems, one can mount the shared filesystems by label rather than by /dev/sdb or /dev/sdc

This is accompished using the using “-Llabel” in the Filesystem resource.

We do not do this in our sample files :-).

Page 46: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

/etc/ha.d/ha.cf Configuration File

clusternumberlclusternumber+1

See also:

Page 47: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

/etc/ha.d/authkeys

This file provides authentication information

File MUST be mode 0600 or 0400

See:

Exercise for those who find time:

Page 48: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Cluster Information Base (CIB) Intro

The CIB is an XML file containing:

Configuration Information

Cluster Node information

Resource Information

Resource Constraints

Status Information

Which nodes are up / down

Which resources are running where

We only provide configuration information

Page 49: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

An Empty CIB

Page 50: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

The CIB section

Page 51: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

The CIB section

We let it get the nodes information from the membership layer

This makes things much easier on us :-)

Page 52: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

The section of the CIB

The resources section is one of the most important sections.

It consists of a set of individual resource records

Each resource record represents a single resource

Page 53: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Classes of Resource Agents in R2OCF – Open Cluster Framework - http://opencf.org/

take parameters as name/value pairs through the environment

Can be monitored well by R2

Heartbeat – R1-style heartbeat resources

Take parameters as command line arguments

Can be monitored by action

LSB – Standard LSB Init scripts

Take no parameters

Can be monitored by status action

Stonith – Node Reset Capability

Very similar to OCF resources

Page 54: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

IPaddr resource Agent

Class: OCF

Parameters:

ip – IP address to bring up

nic – NIC to bring address up on (optional)

netmask – netmask for ip in CIDR form (optional)

broadcast – broadcast address (optional)

Page 55: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Filesystem resource Agent

Class: OCF

Parameters:

device – “devicename” to mount

directory – where to mount the filesystem

fstype – type of filesystem to mount

options – mount options (optional)

This is essentially an /etc/fstab entry – expressed as a resource

Page 56: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

ClusterMon resource Agent

Class: OCF

Parameters:htmlfile – name of apache configuration file (required)

update – how often to update the HTML file (required)

user – who to run crm_mon as

extra_options – Extra options to pass to crm_mon (optional)

Update must be in seconds

htmlfile must be located in the Apache docroot

Suggested value for extra_options: “-n -r”

Page 57: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

ha_apache resource Agent

Class: OCF

Parameters:

configfile – name of apache configuration file (required)

port – the port the server is running on (optional)

statusurl – URL to use in monitor operation (optional)

Values for optional parameters are deduced from reading the configuration file.

Configfile and html directories must go on shared media

Page 58: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

smb and nmb resources

Class: LSB (i. e., normal init script)

They take no parameters

Must be started after the IP address resource is started

Must be started after the filesystem they are exporting is started

Their configuration files should go on the shared media

Page 59: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

nfslock and nfsserver Resources

Class: LSB (i. e., normal init script)

Neither takes any parameters

NFS config and lock info must be on shared media

NFS filesystem data must be on shared media

Inodes of mount devices and all files must match (!)

Must be started before IP address is acquired

Making the inodes of disks match can sometimes be a bit tricky (LVM and DRBD help here)

Page 60: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

ibmhmc STONITH Resource

Class: stonith

Parameters:

ip – IP address of the HMC controlling the node in question

This resource talks to the “management console” for the OpenPower machine we're using

Page 61: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

An OCF object

””””””””””””

Page 62: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

A STONITH object

””””””””””””

Page 63: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

An LSB object(i. e., an init script)

””””””

Page 64: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Resource Groups

Resources can be put together in groups a lot like R1 resource groups

Groups are simple to manage, but less powerful than individual resources with constraints

””

Page 65: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Resource “clone” Units

If you want a resource to run in several places, then you can “clone” the resource

””

Page 66: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

CIB

Page 67: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

information

We prefer to run on host ””””””””””””

Page 68: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Our Configuration:Two resource groups

Webserver

Filesystem

IPaddr

apache

Fileserver

Filesystem

NFS

IPaddr

smb (Samba) / nmb (Samba)

Page 69: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

STONITH “clone” resource(s)

Page 70: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Detailed Handouts

The complete configuration is too detailed to present further in slides.

Remaining details are provided in smaller fonts in the supplied paper handout

Page 71: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

References

http://linux-ha.org/

http://linux-ha.org/download/

http://linux-ha.org/SuccessStories

http://linux-ha.org/Certifications

http://linux-ha.org/NewHeartbeatDesign

www.linux-mag.com/2003-11/availability_01.html

Page 72: -- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005 Linux-HA Release 2 Hands-On Tutorial Alan Robertson Project Leader – Linux-HA project Dave Blaschke

-- Linux-HA Release 2 Hands-On LWCE – SF - August, 2005

Legal Statements

IBM is a trademark of International Business Machines Corporation.

Linux is a registered trademark of Linus Torvalds.

Other company, product, and service names may be trademarks or service marks of others.

This work represents the views of the author and does not necessarily reflect the views of the IBM Corporation.