Download - Disclaimer - Actian...Node configuration Power up a node and give it it's hostname and static IP repeat for all the nodes - it may be useful to add the nodes to the hosts file register

This document is for informational purposes only and is subject to change at any time without notice. The information in this document is proprietary to Actian and no part of this document may be reproduced, copied, or transmitted in any form or for any purpose without the express prior written permission of Actian.

This document is not intended to be binding upon Actian to any particular course of business, pricing, product strategy, and/or development. Actian assumes no responsibility for errors or omissions in this document. Actian shall have no liability for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials. Actian does not warrant the accuracy or completeness of the information, text, graphics, links, or other items contained within this material. This document is provided without a warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose, or non-infringement.

Disclaimer

Actian Hybrid DataConference2018 London

ActianHybrid DataConference2018 London

Setting Up High Availability Clusters for ActianX in Linux

Steffen Harre

Director, Development Projects

Agenda

Setting Up High Availability Clusters for ActianX in Linux

– VM setup• Overview of the machines used

– Cluster services setup• Pacemaker, Stonith and GFS2

– ActianX• Installation and cluster services

VM setupOverview of the machines used

Machine setup

▪using VMWare ESX

▪Create a VM sufficient to run Actian X

▪ Install OS

▪Shutdown and clone the VM into the amount of the nodes needed

▪ create a multiwriter, thick provisioned eager zeroed disk on the storage the VMs are on

▪mount that disk on all nodes

▪download the vSphere server root certificates and import them to the nodes

6 © 2018 Actian Corporation

Machine setup


Node configuration

▪Power up a node and give it it's hostname and static IP

▪ repeat for all the nodes - it may be useful to add the nodes to the hosts file

▪ register the OS and repositories, if needed

▪ for RHEL these are called High Availability and Resilient Storage

▪what we need is Corosync, Pacemaker, GFS2, clvmd, dlm and their dependencies


Cluster services setupPacemaker, Stonith and GFS2

Setting up the cluster services based on RHEL

▪Based on RHEL tutorials

▪https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_administration/ch-startup-haaa

▪https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/global_file_system_2/ch-clustsetup-gfs2

▪ Install and node setup

# yum install pcs pacemaker fence-agents-all

# yum install lvm2-cluster gfs2-utils

▪Add exceptions to the firewall, if needed

# firewall-cmd --permanent --add-service=high-availability

# firewall-cmd --add-service=high-availability


https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_administration/ch-startup-haaa

https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/global_file_system_2/ch-clustsetup-gfs2

Setting up the cluster services based on RHEL 2

▪ The cluster admin user needs a password, preferably the same on all nodes

# passwd hacluster

▪ This starts the cluster service

# systemctl start pcsd.service

# systemctl enable pcsd.service

▪ The nodes are authenticated

# pcs cluster auth gril-qarhclu1 gril-qarhclu2 gril-qarhclu3


Setting up the cluster services based on RHEL 3

▪ The nodes are authenticated

# pcs cluster auth gril-qarhclu1 gril-qarhclu2 gril-qarhclu3

▪ The cluster configuration is created and distributed

# pcs cluster setup --start --name gril-qarhclu \ gril-

qarhclu1 gril-qarhclu2 gril-qarhclu3

▪ and started

# pcs cluster enable --all


Setting up stonith fencing

▪ fencing will isolate malfunctioning nodes and restart them▪On ESX this will be done via the vCenter server

▪ The machine UUIDs are needed for that▪ # fence_vmware_soap -a <vcenter.server> -l <user> -p <password> --ssl --action list

▪ This generates a list of VMs accessible to userqarhclu2,423741ab-437b-b710-7b64-3b65259ab575

qarhclu3,4237d573-e453-6653-e23e-199ff0860e0b

uksl-qa-cdh3,423d7bff-6099-4687-a8b7-ad36af05e4f0

ukslqaw2k8vm,564d194d-5178-f38f-2a3b-b1518e397dbd

ukslingqajts3,4237b99d-6da3-f549-54dc-d22ce10fe387

ukslingqajts2,42375969-b003-0102-2e2c-31be6e849744

ukslingqajts1,42370bc0-b005-e1ff-b767-90233e6ccaef

qarhclu1,4237ab6f-1394-705d-99ba-1c9f13f3af04


setting up stonith fencing 2

▪ The command to setup fencing will have to include– The SSL data set up earlier– The UUIDs from the vCenter

▪ # pcs stonith create vmfence fence_vmware_soap delay=30

ipaddr=<vcenter.server> ipport=443 login=<user> passwd=<password>

pcmk_host_map="gril-qarhclu1:4237ab6f-1394-705d-99ba-

1c9f13f3af04;gril-qarhclu2:423741ab-437b-b710-7b64-

3b65259ab575;gril-qarhclu3:4237d573-e453-6653-e23e-199ff0860e0b"

ssl=1 pcmk_host_list="gril-qarhclu1, gril-qarhclu2, gril-qarhclu3“

▪ for larger ESX environments is is advisable to increase PCMK_ipc_buffer to a higher value (15125524 bytes suggested)

▪ /etc/sysconfig/pacemaker: PCMK_ipc_buffer=15125524


GFS2 setup

▪distibuted lock manager

# pcs property set no-quorum-policy=freeze

# pcs resource create dlm ocf:pacemaker:controld op monitor

interval=30s on-fail=fence clone interleave=true ordered=true

# /sbin/lvmconf --enable-cluster

▪ cluster logical volume management daemon

# pcs resource create clvmd ocf:heartbeat:clvm op monitor

interval=30s on-fail=fence clone interleave=true ordered=true

# pcs constraint order start dlm-clone then clvmd-clone

# pcs constraint colocation add clvmd-clone with dlm-clone


GFS2 setup 2

▪physical and logical volume groups, logical volume

# pvcreate /dev/sdb

# vgcreate -Ay -cy cluster_vg /dev/sdb

# lvcreate -L200G -n cluster_lv cluster_vg

▪ file system creation

# mkfs.gfs2 -j2 -p lock_dlm -t gril-qarhclu:qa1

/dev/cluster_vg/cluster_lv


GFS2 setup 3

▪ GFS is registered as a cluster resource

# pcs resource create clusterfs Filesystem

device="/dev/cluster_vg/cluster_lv" directory="/qa1" fstype="gfs2"

"options=noatime" op monitor interval=10s on-fail=fence clone

interleave=true

# pcs constraint order start clvmd-clone then clusterfs-clone

# pcs constraint colocation add clusterfs-clone with clvmd-

clone


Cluster setup done

▪ # pcs status

Cluster name: gril-qarhclu

Stack: corosync

Current DC: gril-qarhclu3 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with

quorum

Last updated: Thu Oct 11 11:24:30 2018

Last change: Thu Oct 11 11:24:07 2018 by hacluster via cibadmin on gril-qarhclu1

3 nodes configured

10 resources configured

Online: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]

Full list of resources:

Clone Set: dlm-clone [dlm]

Started: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]

vmfence (stonith:fence_vmware_soap): Started gril-qarhclu1

Clone Set: clvmd-clone [clvmd]


Clone Set: clusterfs-clone [clusterfs]



ActianXInstallation and cluster services

ActianX installation

▪Based in the Actian Installation Guide

▪http://docs.actian.com/ingres/11.0/index.html

▪ copy the TAR and extract it

# tar xfz actianx-*.tgz

# cd actianx-*

▪ the host name needs to be set to the cluster name to set up the DBMS in cluster mode

# export II_HOSTNAME=gril-qarhclu

# ./express_install.sh /qa1/installs/CC CC -licdir

/qa1/savesets/ -acceptlicense

▪A license file needs to be accessible to every node

▪All nodes need to be licensed


http://docs.actian.com/ingres/11.0/index.html

ActianX installation 2

▪ The host name is used in the config.dat…

ii.gril-qarhclu.config.replacement_from_unicode: 003F

ii.gril-qarhclu.config.replacement_into_unicode: FFFD

ii.gril-qarhclu.config.star.ii1100a64lnx100: complete

ii.gril-qarhclu.config.string_truncation: ignore

ii.gril-qarhclu.createdb.delim_id_case: lower

ii.gril-qarhclu.createdb.iidbdb_page_size: 8192

ii.gril-qarhclu.createdb.real_user_case: lower

ii.gril-qarhclu.createdb.reg_id_case: lower

ii.gril-qarhclu.dbms.*.*.config.dmf_connect: 32

ii.gril-qarhclu.dbms.*.active_limit: 32

ii.gril-qarhclu.dbms.*.ambig_replace_64compat: OFF

ii.gril-qarhclu.dbms.*.batch_copy_optim: ON

ii.gril-qarhclu.dbms.*.blob_etab_page_size: 8192

…



▪ stop the installation after it has been initialized

# su ingres

# source /qa1/installs/CC/ingres/.ingCCsh

# ingstop

▪note that it says "Executing ingstop against gril-qarhclu" this is registered to the cluster, not a single node

▪back to root

# exit



▪ let's make sure the instance script remembers the hostname

# nano /qa1/installs/CC/ingres/.ingCCsh

insert at the beginning

II_HOSTNAME=gril-qarhclu

export II_HOSTNAME



▪ setting up the Actian LSB service scripts

# source /qa1/installs/CC/ingres/.ingCCsh

# mkrc

# su -c "$II_SYSTEM/ingres/utility/mkrc -i"

# mkrc -s ha

# su -c "$II_SYSTEM/ingres/utility/mkrc -s ha -i"

# mkrc -s iimgmtsvc

# su -c "$II_SYSTEM/ingres/utility/mkrc -s iimgmtsvc -i"



▪ edit the rc script to include the cluster name

# nano /etc/init.d/ingresCC

insert export II_HOSTNAME=gril-qarhclu

next to export II_SYSTEM=/qa1/installs/CC

▪ copy the scripts to the GFS2 drive

# cp /etc/init.d/*CC* /qa1/tmp

▪ copy the environment scripts to the 'ingres' home directories and the LSB ones

▪on each node run

# cp /qa1/installs/CC/ingres/.ingCCsh /home/ingres

# cp /qa1/tmp/*CC* /etc/init.d



▪ there should now be LSB scripts on all nodes

# systemctl start ha_ingresCC

# systemctl status ha_ingresCC

# systemctl stop ha_ingresCC

▪ should now start, display the status and stop the DBMS on each node



▪ # systemctl status ingresCC

● ingresCC.service - LSB: Start Ingres RDBMS - CC instance

Loaded: loaded (/etc/rc.d/init.d/ingresCC; bad; vendor preset: disabled)

Active: active (running) since Wed 2018-10-10 16:52:54 CEST; 7s ago

Docs: man:systemd-sysv-generator(8)

Process: 4075 ExecStart=/etc/rc.d/init.d/ingresCC start (code=exited, status=0

/SUCCESS)

Tasks: 65

CGroup: /system.slice/ingresCC.service

├─4636 /qa1/installs/CC/ingres/bin/iigcn CC gcn

├─4813 /qa1/installs/CC/ingres/bin/iidbms recovery (dmfrcp) CC

├─4837 /qa1/installs/CC/ingres/bin/dmfacp CC

├─4854 /qa1/installs/CC/ingres/bin/iidbms dbms (default) CC

├─4884 /qa1/installs/CC/ingres/bin/iigcc CC gcc

├─4944 /qa1/installs/CC/ingres/bin/iigcd CC gcd

├─4963 /qa1/installs/CC/ingres/bin/iistar star (default) CC

├─4983 /qa1/installs/CC/ingres/bin/rmcmd CC rmcmd

└─5099 /qa1/installs/CC/ingres/bin/mgmtsvr CC mgmtsvr

Oct 10 16:52:43 gril-qarhclu2.actian.com systemd[1]: Starting LSB: Start Ingr...

Oct 10 16:52:43 gril-qarhclu2.actian.com runuser[4088]: pam_unix(runuser:sess...

Oct 10 16:52:44 gril-qarhclu2.actian.com runuser[4499]: pam_unix(runuser:sess...



▪ registering the resource in pacemaker

# pcs resource list lsb

lsb:ha_ingresCC - ingres Cluster Service - CC instance

lsb:iimgmtsvcCC - Ingres Mgmt Tools Service - CC instance

lsb:ingresCC - Start Ingres RDBMS - CC instance

# pcs resource create CC lsb:ha_ingresCC meta resource-

stickiness=500 op monitor interval=60s restart interval=60s start

interval=60s stop interval=60s force-reload interval=60s

▪ The stickiness is to avoid the DBMS moving around to the optimal node

▪Alternatively a constraint can be used

– # pcs constraint location CC prefers gril-qarhclu1=-INFINITY gril-

qarhclu2=500 gril-qarhclu3=400


ActianX installation result

▪ # pcs status

Cluster name: gril-qarhclu

Stack: corosync

Current DC: gril-qarhclu3 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with

quorum

Last updated: Thu Oct 11 11:41:16 2018

Last change: Thu Oct 11 11:40:24 2018 by hacluster via crmd on gril-qarhclu1

3 nodes configured


Online: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]


Clone Set: dlm-clone [dlm]


vmfence (stonith:fence_vmware_soap): Started gril-qarhclu1

Clone Set: clvmd-clone [clvmd]


Clone Set: clusterfs-clone [clusterfs]


CC (lsb:ha_ingresCC): Started gril-qarhclu2


ActianX in the cluster web interface


ActianX installation failover


ActianX installation 2 node cluster

▪ # pcs status

Cluster name: actianx_clus

Stack: corosync

Current DC: rh7-clus02 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum

Last updated: Tue Oct 30 23:45:45 2018

Last change: Tue Oct 30 23:43:27 2018 by hacluster via crmd on rh7-clus02

2 nodes configured


Online: [ rh7-clus01 rh7-clus02 ]


actian_vmfence (stonith:fence_vmware_soap): Started rh7-clus02

Resource Group: actianxresgrp

actianx_lvm (ocf::heartbeat:LVM): Started rh7-clus02

actianx_lvol (ocf::heartbeat:Filesystem): Started rh7-clus02

ha_ingresHC (lsb:ha_ingresHC): Started rh7-clus02

actianx_VIP (ocf::heartbeat:IPaddr2): Started rh7-clus02

Failed Actions:

* ha_ingresHC_monitor_15000 on rh7-clus02 'not running' (7): call=498, status=complete, exitreason='',

last-rc-change='Tue Oct 30 23:45:43 2018', queued=0ms, exec=85ms

Daemon Status:

corosync: active/enabled

pacemaker: active/enabled

pcsd: active/enabled


Two node cluster web interface


Two node cluster web interface 2


Thank you!

Download - Disclaimer - Actian...Node configuration Power up a node and give it it's hostname and static IP repeat for all the nodes - it may be useful to add the nodes to the hosts file register

Top Related