This document is for informational purposes only and is subject to change at any time without notice. The information in this document is proprietary to Actian and no part of this document may be reproduced, copied, or transmitted in any form or for any purpose without the express prior written permission of Actian.
This document is not intended to be binding upon Actian to any particular course of business, pricing, product strategy, and/or development. Actian assumes no responsibility for errors or omissions in this document. Actian shall have no liability for damages of any kind including without limitation direct, special, indirect, or consequential damages that may result from the use of these materials. Actian does not warrant the accuracy or completeness of the information, text, graphics, links, or other items contained within this material. This document is provided without a warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability, fitness for a particular purpose, or non-infringement.
Disclaimer
Actian Hybrid DataConference2018 London
ActianHybrid DataConference2018 London
Setting Up High Availability Clusters for ActianX in Linux
Steffen Harre
Director, Development Projects
Agenda
Setting Up High Availability Clusters for ActianX in Linux
– VM setup• Overview of the machines used
– Cluster services setup• Pacemaker, Stonith and GFS2
– ActianX• Installation and cluster services
VM setupOverview of the machines used
Machine setup
▪using VMWare ESX
▪Create a VM sufficient to run Actian X
▪ Install OS
▪Shutdown and clone the VM into the amount of the nodes needed
▪ create a multiwriter, thick provisioned eager zeroed disk on the storage the VMs are on
▪mount that disk on all nodes
▪download the vSphere server root certificates and import them to the nodes
6 © 2018 Actian Corporation
Machine setup
7 © 2018 Actian Corporation
Node configuration
▪Power up a node and give it it's hostname and static IP
▪ repeat for all the nodes - it may be useful to add the nodes to the hosts file
▪ register the OS and repositories, if needed
▪ for RHEL these are called High Availability and Resilient Storage
▪what we need is Corosync, Pacemaker, GFS2, clvmd, dlm and their dependencies
8 © 2018 Actian Corporation
Cluster services setupPacemaker, Stonith and GFS2
Setting up the cluster services based on RHEL
▪Based on RHEL tutorials
▪https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/high_availability_add-on_administration/ch-startup-haaa
▪https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/global_file_system_2/ch-clustsetup-gfs2
▪ Install and node setup
# yum install pcs pacemaker fence-agents-all
# yum install lvm2-cluster gfs2-utils
▪Add exceptions to the firewall, if needed
# firewall-cmd --permanent --add-service=high-availability
# firewall-cmd --add-service=high-availability
10 © 2018 Actian Corporation
Setting up the cluster services based on RHEL 2
▪ The cluster admin user needs a password, preferably the same on all nodes
# passwd hacluster
▪ This starts the cluster service
# systemctl start pcsd.service
# systemctl enable pcsd.service
▪ The nodes are authenticated
# pcs cluster auth gril-qarhclu1 gril-qarhclu2 gril-qarhclu3
11 © 2018 Actian Corporation
Setting up the cluster services based on RHEL 3
▪ The nodes are authenticated
# pcs cluster auth gril-qarhclu1 gril-qarhclu2 gril-qarhclu3
▪ The cluster configuration is created and distributed
# pcs cluster setup --start --name gril-qarhclu \ gril-
qarhclu1 gril-qarhclu2 gril-qarhclu3
▪ and started
# pcs cluster enable --all
12 © 2018 Actian Corporation
Setting up stonith fencing
▪ fencing will isolate malfunctioning nodes and restart them▪On ESX this will be done via the vCenter server
▪ The machine UUIDs are needed for that▪ # fence_vmware_soap -a <vcenter.server> -l <user> -p <password> --ssl --action list
▪ This generates a list of VMs accessible to userqarhclu2,423741ab-437b-b710-7b64-3b65259ab575
qarhclu3,4237d573-e453-6653-e23e-199ff0860e0b
uksl-qa-cdh3,423d7bff-6099-4687-a8b7-ad36af05e4f0
ukslqaw2k8vm,564d194d-5178-f38f-2a3b-b1518e397dbd
ukslingqajts3,4237b99d-6da3-f549-54dc-d22ce10fe387
ukslingqajts2,42375969-b003-0102-2e2c-31be6e849744
ukslingqajts1,42370bc0-b005-e1ff-b767-90233e6ccaef
qarhclu1,4237ab6f-1394-705d-99ba-1c9f13f3af04
13 © 2018 Actian Corporation
setting up stonith fencing 2
▪ The command to setup fencing will have to include– The SSL data set up earlier– The UUIDs from the vCenter
▪ # pcs stonith create vmfence fence_vmware_soap delay=30
ipaddr=<vcenter.server> ipport=443 login=<user> passwd=<password>
pcmk_host_map="gril-qarhclu1:4237ab6f-1394-705d-99ba-
1c9f13f3af04;gril-qarhclu2:423741ab-437b-b710-7b64-
3b65259ab575;gril-qarhclu3:4237d573-e453-6653-e23e-199ff0860e0b"
ssl=1 pcmk_host_list="gril-qarhclu1, gril-qarhclu2, gril-qarhclu3“
▪ for larger ESX environments is is advisable to increase PCMK_ipc_buffer to a higher value (15125524 bytes suggested)
▪ /etc/sysconfig/pacemaker: PCMK_ipc_buffer=15125524
14 © 2018 Actian Corporation
GFS2 setup
▪distibuted lock manager
# pcs property set no-quorum-policy=freeze
# pcs resource create dlm ocf:pacemaker:controld op monitor
interval=30s on-fail=fence clone interleave=true ordered=true
# /sbin/lvmconf --enable-cluster
▪ cluster logical volume management daemon
# pcs resource create clvmd ocf:heartbeat:clvm op monitor
interval=30s on-fail=fence clone interleave=true ordered=true
# pcs constraint order start dlm-clone then clvmd-clone
# pcs constraint colocation add clvmd-clone with dlm-clone
15 © 2018 Actian Corporation
GFS2 setup 2
▪physical and logical volume groups, logical volume
# pvcreate /dev/sdb
# vgcreate -Ay -cy cluster_vg /dev/sdb
# lvcreate -L200G -n cluster_lv cluster_vg
▪ file system creation
# mkfs.gfs2 -j2 -p lock_dlm -t gril-qarhclu:qa1
/dev/cluster_vg/cluster_lv
16 © 2018 Actian Corporation
GFS2 setup 3
▪ GFS is registered as a cluster resource
# pcs resource create clusterfs Filesystem
device="/dev/cluster_vg/cluster_lv" directory="/qa1" fstype="gfs2"
"options=noatime" op monitor interval=10s on-fail=fence clone
interleave=true
# pcs constraint order start clvmd-clone then clusterfs-clone
# pcs constraint colocation add clusterfs-clone with clvmd-
clone
17 © 2018 Actian Corporation
Cluster setup done
▪ # pcs status
Cluster name: gril-qarhclu
Stack: corosync
Current DC: gril-qarhclu3 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with
quorum
Last updated: Thu Oct 11 11:24:30 2018
Last change: Thu Oct 11 11:24:07 2018 by hacluster via cibadmin on gril-qarhclu1
3 nodes configured
10 resources configured
Online: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]
Full list of resources:
Clone Set: dlm-clone [dlm]
Started: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]
vmfence (stonith:fence_vmware_soap): Started gril-qarhclu1
Clone Set: clvmd-clone [clvmd]
Started: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]
Clone Set: clusterfs-clone [clusterfs]
Started: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]
18 © 2018 Actian Corporation
ActianXInstallation and cluster services
ActianX installation
▪Based in the Actian Installation Guide
▪http://docs.actian.com/ingres/11.0/index.html
▪ copy the TAR and extract it
# tar xfz actianx-*.tgz
# cd actianx-*
▪ the host name needs to be set to the cluster name to set up the DBMS in cluster mode
# export II_HOSTNAME=gril-qarhclu
# ./express_install.sh /qa1/installs/CC CC -licdir
/qa1/savesets/ -acceptlicense
▪A license file needs to be accessible to every node
▪All nodes need to be licensed
20 © 2018 Actian Corporation
ActianX installation 2
▪ The host name is used in the config.dat…
ii.gril-qarhclu.config.replacement_from_unicode: 003F
ii.gril-qarhclu.config.replacement_into_unicode: FFFD
ii.gril-qarhclu.config.star.ii1100a64lnx100: complete
ii.gril-qarhclu.config.string_truncation: ignore
ii.gril-qarhclu.createdb.delim_id_case: lower
ii.gril-qarhclu.createdb.iidbdb_page_size: 8192
ii.gril-qarhclu.createdb.real_user_case: lower
ii.gril-qarhclu.createdb.reg_id_case: lower
ii.gril-qarhclu.dbms.*.*.config.dmf_connect: 32
ii.gril-qarhclu.dbms.*.active_limit: 32
ii.gril-qarhclu.dbms.*.ambig_replace_64compat: OFF
ii.gril-qarhclu.dbms.*.batch_copy_optim: ON
ii.gril-qarhclu.dbms.*.blob_etab_page_size: 8192
…
21 © 2018 Actian Corporation
ActianX installation 3
▪ stop the installation after it has been initialized
# su ingres
# source /qa1/installs/CC/ingres/.ingCCsh
# ingstop
▪note that it says "Executing ingstop against gril-qarhclu" this is registered to the cluster, not a single node
▪back to root
# exit
22 © 2018 Actian Corporation
ActianX installation 4
▪ let's make sure the instance script remembers the hostname
# nano /qa1/installs/CC/ingres/.ingCCsh
insert at the beginning
II_HOSTNAME=gril-qarhclu
export II_HOSTNAME
23 © 2018 Actian Corporation
ActianX installation 5
▪ setting up the Actian LSB service scripts
# source /qa1/installs/CC/ingres/.ingCCsh
# mkrc
# su -c "$II_SYSTEM/ingres/utility/mkrc -i"
# mkrc -s ha
# su -c "$II_SYSTEM/ingres/utility/mkrc -s ha -i"
# mkrc -s iimgmtsvc
# su -c "$II_SYSTEM/ingres/utility/mkrc -s iimgmtsvc -i"
24 © 2018 Actian Corporation
ActianX installation 6
▪ edit the rc script to include the cluster name
# nano /etc/init.d/ingresCC
insert export II_HOSTNAME=gril-qarhclu
next to export II_SYSTEM=/qa1/installs/CC
▪ copy the scripts to the GFS2 drive
# cp /etc/init.d/*CC* /qa1/tmp
▪ copy the environment scripts to the 'ingres' home directories and the LSB ones
▪on each node run
# cp /qa1/installs/CC/ingres/.ingCCsh /home/ingres
# cp /qa1/tmp/*CC* /etc/init.d
25 © 2018 Actian Corporation
ActianX installation 7
▪ there should now be LSB scripts on all nodes
# systemctl start ha_ingresCC
# systemctl status ha_ingresCC
# systemctl stop ha_ingresCC
▪ should now start, display the status and stop the DBMS on each node
26 © 2018 Actian Corporation
ActianX installation 8
▪ # systemctl status ingresCC
● ingresCC.service - LSB: Start Ingres RDBMS - CC instance
Loaded: loaded (/etc/rc.d/init.d/ingresCC; bad; vendor preset: disabled)
Active: active (running) since Wed 2018-10-10 16:52:54 CEST; 7s ago
Docs: man:systemd-sysv-generator(8)
Process: 4075 ExecStart=/etc/rc.d/init.d/ingresCC start (code=exited, status=0
/SUCCESS)
Tasks: 65
CGroup: /system.slice/ingresCC.service
├─4636 /qa1/installs/CC/ingres/bin/iigcn CC gcn
├─4813 /qa1/installs/CC/ingres/bin/iidbms recovery (dmfrcp) CC
├─4837 /qa1/installs/CC/ingres/bin/dmfacp CC
├─4854 /qa1/installs/CC/ingres/bin/iidbms dbms (default) CC
├─4884 /qa1/installs/CC/ingres/bin/iigcc CC gcc
├─4944 /qa1/installs/CC/ingres/bin/iigcd CC gcd
├─4963 /qa1/installs/CC/ingres/bin/iistar star (default) CC
├─4983 /qa1/installs/CC/ingres/bin/rmcmd CC rmcmd
└─5099 /qa1/installs/CC/ingres/bin/mgmtsvr CC mgmtsvr
Oct 10 16:52:43 gril-qarhclu2.actian.com systemd[1]: Starting LSB: Start Ingr...
Oct 10 16:52:43 gril-qarhclu2.actian.com runuser[4088]: pam_unix(runuser:sess...
Oct 10 16:52:44 gril-qarhclu2.actian.com runuser[4499]: pam_unix(runuser:sess...
27 © 2018 Actian Corporation
ActianX installation 9
▪ registering the resource in pacemaker
# pcs resource list lsb
lsb:ha_ingresCC - ingres Cluster Service - CC instance
lsb:iimgmtsvcCC - Ingres Mgmt Tools Service - CC instance
lsb:ingresCC - Start Ingres RDBMS - CC instance
# pcs resource create CC lsb:ha_ingresCC meta resource-
stickiness=500 op monitor interval=60s restart interval=60s start
interval=60s stop interval=60s force-reload interval=60s
▪ The stickiness is to avoid the DBMS moving around to the optimal node
▪Alternatively a constraint can be used
– # pcs constraint location CC prefers gril-qarhclu1=-INFINITY gril-
qarhclu2=500 gril-qarhclu3=400
28 © 2018 Actian Corporation
ActianX installation result
▪ # pcs status
Cluster name: gril-qarhclu
Stack: corosync
Current DC: gril-qarhclu3 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with
quorum
Last updated: Thu Oct 11 11:41:16 2018
Last change: Thu Oct 11 11:40:24 2018 by hacluster via crmd on gril-qarhclu1
3 nodes configured
11 resources configured
Online: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]
Full list of resources:
Clone Set: dlm-clone [dlm]
Started: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]
vmfence (stonith:fence_vmware_soap): Started gril-qarhclu1
Clone Set: clvmd-clone [clvmd]
Started: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]
Clone Set: clusterfs-clone [clusterfs]
Started: [ gril-qarhclu1 gril-qarhclu2 gril-qarhclu3 ]
CC (lsb:ha_ingresCC): Started gril-qarhclu2
29 © 2018 Actian Corporation
ActianX in the cluster web interface
30 © 2018 Actian Corporation
ActianX installation failover
31 © 2018 Actian Corporation
ActianX installation 2 node cluster
▪ # pcs status
Cluster name: actianx_clus
Stack: corosync
Current DC: rh7-clus02 (version 1.1.18-11.el7_5.3-2b07d5c5a9) - partition with quorum
Last updated: Tue Oct 30 23:45:45 2018
Last change: Tue Oct 30 23:43:27 2018 by hacluster via crmd on rh7-clus02
2 nodes configured
5 resources configured
Online: [ rh7-clus01 rh7-clus02 ]
Full list of resources:
actian_vmfence (stonith:fence_vmware_soap): Started rh7-clus02
Resource Group: actianxresgrp
actianx_lvm (ocf::heartbeat:LVM): Started rh7-clus02
actianx_lvol (ocf::heartbeat:Filesystem): Started rh7-clus02
ha_ingresHC (lsb:ha_ingresHC): Started rh7-clus02
actianx_VIP (ocf::heartbeat:IPaddr2): Started rh7-clus02
Failed Actions:
* ha_ingresHC_monitor_15000 on rh7-clus02 'not running' (7): call=498, status=complete, exitreason='',
last-rc-change='Tue Oct 30 23:45:43 2018', queued=0ms, exec=85ms
Daemon Status:
corosync: active/enabled
pacemaker: active/enabled
pcsd: active/enabled
32 © 2018 Actian Corporation
Two node cluster web interface
33 © 2018 Actian Corporation
Two node cluster web interface 2
34 © 2018 Actian Corporation
Thank you!