informix 14.10 tuning tips to go beyond secondary
TRANSCRIPT
© 2019 IBM Corporation
Thursday , 5th Dec 201910:00 AM Eastern Daylight Time
Informix 14.10 Tuning Tips to Go Beyond Secondary Performance Improvements
Vladimir Kolobrodov
Informix Performance Engineer
HCL Software
© 2019 IBM Corporation
Disclaimers
IBM’s statements regarding its plans, directions, and intent are subject to change or withdrawal
without notice at IBM’s sole discretion.
Information regarding potential future products is intended to outline our general product direction and
it should not be relied on in making a purchasing decision.
The information mentioned regarding potential future products is not a commitment, promise, or legal
obligation to deliver any material, code or functionality. Information about potential future products may
not be incorporated into any contract.
The development, release, and timing of any future features or functionality described for our products
remains at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in a
controlled environment. The actual throughput or performance that any user will experience will vary
depending upon many factors, including considerations such as the amount of multiprogramming in
the user’s job stream, the I/O configuration, the storage configuration, and the workload processed.
Therefore, no assurance can be given that an individual user will achieve results similar to those
stated here.
© 2019 IBM Corporation
Introduction
This presentation highlights changes in the Informix
Remote Standalone Secondary (RSS), Shared Disk Secondary (SDS),
and High Availability Data Replication (HDR) technologies.
Logical log transfer and replay mechanism was completely overhauled
in Informix 14.10 resulting in greatly improved replication performance.
Replication now can support 5 to 8 times higher transactional rates
with no or minimal delay compared to previous (12.10 and earlier)
Informix releases.
As a side benefit - Informix can recover from crash much faster,
since crash recovery also depends on logical log replay speed.
© 2019 IBM Corporation
Performance improvement: RSS, SDS
RSS log replay rate
Informix 14.10
compared to
Informix 12.10
SDS log replay rate
Informix 14.10
compared to
Informix 12.10
© 2019 IBM Corporation
Performance improvement: HDR and Crash Recovery
HDR rate of transactions
Informix 14.10
compared to
Informix 12.10
Crash recovery time
Informix 14.10
compared to
Informix 12.10
© 2019 IBM Corporation
Why now?
Increased processing power in current CPUs, both due to higher clock speeds and number of
available cores, saw Informix deployments producing much higher transactional throughput.
Replication technologies in 12.10 and earlier products that rely on logical log replay
(HDR, RSS, SDS) reached a design limit.
© 2019 IBM Corporation
Tuning prerequisites
Standalone Informix instance – configured for the workload with high rate of transactions
CPU
- multiprocessor / CPUVP
Memory
- BUFFERPOOL
Disk
- high IOPS - logging, checkpoint,
layout
Network
- bandwidth
© 2019 IBM Corporation
Tuning workload
- representative
- repeatable
- scalable
- short unit of execution
The OLTP benchmark similar to industry standard TPC-C was a good fit as it allowed
to generate varying levels of transactional activity by changing number of active
sessions, produced fairly stable results for a given configuration, with execution
time under 30 minutes to collect data for a single point.
© 2019 IBM Corporation
Tuning workload highlights
- Up to 1.2 million (new order) transactions per minute
- 50,000 total transactions per second
- over million logical log records per second
- 100 MB / sec logical log I/O (mostly writes) for primary / standalone
- enough logical log space for 10 minute benchmark at peak rate of transactions
- SSD and flash storage for data and logs
- CPU: close to 100% of utilization of 16 cores / 128 logical processors
- Data: working set fully cached in memory
- Connectivity: loopback interface for the network
© 2019 IBM Corporation
Stats / metrics – primary Informix instance
- Rate of transactions –> logical log rate
© 2019 IBM Corporation
Stats / metrics – secondary Informix instance
- Logical log replay rate / transaction latency ( RSS, SDS )
© 2019 IBM Corporation
RSS – Remote Standalone Secondary
- RSS log replay rate vs logical log rate on primary
Several 8-minute benchmark runs were executed with varying workloads using Informix 14.10 and
12.10, in all cases waiting until RSS caught up with the primary. Then, the log replay rate was
calculated for the window when benchmark was active as well as for the period when it has finished
and primary was idle aside from transferring remaining transactions (logs) to the secondary.
© 2019 IBM Corporation
RSS – Remote Standalone Secondary
- Multiple benchmark runs: log replay rate vs time
Several 8-minute benchmark runs were executed with varying workloads using Informix 14.10 and
12.10, in all cases waiting until RSS caught up with the primary. Then, the log replay rate was
calculated for the window when benchmark was active as well as for the period when it has finished
and primary was idle aside from transferring remaining transactions (logs) to the secondary.
© 2019 IBM Corporation
RSS – log I/O patterns during benchmark
The log I/O on RSS (physical log) and Primary (logical log)
for one of the benchmark runs correlated with the log replay
rate.
-
Multiple benchmark runs where
maximum possible secondary log
replay rate exceeds logical log rate on
primary.
By the end of each benchmark run
secondary is in sync with primary.
© 2019 IBM Corporation
RSS – log I/O patterns during benchmark
The log I/O on RSS (physical log) and Primary (logical log)
for one of the benchmark runs correlated with the log replay
rate.
-
The benchmark runs where secondary
cannot replay log records as fast as the
logical records are logged on primary.
Secondary continues replaying log
records after load on primary stopped.
© 2019 IBM Corporation
SDS – Shared Disk Secondary
- Multiple benchmark runs: log replay rate vs time
Physical logging, which affects RSS replay rate at the start of benchmark, is not an issue with SDS.
The plots representing SDS replay rate vs time in minutes for varying levels of transactional
activity, show that the replay rate - up to the maximum supported by configuration - can be reached
by the SDS secondary without delay.
© 2019 IBM Corporation
SDS – Shared Disk Secondary
- SDS log replay rate vs logical log rate on primary
When rate of transactions / logical log records on primary exceeds maximum replay rate supported
by configuration, the latter is negatively affected.
© 2019 IBM Corporation
HDR – High Availability Data Replication
With HDR - transactional rate on primary is limited by the secondary's apply rate, so
the performance metrics is maximum achievable benchmark throughput:
In the benchmark environment best results for the Informix 14.10 were observed when
HDR_TXN_SCOPE was set to ASYNC (DRINTERVAL = 0)
Coming up with a single number for the performance improvement for the HDR is a challenge,
since the benchmark throughput depends on the workload specifics and maximum throughput may
not be achieved in the same setup for 12.10 and 14.10. Overall, in all tested HDR configurations
Informix 14.10 demonstrated significant performance improvement compared to Informix 12.10.
© 2019 IBM Corporation
Configuration and tuning
- SMX_NUMPIPES 2
- LOGBUFF 2048
- LTAPEBLK 20480
- SEC_DR_BUFS 24
- SEC_LOGREC_MAXBUFS 1000
- OFF_RECVRY_THREADS 11
- SEC_APPLY_POLLTIME 0
- RSS_FLOW_CONTROL / SDS_FLOW_CONTROL -1
- RSS_NONBLOCKING_CKPT 1
- DRINTERVAL / HDR_TXN_SCOPE 0 / ASYNC
- LOG_STAGING_DIR ?
© 2019 IBM Corporation
Configuration and tuning
- SMX_NUMPIPES 2
Number of SMX network connections between primary and secondary servers.
Usually reasonably small number is sufficient. Recommended value is 2.
- LOGBUFF 2048
The size of the logical log buffer, in KB.
Also determines size of the replication buffer (see SEC_DR_BUFS).
Smaller log buffer size is better with unbuffered logging ? Not entirely true.
OLTP workloads having high number of active concurrent sessions
achieve better performance using larger log buffer.
- SEC_DR_BUFS 24
Introduced in 14.10.xC1 - number of replication buffers.
Applicable to RSS, SDS, HDR secondaries and primary with HDR.
Buffer size is same as set by LOGBUFF configuration parameter.
Minimum (default) value = 12, Maximum value = 128
© 2019 IBM Corporation
Configuration and tuning
- LTAPEBLK 20480
Primary server uses this configuration parameter for backing up logical logs.
On the secondary several buffers of LTAPEBK size are used for replaying
the log records.
- SEC_LOGREC_MAXBUFS 1000
Introduced in 14.10.xC1 - number of 16k log buffers used for replaying log records.
On standalone / primary instance it is in effect during crash recovery.
Minimum (default) value is 4 x OFF_RECVRY_THREADS.
Values over 5000 may negatively impact performance.
- RSS_FLOW_CONTROL / SDS_FLOW_CONTROL -1
For better performance flow control can be disabled by setting this parameter
on primary server for the appropriate type of secondary to "-1".
Can be changed dynamically with "onmode -wm" or "onmode -wf".
Default value is “0” (12 times log buffer size)
Can be set to <start>,<stop> - where “Start” is larger than “Stop”.
Supports scale factor of 'K', 'M', 'G' – for example 10000K,9000K
© 2019 IBM Corporation
Configuration and tuning
- OFF_RECVRY_THREADS 11
The number of logical recovery threads used for log replay.
On the secondary - should not exceed number of CPUVPs, and/or
number of processor cores available to the instance, especially
when SEC_APPLY_POLLTIME is set to value other than "0".
During crash recovery this parameter is ignored, and number
of recovery threads is set to twice the number of configured CPUVPs.
Recommended minimum is 5
- SEC_APPLY_POLLTIME 0
Introduced in 14.10.xC1 - time, in micro seconds, the log replay thread
polls for events before yielding. May reduce thread context switch overhead
while replaying log records.
Setting to value other than zero may result in increased CPU utilization on
the secondary during log replay and require considering changes to other
Informix configuration parameters:
- poll threads may need to be moved to NETVP (NETTYPE)
- when this parameter is not 0 it is recommended to set number of OFF_RECVRY_THREADS t
to be less than number of CPUVPs and/or number of processor cores
© 2019 IBM Corporation
Configuration and tuning
- RSS_NONBLOCKING_CKPT 1
Introduced in 14.10.xC1. For the remote standalone secondary server (RSS) only -
enables non-blocking checkpoint. The logs are continued to be replayed while the
checkpoint is in progress.
Default is "0" (log replay on RSS is blocked for the duration of the checkpoint)
- DRINTERVAL / HDR_TXN_SCOPE 0 / ASYNC
Only applicable to HDR secondary server. In benchmark environment this resulted in
overall better performance. If HDR_TXN_SCOPE must be set to NEAR_SYNC - optimization
present in Informix 14.10 may result in better performance with the unbuffered log mode.
The HDR_TXN_SCOPE parameter is not applicable when DRINTERVAL is not 0,
with DRINTERVAL = 1 performance in the benchmark was close to optimal, but
with DRINTERVAL = 0 and ASYNC transaction scope - transaction delay on secondary
was noticeably less.
- LOG_STAGING_DIR ?
Must be set to a valid directory location to enable log staging at HDR while flushing buffer pool
to disk during checkpoint processing. For RSS server, log staging directory configuration is a
requirement to enable delay apply or stop apply.
© 2019 IBM Corporation
Summary
Considerable development effort resulted in delivery of the latest Informix release with major
performance improvements in most areas of replication and high availability.
With Informix 14.10 - the RSS, SDS and HDR clusters can now support much higher transactional
loads with transactions propagated to secondary servers practically in real time.
At secondary server, reads and writes to logical and physical log spaces require synchronous I/O
operation. For performance reasons, it is recommended to use storage supporting high number of
IOPS (I/O operations per second), preferably flash based or, at least, SSD (solid state disk) medium
for the log spaces.
© 2019 IBM Corporation25
Questions