rac best practices - rac sig 9dec05[1]
TRANSCRIPT
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
1/34
Best Practices for 10g RACDecember 2005
Roy RosseboRAC Pack
Oracle Corporation
This presentation is for informational purposes only and may not be incorporated into a contract or agreement.
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
2/34
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
3/34
1. Define Service Level Objectives
Cannot be Successful without Objectives
Realistic and Measurable Objectives
Driven by and linked to business objectives
Existing service levels typically provide the baseline Max planned/unplanned downtime
Performance requirements: #user, response times,jobs/transactions per time period
Consolidation, cost saving
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
4/34
Realistic Expectations
If your application will scale transparently on
SMP, then it is realistic to expect it to scale well on
RAC, without having to make any changes to the
application code.
RAC eliminates the database instance, and the
node itself, as a single point of failure, and ensures
database integrity in the case of such failures
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
5/34
2. Keep It Simple
Use Proven Configurations
Certified Components Metalink Certification matrix
Vendor certifications Minimize number of Components
Other Cluster Manager required anymore?
Volume Managers needed anymore?
Partner with Vendors
Dont be too Creative
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
6/34
3. Get ready for the Grid
2 node clusters give no/few Grid benefits 30+ customer with 6+ nodes in cluster
4+ node clusters are the norm for new systems
Large Clusters = Large HW Savings Provisioning, Capacity on Demand
RAC performance scales better than SMP forlarge number of CPUs
Grid Control: RAC System Management verysimilar to Single Instance
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
7/34
4. Use Oracle10g Release 2
OCR and Voting Disk mirroring
Simpler Install Process for Large Clusters
Significant Performance Improvements
Framework for Workload Management inLarge Clusters: Services, FAN and LBA
Clusterware High Availability API
Etc.
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
8/34
Before Oracle10g Release 2
Object,e.g. Table
Data Blocks
Buffer Cache
Messages are sent to remote node when reading into cache
Read from disk
Gcs message to master
Buffer Cache
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
9/34
Object Affinity
Object,e.g. Table
Data Blocks
Buffer Caches
NO Messages are sent to remote node when reading into cache
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
10/34
5. Implement Workload Mgt.
Workload Management for Clusters morethen just load balancing
Effective Utilization of Cluster Resources
Minimize Synchronization Cost in Cluster Monitor and Optimize Response Times
Reduce Network Delays During Failover
Take advantage of Services, FAN and LBA
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
11/34
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
12/34
Initial State
C2
C3
C4
C5
C6
C1
Instance 1
Instance 2
Service across RAC
Connection Pool
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
13/34
Instance Join
C2
C3
C4
C5
C6
C1
Instance 1
Instance 2
Instance 3
Service across RAC
Connection Pool
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
14/34
Instance Join Undesired CasesOption 1: Nothing Happens
C2
C3
C4
C5C6
C1
Instance 1
Instance 2
Instance 3
Service across RAC
Connection Pool
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
15/34
Instance Join Undesired CasesOption 2: Add New ConnectionsRandomly
C2
C3
C4
C5
C6
C7
C8
C9
C1
Instance 1
Instance 2
Instance 3
Service across RAC
Connection Pool
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
16/34
Instance Join Desired Result
C2
C3
C4
C5C6
C7
C8
C9
C1
Instance 1
Instance 2
Instance 3
Service across RAC
Connection Pool
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
17/34
Node Leaves
C2
C3
C4
C5C6
C7
C8
C9
C1
Instance 1
Instance 2
Service across RAC
Connection Pool
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
18/34
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
19/34
6. Configure Interconnect Correctly
Use UDP over Gigabit Ethernet for Interconnect! OS Bonding for Failover & Loadbalancing
NIC bonding, multiple switches
Set UDP buffers to maxCheck:
$ /sbin/sysctl net.core.rmem_max net.core.wmem_max net.core
.rmem_default net.core.wmem_defaultnet.core.rmem_max = 262144
net.core.wmem_max = 262144
net.core.rmem_default = 262144
net.core.wmem_default = 262144
Use switch(es) - crossover cable not supported
Eliminate any Transmission Problems LMS CPU Resources
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
20/34
Correct network used?
select * from gv$cluster_interconnects;
INST_ID NAME IP_ADDRESS IS_PUBLIC SOURCE
------- ---- ------------- --------- -------------------------
1 eth2 138.2.238.74 NO Oracle Cluster Repository
2 eth2 138.2.238.75 NO Oracle Cluster Repository
SQL> oradebug setmypidStatement processed.
SQL> oradebug ipc
Information written to trace file.
SQL> oradebug tracefile_name
SSKGXPT 0xcc1c78c flags SSKGXPT_READPENDING info for network 0
socket no 7 IP 138.2.238.74 UDP 33594sflags SSKGXPT_UP
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
21/34
Correct network used?
$ oifcfg getifeth0 138.2.236.114 global public
eth2 138.2.238.74 global cluster_interconnect
$ocrdump
$view OCRDUMPFILE
...
[SYSTEM.css.node_numbers.node2.privatename]
ORATEXT : stnsp014-rac
SECURITY : {USER_PERMISSION : PROCR_ALL_ACCESS, GROUP_PERMISSION :
PROCR_READ, OTHER_PERMISSION : PROCR_READ, USER_NAME : root,
GROUP_NAME : root}
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
22/34
Interconnect Jumbo Frames (NIC and switch) test properly first!$ /sbin/ifconfig eth1
eth1 Link encap:Ethernet HWaddr 00:0E:0C:09:6C:81
inet6 addr: fe80::20e:cff:fe09:6c81/64 Scope:Link
UP BROADCAST RUNNING NOARP MULTICAST MTU:1500 Metric:1
$ ping -s 8972 -M do stnsp013-rac
From stnsp014-rac (138.2.238.74) icmp_seq=0 Frag needed and DF set (mtu =
1500)
Full Duplex$ /sbin/mii-tool -v eth0
eth0: 100 Mbit, full duplex, link ok
product info: vendor 00:50:43, model 2 rev 3
basic mode: 100 Mbit, full duplex
basic status: link ok
capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
23/34
User sets up the
Hardware,
network & storage
Sets up OCFS
( OPT )
Installs
CRS
Installs
RAC
Configures
RAC DB
-post hwos
-post cfs
-post crsinst
-pre crsinst
-pre dbinst
-pre dbcfg
-pre cfs
7. Use Cluster VerificationUtility
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
24/34
List of Components
$> ./cluvfy comp -list
Valid components are:
nodereach : checks reachability between nodes
nodecon : checks node connectivity
cfs : checks CFS integrity
ssa : checks shared storage accessibility
space : checks space availability
sys : checks minimum system requirements
clu : checks cluster integrity
clumgr : checks cluster manager integrity
ocr : checks OCR integrity
crs : checks CRS integrity
nodeapp : checks node applications existence admprv : checks administrative privileges
peer : compares properties with peers
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
25/34
CVU locations
Oracle DVD
clusterware/cluvfy/runcluvfy.sh
clusterware/rpm/cvuqdisk-1.0.1-1.rpm
CRS Home
$ORA_CRS_HOME/bin/cluvfy $ORA_CRS_HOME/cv/rpm/cvuqdisk-1.0.1-
1.rpm
Oracle Home
$ORACLE_HOME/bin/cluvfy
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
26/34
User installs onlocal node
l Install only on local node. Tool deploys itself on remote nodes duringexecution, as required.
Issues verificationcommand for multiplenodes
l Tool copies therequired bits tothe remote nodes
l Executesverification taskson all nodes andgenerates report
Deployment of cluvfy
CVU
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
27/34
8. Test, Test, Test
Why? Verify that System Infrastructure meets SLO
Verify Correct Install / Configuration
Build Skills
How? Test Plan Separate Test Cluster
Realistic Configuration matching production
Realistic Workload
Best Insurance, but can be difficult / expensive
Functional, Performance and Destructive Testing Include Normal and Exception Operational Procedures
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
28/34
9. Monitor Performance
Establish Performance Baseline
AWR / Statspack
ADDM
Active Session History
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
29/34
ADDM ( excerpt )FINDING 5: 7.1% impact (5377 seconds)
-------------------------------------Read and write contention on database blocks was consuming significantdatabase time in the cluster.
RECOMMENDATION 1: Schema, 2.7% benefit (2074 seconds)ACTION: Consider partitioning the INDEX "FHUS_NEW.PK_OBJECT_STORE"with
object id 101859 in a manner that will evenly distribute concurrentDML across multiple partitions.RELEVANT OBJECT: database object with id 101859
RATIONALE: The INSERT statement with SQL_ID "gf99f4tkwcvt9" wassignificantly affected by "global cache buffer busy".RELEVANT OBJECT: SQL statement with SQL_ID gf99f4tkwcvt9insert into OBJECT_STORE (VERSION, TYPE_ID, CREATE_DATE, UPDATE_DATE,STATE_ID, ENCODING_TYPE, COMPRESSION_TYPE, STORED_DATA_SIZE,ORIGINAL_DATA_SIZE, FORWARD_ID, OBJECT_ID) values (:1, :2, :3, :4,:5, :6, :7, :8, :9, :10, :11)
RATIONALE: The INSERT statement with SQL_ID "cfpgqrsxcfhg1" wassignificantly affected by "global cache buffer busy".RELEVANT OBJECT: SQL statement with SQL_ID cfpgqrsxcfhg1insert into OBJECT_REFERENCE (OBJECT_ID, REFERENCE_ID) values (:1,:2)
Identifies SQL and objects with serializing contention@?/rdbms/admin/addmrpt.sql
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
30/34
Active Session History Report
@?/rdbms/admin/ashrpt.sql
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
31/34
10. Prepare DiagnosticsProcedures, just in case
Capture max possible diagnostics before restartingsystem etc.
Tradeoff with recovery time
Oracle side Hanganalyze
Systemstate dumps
gv$views
Statspack / AWR
OS setup, e.g. Linux Netdump Utility (Metalink note 226057.1)
Alt SysRq Keys Utility (Metalink node 228203.1) Serial Console (Metalink node 228204.1)
Performance / load data
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
32/34
And Dont Forget
Synchronize Time across cluster NTP
Use ASM for Database Storage
Use Grid Control
Cache Sequences
Use ASSM for all Tablespaces
Linux hugetlb for large memory
Etc.
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
33/34
Summary1. Define Service Level Objectives
2. Keep It Simple3. Get ready for the Grid
4. Use Oracle10g Release 2
5. Implement Workload Management
6. Configure the Interconnect Correctly7. Use Cluster Verification Utility
8. Test, Test, Test
9. Monitor Performance
10.Prepare Diagnostics Procedures, just incase
-
8/14/2019 RAC Best Practices - RAC SIG 9Dec05[1]
34/34
AQ&Q U E S T I O N SQ U E S T I O N SA N S W E R SA N S W E R S