deploying oracle rac best practices

44

Upload: myron

Post on 22-Oct-2015

33 views

Category:

Documents


6 download

DESCRIPTION

Deploying Oracle Rac Best Practices

TRANSCRIPT

Page 1: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 1

Page 2: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 2

Deploying Oracle Real Application Clusters: BestPractices (S298766) ROOM S104 WED 5PM

• Saar Maoz & Philip Newlan• RACPACK – Server Technologies

Page 3: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 3

The following is intended to outline our generalproduct direction. It is intended for informationpurposes only, and may not be incorporated into anycontract. It is not a commitment to deliver anymaterial, code, or functionality, and should not berelied upon in making purchasing decisions.The development, release, and timing of anyfeatures or functionality described for Oracle’sproducts remains at the sole discretion of Oracle.

Page 4: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 4

<Insert Picture Here>

Agenda

• Installation Overview• Cluster Verification Utility• Cluster Interconnect• Clusterware• Storage / MPIO• ASM / Shared filesystem• Patching• Monitoring / Tuning• Tips & Troubleshooting

Page 5: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 5

Service

Oracle RAC Architecturepublic network

Node1

Operating SystemOracle Clusterware

instance 1ASM

VIP1

Listener

Node 2

Operating SystemOracle Clusterware

instance 2ASM

VIP2

ListenerService

Node n

Operating SystemOracle Clusterware

instance nASM

VIPn

ListenerService

/…/

Redo / Archive logs all instancesShared Storage

Database / Control files

OCR and Voting Disks

Managed by ASM

RAW/Block Devices

Private network

Page 6: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 6

Oracle RAC Installation Overview• Validate and prepare Hardware & OS

• Check RAC Technology Matrix on OTN: Unix, Linux, Windows andCertify on Metalink

• Linux only:• Consult Oracle Validated Configurations on OTN• Use oracle-validated rpm to set/install kernel rpms/parameters

(Note: 728346.1)• Determine cluster interconnect• Determine shared storage methodology• Install and configure the Oracle Clusterware Software• Install the Oracle Database RAC software

• Can install ASM and create a database automatically

Page 7: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 7

Cluster Verification Utility (CVU, cluvfy)

• Allows customers to verify cluster during variousstages of its deployment from hardware setup,Clusterware Install, Database install, storage, etc.

• Extensible framework• Command Line only

$ ./cluvfy comp peer -n node1,node2 | more• Does not take any corrective action following the

failure of a verification task• Non-intrusive verification

Page 8: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 8

Deployment of cluvfy

• Install only on local node. Tool deploys itself onremote nodes during execution, as required.

• Issues verification command formultiple nodes

• Tool copies the requiredbits to the remote nodes

• Executes verificationtasks on all nodesand generates report

CVU

• User installs on local node

Page 9: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 9

cluvfy Stage List

• Valid stage options and stage names are:$ ./cluvfy stage -list

-post hwos : post-check for hardware & operating system-pre cfs : pre-check for CFS setup-post cfs : post-check for CFS setup-pre crsinst : pre-check for Clusterware installation-post crsinst : post-check for Clusterware installation-pre dbinst : pre-check for database installation-pre dbcfg : pre-check for database configuration

Page 10: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 10

User sets up theHardware,

network & storage

Sets up OCFS( OPT )

Installs OracleClusterware

InstallsRAC

ConfiguresRAC DB

-post hwos

-post cfs

-post crsinst

-pre crsinst

-pre dbinst

-pre dbcfg

-pre cfs

cluvfy Stage List - Graphical

Page 11: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 11

cluvfy Component List• Valid components are:$ ./cluvfy comp -list

nodereach : checks reachability between nodesnodecon : checks node connectivitycfs : checks CFS integrityssa : checks shared storage accessibilityspace : checks space availabilitysys : checks minimum system requirementsclu : checks cluster integrityclumgr : checks cluster manager integrityocr : checks OCR integritycrs : checks CRS integritynodeapp : checks node applications existenceadmprv : checks administrative privilegespeer : compares properties with peers

Page 12: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 12

cluvfy Stage or Component

• Use Stage checks during installation of Oracle Clusterwareand RAC.• Use the appropriate –pre and –post check for the stages, e.g:

$ ./cluvfy stage –pre crsinst –n node1,node2 -verbose

• To verify a particular component while the stack is running orto isolate a cluster subsystem for diagnosis, use appropriateComponent checks.

Page 13: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 13

CVU locations

• Pre-Installation:• Cluster Verification Utility on OTN:• http://www.oracle.com/technology/products/database/clustering/

cvu/cvu_download_homepage.html• Oracle DVD• clusterware/cluvfy/runcluvfy.sh

• Clusterware Home• <crs_home>/bin/cluvfy

• Oracle Home• $ORACLE_HOME/bin/cluvfy

Page 14: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 14

Cluster Interconnect Best Practices• Most cases: use UDP over 1 Gigabit Ethernet

• Windows uses TCP (See Note: 278132.1)• Heavier workload may benefit from Infiniband/IP or 10 Gigabit Ethernet

• Use OS Bonding/teaming to “virtualize” interconnect• Failover ; Load-balancing ; Improved bandwidth• Linux: Note: 434375.1, Solaris: Note: 368464.1

• Set UDP send/receive buffers high enough• Platform dependant – typically 256K is adequate

• Linux: net.core.rmem_max, net.core.wmem_max,net.core.rmem_default, net.core.wmem_default

• Use a private dedicated non-routable Switch or VLAN• Crossover cables are not supported

• Eliminate any Transmission Problems• Packet errors/drops can manifest into more serious outages• Do not use any firewall/iptables on interconnect Note 554781.1

Page 15: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 15

Cluster Interconnect (Cont’)• Best practice: Clusterware and Database (GCS/DLM/PQ)

communications on same underlying transport (NIC)• These could be split if the need arises

• Might be needed with many databases on the same cluster• Clusterware uses: olsnodes -p• DB uses: oifcfg getif

• select * from v$cluster_interconnects;• DB may override Clusterware via the cluster_interconnects

init.ora on a per database basis

• Private NICs and public NICs should be kept the samename/order on all cluster members

Page 16: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 16

Clusterware Misscount• Oracle Clusterware has two heartbeats

• Network: misscount, defaults to 30sec (Linux 10g: 60sec)• Disk (IOT): function of misscount varies by release

• Problematic approach to tie the two together (not granular enough)• 11g, 10.2.0.2+ and 10.1.0.4+ decouple above timeouts

(disk/network)• Introduce css disktimeout parameter; defaults to 200 seconds• Reconfiguration/reboot only if misscount exceeded for network ordisktimeout exceeded for voting disk

• Prior releases get patch 4896338• Note: 294430.1: Misscount definitions• Note: 284752.1: Change misscount/reboottime/disktimeout• Best Practice: Do NOT change misscount or disktimeout

unless on the recommendation of Support

Page 17: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 17

VIP IP in RAC• Used to mitigate TCP/IP timeout delays on client connections• When configuring (VIPCA) choose only the public interfaces

• Watch out the default subnet is 255.255.255.0, correct it if needed• 10gR2: On SLES10 / RHEL5 / OEL5: VIPCA fails during root.sh;Note: 414163.1

• The VIP must be a DNS known IP address• Clients connect to VIP address from tnsnames connect description• Listeners listen on VIP for client connections

• Use ifconfig (on most platforms) to verify VIP interface is configured afterClusterware is running• IP address on the new VIP interface eg: bond0:1, should respond to pings

• The VIP is stored within the OCR• To modify the VIP IP see Note: 276434.1

• Bond NICs for VIP (IP multipathing: IPMP)• Linux: Note: 298891.1 Solaris: Note: 283107.1

Page 18: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 18

Mirror OCR/Voting disk• Oracle Cluster Repository (OCR), and split brain resolution mechanism

(Voting Disk)• Storage Options: Block, RAW, CFS or certified NFS• 10gR2 & 11g: Oracle mirroring is recommended at install time

• Best Practice: configure 3 Voting disks and 2 OCR devices• Post install:# crsctl add css votedisk path# ocrconfig -replace ocrmirror destination file/disk

• 10gR1: Limited to hw RAID and OS LVM• Split brain logic requires majority of disks for sub-cluster to continue

• Stretched clusters may place voting on 3rd location over NFS (Linux, AIX,Solaris, HPUX)

• Auto OCR Backups: # ocrconfig –showbackup• New in 11g: # ocrconfig –manualbackup

• Relocate OCR/Voting: Note: 428681.1

Page 19: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 19

Determine Storage Methodology• ASM (Automatic Storage Management) -

RECOMMENDED• Clustered Filesystems:

• Oracle Cluster Filesystem (OCFS1 for 2.4 kernels; OCFS2 for 2.6kernels) [FREE]

• Certified 3rd party clustered filesystems• NFS: Certified NFS Server• iSCSI provides block devices (For ASM, OCFS2, etc.)• Check RAC Technology Matrix on OTN: Unix, Linux,

Windows and Certify on Metalink• Raw Devices

• Avoid!, limited to 255; needed only for OCR/Voting in 10g, notneeded in 11g, use block devices instead

• Raw devices will be deprecated in next major release

Page 20: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 20

IO Multipathing• Device driver automatically or manually combines multiple

paths to the same device• Two HBAs become one virtual HBA (Host Bus Adapter)• Failover, Bandwidth aggregation, path rediscovery

• On Linux (Open Source, FREE)• 2.6 kernels: Device Mapper (DM) (decent)

• Fixes all the lacks of MD and then some• 2.4 kernels: Multipath Device (MD) (mdadm) AVOID

• Long timeout (90 seconds) for failover to kick in• Manual configuration of path, No path rediscovery• Use 3rd Party instead

• Third-party (HP, EMC, IBM, Sun, HDS, Veritas, Qlogic)multipathing on Linux• No Unbreakable support from Oracle

Page 21: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 21

3rd Party IO Multipathing• HP – Secure Path, Auto Path XP

• Only HP storage• IBM (Varies by storage/OS)

• MPIO Driver (Multi-Path Input Output): AIX• SDD (Subsystem Device Driver): AIX, Linux, HP-UX, Solaris, Windows• RDAC (Redundant Disk Array Controller):

• AIX, Linux, Windows• Sun – StorEdge Traffic Manager (Sun Storage only)• Microsoft – MPIO Software dev kit (not AIX’s MPIO)• EMC – Power Path

• Compatible with many storage arrays• Qlogic – Must use Qlogic HBAs• Symantec/Veritas – Dynamic Multipathing (DMP, VxVM)

• Must use Veritas CLVM, create logical volumes for ASM to use

Page 22: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 22

When NOT to use Multipathing

• MP is transparent to ASM, avoid these cases:• When MP requires root access to MP device• When MP requires a non cluster aware LVM in the path• When 3rd party vendor does not certify for OS

Page 23: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 23

ASM Recommendations• Install ASM on a separate Oracle Home

• NEW in 11g – ASM rolling upgrades are possible, 11g onwards• Set INIT.ORA on ASM and DB as per recommendations• Remove ASM dependency on VIP

• If VIP fails ASM instance remains operational• Fixed in 11g and 10.2.0.3 patchset; download fix for 10.2.0.2

• If mirroring is done in the storage array, setREDUNDANCY=EXTERNAL for the diskgroup

• On Linux use ASMLib (Migrate: Note: 394955.1)• Protects against device name changes across reboots without

compromising security (Note: 394959.1)• Fewer kernel resources, no configs to modify as disks are added• Global Open/Close for ASM devices

Page 24: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 24

Shared Oracle Home• Shared Oracle Home requires a shared filesystem

• OCFS2, Certified NFS device, etc.• Only one copy of the software to maintain & faster

installation, however with following drawbacks:• Can not perform rolling upgrade of patches/sets• Binaries have local dependencies

• Requires cross-node OS compatibility• Single point of failure

• Avoid using Shared Home• Especially for the Oracle Clusterware Home• It is required when implementing SAP on RAC

• Oracle Homes in RAC Whitepaper on OTN

Page 25: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 25

Summary: Install Oracle RAC 10g / 11g

• Linux: Consult Oracle Validated Configurations on OTN• OS at latest revision + set kernel parameters correctly• Run Cluster Verification Utility (CVU) at various stages• Install and configure the Oracle Clusterware Software• Install the Oracle RDBMS RAC software

• Can install ASM and create a database automatically

Page 26: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 26

Patching/SW Maintenance• Stay current with:

• CPU’s (Critical Patch Update; only apply to RDBMS & ASM; not Clusterware)• RDBMS Patchsets ; Clusterware bundle patches

• Use latest Opatch; download from Metalink• New Opatch placeholder for all platforms/versions, bug 6880880• Old: 10.2 placeholder bug 4898608; 10.1 placeholder bug 2617419

• Review Support/Metalink (e.g. 10.2.0.3 see Notes: 391116.1,401435.1)for known issues & patches from Support/Metalink

• Read individual patch readme’s carefully• Not all patches install exactly the same way

• Confirm patch successfully applied to all nodes$ opatch lsinventory -detail –oh <home location>

• Patch first in test/QA environment• NEW in Opatch 10.2.0.3: Apply/remove N patches at once• NEW in 11g: Online Patching; some patches can be applied to running

code

Page 27: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 27

Patchsets (10.1.0.4, 10.2.0.4, etc.)

• Consist of two portions (Clusterware & RDBMS/ASM)• Install using Oracle Universal Installer (OUI)• Latest patchset (10.2.0.4) is always advised• Oracle Clusterware must be newer or equal version of any

RDBMS or ASM installed• Oracle Clusterware portion can always be installed in a

rolling upgrade fashion; HOWTO: Note: 338706.1• New in 11g: ASM can be upgraded as a rolling upgrade,

11g onwards• RDBMS portion can only be installed in a rolling upgrade

fashion if a logical standby exists

Page 28: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 28

Patching Mixed Oracle Homes

• Mixed Oracle Home is when Clusterware andRDBMS/ASM versions are not identical• Fully supported; Clusterware always higher version• Patching is slightly different as follows

• Clusterware patches consist of two portions• One applied to Clusterware Home• Second applied to ASM or RDBMS Home

• ASM & RDBMS treated equally for this purpose• Attempt to install a 10.2 patch in a 10.1 RDBMS will fail

• Patches must always be applied to exact version

Page 29: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 29

Patching Mixed Oracle Homes (Cont’)

Refer to Metalink Note 363254.1 for full details• You may skip the RDBMS portion of the fix

• Bug may still be visible on that RDBMS Home• Or; Request a one-off for the needed older RDBMS version

• Remember:• Never force a patch to be installed into incorrect version home• A single one-off (patch) zip will always contain exact versions for both

portions of the patch (Clusterware,RDBMS)

Page 30: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 30

Tuning Philosophy

• Philosophies differ• Tuning for new or existing database• Tend to start with things we know• Perception of a problem may sway your philosophy

• Here’s mine...• Go for the best bang for your buck

• Translation: Go after the big things first

Page 31: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 31

Monitoring: General

• Vital to have good baseline to compare with• Correlate I/O timing reported by Oracle to I/O timing

reported by OS utilities & Hardware• For example: Database says I/O takes 60ms but hardware

says 10ms, investigate why.• OS and database statistics should be collected at the

same time periods to have a meaningful comparison• Run OSWatcher and statspack (or AWR with Diagnostic

Pack license) continuously

Page 32: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 32

Monitoring Tools: Linux Specific• Overall tools

sar , vmstat• CPU

/proc/cpuinfo , mpstat , top• Memory

/proc/meminfo , /proc/slabinfo• Disk I/O

iostat, sar• Network

iptraf, netstat, ethtool• Individual process debugging

strace , ltrace, lsof

Page 33: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 33

Monitoring Tools: RAC• Oracle Enterprise Manager (recommended)

• DB Control or Grid Control• With Diagnostics Pack license provides Automatic Database

Diagnostic Monitor (ADDM) and Automatic WorkloadManagement (AWR)• Comprehensive & concrete tuning recommendations

• Statspack (Note: 94224.1) or AWR if own DiagnosticPack License• Manual snapshot/reporting, similar to AWR reports

• No recommendations, user must conclude based on report• OS Watcher (Note: 301137.1)

• Continuous collection of OS metrics automatically• LTOM & LTOMg (Notes: 352363.1 & 461050.1)

• Real-time system profiler & diagnostic tool with graphics frontend

Page 34: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 34

RAC Performance Recommendations• Good SQL• Reduce Hot Spots

• Same as you would for a single instance• Set sequence cache to 1000 or more

• Scalable I/O sub-system• Implement multipathing

• Confirm Interconnect is actually being used• Jumbo Frames helps in most cases

• Use Automatic Segment Space Management• “SEGMENT SPACE MANAGEMENT AUTO” in create tablespace• Remove PCTUSED, FREELIST & FREELISTS GROUPS

• Allocate SGA memory from non-swappable memory• Hugepages (Note: 361323.1)

Page 35: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 35

Tips & Troubleshooting 1• How to convert Single instance to RAC?

• DBCA, EM, rconfig (Note: 387046.1)• Follow install requirements (e.g. Note: 438766.1)

• For example: SELinux should be set to Permissive (or Disabled) modeon EL5, Note: 457458.1

• Oracle RAC 11g (All Linux/Unix ports) & Linux 10.2.0.4 andabove use oprocd to detect hangs• Linux: hangcheck-timer still pickups lower level (device driver)

hangs• Avoid False Reboots

• Set diagwait to 13 (All Platforms), Note: 559365.1• Overcome OS scheduling latencies

• Linux: make sure glibc is updated, Note: 731599.1

Page 36: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 36

Tips & Troubleshooting 2• Correctly mount clustered filesystems

• OCFS2: “datavolume” for database mount points (Note: 428356.1)• NFS: Correct NFS mount options (Note: 359515.1)

• Clusterware needs storage and network UP• Verify host startup sequence of network & I/O drivers, iSCSI

• Use NTP (Network Time Protocol)• Easier debugging/diagnostics as time is in sync on all nodes• Some issues may exist for Clusterware & DBMS_SCHEDULER if time

drifts wildly• Jobs get scheduled incorrectly• May reboot nodes as misscount calculations will be incorrect

• Use -x (or equivalent) to prevent time from moving backwards

Page 37: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 37

Tips & Troubleshooting 3• Ensure IO Storage scalability for multiple nodes early on

• As nodes are added more storage bandwidth should be added• ORION: Oracle tool on OTN (Linux, Windows, Solaris, AIX)• IOzone: Freeware on Internet (Cross platform)

• If OS stack size set too high (e.g. 200MB), OracleClusterware fails to start• Each thread consumes stack-size (200MB!!)• Leave at port-specific defaults

Page 38: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 38

Tips & Troubleshooting 4• On Windows: Disable Media Sense (Note: 243549.1)• Increase SYS.AUDSES$ sequence cache (Note: 395314.1):

alter sequence sys.audses$ cache 10000;• Affects 9i up to and including 10.2.0.2

• Clusterware relies on OS authentication, if using LDAP ensureit’s at High Availability standards or decouple the RAC nodesfrom LDAP

Page 39: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 39

Tips & Troubleshooting 5• Linux: Oracle Validate RPM now available for non-ULN

customers• Can help achieve a reduced package installation, Note: 728346.1

• Setup SSH equivalency using runSSHSetup.sh on installdirectory of Clusterware CD/DVD.

• Use pdsh (Public Domain Shell) runs commands on all nodes• Want silent OUI installs?

• Use the –record flag to generate a response file• Installing on a cluster with many nodes?

• Use cluster configuration file (Text file with node names)• See Note: 336912.1

Page 40: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 40

Tips & Troubleshooting 6• Collect RAC traces/diagnostics

• Remote Diagnostic Agent (RDA) 4.2 or above: Note: 359395.1• RAC Diagnostic Data Tool (RAC-DDT): Note: 360926.1• <CRS_home>/bin/diagcollection.pl• Procwatcher Note: 459694.1 (Heavy duty debugging/tracing)

• Only when working on a bug with help from Support• Cluster Deconfig/Deinstall tool on OTN (10g: Linux x86)

• Helps deinstall RAC 10g software for a clean reinstallation

Page 41: Deploying Oracle Rac Best Practices

Q U E S T I O N SQ U E S T I O N SA N S W E R SA N S W E R S

Page 42: Deploying Oracle Rac Best Practices

© 2007-2008 Oracle Corporation 42

More RAC Sessions Thursday!

• 12:00 PM South 306 : S298787 Migrating to OracleReal Application Clusters: From POC to Production.

• 1:30 PM South 306 : S298771 Increase YourOrganization’s Efficiency with a Dynamic SharedInfrastructure Grid

• 3:00 PM South 306 : S299069 Oracle RealApplication Clusters and Qlogic InfiniBand: Yahoo!Large-Scale Data Warehouse

Page 43: Deploying Oracle Rac Best Practices
Page 44: Deploying Oracle Rac Best Practices