proven techniques for maximizing availability maximum availability architecture lawrence to, shari...
TRANSCRIPT
Proven Techniques for Maximizing AvailabilityMaximum Availability Architecture
Lawrence To, Shari Yamaguchi
High Availability Systems Group Systems Technologies
Oracle Corporation
Session id: 40180
Agenda
Achieving High Availability Maximum Availability Architecture (MAA) Solutions to Real World Questions Real MAA Deployments MAA in 10g Future MAA Q & A
Achieving High Availability
Prevent outages before they occur. Tolerate outages - planned or unplanned so
they are transparent to the business. Recover quickly if an outage does occur.
Causes of Downtime
Computer Computer FailuresFailures
Data Data FailuresFailures
System System ChangesChanges
Data Data ChangesChanges
UnplannedUnplannedDowntimeDowntime
PlannedPlannedDowntimeDowntime
Human Error, Human Error, Corruption, Corruption, Storage Failure, Storage Failure, Site FailureSite Failure
System Maintenance,System Maintenance,Software Maintenance,Software Maintenance,Application ChangesApplication Changes
High Availability is …
Maximum Availability Architecture Best Oracle High Availability Architecture Best Practices
Building the configuration. Managing the configuration. Recovering from outages quickly. Restoring full fault tolerance.
Continual Testing Evolves with new Oracle versions and
features
Maximum Availability Architecture
What to Use:– High Availability Blueprint for Database, Oracle.
Application Server, Enterprise Manager, and more.
How to Build, Manage, and Recover:– Following configuration and operational best
practices.– Understanding outages and detailed recovery
solutions.– Restoring fault tolerance after an outage.
Unbreakable Architecture + Best Practices = Maximum Availability
Maximum Availability Architecture
WAN Traffic Manager
Dedicated Network
Primary Site
RAC
Oracle Application Server
Secondary Site
Oracle Application Server
RACData Guard
MAA Was Created Based on…
Real world customer requests and questions:
– What issues should we consider for choosing the most optimal high availability architecture?
– What is Oracle’s best high availability architecture?
– How can we manage this high availability environment?
– What are the performance trade-offs?– How do we repair from various outages?
Examples of Issues That Have Been Addressed
What is the best solution to avoid service disruption for host and instance failures?
Which Disaster Recovery solution should we adopt?
What is the best way to configure the standby database over a network?
How do you configure Oracle Application Server for high availability?
Best Solution to Avoid Service DisruptionReal Application Clusters
Fast Failover– Protection from local site system failures– Faster than cold cluster failover solution– Fast-start fault recovery (instance failure MTTR)
Availability and Accessibility – Allows for scheduled outages
Add and remove nodes transparently
– Transparent Application Failover (TAF) provides uninterrupted service
Best Solution to Avoid Service DisruptionReal Application Clusters
Higher Scalability – All system resources from all nodes are leveraged– Cache fusion eliminates need to partition data or
modify the application – fully application transparent– Connection load balancing distributes connection
requests from application tier
Manageability– Provides a single image of the database to manage
A BB
Fast Instance RecoveryPerformance stays constant as recovery gets faster.
disabled 300 180 900
100
200
300
400
500
600
700
800
900
disabled 300 180 90
writes/sec
tps
fast_start_mttr_target setting
Which Disaster Recovery Option?
• Storage or Remote Mirroring, Geo-Clusters
• Vulnerable to human error and data failures. • Latency.
• Streams and Replication• Ideal for active-active configurations that may involve
heterogeneous environments.• Offers finer granularity on what gets replicated and when.
• Data Guard• Provides comprehensive data protection, data availability, and
data recovery benefits, along with an integrated management framework.
Physical/Logical StandbyDatabase
MRP/ LSPRFS
StandbyRedo Logs
ARCH
Data Guard Architecture
Archived Redo Logs
PrimaryDatabase
Transactions
LGWR
Online Redo Logs
ARCH
Oracle Net
Archived Redo Logs
Choosing: Physical or Logical Standby
Questions Recommendations
1. Do you require strict zero data loss?
Yes - use a physical standby databaseNo – go to next question
2. Do you have any unsupported logical standby data types?
run this query:SELECT DISTINCT OWNER,TABLE_NAME FROM DBA_LOGSTDBY_UNSUPPORTEDORDER BY OWNER,TABLE_NAME; Rows returned – use a physical standby or investigate switching to supported data typeNo rows returned – go to next question
3. Do you need to have the standby database open for read and/or write access?
Yes – evaluate logical standby databaseNo -- evaluate physical standby database
Configuring Standby Over the Network Performance Case Examples
– Primary database in Tokyo and standby database in Kyoto (229 miles and 7ms RTT) in Maximum Protection mode ensure no data loss even in the face of a disaster, with minimum performance impact (2-3%).
– Primary database in San Francisco and standby database in New York (2582 miles and 78ms RTT) in Maximum Performance mode had only seconds of data loss, with minimum performance impact (1%).
Best Practices are Key– Assess bandwidth and latency– Pick the appropriate transport mechanism and protection mode:
ARCH, LGWR SYNC or LGWR ASYNC– Set TCP Socket Buffer Sizes = Bandwidth x Round Trip Latency– Set SDU = 32K– Evaluate SSH port forwarding with compression
Fast Redo ApplyRedo apply out performs high production redo rates.
0
2
4
6
8
10
12
14
High OLTP Batch LoadTransaction Profile
MB
/se
c
Production Redo Rate
Standby Redo AppyRate
Fast SQL ApplySQL Apply can manage high transaction rates.
0
50
100
150
200
250
300
Full Read Only None
Consistency Model
TP
S
Oracle Application Server 10g High Availability Middle Tier
– Oracle Application Server OC4J and Web Cache clustering
– Redundant mid-tier servers front ended by a load balancer
Infrastructure– Active Clusters which incorporates Real
Application Clusters– Cold Failover Clusters
Oracle Application Server 10g HA Middle Tier
Application Application Server TierServer Tier
Database TierDatabase Tier
ClientsClients
Web CacheWeb Cache
OC4J ClustersOC4J Clusters
Load BalancerLoad Balancer
Oracle Application Server 10g Active Clusters Infrastructure
MAA in 10g
Continuing to Test and Validate Oracle Database and Application Server 10g– Flashback capabilities, RAC, Data Guard with
Real Time Apply– Rolling upgrades and scheduled maintenance
enhancements– Incorporating best practices into the core 10g
products – Best practices formalized into Oracle Database
and Application Server 10g documentation– MAA White Paper updates
Future MAA Incorporating E-Business Suite Incorporating Collaboration Suite Continuing to work with:
– Internal Deployments– Outsourcing Deployments– Consultants– Partners– External Customers
MAA Test Lab
WAN Traffic Manager
Dedicated Network
Primary Site
RAC
Oracle Application Server
Secondary Site
Oracle Application Server
RACData Guard
F5 Networks
EMC
Hewlett-Packard
Sun Microsystems
Shunra
MAA Information Sources
Oracle Technology Network– http://otn.oracle.com/deploy/availability/htdocs/maa.htm
Maximum Availability Architecture Oracle9i Media Recovery Best Practices Oracle9i Data Guard: SQL Apply Best Practices Oracle9i Data Guard Role Management Best Practices Oracle9i Data Guard Primary Site and Network Configuration
Best Practices Oracle9iAS Cluster configuration
Oracle Consulting – Advanced Technologies Solutions (ATS) Group
– http://otn.oracle.com/consulting/9iServices
Next StepsHigh Availability Sessions from Oracle
11:00 AM
How Oracle Database 10g Revolutionizes Availability and
Enables the Grid
3:30 PM
Oracle Recovery Manager (RMAN) 10g: Reloaded
5:00 PM
Proven Techniques for Maximizing Availability
8:30 AM
Oracle Database 10g - RMAN and ATA Storage in Action
11:00 AM
Oracle Data Guard: Maximum Data Protection at Minimum Cost
1:00 PM
Oracle Database 10g Time Navigation: Human-Error Correction
4:30 PM
Data Guard SQL Apply: Back to the Future
Wednesday in Moscone Room 304Tuesday in Moscone Room 304
For More Info On Oracle HA Go To http://otn.oracle.com/deploy/availability/
Next StepsHigh Availability Sessions from Oracle
8:30 AM in Moscone Room 304
Oracle Database 10g Data Warehouse Backup and Recovery:
Automatic, Simple, Reliable
8:30 AM in Moscone Room 104
Building RAC Clusters over InfiniBand
Thursday
For More Info On Oracle HA Go To http://otn.oracle.com/deploy/availability/
Real Application Clusters
Data Guard
Database Backup & Recovery
Flashback Recovery
LogMiner, Online Redefinition, and Cross Platform Transportable
Tablespaces
Database HA Demos All Four DaysIn The Oracle Demo Campground
Reminder – please complete the OracleWorld online session survey
Thank you.
AQ&Q U E S T I O N SQ U E S T I O N S
A N S W E R SA N S W E R S