1 copyright © 2013, oracle and/or its affiliates. all ... · oracle exadata management deep dive...
TRANSCRIPT
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1
Oracle Exadata Management Deep Dive with Oracle Enterprise Manager 12c
Kurt Engeleiter
Principal Product Manager, Oracle
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 3 Copyright © 2012, Oracle and/or its affiliates. All rights reserved.
Oracle Confidential. Internal Only. 3
Safe Harbor
The following is intended to outline our general product direction. It
is intended for information purposes only, and may not be
incorporated into any contract. It is not a commitment to deliver
any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and
timing of any features or functionality described for Oracle ’ s
products remains at the sole discretion of Oracle.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 4
Program Agenda
Exadata Component Monitoring
Exadata Configuration Monitoring
Common Performance Issues
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 5
Exadata Monitoring
Database
Storage Server
Infiniband Network
KVM, PDU, ILOM, CISCO SWITCH
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 6
Monitoring Architecture
OEM Agents with Exadata Plug-in are
deployed on each Compute Node
Storage Server internally monitored by
ILOMs and MS (Management Server).
Agent uses SSH and SNMP to monitor the
Storage Servers.
Agent uses SSH to collect monitoring
information from the IB switches.
The Agent subscribes to SNMP traps to
monitors the other DBM components such
as ILOM, PDU, KVM etc.
How is the Exadata Database Machine monitored ?
ORACLE DATABASE MACHINE
COMPUTE NODE #1
DATABASE SERVER
Oracle Enterprise Manager 12c
Agent
Exadata Plug-in
Exadata Storage Server
Exadata Infiniband Switches and
Network
Other DBM Devices
PDU
ILOM CISCO S/W
KVM
OMS
ssh & SNMP
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 7
Exadata Monitoring
Hardware view
– Schematic of cells, compute nodes and switches
– Hardware components alerts
– Integrated resource utilization views
Exadata Plug-in (12.1.0.4) release has
support for
– SPARC Supercluster
– Multi-Rack
– Storage Expansion Rack
Configuration view
– Version summary and configuration information
of all components
Integrated view of Hardware and Software
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 8
Exadata Monitoring
Storage Cell monitoring and
administration support
- Cell Home page and performance
pages
- Execute Cellcli commands on a set
of cells or all cells
- Performance and workload
distribution charts help analyze
performance contentions
Management by Cell Group
- All cells used by a database
automatically placed in a group e.g.
cellsys target
Storage Cell Management
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 9
Exadata Monitoring
Infiniband network and switches are
discovered as part of the Database
Machine target
Network home page and
performance page
- Real time and historical usage
information
Topology view of Network with
switch and port level details
Infiniband Network Management
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 10
Exadata Monitoring
Common metrics monitored
Power supply failure
Fan failure
Temperature out of range
Specific metrics monitored
Cisco Switch
_ Configuration change tracking and
reporting
_ Unauthorized SNMP access
Keyboard, Video, Mouse (KVM) for X2
– Server connected to KVM added/removed,
powered on/off
Monitoring other hardware components
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 11
Exadata Monitoring
Service Dashboard for a single
pane of glass view of all
Exadata components
Out of the box job for creating
dashboard to monitor
performance and usage
metrics
– Database Machine System
– Database Machine components
– Database Systems on Exadata
Exadata Service Dashboard
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 12
Program Agenda
Exadata Component Monitoring
Exadata Configuration Monitoring
Common Performance Issues
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 13
Exadata Configuration Monitoring
Audits configuration settings of the
following categories
- Software
- Hardware
- Configuration Best Practices
Based on exachk (1070954.1) utility
Pre-Installed in Exadata
deployments
Exachk
Other Best Practices checks
RAC Exadata MAA
Configuration Best Practices
Operating System Clusterware ASM Infiniband
Hardware
Database Server Infiniband Exadata Storage Servers
Software Checks
Firmware Operating System Clusterware ASM Database
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 14
Exadata Configuration Monitoring
Exadata health check plug-in
consumes the exachk output
Evaluates the output against pre-
defined health check templates
Generates relevant alerts
Health Check Plug-in
Execute Exachk
(2.1.3 and above)
EM 12c Agent
Exadata Health
Checks Plug-in
Metric Evaluation
Execution Output
XML XML XML
OMS
Console
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 15
Exadata Configuration Monitoring
Review the
failures in the
Health Check
Plug-in Page
Sort by Status
to easily detect
the failures
Health Check Plug-in Page
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 16
Program Agenda
Exadata Component Monitoring
Exadata Configuration Monitoring
Common Performance Issues
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 17
Classifying the Performance Problems
Performance Problems
Hardware
Network Disk
Software
SQL Performance
Issues
Database System Issues
Types of Performance problems
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 18
Hardware Problems
Performance Problems
Hardware
Network Disk
Software
SQL Performance
Issues
Database System Issues
Network Issues
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 19
Hardware Problems
Bad port or loose cable can impact the performance of the database
Ports with Errors are marked as Red
Details of the problem can be found in Open Metric Events
Network Issue
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 20
Hardware Problems
Perform Infiniband Administration tasks to disable a bad port
Other tasks that can be performed are
– Enable Port
– Clear Performance counters
– Clear Error Counters
Resolving Network Issues
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 21
Hardware Problems
Performance Problems
Hardware
Network Disk
Software
SQL Performance
Issues
Database System Issues
Disk Issues
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 22
Hardware Problems Types of Disk Issues
Disk Failures
Over Utilized
Hard Disks
Under Utilized
Flash Disks
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 23
Hardware Problems
Hard Disk or Flash Disk failures lead to bad database performance
Cell Health is determined by any Open, Critical, Unsuppressed alerts from “Cell
Generated Alert” SNMP metric
Disk Failure : Cell health
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 24
Hardware Problems
Bad disk causes I/O Imbalance
Gives an indication of the percentage of maximum average I/O load from the cell disk.
Metric thresholds needs to be set at the Storage Server level
Disk Failure : Load Imbalance
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 25
Hardware Problems
Network
• Infiniband Switch: Degraded Port / Port with Errors
Disk
• Disk failures
• Configuration Issues
• Load Imbalance
Software Setup
• ASM Disk group Issues
Exadata System Health
• Exadata System Health is computed using information collected from
Network, Disk and ASM.
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 26
Exadata System Health
Drill down from
database Performance
page
• Provides composite
view of all health
indicators
• Week, Day or the
default 2 hours view
can be used to analyze
trends of various
issues.
Integration with Database Performance Page
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 27
Over Utilization of Hard Disks
Performance page of the Exadata Storage Server Target provides real-time and historical
utilization information
Exadata Cell Utilization Limit Lines introduced in Exadata Plug-in 12.1.0.5
Helps to determine at what time of day, the IO bandwidth is exhausted
Cell Performance Page
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 28
Hard Disks Utilized
Identify the database which caused the increased I/O usage
Identify what caused the increased I/O activity
Make sure a single database is not running away with all the I/O bandwidth
Correlate with Database Workload Distribution
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 29
I/O Resource Utilization Quiz: Do you see any problems with the I/O utilization pattern?
One Database
is running away
with all I/O
bandwidth
How do you
prevent this
from happening
?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 30
IO Resource Management
Makes sure one database is not running
away with all I/O bandwidth
Keep disks well utilized
Keep I/O latencies low
Prioritize log writes, control file I/Os
Control how much disk bandwidth each
DB, Category or Consumer Group uses
Goal : Ensure I/O bandwidth to all Databases
Exadata IORM
Across Databases
Inter-database Resource Plan
Category resource plan
One Database
Intra-database Resource Plan
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 31
Common IORM Setup
Based on the name of the database
initiating the request
Useful when you need to manage I/O
priorities across these databases
Allocate I/O resources across
databases by means of an IORM plan
configured on each storage cell.
IORM plans should be identically
configured on each storage cell.
Inter-database IORM
CellCLI> ALTER IORMPLAN –
dbPlan = ( -
(name=DBM, level=1 allocation=60),
(name=CRM, level=2 allocation=80),
(name=other, level =3 allocation=100))
Database Level 1 Level2 Level3
DBM 60%
CRM 80%
OTHER 100%
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 32
Disk Objectives
Day Time Plan
Allocation
OLTP 80
Reports 10
Low-Priority 10
IORM distinguishes between small (less than 128K in size) and large I/O requests
Low-Latency OLTP type requests are usually small requests
High-throughput DW type requests are usually large requests
Comparing large (LG) and small (SM) I/O requests in the IORM metrics helps to determine
the type of workload
Objective Description
LOW_LATENCY For applications that are extremely sensitive to I/O latency. DW applications impacted
HIGH_THROUGHPUT Best possible throughput for DW transactions
BALANCED Strikes a balance between low latency and high throughput
AUTO IORM determines the best optimization objective
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 33
Setup IORM using EM 12c Navigation: Exadata Storage Server Administration Manage IO Resource
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 34
Under Utilized Flash
Common DW problem scenario:
– HDD disks are busy but flash are idle due to large reads issued by smart scans
bypassing flash cache
Solution:
– Use flash for KEEP objects so large reads can be offloaded to flash
– Execute the following steps:
1. Run IO intensity report @?/rdbms/admin/spawrio
2. Ensure the total size of KEEP objects do not overwhelm flash cache size
– Be aware that allowed KEEP size is restricted to 80% of flashcache size
– Target small tables with lots of reads for KEEP
3. Mark each candidate table as KEEP
Improve Flash Utilization
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 35
Under Utilized Flash I/O Intensity Report - Spawrio.sql
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 36
Under Utilized Flash
Analyze the flash cache utilization rate prior to KEEP
Ensure that newly marked KEEP objects do not trump other critical workloads effectively utilizing flash cache
From Previous Example
Space IO
Id Type GB Intensity
---------------------------------------- ---------- ------ ------------
EDW_ATS.TECS_PHC(P2011) TABLE PART 67.8 1,284.9
EDW_ATS.TECS_PHC(P2011) TABLE PART 67.8 1,284.9
EDW_ATS.ENTITY_ADDR TABLE 83.6 408.1
Total KEEP size = 67.8 + 67.8 + 83.6 = 219.2 GB
Default Flash Cache size per cell in X3 = 1609.14 GB and X2 = 364.75 GB
Evaluate Total Keep Size
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 37
Under Utilized Flash
Run the following SQLs -
- ALTER TABLE TECS_PHC MODIFY PARTITION P2011
STORAGE (CELL_FLASH_CACHE KEEP);
- ALTER TABLE ENTITY_ADDR STORAGE
(CELL_FLASH_CACHE KEEP);
Mark Objects as KEEP
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 38
Database Problems
Performance Problems
Hardware
Network Disk
Software
SQL Performance
Issues
Database System Issues
SQL Performance Issues
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 39
Exadata Aware SQL Monitoring
Real time monitoring of application SQL
I/O performance graphs with Exadata information
- Cell offload efficiency
- Cell smart scan
Rich metric data
- CPU
- I/O requests
- I/O throughput
- PGA Usage
- Temp Usage
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 40
Quiz: SQL Performance Problem What is wrong with this execution in Exadata ?
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 41
SQL Performance Problem Smart Scan
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 42
Database Problems
Performance Problems
Hardware
Network Disk
Software
SQL Performance
Issues
Database System Issues
Database System Issues
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 43
Database System Issue Parallel Downgrades
What do you see in the Parallel column ?
Use Parallel Queuing for Consistent Parallel execution
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 44
Database System Issue
Enable by setting parallel_degree_policy = ‘auto’
– Automatic setting of DOP
– Parallel Statement Queuing
Availability
– Introduced: 11.2.0.1
– Integrated with Resource Manager: 11.2.0.2
Objective
– Run enough parallel statements to keep the system very busy
– Queue any subsequent parallel statements – avoid DOP downgrades
Configure by a Resource Plan
Parallel Statement Queuing
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 45
Summary
Cell Health Indicator
Exadata System Health
How to identify over utilization of hard disks
How to identify objects for Flash Keep
How to identify SQL and database system issues with SQL Monitoring
Top Things to remember
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 46