<Insert Picture Here>
Deployment Guidelines for Oracle Enterprise
Manager Grid Control
Agenda
• Enterprise Manager Deployment Lifecycle • Pre-deployment Planning
• Deployment Phase
• Post-deployment Activities
• Implementation Considerations
Post Deployment Activities
Pre Deployment Activities
Enterprise Manager Deployment Lifecycle
Requirements Analysis
Capacity Planning & Sizing
SolutionArchitecture
Design
Deployment Planning
DeployOMS
Deploy Agents
HA/Security
DeployRepository
House Keep TuneManageChange
Monitor
POC/Staging Instance
Production Instance
Functional Requirements
• Scope and Objectives
• What do you expect to be monitored?
• What do you expect to be managed?
• Who are the users of the system?
• How are these different components located in the data-center?
• What are the current system management processes/practices?
• What are the key objectives of Grid Control?
Requirement Analysis
Requirement ↔↔↔↔ Feature Mapping
• Map the requirements to Enterprise Manager Grid Control
• Standard Features
• Management Packs
• Management Plug-Ins
• Configuration Settings
• Customizations / Extensions
Requirement Analysis
Infrastructure Requirements
• High Availability requirements impact Enterprise Manager Grid Control Architecture
• Oracle Management Repository
• High availability – Real Application Clusters
• Disaster recovery - Data Guard
• Oracle Management Service (OMS)
• Multiple OMS
• Fronted by Server Load Balancer (SLB)
• Shared Receive Directory for uploaded files
• Shared NFS directory for Software Library
Requirement Analysis
Infrastructure Requirements – Cont
• Understand the security requirements of your Grid Control implementation
• Framework Communication Security
• Authentication & Authorization
• Auditing
• Other Non functional requirements
• Performance
• Scalability
• Backup & Recovery
Requirement Analysis
Sizing Grid Control
Required Inputs
• Number of Hosts to be monitored (Agent)
• Number of types of targets (DB, AS, Application etc)
• Frequency of data collection
• Estimated size of violations / notifications
• Number of scheduled jobs
• Planned automation using Grid Control (Corrective Actions, Jobs etc.)
Capacity Planning &
Sizing
Sizing - Rough Guidelines
OMS Hosts 3 Ghz CPU/
Host
Memory
(GB)
Starting
Disk (GB)
Small (100) 1 1 2 2
Medium (1000) 1 2 2 5
Large (10,000) 2 2 2 10
Repository
Database
Hosts 3 Ghz CPU/
Host
Memory
(GB)
Starting
Disk (GB)
Small (100) 1 1 2 10
Medium (1000) 1 2 4 30
Large (10,000) 2 4 6 150#Targets
#Targets
Capacity Planning &
Sizing
Console
OMS
Repository
storage
Repository1Gbps
30ms *
1Gbps
<1ms *
300Kbps DSL
300ms *
300Kbps DSL
300ms *
* Min. Bandwidth / Max. Latency
Sizing – Network RequirementsCapacity
Planning & Sizing
Enterprise Manager Grid Control
Architecture Diagram
• Develop Architecture Diagram
• Target Information (What is monitored)
• Repository & OMS Host information (OS, Version, CPU, Memory)
• Repository Storage information
• Network Connectivity between Targets – OMS -Repository
• High availability servers/storage/network
• Server Load Balancer
• Firewalls between components
SolutionArchitecture
Design
Validate Solution Architecture
• Map diagram to requirements and capacity
• Ensure high availability is there if specified as a requirement
• Ensure bandwidth needs are met by the physical infrastructure
• Using Chronos to gather HTTP timing/hit information? Use Webcache
• Multiple OMS required? Shared receive directory and Server load balancer should be included
Solution Architecture
Design
Prepare a Deployment Plan
• Part 1 - Deploy Grid Control Central Infrastructure
• Deploy Management Repository Database
• Deploy OMS
• Part 2 - Deploy Agents
• Deploy Agents
• Standardize Monitoring
• Automate Regular Task
Develop Detailed
Plan
Deployment – Repository DB
• Latest certified database version (10.2.0.4 or 11.1.0.7) with RAC option enabled
• Choose ASM as underlying storage technology
• Install DBMS_SHARED_POOL package to help improve throughput of the OMR
• Best Practices for Database MAA• Enable ARCHIVELOG Mode
• Enable Block Checksums
• Configure the Size of Redo Log Files and Groups Appropriately
• Use a Flash Recovery Area
• Enable Flashback Database
• Use Fast-Start Fault Recovery to Control Instance Recovery Time
• Enable Database Block Checking
• Set DISK_ASYNCH_IO
DeployRepository
Deployment – OMS servers
• Deploy OMS• Install Software-only EM 10g GC Release 2 (10.2.0.1 for Linux or 10.2.0.2 for Microsoft Windows)
• Apply 10.2.0.5.0 GC patch set to OMS
• Apply 10.2.0.5.0 GC patch set to Agent on OMS host
• Configure Grid Control by running the ConfigureGC.pl script from the Oracle home directory of the OMS
• Apply latest PSU and critical patches as mentioned in note https://support.us.oracle.com/oip/faces/secure/km/DocumentDisplay.jspx?id=853691.1
DeployOMS
Agent Deployment Methods
• Mass agent deployment• Agent Push
• Agent Download
• Agent install using OUI
• Agent Cloning
• NFS agent install
• Active/Passive Configuration
• Cluster Agent
DeployAgents
Agent Push (1) Agent
Download (2)
Agent Install
Using OUI (3)
Agent Cloning
(a)
NFS/Shared
Agent (b)
Active Passive
mode (c)
RAC Agent (d)
Centrally managed
Locally managed Locally managed Centrally/Locally managed
Centrally/Locally managed
Centrally/Locally managed
Centrally/Locally managed
Requires sshsetup and pushes the agent binaries to the target machines
Requires wget in the agent host machine and uses pull
Agent install software to be downloaded manually on each host machine or centrally staged on NFS
Agent binaries are shared and stored in central place for all hosts. So this requires least space
Requires access to the EM console
The agent download script can be downloaded from a URL accessible to all
Preferable if an administrator is performing all the agent installations
Preferable if the individual host/target owners have the onus of installing the agents
Cumbersome method good only if having very few agents to be installed
Preferable if source agent has been patched to standard level and can be treated as gold image
Preferable if we have standard/similar configuration of all hosts in environment.
Used for Primary/standby targets
Used for Cluster targets
Agent install s/w should be staged in OMS home. It can be done using EM GUI
Agent install s/w should be staged in OMS home. It can be done using EM GUI
DeployAgentsAgent Deployment Considerations
Agent Deployment
• Mass Deployment of Agents saves time and effort for agent installation. It can be done using:
• Agent Deploy Application from the OMS
• Downloadable Agent scripts
• Mass deployment of the agents require the agent kits for different platforms to be staged on the OMS. Kits can be staged on OMS by using “Download Agent Software” from Deployment page in Grid control GUI.
• Mass deployment of agents require similar machines
DeployAgents
Agent DeploymentActive/Passive Configuration
DeployAgents
•Install 1 agent per host
•Install the EMCLI
•Install Active/Passive targets using shared storage and cluster ware
•Discover targets as normal
•Use EMCLI command relocate target to moving monitoring between agents
•emcli relocate target
Oracle Management Repository
(OMR)
Oracle Management
Service(OMS)
DB
Grid Control Console
Oracle Management Agent (OMA)
DB
HA Configuration
• Use multiple OMSs for high availability
• Configure shared directory accessible by all OMSs
• Configure Shared File system Loader on each OMS
• Front the OMSs with a server load balancer to ensure if one OMS goes down, Grid Control still usable
• Also useful for scaling
• Always preferred to have a POC/staging EM instance with Repository and OMS
HA /Security
MAA – Customer Topology(1)Active/Passive (blackout during failover)
DB
Grid Control Console
Oracle Management Agent
Active Oracle Management
Service
Active Oracle Management Repository
Standby Oracle Management Repository
Shared Storage
Shared Storage
Standby Oracle Management
Service
HA /Security
MAA – Customer Topology(2) Active/Active Mode (brownout during failover)
DB
Oracle Management Agent
Grid Control Console
storage
Oracle
Management
Service
Load Balancer
RAC
•Latest certified DB version with RAC
•ASM as storage technology
•Best practices for Database MAA
•Configure Fast ConnectionFailover
•Appropriate OMS Installation Location (Network Latency)
•Best practices for Application Server MAA
•Configure the shared file system load directory
•Configure connection string
•Agent HA feature: watchdog process. Configure Agent to communicate through SLB. Configure Agent to allow retrofitting SLB
HA /Security
MAA – Customer Topology(3) Disaster Recovery
Standby
OMS
DB
RAC RACOracle Dataguard
SecondaryPrimary
repository Standby repository
storage
• Physical standby database
• Fast Start Failoverand the Oracle observer
• Data Guard Broker for management
• Redundant OMS
• Extend FCF/ONS to automate connection switchover
• See Advance Configuration Guide for High Availability Best Practices
Oracle Management
Service
HA / Security
MAA @ Customer XYZ
Build High Availability for each component in the EM
Infrastructure
Disaster Scenario - XYZ
OMS Nodes
RAC Nodes
Agents
Primary Standby
Redo Shipping
Backup And RecoveryOracle Management Repository
• Use standard database tools for any database backup and for recovery
• Case 1: Full recovery on same host -- No special consideration for EM On new host, modify repository target
• Case 2: Partial/Point-in-time recovery – Agent will be the source of truth and state information to resynchronization
emctl resync reposOracle Management
Repository (OMR)
Oracle
Management
Service(OMS)
DB
Grid Control Console
Oracle Management Agent (OMA)
HA /Security
Backup And RecoveryOracle Management Service
HA / Security
•Oracle Management Service is (mostly) stateless
•Protect receive directory with some forms of disk mirroring
•Backup OMS config with:
•Emctl exportconfig
•Method 1: Backup/Restore the software directory structure
•restore that to the same directory path
•Method 2: Reinstall from the original media to get a baseline
•Restore saved OMS configuration
•Emctl importconfigOracle Management
Repository (OMR)
Oracle
Management
Service(OMS)
DB
Grid Control Console
Oracle Management Agent (OMA)
Backup And RecoveryOracle Management Agent
HA / Security
•Method 1: Disk backup and restore
•Method 2: Reinstall from the original media
•Repository will be considered as the source of truth
•Rebuild state information from Repository
•Emcli resyncagent
Oracle Management
Repository (OMR)
Oracle
Management
Service(OMS)
DB
Grid Control Console
Oracle Management Agent (OMA)
Maximum Availability Architecture
References
• Metalink Note 330072.1 – High Availability for EM • Check out EM Grid Control Certification checker (MetalinkNote 412431.1)
• Documentation Set• Installation & Configuration Guide
• Chapter 17 “Grid Control Common Configurations”• Chapter 18 “Configuring Enterprise Manager for Active Passive Environments”
• Administrators Guide• Chapter 10 “Backup, Recovery and Disaster Recovery”
• http://www.oracle.com/technology/deploy/availability/htdocs/maa.htm• Contains best practice guidance for database and Grid Control published between documentation set revisions/updates
SSL v3
SSL v3
Oracle Management
Service(OMS)
Oracle Management
Repository(OMR)
Oracle Management
Agent(OMA)
Database Application
Server
Applications
ASO
• Harden the Oracle Management Service and Repository machines by removingunsecure services
• Secure Enterprise Manager Components• Apply Database security best practices for Repository
and Oracle Application Server Security best practices for OMS
• Put OMS and Repository behind firewalls
• Restrict network access to OMS, Repository through IP addresses
• Use Impersonation-based access to the owner of Oracle Homes
• Configure secure communication• Enable Advanced Security Option for communication
between OMS and Repository
• Use certificates from well-known Certificate Authority
HA / Security
EM Infrastructure SecuritySecure Enterprise Manager as a Post-
Install Step
Agent Security Keep Enterprise Manager Secure As It Grows
• New (Mass) agent deployment
• Use ‘Agent Deploy’ which uses SSH between OMS and host server
• Secure the agents
• emctl secure agent
• Secures communication with the OMS
• Prevents unauthorized agent from uploading data
• Ensure latest CPUs/PSUs are applied on all Enterprise Manager components
…Agent Product Stage
SSH
Grid Control
ConsoleOracle Management
Service(OMS)
Oracle Management
Repository(OMR)
Oracle Management
Agent(OMA)
Database Application
Server
Applications
HA / Security
Enterprise Manager User SecurityGrid Control Authentication
HA /Security
• Repository-based authentication (Default)
• Enable password profile to enforce the password control
• Support Oracle Single Sign-on (SSO) or Enterprise User Security (EUS) user authentication
• Simplify the identity management across the enterprise
• Both SSO and EUS enable your users to authenticate to Grid Control by using their credentials stored in LDAP server
Oracle Management
Repository(OMR)
OSSO
LDAP Server
EUSDefault
Enterprise Manager User SecurityAuthorization and Auditing
HA / Security
• Grant only the minimum set of privileges
• Follow the principle of least privilege
• New fine-grained privileges in 10.2.0.5
• Simplify the access control management
• Grant roles instead of individual privileges to users
• Use roles along with Privilege Propagation groups
• Monitor user actions in Enterprise Manager
• Enable Enterprise Manager Auditing
• Configure externalization service to purge audit data from Repository to external system
Oracle Enterprise Manager
Resources:
Targets, Jobs, Reports….
Operations:
Blackout, Configure….
Roles and
Privilege Propagating Group
Auditing
Enterprise Manager SecurityFirewall Considerations
HA / Security
Monitoring Management Agents Monitor
• Agent Log Files
• Emagent.log/trc/nohup
• Management Server and Repository Page
• Agent subtab
• Use Emdiag reports to generate alerts
• Listener issues contribute heavily to down targets
Monitoring Management Servers Monitor
• Monitoring in Enterprise Manager
• OPMN, J2EE and OMS log files
• $OH/opmn/logs/*
• $OH/j2ee/OC4J_EM/log/*/*
• $OH/sysman/log/*
• Management Servers and Repository Page
• Backlog indicators
• Loading data file backlog
• Notifications waiting to be delivered
• Jobs step backlog
Monitoring Target Configuration Monitor
• Not configured properly Targets
• Wrong ORACLE_HOME
• Wrong Database password
• Wrong role (ASM/Data guard)
• Agent Configuration
• Clock skew – EMDiag reported
• Communication problems
Monitoring Management Repository )Monitor
• Monitoring in Enterprise Manager
• Alertlog
• Database diagnostics
• ADDM
• AWR
• ASH
• Space Monitoring
• Oracle home free space
• Daily health checks
• Resource/Performance
• Repository operations
• Data processing
• Internal housekeeping
AGTVFY
• Agent diagnostics– Verification/Detection
$ agtvfy verify all –level 9
– Agent statistics
$ agtvfy show targets
• To get help screen:$ agtvfy -h
REPVFY
• Repository diagnostics– Verification/Detection
$ repvfy verify all –level 9
– Dumps$ repvfy dump health
• To get help screen:$ repvfy -h
• More information can be found on My Oracle Support:421053.1: EMDiagkit Download and Master Index
Enterprise Manager Diagnostics
(EMDiag)
Monitor
• Get the right data from the right component of Enterprise Manager to streamline the communication with Oracle Support
• RDA for the generic cases or cases involving multiple components
• Information from the repository:– Error log
$ repvfy dump errors
– System information$ repvfy dump system
• Information needed from the OMS:– Log files (OPMN, J2EE and SYSMAN logs)
– System information$ repvfy dump system
• Information needed from the Agent:– Log files
$ agtvfy zip log
– Config files$ agtvfy zip config
Enterprise Manager Diagnostics
(EMDiag)
Monitor
Diagnosing Network
Communication Issues
• All the Enterprise Manager tiers communicate via the network- OMS -> Repository: Oracle Net
- Agent <-> OMS: HTTP(S)
- Administrator <-> OMS: HTTP(S)
• Available network toolsTo test the communication between the tiers- Statistics and throughput: netstat
- Testing connectivity: traceroute
- Network latency: ping, tnsping
• Available web toolsTo test the response of the Enterprise Manager application- URL page grabbers: wget, lynx, …
- Scripting: perl
Monitor
Enterprise Manager Housekeeping
• Why Routine Maintenance• Critical for EM Grid Control performance
• Prevent Outages by• Monitoring all the tiers
• Watching for trends• Usage / Resources• Availability
• Reducing frequent “oscillation” or “floods” of alerts
• Deleting permanently “down” targets
• Detecting early warning signs• Messages in log files• Repeated errors/messages
House Keep
Enterprise Manager Housekeeping
• Weekly “Online” tasks• Resolve system errors
• Resolve metric collection errors
• Resolve all open alerts
• Resolve always down, duplicate, or unconfiguredtargets
• Resolve database alert log errors for repository database.
• Monthly “Offline” tasks• Review and consider implementing any AWR Segment Advisor recommendations.
House Keep
Enterprise Manager Performance
Tuning
• Management Repository
• Monitor repository system CPU, memory, and disk utilization
• Monitor the Repository database itself
• Typical performance bottleneck suspects:• Neglected housekeeping tasks
• Incorrect hardware/software configuration
• Insufficient hardware resources
• Trend Vital Signs to identify the root cause
Tune
Enterprise Manager Vital Signs
for Tuning• Loader Vital Signs
• Percent of hour runs
• Loader throughput per second as Rows/second/thread
• Rollup Vital signs
• Percent of hour runs for Rollup Job
• Rollup throughput per second as Rows/second
• Job, Notification, Alert Vital signs
• I/O Vital Signs
• Disk I/O from the Management Repository instance to its data files
• Network I/O between the Management Server and Management Repository
• RAC interconnect (network) I/O
Note: See OEM Advance Configuration 10g Release 5 guide for details on EM Vital Signs
Tune
Use Enterprise Manager to Tune
Enterprise Manager
• Slow UI pages
• End-to-end page monitors at OC4J servlet level
• OC4J performance for each URL
• Servlet Time
• JSP Time
• EJB Method Time
• JDBC Time
• JDBC time broken down to SQL query
• SQL queries pinpoint bottleneck
Tune
Extrapolate Enterprise Manager
Vital Signs for Change Management
• Getting a baseline during good performance is key
• 31 day wait for full target data impact
• Examine trends from baselines
• Rate of change allows accurate forecast for future resource requirements
ManageChange
Change Management
• Any changes to number of targets, types of targets
• Any changes to network, storage and server infrastructure
• Any changes to monitoring policies, thresholds
• Any new features / functionality
• Planning a release Upgrade
ManageChange
Implementation Considerations
• Deployment of Repository Database• 10.2.0.4 or 11.1.0.7
• SLB Configuration – Use best practices
• Agent Deployment • Deployment Method selection
• Target Monitoring, policy definition • Based on organization’s needs
• Configuring Enterprise Manager for High Availability and periodic testing• Initial setup, testing and on-going testing
• Securing Enterprise Manager• Communications and encryption key
• Preferred credentials
Implementation Considerations
• Objectives & Plan Alignment – Use a proven methodology• Business Case
• Integrating into overall Data Center management• Customizations, extensions & Process Alignment
• Training at various levels
Questions?
For More Information
search.oracle.com
Enterprise Manager
or
oracle.com