cern - it department ch-1211 genève 23 switzerland t oracle and streams diagnostics and monitoring...

18
RN - IT Department CH-1211 Genève 23 Switzerland www.cern.ch/it Oracle and Streams Diagnostics and Monitoring Eva Dafonte Pérez Florbela Tique Aires Viegas

Upload: cecilia-lucas

Post on 22-Dec-2015

221 views

Category:

Documents


0 download

TRANSCRIPT

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Oracle and Streams Diagnostics and

Monitoring Eva Dafonte Pérez

Florbela Tique Aires Viegas

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Agenda

• Oracle Enterprise Manager• Streams monitoring tool• Local monitoring tools• Network diagnostic tools• Open questions

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Oracle Enterprise Manager

• Set of centralized management tools – administration – configuration management– end-to-end monitoring – security capabilities

• Proactive monitoring and alerting• Monitoring service performance and usage• Automation, schedule jobs, patch

management

Overview

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Oracle Enterprise Manager

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Oracle Enterprise Manager

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Oracle Enterprise Manager

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Oracle Enterprise Manager

• Thresholds configuration• Metrics for the servers’ load• Run some advisors to try and pinpoint

performance or configuration issues• Can Tier1 use CERN OEM to monitor their

databases?

Open questions

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Streams monitoring tool

• Objectives:– Replication topology– Status of streams connections– Error notifications– Monitor streams performance (latency, throughput, …)

– Monitor resources related to the streams performance (Streams Pool memory, Redo generation)

• Architecture:– ‘Strmmon’ daemon written in Phython

• collects streams and instances info + repository• errors and warnings

– End-user web application

http://oms3d.cern.ch:4889/streams/main.phpOverview

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Streams monitoring tool

Monitor view

Connection view

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Streams monitoring tool

Database list

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Streams monitoring tool

Connection dashboard view

Detailed Streams view

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Streams monitoring tool

Graph generator

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Streams monitoring tool

• New features– Error tab (web application)

• list of errors that have been reported by streams processes

– Availability tab (web application)• Percentage availability of each instance provided with

availability plots.

– New metrics (monitor)• CPU consumption• Physical bytes

– Read

– Written

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Streams monitoring tool

Errors List

CPU consumption

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Streams monitoring tool

Availability

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Streams monitoring tool

• Proposition of future features– Weekly reports(number of transactions applied, number of

LCRs streamed etc)?– More notifications via mail(high latency,high CPU

utilization etc.)?– Some automatization in streams administration?

• Detecting common failures (e.g. propagation hangs)

• Proceed procedure to solve the failures

• Streams errors report:– Any action necessary at Tier1?– Who is testing what?

• Email alerts– RAL still does not receive notifications

Open questions

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Local and Network monitoring

• Is OEM sufficient?• Which other tools?• To which metrics we should pay attention?• “Homemade” tools for backup monitoring:

– RAL, …

• Local monitoring with Nagios– Is this reasonable?– Any experience?

Open questions

• Triumf (slides)• BNL (slides)

CERN - IT DepartmentCH-1211 Genève 23

Switzerlandwww.cern.ch/it

Overall

• What, specifically, Tier-1s should monitor on their own databases?

• What CERN want to know about the sites?• What Tier-1 sites need to know about the

CERN databases?• What Tier-1 sites need to know about other

Tier-1 sites?