Download - Nagios in Power Transmission Utilities
Introduction & Agenda
• Brazilian Electrical Sector Overview• CEEE-GT experience within Nagios Core• Motivation for Different areas of the
company– Telecommunications– Automation– Protection and Control– Supervision
• Results and Future Plans
Brazillian Electrical Sector
• Regionalized until mid-1990.
Brazillian Electrical Sector
• Regionalized until mid-1990.• Regional companies controlled their
respective areas and they could have vertical expertise.
Brazillian Electrical Sector
• Regionalized until mid-1990.• Regional companies controlled their
respective areas and they could have vertical expertise.
• Generation, Transmission and Distribution of Electricity.
Brazillian Electrical Sector
• Regionalized until mid-1990.• Regional companies controlled their
respective areas and they could have vertical expertise.
• Generation, Transmission and Distribution of Electricity.
• In the second half of the 1990s, rules changed.
Brazillian Electrical Sector
• Regionalized until mid-1990.• Regional companies controlled their
respective areas and they could have vertical expertise.
• Generation, Transmission and Distribution of Electricity.
• In the second half of the 1990s, rules changed.
• The increasing interconnectivity of various states created the need to regulate and discipline the electrical sector.
Brazillian Electrical Sector
• Regionalized until mid-1990• Regional companies controlled their
respective areas and they could have vertical expertise.
• Generation, Transmission and Distribution of Electricity.
• In the second half of the 1990s, rules changed.
• The increasing interconnectivity of various states created the need to regulate and discipline the electrical sector.
• ANEEL and ONS were created.
Brazillian Electrical Sector
• National Interconnected System (SIN)
Brazillian Electrical Sector
• National Interconnected System (SIN)• Biggest of its kind in the world.
Brazillian Electrical Sector
• National Interconnected System (SIN)• Biggest of its kind in the world.• More than 100 thousand km of
transmission lines (equal to or higher than 230kV)
Brazillian Electrical Sector
• National Interconnected System (SIN)• Biggest of its kind in the world.• More than 100 thousand km of
transmission lines (equal to or higher than 230kV)
• Only 1,7% of Energy used in the country are not in the interconnected system.
Brazillian Electrical Sector
• National Interconnected System (SIN)• Biggest of its kind in the world.• More than 100 thousand km of
transmission lines (equal to or higher than 230kV)
• Only 1,7% of Energy used in the country are not in the interconnected system.
• A failure in a substation or transmission line can impact in the whole country (blackout).
Company Presentation
• CEEE was founded in 1943.
Company Presentation
• CEEE was founded in 1943.• Operates in the 3 main areas of The
Brazilian Electrical Sector: Power Generation (G), Transmission (T) and Distribution(D).
Company Presentation
• CEEE was founded in 1943.• Operates in the 3 main areas of The
Brazilian Electrical Sector: Power Generation (G), Transmission (T) and Distribution(D).
• The state government has equity control of the company.
Company Presentation
• CEEE was founded in 1943.• Operates in the 3 main areas of The
Brazilian Electrical Sector: Power Generation (G), Transmission (T) and Distribution(D).
• The state government has equity control of the company.
• Considerable Eletrobras participation (~32%), which is the main provider for the federal government
Company Presentation
• 3,800 employees.
Company Presentation
• 3,800 employees.• 6th largest company in Rio Grande do Sul
State (117th largest company in Brazil).
Company Presentation
• 3,800 employees.• 6th largest company in Rio Grande do Sul
State (117th largest company in Brazil).• Generates 75% of the State
Hydroelectricity
Company Presentation
• 3,800 employees.• 6th largest company in Rio Grande do Sul
State (117th largest company in Brazil).• Generates 75% of the State
Hydroelectricity• Owns 5.781 km of transmission lines.
Company Presentation
• 3,800 employees.• 6th largest company in Rio Grande do Sul
State (117th largest company in Brazil).• Generates 75% of the State
Hydroelectricity• Owns 5.781 km of transmission lines.• Distributes electrical energy for one third
of the State (3.5 million people).
Supervision Area
• Division was founded in the mid-1970s.
Supervision Area
• Division was founded in the mid-1970s. • Initially focused on the data state of the
electrical system.
Supervision Area
• Division was founded in the mid-1970s. • Initially focused on the data state of the
electrical system. • With the growth of the system, greater
demands were aggregated.
Supervision Area
• Division was founded in the mid-1970s. • Initially focused on the data state of the
electrical system. • With the growth of the system, greater
demands were aggregated. • New devices were installed.
Supervision Area
• Division was founded in the mid-1970s. • Initially focused on the data state of the
electrical system. • With the growth of the system, greater
demands were aggregated. • New devices were installed. • New demands were made by the
regulator.
Supervision Area
• Division was founded in the mid-1970s. • Initially focused on the data state of the
electrical system. • With the growth of the system, greater
demands were aggregated. • New devices were installed. • New demands were made by the
regulator. • Need to reduce downtime of equipment.
Supervision Area
• Division was founded in the mid-1970s. • Initially focused on the data state of the
electrical system. • With the growth of the system, greater
demands were aggregated. • New devices were installed. • New demands were made by the
regulator. • Need to reduce downtime of equipment.• Need to remotely control Substations.
Supervision Area
• Composed mainly of electronic/electrical engineers and technicians.
Supervision Area
• Composed mainly of electronic/electrical engineers and technicians.
• Weak Computer knowledge among team members (no course graduation in the IT area).
Supervision Area
• Composed mainly of electronic/electrical engineers and technicians.
• Weak Computer knowledge among team members (no course graduation in the IT area).
• Large gap between new and old employees, due to a large time without new hires.
Supervision Area
• Composed mainly of electronic/electrical engineers and technicians.
• Weak Computer knowledge among team members (no course graduation in the IT area)
• Large gap between new and old employees, due to a large time without new hires.
• Old concepts and techniques are very difficult to change.
Motivation
• The amount of data has been growing exponentially.
Motivation
• The amount of data has been growing exponentially.
• Many of these data are not directly linked to real time.
Motivation
• The amount of data has been growing exponentially.
• Many of these data are not directly linked to real time.
• Increasing number of data to be supervised versus selective users interest.
Motivation
• The amount of data has been growing exponentially.
• Many of these data are not directly linked to real time.
• Increasing number of data to be supervised versus selective users interest.
• Several of these data are alarmed for long time.
Motivation
• The amount of data has been growing exponentially.
• Many of these data are not directly linked to real time.
• Increasing number of data to be supervised versus selective users interest.
• Several of these data are alarmed for long time.
• Disrupting the work of real time staff.
Motivation
• The amount of data has been growing exponentially.
• Many of these data are not directly linked to real time.
• Increasing number of data to be supervised versus selective users interest.
• Several of these data are alarmed for long time.
• Disrupting the work of real time staff. • The maintenance staff is not informed of
problems.
Motivation
• The amount of data has been growing exponentially.
• Many of these data are not directly linked to real time.
• Increasing number of data to be supervised versus selective users interest.
• Several of these data are alarmed for long time.
• Disrupting the work of real time staff. • The maintenance staff is not informed of
problems.• This leads the system to become discredited.
Motivation
• Reduction of revenues led to reduction of employees on the long term (retirement and no new hires).
Motivation
• Reduction of revenues led to reduction of employees on the long term (retirement and no new hires).
• Telecontrol of substations became a priority in order to reduce Substations operators workforce.
Motivation
• Reduction of revenues led to reduction of employees on the long term (retirement and no new hires).
• Telecontrol of substations became a priority in order to reduce Substations operators workforce.
• Higher availability of systems are required when telecontrol is used.
Motivation Overview
Substation Field Devices
Substation Protection Realys
Motivation Overview
Substation Field Devices
Substation Protection Realys
Substation Automation Devices
Motivation Overview
Substation Field Devices
Substation Protection Realys
Substation Automation Devices
Operation CentersEMS
Motivation Overview
Substation Field Devices
Substation Protection Realys
Substation Automation Devices
Operation CentersEMS
Operation CenterHMI
Motivation Overview
Substation Field Devices
Substation Protection Realys
Substation Automation Devices
Operation CentersEMS
Operation CenterHMI
National SystemOperator
Motivation Overview
Substation Field Devices
Substation Protection Realys
Substation Automation Devices
Operation CentersEMS
Operation CenterHMI
National SystemOperator
Database Servers
Motivation Overview
Substation Field Devices
Substation Protection Realys
Substation Automation Devices
Operation CentersEMS
Operation CenterHMI
National SystemOperator
Database Servers
Corporate Network
Motivation for Substation Devices
• Online Graphic Supervision of failure on ethernet based devices inside substations.
Motivation for Substation Devices
• Online Graphic Supervision of failure on ethernet based devices inside substations.
• Due to the redundancy and use of RSTP (or other redundancy protocols), the flaws are often unnoticed, and failed devices are not replaced.
Motivation for Substation Devices
• Online Graphic Supervision of failure on ethernet based devices inside substations.
• Due to the redundancy and use of RSTP (or other redundancy protocols), the flaws are often unnoticed, and failed devices are not replaced.
• Preventive Maintenance, mainly in substations implemented with IEC61850.
Motivation for Substation Devices
• Online Graphic Supervision of failure on ethernet based devices inside substations.
• Due to the redundancy and use of RSTP (or other redundancy protocols), the flaws are often unnoticed, and failed devices are not replaced.
• Preventive Maintenance, mainly in substations implemented with IEC61850.
• Supervision also where there is a 2nd communication channel for the Operation Center.
Motivation for Telecommunications
• Different communication devices and vendors.
Motivation for Telecommunications
• Different communication devices and vendors.• Multiplexers (SDH and SONET).
Motivation for Telecommunications
• Different communication devices and vendors.• Multiplexers (SDH and SONET).• Switches (Ethernet).
Motivation for Telecommunications
• Different communication devices and vendors.• Multiplexers (SDH and SONET).• Switches (Ethernet).• Analog and Digital Radio (Serial
communication).
Motivation for Telecommunications
• Different communication devices and vendors.• Multiplexers (SDH and SONET).• Switches (Ethernet).• Analog and Digital Radio (Serial
communication).• Power Line Communication.
Motivation for Telecommunications
• Different communication devices and vendors.• Multiplexers (SDH and SONET).• Switches (Ethernet).• Analog and Digital Radio (Serial
communication).• Power Line Communication.
• Different Management Softwares.
Motivation for Telecommunications
• Different communication devices and vendors.• Multiplexers (SDH and SONET).• Switches (Ethernet).• Analog and Digital Radio (Serial
communication).• Power Line Communication.
• Different Management Softwares.• Architecture only drawn (without online
state).
Motivation for Telecommunications
• Different communication devices and vendors.• Multiplexers (SDH and SONET).• Switches (Ethernet).• Analog and Digital Radio (Serial
communication).• Power Line Communication.
• Different Management Softwares.• Architecture only drawn (without online
state).• Most susceptible to failures (shared links
with other companies, weather,…).
Nagios Usage Data Excess
• Problem of the excessive number of points became critical.
Nagios Usage Data Excess
• Problem of the excessive number of points became critical.
• There have been a few attempts to solve the problem by reducing the number of points.
Nagios Usage Data Excess
• Problem of the excessive number of points became critical.
• There have been a few attempts to solve the problem by reducing the number of points.
• It did not work for obvious reasons.
Nagios Usage Data Excess
• Problem of the excessive number of points became critical.
• There have been a few attempts to solve the problem by reducing the number of points.
• It did not work for obvious reasons. • We need to monitor increasingly data
points.
Nagios Usage Data Excess
• Problem of the excessive number of points became critical.
• There have been a few attempts to solve the problem by reducing the number of points.
• It did not work for obvious reasons. • We need to monitor increasingly data
points. • Another attempt was to include filters
alarms.
Nagios Usage Data Excess
• Problem of the excessive number of points became critical.
• There have been a few attempts to solve the problem by reducing the number of points.
• It did not work for obvious reasons. • We need to monitor increasingly data
points. • Another attempt was to include filters
alarms. • These filters alarms end up making users
forget most of the filtered points.
Nagios Usage Data Separation
• Monitoring a growing number of points requires more elaborate solutions.
Nagios Usage Data Separation
• Monitoring a growing number of points requires more elaborate solutions.
• The interest in this information is not the same for all teams.
Nagios Usage Data Separation
• Monitoring a growing number of points requires more elaborate solutions.
• The interest in this information is not the same for all teams.
• Also the frequency of monitoring needs to be the same.
Nagios Usage Data Separation
• Monitoring a growing number of points requires more elaborate solutions.
• The interest in this information is not the same for all teams.
• Also the frequency of monitoring needs to be the same.
• Thus, we sought to separate roughly into real-time information and maintenance.
Nagios Usage Data Separation
• Monitoring a growing number of points requires more elaborate solutions.
• The interest in this information is not the same for all teams.
• Also the frequency of monitoring needs to be the same.
• Thus, we sought to separate roughly into real-time information and maintenance.
• The real-time system remains SAGE.
Nagios Usage Data Separation
• Monitoring a growing number of points requires more elaborate solutions.
• The interest in this information is not the same for all teams.
• Also the frequency of monitoring needs to be the same.
• Thus, we sought to separate roughly into real-time information and maintenance.
• The real-time system remains SAGE. • Nagios was introduced as the maintenance
system.
Nagios Usage Why Nagios?
• Stable, was developed over an extensive period of time.
Nagios Usage Why Nagios?
• Stable, was developed over an extensive period of time.
• Expandable and customizable, with a wide range of add ons.
Nagios Usage Why Nagios?
• Stable, was developed over an extensive period of time.
• Expandable and customizable, with a wide range of add ons.
• Open software that meets the preferences of the team.
Nagios Usage Why Nagios?
• Stable, was developed over an extensive period of time.
• Expandable and customizable, with a wide range of add ons.
• Open software that meets the preferences of the team.
• Community of developers and active users.
Nagios Usage First Attempts
• In the past decade the Telecommunications Area made an attempt to monitor through Nagios.
Nagios Usage First Attempts
• In the past decade the Telecommunications Area made an attempt to monitor through Nagios.
• This experiment was not successful.
Nagios Usage First Attempts
• In the past decade the Telecommunications Area made an attempt to monitor through Nagios.
• This experiment was not successful. • Lack of interest from potential users.
Nagios Usage First Attempts
• In the past decade the Telecommunications Area made an attempt to monitor through Nagios.
• This experiment was not successful. • Lack of interest from potential users. • Fine tuning in Nagios was needed.
Nagios Usage First Attempts
• In the past decade the Telecommunications Area made an attempt to monitor through Nagios.
• This experiment was not successful. • Lack of interest from potential users. • Fine tuning in Nagios was needed. • It was needed a person to give daily
attention to the maturation process, which didn’t exist.
Nagios Usage First Attempts
• In the past decade the Telecommunications Area made an attempt to monitor through Nagios.
• This experiment was not successful. • Lack of interest from potential users. • Fine tuning in Nagios was needed. • It was needed a person to give daily
attention to the maturation process, which didn’t exist.
• Only part of the telecommunications system of the company was monitored.
Nagios Usage Installation Conditions
• In early 2011, the team was renewed in 50%.
Nagios Usage Installation Conditions
• In early 2011, the team was renewed in 50%. • Entry of new members brought new ideas.
Nagios Usage Installation Conditions
• In early 2011, the team was renewed in 50%. • Entry of new members brought new ideas. • Telecommunications System expanded a lot
with new multiplexers and switches.
Nagios Usage Installation Conditions
• In early 2011, the team was renewed in 50%. • Entry of new members brought new ideas. • Telecommunications System expanded a lot
with new multiplexers and switches. • Telecommunications team experienced an
influx of new employees.
Nagios Usage Installation Conditions
• In early 2011, the team was renewed in 50%. • Entry of new members brought new ideas. • Telecommunications System expanded a lot
with new multiplexers and switches. • Telecommunications team experienced an
influx of new employees. • In 2012, the company lost more than 60% of
its revenue due to renovation contracts by the Federal Government.
Nagios Usage Installation Conditions
• In early 2011, the team was renewed in 50%. • Entry of new members brought new ideas. • Telecommunications System expanded a lot
with new multiplexers and switches. • Telecommunications team experienced an
influx of new employees. • In 2012, the company lost more than 60% of
its revenue due to renovation contracts by the Federal Government.
• This has led to a pressing need for increased monitoring of the system and preventative action.
Nagios Usage Installation
• Nagios was again considered as a way to monitor the status of various devices.
Nagios Usage Installation
• Nagios was again considered as a way to monitor the status of various devices.
• Installation started in mid-2012.
Nagios Usage Installation
• Nagios was again considered as a way to monitor the status of various devices.
• Installation started in mid-2012. • Primary focus was to monitor Linux systems.
Nagios Usage Installation
• Nagios was again considered as a way to monitor the status of various devices.
• Installation started in mid-2012. • Primary focus was to monitor Linux systems. • Soon, it expanded to other systems and
areas, such as communication status of remote systems.
Nagios Usage Installation
• Nagios was again considered as a way to monitor the status of various devices.
• Installation started in mid-2012. • Primary focus was to monitor Linux systems. • Soon, it expanded to other systems and
areas, such as communication status of remote systems.
• Several features were added in these two years.
Nagios Usage Installation
• Nagios was again considered as a way to monitor the status of various devices.
• Installation started in mid-2012. • Primary focus was to monitor Linux systems. • Soon, it expanded to other systems and
areas, such as communication status of remote systems.
• Several features were added in these two years.
• Increasing in other areas of the company, like in the Substation Automation.
Nagios Usage Installation
• Nagios was again considered as a way to monitor the status of various devices.
• Installation started in mid-2012. • Primary focus was to monitor Linux systems. • Soon, it expanded to other systems and
areas, such as communication status of remote systems.
• Several features were added in these two years.
• Increasing in other areas of the company, like in the Substation Automation.
• In June 2014 the system has expanded to a second version, now installed in Telecommunications.
Nagios Usage Panorama
• Supervision
Nagios Usage Panorama
• Supervision
• Telecommunication
Nagios Usage Customized Services
• Script to check raid disks.
Nagios Usage Customized Services
• Script to check raid disks.• Configuration backup (Manually changed
devices).
Nagios Usage Customized Services
• Script to check raid disks.• Configuration backup (Manually changed
devices).• Configuration check (differences between
database and Operation Center configuration).
Nagios Usage Customized Services
• Script to check raid disks.• Configuration backup (Manually changed
devices).• Configuration check (differences between
database and Operation Center configuration).
• Serial Communication state (RX/TX Bytes).
Nagios Usage Customized Services
• Script to check raid disks.• Configuration backup (Manually changed
devices).• Configuration check (differences between
database and Operation Center configuration).
• Serial Communication state (RX/TX Bytes).• Telecommunication System Devices
proprietary protocols (via telnet).
Nagios Usage Customized Services
• Script to check raid disks.• Configuration backup (Manually changed
devices).• Configuration check (differences between
database and Operation Center configuration).
• Serial Communication state (RX/TX Bytes).• Telecommunication System Devices
proprietary protocols (via telnet).• Expect Language scripts.
Results
• It has provided notices of failures which could not be detected in a normal situation.
Results
• It has provided notices of failures which could not be detected in a normal situation. • Failure of one of the disks in a RAID system.
Results
• It has provided notices of failures which could not be detected in a normal situation. • Failure of one of the disks in a RAID system. • Failure of Emergency Control Scheme
system.
Results
• It has provided notices of failures which could not be detected in a normal situation. • Failure of one of the disks in a RAID system. • Failure of Emergency Control Scheme
system. • Failure of backup devices.
Results
• It has provided notices of failures which could not be detected in a normal situation. • Failure of one of the disks in a RAID system. • Failure of Emergency Control Scheme
system. • Failure of backup devices. • Failure of backup communication channels.
Results
• It has provided notices of failures which could not be detected in a normal situation. • Failure of one of the disks in a RAID system. • Failure of Emergency Control Scheme
system. • Failure of backup devices. • Failure of backup communication channels.
• Reduced response time of maintenance teams to attend occurrences.
Results
• It has provided notices of failures which could not be detected in a normal situation. • Failure of one of the disks in a RAID system. • Failure of Emergency Control Scheme
system. • Failure of backup devices. • Failure of backup communication channels.
• Reduced response time of maintenance teams to attend occurrences.
• Fault location with an integrated view.
Future and Beyond
• Transferring Real-time data points to Nagios.
Future and Beyond
• Transferring Real-time data points to Nagios. • Can be expanded to obtain more data from
protection relays.
Future and Beyond
• Transferring Real-time data points to Nagios.• Can be expanded to obtain more data from
protection relays. • Integration within the substation (IEC 61850,
DNP LAN).
Future and Beyond
• Transferring Real-time data points to Nagios. • Can be expanded to obtain more data from
protection relays. • Integration within the substation (IEC 61850,
DNP LAN). • Increasingly networked devices for
substation, easily reaching 50 on today’s equipment.
Future and Beyond
• Transferring Real-time data points to Nagios.• Can be expanded to obtain more data from
protection relays. • Integration within the substation (IEC 61850,
DNP LAN). • Increasingly networked devices for
substation, easily reaching 50 on today’s equipment.
• Trend to increase the number of the substation devices.
Future and Beyond
• Transferring Real-time data points to Nagios.• Can be expanded to obtain more data from
protection relays. • Integration within the substation (IEC 61850,
DNP LAN). • Increasingly networked devices for
substation, easily reaching 50 on today’s equipment.
• Trend to increase the number of the substation devices.
• Usage of Nagios in Smart Grids (Bigger Networks)
Future and Beyond
• Usage of Nagios reports in order to analyze potential points of future failure.
Future and Beyond
• Usage of Nagios reports in order to analyze potential points of future failure.
• Provides prospective on where to invest the budget resources.
Future and Beyond
• Usage of Nagios reports in order to analyze potential points of future failure.
• Provides prospective on where to invest the budget resources.
• Relieving the burden of repetitive work.
Future and Beyond
• Usage of Nagios reports in order to analyze potential points of future failure.
• Provides prospective on where to invest the budget resources.
• Relieving the burden of repetitive work. • Using Nagios as a tool of "management“:
email to Decentralized Teams to provide maintenance on failed devices.
Future and Beyond
• Usage of Nagios reports in order to analyze potential points of future failure.
• Provides prospective on where to invest the budget resources.
• Relieving the burden of repetitive work. • Using Nagios as a tool of "management“:
email to Decentralized Teams to provide maintenance on failed devices.
• Integration with other tools, such as automatic generation of maps, simulators, wiki, etc.
Questions?
Any questions?
Thanks!