server monitoring document
TRANSCRIPT
-
7/29/2019 Server Monitoring Document
1/47
Server Monitoring
1
-
7/29/2019 Server Monitoring Document
2/47
Bulletin What is server? What will do ?
What is Server Monitoring?
Goals of Monitoring
Benefits of Monitoring
Components of Monitoring
Monitoring Parameters or Counters
Monitoring Tools
Choosing a Tool
How tool works?
Conclusion
2
-
7/29/2019 Server Monitoring Document
3/47
What is server? What will do?
A server is a physical computer, dedicated to run one or more
services, to serve the needs of the users on a network or on same
computer. There are different type of servers are available, depending
on the service, server can be selected. Some of the servers are listed
below.
Application server
Web server
Database server
Proxy server
File Server
Mail Server, etc
3
-
7/29/2019 Server Monitoring Document
4/47
Web Server Serves web pages to computers (Clients) that connect toit.
ApplicationServer
That handles all application operations between usersand an organizations back-end applications or database.
Mail Server Stores users' email accounts, send and receive emails.
File Server Stores file that can be accessed by other computers(clients).
Proxy Server Proxy server lie between a client program and server. Itprovides filtering, translation, sharing connections etc
Database Server
Database server is the term used to refer to the back-endsystem of a database application using client/server
architecture. The back-end, sometimes called a databaseserver, performs tasks such as data analysis, storage,data manipulation, archiving, and other non-user specifictasks.
-
7/29/2019 Server Monitoring Document
5/47
In today's internet economy, a well designed and smoothly
operating applications provides a distinct competitive advantages: The
ability to reach customers around the world 24x7. Ensuring that all of the
elements of the application resided in server are functioning properly is
critical to maximizing the companies investment, therefore monitoring
the server is an important and critical element of any web presence.
5
-
7/29/2019 Server Monitoring Document
6/47
Server Monitoring is ..
The process of automatically scanning the server on networks forirregularities or failures.
To monitor server's system resources like CPU Usage, MemoryConsumption, I/O, Network, Disk Usage, Process etc.
To understand server resource usage which can help for thecapacity planning and provide a better end-user experience.
Ensures that, Server is capable of hosting the application.
To make sure that a server is active, healthy and responding torequests appropriately.
Allows to identify issues and fix unexpected problems before theyimpact end-users productivity.
Lets to get real-time internal statistics from the server. By internalstatistics, we mean things like CPU Usage, number of openconnections, amount of free memory, number of cache hits from
server, etc.6
-
7/29/2019 Server Monitoring Document
7/47
Goals of Monitoring:
Determining whether it is possible to improve
server performance. For example, by
monitoring the server response time for
frequently used requests, you can determine
whether changes need or not.
Troubleshoot any problem. For example,
downloading a page in application not
working, to troubleshoot these problems, goal
would be to track down the problem using theavailable resources.
Continue..7
-
7/29/2019 Server Monitoring Document
8/47
Goals of Monitoring:.
Optimizing the performance, is minimal response time and maximum
through put as a result of minimizing network traffic, disk I/O and CPU
time.
Establishing a server performance baseline: is done by taking
performance measurements over time. Each measurement should be
compared against the same measurement taken earlier.
For example, if the amount of time to perform set of actions
increases, want to examine the actions and take server performance
improvement actions.
Continue..
-
7/29/2019 Server Monitoring Document
9/47
After server performance baseline, compare the base line withthe current server performance. This may indicate areas where the
server need to be reconfigured for better performance.
At minimum, measurements should be taken to determine baseline:
Peak and off-peak hours of operation
Response query or response times
Server backup and restore completion times
Server reaction in down-times
Server response at overload
9
-
7/29/2019 Server Monitoring Document
10/47
Benefits of Monitoring:
Intrusion detection
Ensuring continuity and performance
Security considerations
Automatic overload prevention
Server downtime reaction
Scalability
To detect internal link errors, etc
10
-
7/29/2019 Server Monitoring Document
11/47
Intrusion Detection :
It is a big reason for monitoring the
server. More often than not, servers are
compromised without anyone knowing.
By monitoring the server for intrusion, an
administrator is aware that some one is
trying to compromise server security, and
he/she may then take steps to avoid in
the future and even may find out who is
intruding.
For example, If an intruder can logged
into application and post huge number of
queries in short span of time, this will
cause the server to go in the denial of
service.11
-
7/29/2019 Server Monitoring Document
12/47
APSRTC website was hacked on Jan 13th,2013:
Aryan Hackers, Bangladesh hackers group was entered into the
APSRTC server and control about 1 hour on Jan 13th, 2013.
The RTC's IT personnel got into act and restored the server.
-
7/29/2019 Server Monitoring Document
13/47
Ensuring Continuity and Performance :
It is important that an application is available for the customer
24x7, that is frequently inaccessible is likely to loss business and
destroy customer loyalty. The server might go-down for reasons like
hardware failure, application failure, network trafficetc.
By monitoring the server we can find out the problems before they
impact the business and we can provide continuity service.
Applications unavailability implies that closing the business for that
much time. It is not only leads to business losses but also in terms of
reputation of the company.
13
Continue..
-
7/29/2019 Server Monitoring Document
14/47
Ensuring Continuity and Performance :
Troubleshooting the server performance is also reason,
For example, if the user is not able to connect to server, you may
want to monitor the server to troubleshoot these problems.
If any component like driver, motherboard, controller failed, the
server stay down. Monitoring provides, an administrator needs to
be know the as soon as possible that hardware is failed, so that
component can replaced.
-
7/29/2019 Server Monitoring Document
15/47
Security Considerations :
There are many other security features that an
administrator may need to monitor. Some possible
examples are :
1. Denial of service filtering A DoS filter rejects the
connections, if the request is unauthenticated with in
the monitor time.
2. Unused services monitoring decides which
services want use and disable the rest.
3. Carefully manage clients by removing users who no
longer need access servers.
15
-
7/29/2019 Server Monitoring Document
16/47
Overload prevention:
Every server has defined load limits , because it can handle
only a limited number of concurrent client connections (Users). When
server is near to or over its limits, then we called that situation as
overload.
Causes of overload:
1. Too much legitimate web traffic huge number of clients
connecting to the server with in a short interval.
2. Computer warms that sometime cause abnormal traffic.
3. XSS Viruses (Cross scripting viruses)
4. Slow internet connection Requests are served very slowly and
the number of opened connections increases, so that the server limits
are reached.
5. Servers partially unavailable16
Continue..
-
7/29/2019 Server Monitoring Document
17/47
Due to above reasons the server might be overloaded, it
causes to business loss. To prevent these problems, lets know the
administrator about the overload and let him take right actions at
right time.
For instance, If the admin identified that server is
overloaded, by moving key factors to another server, he is able to
prevent the problem.
-
7/29/2019 Server Monitoring Document
18/47
-
7/29/2019 Server Monitoring Document
19/47
Avoid server downtime:
Server downtime refers that, server is unavailable to provide service.
Downtime can results from overloaded processors, rapidly expanding
memory usage, disk errors and other problems.
For Example, if you are in a video conference with an important client,
suddenly the server became busy and slow down, it either creates a gap
between the communication or can leads to losing the valuable client. More
over it effects to the reputation of the company.
19
-
7/29/2019 Server Monitoring Document
20/47
Close monitoring and management of key server metrics prevents
the downtime. Administrators can reused server downtime with monitoring
utilities that alert when critical thresholds are passed, so that the admin can
take corrective actions.
-
7/29/2019 Server Monitoring Document
21/47
Detecting internal link errors:
It is not possible to check the application continuously. A better server
monitoring, will notify if there are any problems or errors with in the
application internal links, so that you can resolve those errors before
customer can find. For example, if there are any dead links or database
errors.
21
-
7/29/2019 Server Monitoring Document
22/47
22
Components of Monitoring:
Monitoring a server involves the components: Identifying the events (Parameters or Counters) that must be
monitored
Determine the event data to capture
Apply the filters to limit the captured data
Monitoring (capturing) events
Saving captured event data
Analyzing the captured data
Replaying the captured event data
Generating the reports
Server performance is estimated based on the reports andfurther actions should be applied.
-
7/29/2019 Server Monitoring Document
23/47
23
Identify the parameters to be
monitored:
The parameters determine the activities
that are monitored and captured. These
parameters depends on what is being
monitored and why.
For example, when monitoring disk
activity, it is not necessary to monitor database
server locks.
-
7/29/2019 Server Monitoring Document
24/47
24
Determining the counters data to capture:
The event data describes each instance of an counter as it
occurred. For example, when capturing database lock events, it is
useful to capture data that describes the tables, users and
connections affected by the lock event.
1. Apply the filters to limit the counters data collected:- Limit the
counters data allows the system to focus on the specific types of
relevant to the monitoring scenario.
2. Capturing (Monitoring) events:- This is the process of actively
monitoring the application, to see what is occurring.
Continue..
-
7/29/2019 Server Monitoring Document
25/47
Determining the counters data to capture:
3. Save captured data:- This allows data to analyzed at a later time.
4. Analyzing captured data:- Analyzing event data involves
determining what is happening and why. Using this analysis, allows to
make changes that can improve performance.
Continue..
-
7/29/2019 Server Monitoring Document
26/47
26
Determining the counters data to capture:
5. Replaying captured data:- This allows to establish a test copy of the
server environment from which the captured events as they originally
occurred on the real system. To determine the effect of the parameters, replay
allows to analyze the exact events that occur on a production system in test
environment.
6. Generating the reports:- Based on the analysis, reports should be
generated for the future reference
-
7/29/2019 Server Monitoring Document
27/47
27
Estimate the server performance :
Based on the reports the server performance should be estimated. The
estimation moved towards positive results, so that the server is healthy. If
bring into being poor performance, alterations in server configurations
should be made to improve the performance.
-
7/29/2019 Server Monitoring Document
28/47
Monitoring Parameters
orCounters
-
7/29/2019 Server Monitoring Document
29/47
29
Monitoring Parameters or Counters:Server performance monitoring is a complex subject, it can be
daunting to met with a choice of over a set of performance counters to
choose from. Which one are important to monitor. Counters choice
depends on the role of the system to monitor. Counters determine what
to monitor and why.
-
7/29/2019 Server Monitoring Document
30/47
Counters to check server availability:
System up time tells how many seconds it has been since
server last rebooted.
Processor (instance) elapsed time - tells how long thatparticular process has been running on your machine.
-
7/29/2019 Server Monitoring Document
31/47
Counters to determine server busy:
% processor usage time- measures the total utilization of your
processor by all running processes.
% processor privileged time tells processor utilization by kernal
% processor user time - tells processor utilization by user.
Processor queue length Gives an indication of how many threads
are waiting for execution.
Request queued Number of active services and applications
running on the server.
-
7/29/2019 Server Monitoring Document
32/47
Counters to determine availability of Memory/RAM:
Memory or pages/sec- indicates the number of paging
operations to disk during the measuring interval, and this is the
primary counter to watch for indication of possible insufficient
RAM to meet your server's needs.
Memory available bytes if this counter is greater than 10% of
the actual RAM in your machine then you probably have more
than enough RAM and don't need to worry.
Continue..
-
7/29/2019 Server Monitoring Document
33/47
Counters to determine availability of Memory/RAM:
Processor (instance)\working set - determine which process is
consuming larger and larger amounts of RAM
Memory or Transaction fault/sec - measures how often recently
trimmed page on the standby list are re-referenced. If this counter
slowly starts to rise over time then it could also indicating server
reaching a point where you no longer have enough RAM for your
server to function well.
-
7/29/2019 Server Monitoring Document
34/47
Counters to check hardware:
System or context switches/sec- measures how frequently the
processor has to switch from user-mode to kernel-mode to
handle a request from a thread running in user mode.
Generally this counter would be higher, but over long term the
value of this counter should remain fairly constant. If this counter
suddenly starts increasing however, it may be an indicating of a
malfunctioning device.
-
7/29/2019 Server Monitoring Document
35/47
Counters to find out disks fast:
Physical disk transfers/sec states response time of the disk, if it
goes above 25 disk I/Os per second then you've got poor
response time for your disk. Physical disk (instance) % idle time - measures the percent time
that your hard disk is idle during the measurement interval
-
7/29/2019 Server Monitoring Document
36/47
v
Server Monitoring Tools
-
7/29/2019 Server Monitoring Document
37/47
37
Tools Provide .
Managing application service with out impacting the
infrastructure.
Resolves problems automatically, such as re-establishing
network connection or restarting an application.
Daily scheduled reports are generated automatically.
Ability to quickly identify hardware and applications issues
that may cause harm to the operating system.
Continue..
-
7/29/2019 Server Monitoring Document
38/47
38
Tools Provide .
Aides in capacity planning for reconfigurations.
Instant multimedia alerts by sms, email, phone, instant messenger
and others.
24 x 7 monitoring.
Minimize IT cost
Help and simplifies detection and resolution of server and network
problems.
Reduce downtime and business loss.
Commercial Monitoring Tools :
-
7/29/2019 Server Monitoring Document
39/47
Commercial Monitoring Tools :
1. HP Network Node Manager
2. Observer by Network Instruments
3. Nimsoft Monitoring Solution
4. PacketTrap part of Dell
5. PRTG Network Monitor (free and commercial)
6. ServersCheck
7. SolarWinds
8. SevOne
9. WhatUpGold
10. Zyrion Traverse
Open Source Monitoring Tools :
-
7/29/2019 Server Monitoring Document
40/47
Open Source Monitoring Tools :
1. Cacti
2. Nagios
3. Argus
4. PandoraFMS
5. Zenoss
6. Zabbix
7. Aggregate Network Manger (limited free)
8. IsyVmon
9. NetXms
10. InterMapper (limited free)
-
7/29/2019 Server Monitoring Document
41/47
41
Choosing a tool:
A comprehensive set of tools for monitoring. The choice of the
tool depends on the events to be monitored, cost of the tool and type
of the monitoring etc.
How tool works ?
-
7/29/2019 Server Monitoring Document
42/47
How tool works ?
Monitoring the server using tool involves:
Monitoring: IT staff configure to monitor critical IT infrastructure including
system metrics, network protocols, applications, services, servers etc.
Alerting: If any infrastructure components fail, providing administrator with
notice about the failure.
Response: IT staff can acknowledged alerts and being resolve, other wise
-
7/29/2019 Server Monitoring Document
43/47
the alerts would be send repeatedly.
Reports: Reports are provide a historical data of failures, events, notifications
and alerts for later review.
Maintenance: Scheduled downtime prevents
-
7/29/2019 Server Monitoring Document
44/47
Maintenance: Scheduled downtime prevents.
Planning: Trending and capacity planning graphs and reports allow you to
identify necessary infrastructure upgrades before failures occur.
-
7/29/2019 Server Monitoring Document
45/47
Conclusion:
-
7/29/2019 Server Monitoring Document
46/47
ServerMonitoring
process,analysis
andreportgenera
tion
Action taken againstserver failure based onthe reports
46
Monitoring a server is an important aspect, leads to better
performance and high customer satisfaction. If the problems are
identified as possible as soon, we can take actions against the failure
before it impacts the business.
-
7/29/2019 Server Monitoring Document
47/47
47
Thank you
Presented By
N.V.Narasimha Rao