introduction to online diagnostics for dell … · introduction to online diagnostics for dell...
TRANSCRIPT
www.dell.com/powersolutions Reprinted from Dell Power Solutions, February 2006. Copyright © 2006 Dell Inc. All rights reserved. DELL POWER SOLUTIONS 87
SYSTEMS MANAGEMENT
Diagnostics that help identify hardware failures and
changes in a system’s condition can be critical for
administrators who must minimize server maintenance
and maximize uptime and reliability. Deploying an effi-
cient, effective diagnostics program can help reduce
system downtime considerably.
The Dell OpenManage Online Diagnostics pro-
gram is an integral part of Dell OpenManage Server
Administrator software. This program consists of a
suite of diagnostic test modules that run locally on a
Dell PowerEdge server and can be accessed remotely
over a network. Diagnostic tests can be selected from
a hierarchical menu representing the hardware that
the Online Diagnostics program discovers on a Dell
PowerEdge server. Tests can be run simultaneously or
sequentially in a single session. In addition, adminis-
trators can view progress and results for each selected
test or hardware component. Figure 1 shows the Diag-
nostic Selection screen in Dell OpenManage Server
Administrator.
Benefits of Dell OpenManage Online DiagnosticsAdministrators can install and use Dell OpenManage
Online Diagnostics to diagnose performance issues with
their hardware. For instance, a Dell-supplied serial modem
might not be performing at the specified speed. An admin-
istrator can use the device tree in Online Diagnostics and
run diagnostics on the entire subset of devices that might
be causing the modem issue.
In this example, the administrator would run diagnos-
tics on the modem and its parent, the serial port. If the
modem is at fault, the modem diagnostic program would
fail one or more of the commands written to it and flag
an error to the administrator. However, if the problem is
caused by an improper serial port setting (such as a low
baud rate), then the serial port diagnostics would fail,
indicating a meaningful error to the administrator. In this
manner, the Online Diagnostics program can help admin-
istrators and Dell tech support staff to isolate hardware
issues and prevent false service dispatches, thus helping
to reduce service costs.
BY PRATHAP THATHIREDDY AND SRIKRISHNA SRIDHAR MURTHY
Introduction to Online Diagnostics for Dell PowerEdge Servers
Dell™ OpenManage™ Online Diagnostics is a comprehensive, cross-platform diagnos-
tics program designed to enhance operation of Dell PowerEdge™ servers and help
reduce service costs. This article introduces Dell OpenManage Online Diagnostics,
describes its features, and discusses administration scenarios.
Related Categories:
Dell OpenManage
Dell PowerEdge servers
Diagnostics
Systems management
Troubleshooting
Visit www.dell.com/powersolutions
for the complete category index.
SYSTEMS MANAGEMENT
DELL POWER SOLUTIONS Reprinted from Dell Power Solutions, February 2006. Copyright © 2006 Dell Inc. All rights reserved. February 200688
Devices supported by Dell OpenManageOnline DiagnosticsThe Dell OpenManage Online Diagnostics program provides
diagnostics for several Dell-supplied and Dell-certified hard-
ware devices.
CMOS RAM diagnostics. CMOS (complementary metal-oxide
semiconductor) memory contains system configuration information.
The CMOS diagnostic program performs a checksum test of the
CMOS memory to determine whether any bytes are corrupt.
CD/DVD diagnostics. The CD/DVD diagnostic test identifies
drive-related mechanical problems such as those affecting the drive
door, spindle motor, and fault-sector and read functions.
Serial port diagnostics. Serial port diagnostics identify any
issues, such as baud rate, related to serial port configurations. They
also cover communication issues such as internal loop back, inter-
rupt handling, and internal registers. These diagnostics can be used
to diagnose performance-related issues that result from improper
configuration of serial devices.
Parallel port diagnostics. This test diagnoses parallel port
configurations and communication-related issues. It can be used
to diagnose performance-related issues that result from improper
configuration of parallel devices.
Modem diagnostics. Modem diagnostics analyze communica-
tion registers on the modem. This also helps to diagnose any modem
hardware issues in sending and receiving commands.
Network interface controller diagnostics. This test is used
to diagnose any network communication and configuration-
related issues. These advanced diagnostics can be very helpful
to administrators in resolving issues with network configuration
on Dell servers.
Memory diagnostics. Memory diagnostics are designed to
test the system memory’s storage integrity and its ability to store
data accurately. This test verifies that data paths, error-correction
circuits, and memory devices are working correctly.
Dell Remote Access Controller (DRAC) diagnostics. The DRAC
diagnostic test provides IT administrators with continuous access
to remote Dell servers. These diagnostics analyze DRAC hardware
and communication issues.
USB controller diagnostics. USB controller diagnostics are
designed to identify hardware and communication-related issues
of USB controllers and any attached devices.
Floppy disk diagnostics. Floppy disk diagnostics detect prob-
lems with floppy disk controllers and their related components, such
as the motor, read/write mechanism, and floppy disk sectors.
PCI diagnostics. PCI diagnostics detect any driver-related
errors or interrupt request (IRQ) sharing warnings for PCI devices
in the system.
RAID controller diagnostics. Dell PowerEdge RAID controller
diagnostics report problems with RAID controller hardware, batter-
ies, and attached disks.
SCSI controller diagnostics. SCSI controller diagnostics detect
problems with SCSI controller hardware and attached devices such
as SCSI hard drives, tape drives, and autoloaders.
Features of Dell OpenManage Online DiagnosticsThe Dell OpenManage Online Diagnostics program provides IT
administrators with various features to perform diagnostic tests.
CLI- and GUI-based testsDiagnostic tests can be performed using OS-supported command-
line interfaces (CLIs) locally or remotely over Telnet or Secure Shell
(SSH). This feature can be helpful to administrators when they are
scripting diagnostics using well-known scripting tools. Graphical
user interface (GUI)–based diagnostics can be performed via HTTP
over Secure Sockets Layer (HTTPS) ports using well-known brows-
ers such as Microsoft Internet Explorer or Mozilla Firefox.
Device enumeration and test/device inventoryThis feature enables system administrators to re-inventory all sup-
ported devices after hardware reconfiguration. This feature can
prove helpful after adding components such as plug-and-play
devices, installing drivers for supported hardware components,
or implementing hot-swappable devices. The test/device selection
feature allows administrators to select desired tests and specify the
devices on which to run them.
Diagnostic schedulerThe diagnostic scheduler feature can help increase system uptime
by allowing administrators to select diagnostic tests to run at specific
times and frequencies. This can help administrators schedule server
maintenance and run appropriate diagnostics without affecting
Figure 1. Dell OpenManage Server Administrator Diagnostic Selection screen
SYSTEMS MANAGEMENT
www.dell.com/powersolutions Reprinted from Dell Power Solutions, February 2006. Copyright © 2006 Dell Inc. All rights reserved. DELL POWER SOLUTIONS 89
business deliverables. Figure 2 displays the Dell OpenManage
Server Administrator Diagnostic Scheduling screen.
Diagnostic test review, test status, and result historyThe diagnostic test review option lets administrators review the
selected diagnostic tests before submission for execution. This
means that administrators can change settings that are predefined.
Other options include choosing test-specific settings such as halt-
on-error, specifying the number of iterations a test should run, or
even scheduling a test to run at a later time. Figure 3 displays the
Diagnostic Selection Review screen.
The diagnostic test status feature enables administrators to
monitor the status of the diagnostic tests that are running, allowing
them to view the progress of each test under execution. Figure 4
displays the Diagnostic Status screen.
The diagnostic result history feature allows administrators to
view the result history log. This log file contains the results of
previously run diagnostics tests. This log can help administrators
monitor events such as warnings, device failures, and the work-
ing condition of the device under test. The log file size can be
configured to a maximum of 5 MB. Figure 5 shows the Diagnostic
Result History screen.
Hardware configuration changes and change historyThe hardware configuration changes feature gives administrators
the option to view changes that have occurred to testable devices
on the system since the last reboot, restart of the secure port server,
or re-enumeration. It also reports changes in system configuration
such as the addition or removal of a hard drive. Figure 6 displays
the Diagnostic Hardware Configuration Changes screen.
Figure 2. Dell OpenManage Server Administrator Diagnostic Scheduling screen
Figure 3. Dell OpenManage Server Administrator Diagnostic Selection Review screen
Figure 4. Dell OpenManage Server Administrator Diagnostic Status screen
Figure 5. Dell OpenManage Server Administrator Diagnostic Result History screen
SYSTEMS MANAGEMENT
DELL POWER SOLUTIONS Reprinted from Dell Power Solutions, February 2006. Copyright © 2006 Dell Inc. All rights reserved. February 200690
The change history feature enables administrators to view a
log file that contains a history of hardware configuration changes.
The log file size can be configured to a maximum size of 5 MB.
Figure 7 shows the Diagnostic Hardware Configuration Change
History screen.
Configuration optionsThe Dell OpenManage Online Diagnostics program can specify
options for running diagnostic tests. Both application settings and
test-execution settings can be specified. Figure 8 shows the Diag-
nostic Application Settings screen.
Options available under application settings include remote
method invocation (RMI) registry port, result history log size, and
hardware change history log size. Options available under test-
execution settings include halt execution on first error, quick test,
the number of passes, and runtime.
A powerful systems management toolThe Dell OpenManage Online Diagnostics program is an integrated
component within Dell OpenManage Server Administrator enter-
prise server management software. It provides a rapid method by
which administrators can diagnose hardware malfunctions and
identify solutions to such problems. IT administrators can use the
Online Diagnostics program to help reduce management costs and
enhance server management.
Prathap Thathireddy is a senior engineering analyst in the Product Group Test Engineering Group at Dell. He has a B.S. in Computer Maintenance and Engineering from Osmania University in Hyderabad, India. He has seven years of IT experience as a system administrator, technology consultant for storage software, and test engineer.
Srikrishna Sridhar Murthy is an engineering analyst in the Online Diagnostics Group at Dell. He has a B.E. in Computer Science from Birla Institute of Technology and Science in Pilani, India.
FOR MORE INFORMATION
Dell OpenManage:www.dell.com/openmanage
Figure 6. Dell OpenManage Server Administrator Diagnostic Hardware ConfigurationChanges screen
Figure 7. Dell OpenManage Server Administrator Diagnostic Hardware ConfigurationChange History screen
Figure 8. Dell OpenManage Server Administrator Diagnostic Application Settingsscreen