reviewing aix error and boot logs

4
close window Print Reviewing AIX Error and Boot Logs July 2013 | by David Tansley AIX provides comprehensive logging of events—some are errors requiring attention and others are just notifications. For system administrators, tasked to make sure the system is running without major issues, logging provides alerts or apprises them of events as they happen. AIX offers different logs depending on the action and when it occurred. These logs hold information on the boot-up process, console, hardware and system software events. It’s up to the system admin to take action on these events, because once AIX has published the log, its job is done. Logs, Logs, Logs AIX not only offers the errpt but also other error reporting logs. Using the alog command one can list and pick a log to view: # alog -L boot bosinst nim console cfg mdmplog lvmt lvmcfg dumpsymp When issues arise during the boot-up process, for example, and you’re not at the console, you can review the start-up process messages, particularly the boot and console messages. To list the available logs: alog -o -t For example, to view the console log: alog -o -t console Logging Your Own Entries The standard errpt list hardware or software events in AIX that have occurred. However, you might want a message generated and inserted into errpt after some user interaction, for instance, if a system admin has made a change. This allows the change notification to be visible via errpt. Like the logger command that writes to the system log (messages file), errlogger will write an operator notification entry to the error log. For example, having completed an AIX upgrade, you could post that to the error log, so other users could view it, like so: errlogger "AIX upgrade completed - no errors- test" http://www.ibmsystemsmag.com/CMSTemplates/IBMSystemsMag/Pri... 1 of 4 07-08-2015 16:15

Upload: anvesh-reddy

Post on 25-Jan-2016

217 views

Category:

Documents


0 download

DESCRIPTION

Reviewing AIX Error and Boot Logs

TRANSCRIPT

Page 1: Reviewing AIX Error and Boot Logs

close window

Print

Reviewing AIX Error and Boot LogsJuly 2013 | by David Tansley

AIX provides comprehensive logging of events—some are errors requiring attention and others are justnotifications. For system administrators, tasked to make sure the system is running without major issues,logging provides alerts or apprises them of events as they happen.

AIX offers different logs depending on the action and when it occurred. These logs hold information onthe boot-up process, console, hardware and system software events. It’s up to the system admin to takeaction on these events, because once AIX has published the log, its job is done.

Logs, Logs, Logs

AIX not only offers the errpt but also other error reporting logs. Using the alog command one can list andpick a log to view:

# alog -L

boot

bosinst

nim

console

cfg

mdmplog

lvmt

lvmcfg

dumpsymp

When issues arise during the boot-up process, for example, and you’re not at the console, you canreview the start-up process messages, particularly the boot and console messages. To list the availablelogs:

alog -o -t

For example, to view the console log:

alog -o -t console

Logging Your Own Entries

The standard errpt list hardware or software events in AIX that have occurred. However, you might want amessage generated and inserted into errpt after some user interaction, for instance, if a system adminhas made a change. This allows the change notification to be visible via errpt. Like the logger commandthat writes to the system log (messages file), errlogger will write an operator notification entry to the errorlog. For example, having completed an AIX upgrade, you could post that to the error log, so other userscould view it, like so:

errlogger "AIX upgrade completed - no errors- test"

http://www.ibmsystemsmag.com/CMSTemplates/IBMSystemsMag/Pri...

1 of 4 07-08-2015 16:15

Page 2: Reviewing AIX Error and Boot Logs

Working With errpt

The first thing AIX admins should do is get event notifications via email. Those errors/warnings will beemailed as well as posted to the errpt log. First, create an email alias containing all system admins’addresses in the /etc/mail/aliases file. Insert the email alias into the notification list, using the followingsmit selections: smit diag, current shell diagnostic, task selection, automatic error log notification. Nowyou’ll get errpt log emails as they’re posted to the errpt file.

The errpt list has headers in the following format:

identifier, timestamp, type, class resource, description.

A typical list entry could be:

A6DF45AA 0410183413 I O RMCdaemon The daemon is started.

Some system admins view the errpt listing and list the errpt, in full, using the following commands, thenclear the whole errprt when done:

errpt

errpt -a

errclear 0

However, one can be more explicit. To clear errpt entries older than two days, use

errclear 2

To clear all software errors by using the resource name, try:

errclear -d S 0

To clear down all ent0 entries:

errclear -N ent0 0

To clear all SYSPROC entries:

errclear -N SYSPROC 0

To clear by identifier:

errclear -J <identifier> 0

In the last example, identifier is used to locate and clear an entry. It can also be used to view entries:

errpt -j <identifier>

To view the full entries by identifier:

errpt -a -j <identifier>

Of course, it’s OK to get information from the errpt using the identifier, but sometimes you need to keep itsimple. So to extract all entries relating to, say, hdisk1, use the resource name to extract from the errprt:

errpt -N hdisk1

http://www.ibmsystemsmag.com/CMSTemplates/IBMSystemsMag/Pri...

2 of 4 07-08-2015 16:15

Page 3: Reviewing AIX Error and Boot Logs

To extract all entries relating to ent0, try:

errpt -N ent0

If you want to view entries based on hardware or software, simply supply the class type. To view anyhardware-related issues, for instance, use:

errpt -d H

Similarly for software, which would include core dumps and shutdowns, use:

errpt -d S

For operator, including notice events, file system space issues and services that terminate:

errpt -d O

Another identifier, called U (undetermined), logs events that don’t fall into any other category.

Don’t Report These Errors

There are occasions when the errpt gets filled with notifications you don’t really care about. Still, you wantAIX to log them—just not report them. This could be due to a rush of notifications that you don’t wantreported until a certain issue has been fixed. To view current errpt entries that have been disabled fromreporting, use:

errpt -t -F Report=0

To view the current repository list containing the complete list of identifiers, labels, descriptions, etc., try:

errpt -t

Consider a scenario where you wish to stop report logging of events for a disk raid. The systemrepeatedly tries to rebuild, but you don’t need AIX to keep telling you. To disable the reporting of the raidrebuild, first obtain the identifier—FE7D0EED—by listing the errpt repository. To disable reporting of thatidentifier:

# errupdate <hit return>

=FE7D0EED: <hit return>

Report=false

<hit CTRL-D>

<hit CTRL-D>

0 entries added.

0 entries deleted.

1 entries updated.

#

In the output above, the “=” sign indicates to modify report entry. The text also shows where you shouldhit return and CTRL-D in the inactive errupdate utility. To confirm that reporting was disabled, use theerrpt -t -F Report=0 command. At some point, you’ll want to re-enable this report. To do so:

# errupdate <hit return>

=FE7D0EED: <hit return>

< hit CTRL-D>

< hit CTRL-D>

http://www.ibmsystemsmag.com/CMSTemplates/IBMSystemsMag/Pri...

3 of 4 07-08-2015 16:15

Page 4: Reviewing AIX Error and Boot Logs

0 entries added.

0 entries deleted.

1 entries updated.

#

Again, review the repository to check identifiers that have been disabled/enabled from reporting.

If Logging Stops

If your errlog stops logging/reporting events, chances are the log is full or corrupted. A quick fix is to zerothe file. First, stop the errpt service:

# /usr/lib/errstop

Next, remove the /var/adm/ras/errlog:

# rm /var/adm/ras/errlog

Restart it:

# /usr/lib/errdemon

You’re good to go. To view attributes relating to the errolog, use:

# /usr/lib/errdemon -l

IBM Systems Magazine is a trademark of International Business Machines Corporation. The editorial content of IBM Systems Magazine is

placed on this website by MSP TechMedia under license from International Business Machines Corporation.

©2015 MSP Communications, Inc. All rights reserved.

http://www.ibmsystemsmag.com/CMSTemplates/IBMSystemsMag/Pri...

4 of 4 07-08-2015 16:15