fru man page on IRIX

Man page or keyword search:  
man Server   31559 pages
apropos Keyword Search (all sections)
Output format
IRIX logo
[printable version]



FRU(1M)								       FRU(1M)

NAME
     fru - Field replacement unit analyzer for Challenge/Onyx systems

SYNOPSIS
     fru [-a] namelist corefile

DESCRIPTION
     fru is a hardware state analyzer that provides board replacement
     information based on system crash dumps.  The output provided by fru
     displays what system boards, if any, are the most likely suspects that
     might have induced a hardware failure.

     fru can be run on any namelist and corefile specified on the command
     line.  namelist contains symbol table information needed for symbolic
     access to the system memory image being examined.	This will typically be
     the unix.N kernel copied into /var/adm/crash, where N is the number of
     the crash dump you are analyzing.	corefile is a file containing the
     system memory image.  This will typically be the vmcore.N.comp file
     copied into /var/adm/crash by savecore(1) when the machine reboots after
     a system panic.  If the memory image being analyzed is from a system core
     dump (vmcore.N.comp), then namelist must be a copy of the unix file that
     was executing at the time (unix.N).

     Note that fru cannot be run against live systems, as there is no system
     board replacement information available while the system is running
     properly.

     The fru command has the following options listed below.  By default, all
     information will be sent to the standard output:

     -a		Print the entirety of the error dump buffer, which might not
		be complete in the kernel console buffer of the core dump.

NOTES
     If fru finds a hardware error state, it will try and report a confidence
     level on each system board (and in some cases, the components on a
     board).  When fru reports a confidence level, it means that it has some
     measure of confidence that the board reported has a problem.  Typically
     each board in the system will be assigned a 10% confidence level if it
     reports anything into a hardware error state.  Note that there are only a
     few levels of confidence, and it is important to recognize what the
     percentages mean:

	 10%	  The board was witnessed in the hardware error state only.
	 30%	  The board has a possible error, with a low likelihood.
	 40%	  The board has a possible error, with a medium likelihood.
	 70%	  The board has a *probable* error, with a high likelihood.
	 90%	  The board is a *definite* problem.

     Given that there is the possibility of multiple boards being reported,
     care should be taken before when replacing a board on the system. For
     example, if two boards are reported at 10%, that is not enough confidence

									Page 1

FRU(1M)								       FRU(1M)

     that the boards listed are bad. If there is one board at 70% or better,
     however, there is a good likelihood that the board listed is a problem,
     and should be replaced. Boards at 30% to 40% are questionable, and should
     be reviewed based on the frequency of the failure of the specific board
     (in the same slot) between system crashes.

     The objective is to catch real hardware problems, rather than just
     replacing boards on systems where there isn't a problem.

     Here is some sample output from a fru analysis on a system crash dump:

	 # fru -a /var/adm/crash/unix.0 /var/adm/crash/vmcore.0.comp
	 ---------------------------------------------------------------
	     FRU ANALYZER (2.2):
	     ++ MEMORY BANK: leaf 1 bank 0 (B)
	     ++	  on the MC3 board in slot 3: 90% confidence.
	     ++ END OF ANALYSIS
	 ---------------------------------------------------------------

	 HARDWARE ERROR STATE:
	 +  IP19 in slot 2
	 +    CC in IP19 Slot 2, cpu 3
	 +	CC ERTOIP  Register: 0x10
	 +	  4: Parity Error on Data from D-chip
	 +  MC3 in slot 3
	 +	MA EBus Error register: 0x4
	 +	  2: My EBus Data Error
	 +	MA Leaf 1 Error Status Register: 0x2
	 +	  1: Read Uncorrectable (Multiple Bit) Error
	 +	MA Leaf 1 Bad Memory Address: 0x3fb27380
	 +	  Slot 3, leaf 1, bank 0 (B)
	 +  IO4 board in slot 15
	 +	IA EBUS Error Register: 0x201
	 +	   0: Sticky Error
	 +	   9: DATA_ERROR Received

     In this example, it would be a good idea to have the memory in leaf 1,
     bank 0 (B) changed, and have the MC3 examined (unless the memory and
     board in that slot has been replaced before, in which case further
     analysis of the hardware on the machine should be completed.)

     Please also note that it is possible the system problem being reported
     might be something unknown to the version of fru you are currently
     running with.  There might also be some bugs within fru that SGI is
     unaware of that will keep field replacement unit analysis from being
     completed.

									Page 2

[top]

List of man pages available for IRIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net