NAME¶
bmc-watchdog - BMC watchdog timer daemon and control utility
SYNOPSIS¶
bmc-watchdog command [
OPTION...] [
COMMAND_OPTIONS...]
DESCRIPTION¶
bmc-watchdog controls a Baseboard Management Controller (BMC) watchdog
timer. The
bmc-watchdog tool typically executes as a cronjob or daemon
to manage the watchdog timer. A user must be root in order to run
bmc-watchdog.
Listed below are
bmc-watchdog details, option details, examples, and
known issues. For a general introduction to FreeIPMI please see
freeipmi(7).
BMC WATCHDOG DETAILS¶
A BMC watchdog timer is part of the Intelligent Platform Management Interface
(IPMI) specification and is only available to BMCs that are compliant with
IPMI. When a BMC watchdog timer is started, it begins counting down to zero
from some positive number of seconds. When the timer hits zero, the timer will
execute a pre-configured pre-timeout interrupt and/or timeout action.
In order to stop the pre-timeout interrupt or timeout action from being
executed, the watchdog timer must be periodically reset back to its initial
beginning value.
The BMC watchdog timer automatically stops itself when the machine is rebooted.
Therefore, when a machine is brought up, the BMC watchdog timer must be setup
again before it can be used.
Typically, a BMC watchdog timer is used to automatically reset a machine that
has crashed. When the operating system first starts up, the BMC timer is set
to its initial countdown value. At periodic intervals, when the operating
system is functioning properly, the watchdog timer can be reset by the OS or a
userspace program. Thus, the timer never counts down to zero. When the system
crashes, the timer cannot be reset by the OS or userspace program. Eventually,
the timer will countdown to zero and reset the machine.
See EXAMPLES below for examples of how bmc-watchdog is commonly used.
COMMANDS¶
The following commands are available to
bmc-watchdog.
- -s, --set
- Set BMC Watchdog Configuration. BMC watchdog timer configuration values
can be set using the set command options listed below under SET OPTIONS.
If a particular configuration parameter is not specified on the command
line, the current configuration of that parameter will not be
changed.
- -g, --get
- Get BMC Watchdog Configuration and State. The current configuration and
state is printed to standard output.
- -r, --reset
- Reset BMC Watchdog Timer.
- -t, --start
- Start BMC Watchdog Timer. Does nothing if the timer is currently running.
Identical to --reset command when the timer is stopped with the
exception of the start command options listed below under START
OPTIONS.
- -y, --stop
- Stop BMC Watchdog Timer. Stops the current timer.
- -c, --clear
- Clear BMC Watchdog Configuration. Clears all configuration values for the
watchdog timer, except for timer use, which is kept at its current
value.
- -d, --daemon
- Run bmc-watchdog as a daemon. Configurable BMC watchdog timer
options are listed below under DAEMON OPTIONS. The configuration values
are set once, then the daemon will reset the timer at specified periodic
intervals. The daemon can be stopped using the --stop command,
--clear command, or by setting the stop_timer flag on the
--set command.
GENERAL OPTIONS¶
The following options are general options for configuring IPMI communication and
executing general tool commands. These options are generic and can be used by
any command.
- -D IPMIDRIVER, --driver-type=IPMIDRIVER
- Specify the driver type to use instead of doing an auto selection. The
currently available inband drivers are KCS, SSIF, OPENIPMI, SUNBMC, and
INTELDCMI.
- --disable-auto-probe
- Do not probe in-band IPMI devices for default settings.
- --driver-address=DRIVER-ADDRESS
- Specify the in-band driver address to be used instead of the probed value.
DRIVER-ADDRESS should be prefixed with "0x" for a hex
value and '0' for an octal value.
- --driver-device=DEVICE
- Specify the in-band driver device path to be used instead of the probed
path.
- --register-spacing=REGISTER-SPACING
- Specify the in-band driver register spacing instead of the probed value.
Argument is in bytes (i.e. 32bit register spacing = 4)
- --target-channel-number=CHANNEL-NUMBER
- Specify the in-band driver target channel number to send IPMI requests
to.
- --target-slave-address=SLAVE-ADDRESS
- Specify the in-band driver target slave number to send IPMI requests
to.
- -v, --verbose-logging
- Increase verbosity of logging.
- -n, --no-logging
- Turns off all logging done by bmc-watchdog.
- --config-file=FILE
- Specify an alternate configuration file.
- -W WORKAROUNDS,
--workaround-flags=WORKAROUNDS
- Specify workarounds to vendor compliance issues. Multiple workarounds can
be specified separated by commas. A special command line flag of
"none", will indicate no workarounds (may be useful for
overriding configured defaults). See WORKAROUNDS below for a list of
available workarounds.
- --debug
- Turn on debugging.
- -?, --help
- Output a help list and exit.
- --usage
- Output a usage message and exit.
- -V, --version
- Output the program version and exit.
SET OPTIONS¶
The following options can be used by the set command to set or clear various BMC
watchdog configuration parameters.
- -u INT, --timer-use=INT
- Set timer use. The timer use value can be set to one of the following: 1 =
BIOS FRB2, 2 = BIOS POST, 3 = OS_LOAD, 4 = SMS OS, 5 = OEM.
- -m INT, --stop-timer=INT
- Set Stop Timer Flag. A flag value of 0 stops the current BMC watchdog
timer. A value of 1 doesn't turn off the current watchdog timer.
- -l INT, --log=INT
- Set Log Flag. A flag value of 0 turns logging on. A value of 1 turns
logging off.
- -a INT, --timeout-action=INT
- Set timeout action. The timeout action can be set to one of the following:
0 = No action, 1 = Hard Reset, 2 = Power Down, 3 = Power Cycle.
- -p INT, --pre-timeout-interrupt=INT
- Set pre-timeout interrupt. The pre timeout interrupt can be set to one of
the following: 0 = None, 1 = SMI, 2 = NMI, 3 = Messaging Interrupt.
- -z SECONDS,
--pre-timeout-interval=SECONDS
- Set pre-timeout interval in seconds.
- -F, --clear-bios-frb2
- Clear BIOS FRB2 Timer Use Flag.
- -P, --clear-bios-post
- Clear BIOS POST Timer Use Flag.
- -L, --clear-os-load
- Clear OS Load Timer Use Flag.
- -S, --clear-sms-os
- Clear SMS/OS Timer Use Flag.
- -O, --clear-oem
- Clear OEM Timer Use Flag.
- -i SECONDS, --initial-countdown=SECONDS
- Set initial countdown in seconds.
- -w, --start-after-set
- Start timer after set command if timer is stopped. This is typically used
when bmc-watchdog is used as a cronjob. This can be used to
automatically start the timer after it has been set the first time.
- -x, --reset-after-set
- Reset timer after set command if timer is running.
- -j, --start-if-stopped
- Don't execute set command if timer is stopped, just start timer.
- -k, --reset-if-running
- Don't execute set command if timer is running, just reset timer. This is
typically used when bmc-watchdog is used as a cronjob. This can be
used to reset the timer after it has been initially started.
START OPTIONS¶
The following options can be used by the start command.
- -G INT, --gratuitous-arp=INT
- Suspend or don't suspend gratuitous ARPs while the BMC timer is running. A
flag value of 1 suspends gratuitous ARPs. A value of 0 will not suspend
gratuitous ARPs. If this option is not specified, gratuitous ARPs will not
be suspended.
- -A INT, --arp-response=INT
- Suspend or don't suspend BMC-generated ARP responses while the BMC timer
is running. A flag value of 1 suspends ARP responses. A value of 0 will
not suspend ARP responses. If this option is not specified, ARP responses
will not be suspended.
DAEMON OPTIONS¶
The following options can be used by the daemon command to set the initial BMC
watchdog configuration parameters.
- -u INT, --timer-use=INT
- Set timer use. The timer use value can be set to one of the following: 1 =
BIOS FRB2, 2 = BIOS POST, 3 = OS_LOAD, 4 = SMS OS, 5 = OEM.
- -l INT, --log=INT
- Set Log Flag. A flag value of 0 turns logging on. A value of 1 turns
logging off.
- -a INT, --timeout-action=INT
- Set timeout action. The timeout action can be set to one of the following:
0 = No action, 1 = Hard Reset, 2 = Power Down, 3 = Power Cycle.
- -p INT, --pre-timeout-interrupt=INT
- Set pre-timeout interrupt. The pre timeout interrupt can be set to one of
the following: 0 = None, 1 = SMI, 2 = NMI, 3 = Messaging Interrupt.
- -z SECONDS,
--pre-timeout-interval=SECONDS
- Set pre-timeout interval in seconds.
- -F, --clear-bios-frb2
- Clear BIOS FRB2 Timer Use Flag.
- -P, --clear-bios-post
- Clear BIOS POST Timer Use Flag.
- -L, --clear-os-load
- Clear OS Load Timer Use Flag.
- -S, --clear-sms-os
- Clear SMS/OS Timer Use Flag.
- -O, --clear-oem
- Clear OEM Timer Use Flag.
- -i SECONDS, --initial-countdown=SECONDS
- Set initial countdown in seconds.
- -G INT, --gratuitous-arp=INT
- Suspend or don't suspend gratuitous ARPs while the BMC timer is running. A
flag value of 1 suspends gratuitous ARPs. A value of 0 will not suspend
gratuitous ARPs. If this option is not specified, gratuitous ARPs will not
be suspended.
- -A INT, --arp-response=INT
- Suspend or don't suspend BMC-generated ARP responses while the BMC timer
is running. A flag value of 1 suspends ARP responses. A value of 0 will
not suspend ARP responses. If this option is not specified, ARP responses
will not be suspended.
- -e, --reset-period
- Time interval to wait before resetting timer. The default is 60
seconds.
ERRORS¶
Errors are logged to syslog.
WORKAROUNDS¶
With so many different vendors implementing their own IPMI solutions, different
vendors may implement their IPMI protocols incorrectly. The following
describes a number of workarounds currently available to handle discovered
compliance issues. When possible, workarounds have been implemented so they
will be transparent to the user. However, some will require the user to
specify a workaround be used via the -W option.
The hardware listed below may only indicate the hardware that a problem was
discovered on. Newer versions of hardware may fix the problems indicated
below. Similar machines from vendors may or may not exhibit the same problems.
Different vendors may license their firmware from the same IPMI firmware
developer, so it may be worthwhile to try workarounds listed below even if
your motherboard is not listed.
If you believe your hardware has an additional compliance issue that needs a
workaround to be implemented, please contact the FreeIPMI maintainers on
<freeipmi-users@gnu.org> or <freeipmi-devel@gnu.org>.
assumeio - This workaround flag will assume inband interfaces communicate
with system I/O rather than being memory-mapped. This will work around systems
that report invalid base addresses. Those hitting this issue may see
"device not supported" or "could not find inband device"
errors. Issue observed on HP ProLiant DL145 G1.
spinpoll - This workaround flag will inform some inband drivers (most
notably the KCS driver) to spin while polling rather than putting the process
to sleep. This may significantly improve the wall clock running time of tools
because an operating system scheduler's granularity may be much larger than
the time it takes to perform a single IPMI message transaction. However, by
spinning, your system may be performing less useful work by not contexting out
the tool for a more useful task.
ignorestateflag - This workaround option will ignore the BMC timer state
flag (indicating if the timer is running or stopped) when running in daemon
mode. On some BMCs, the flag is broken and will never report that a BMC timer
is running, even if it is. The workaround will take notice of changes in the
countdown seconds to determine if a timer is running or stopped. With this
type of implementation, the reset-period must be large enough to ensure minor
fluctuations in the countdown will not affect the workaround. Due to the
implementation of this workaround, if another process stops the watchdog
timer, it may be detectable. This option is confirmed to work around
compliances issues on Sun x4100, x4200, and x4500.
EXAMPLES¶
Setup a bmc-watchdog daemon that resets the machine after 15 minutes (900
seconds) if the OS has crashed (see default bmc-watchdog rc script
/etc/init.d/bmc-watchdog for a more complete example):
bmc-watchdog -d -u 4 -p 0 -a 1 -i 900
DIAGNOSTICS¶
Upon successful execution, exit status is 0. On error, exit status is 1.
KNOWN ISSUES¶
Bmc-watchdog may fail to reset the watchdog timer if it is not scheduled
properly. It is always recommended that
bmc-watchdog be executed with a
high scheduling priority.
On some machines, the hardware based SMI Handler may disable a processor after a
watchdog timer timeout if the timer use is set to something other than SMS/OS.
REPORTING BUGS¶
Report bugs to <freeipmi-users@gnu.org> or <freeipmi-devel@gnu.org>.
COPYRIGHT¶
Copyright (C) 2007-2014 Lawrence Livermore National Security, LLC.
Copyright (C) 2004-2007 The Regents of the University of California.
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 3 of the License, or (at your option) any later
version.
SEE ALSO¶
freeipmi(7)
http://www.gnu.org/software/freeipmi/