NAME¶
watchdog.conf - configuration file for the watchdog daemon
DESCRIPTION¶
This file carries all configuration options for the Linux watchdog daemon. Each
option has to be written on a line for itself. Comments start with '#'. Blanks
are ignored except after the '=' sign. An empty text after the '=' sign
disables the feature as long as that makes sense.
OPTIONS¶
- interval = <interval>
- Set the highest possible interval between two writes to the watchdog
device. The device is triggered after each check regardless of the time it
took. After finishing all checks watchdog goes to sleep for a full cycle
of <interval> seconds. Default value is 1 second. The kernel drivers
expects a write command every minute. Otherwise the system will be
rebooted. Therefore an interval of more than a minute can only be used
with the -f command-line option.
- logtick = <logtick>
- If you enable verbose logging, a message is written into the syslog or a
logfile. While this is nice, it is not necessary to get a message every 10
seconds which really fills up disk and needs CPU. logtick allows
adjustment of the number of intervals skipped before a log message is
written. If you use logtick = 60 and interval = 10, only every 10 minutes
(600 seconds) a message is written. This may make the exact time of a
crash harder to find but greatly reduces disk usage and administrator
nerves if you're looking for a particular syslog entry in between of
watchdog messages.
- max-load-1 = <load1>
- Set the maximal allowed load average for a 1 minute span. Once this load
average is reached the system is rebooted. Default value is 0. That means
the load average check is disabled. Be careful not to this parameter too
low. To set a value less then the predefined minimal value of 2, you have
to use the -f commandline option.
- max-load-5 = <load5>
- Set the maximal allowed load average for a 5 minute span. Once this load
average is reached the system is rebooted. Default value is
3/4*max-load-1. Be careful not to this parameter too low. To set a value
less then the predefined minimal value of 2, you have to use the -f
commandline option.
- max-load-15 = <load15>
- Set the maximal allowed load average for a 15 minute span. Once this load
average is reached the system is rebooted. Default value is
1/2*max-load-1. Be careful not to this parameter too low. To set a value
less then the predefined minimal value of 2, you have to use the -f
commandline option.
- min-memory = <minpage>
- Set the minimal amount of virtual memory that has to stay free. Note that
this is in pages. Default value is 0 pages which means this test is
disabled. The page size is taken from the system include files.
- allocatable-memory = <minpage>
- Set the minimum amount of allocatable memory available on the system. Note
that this is in pages. Default value is 0 pages which means the test is
disabled. As with min-memory, the page size is taken from the system
include files.
- max-temperature = <temp>
- Set the maximal allowed temperature. Once this temperature is reached the
system is halted. Default value is 120. There is no unit conversion, so
make sure you use the same unit as your hardware. Watchdog will issue
warnings once the temperature increases 90%, 95% and 98% of this
temperature.
- watchdog-device = <device>
- Set the watchdog device name. Default is to disable keep alive
support.
- watchdog-timeout = <timeout>
- Set the watchdog device timeout during startup. If not set, a default is
used that should be set to the kernel timer margin at compile time.
- temperature-device = <temp-dev>
- Set the temperature device name. Default is to disable temperature
checking.
- file = <filename>
- Set file name for file mode. This option can be given as often as you like
to check several files.
- change = <mtime>
- Set the change interval time for file mode. This options always belongs to
the active filename, that is when finding a 'change =' line watchdog
assumes it belongs to the most recently read 'file =' line. They don't
neccessarily have to follow each other directly. But you cannot specify a
'change =' before a 'file ='. The default is to only stat the file and
don't look for changes. Using this feature to monitor changes in
/var/log/messages might require some special syslog daemon configuration,
e.g. rsyslog needs "$ActionWriteAllMarkMessages on" to be set to
make sure the marks are written no matter what.
- pidfile = <pidfilename>
- Set pidfile name for server test mode. This option can be given as often
as you like to check several servers.
- ping = <ip-addr>
- Set IP address for ping mode. This option can be used more than once to
check different connections.
- interface = <if-name>
- Set interface name for network mode. This option can be used more than
once to check different interfaces.
- test-binary = <testbin>
- Execute the given binary to do some user defined tests.
- test-timeout = <timeout in seconds>
- User defined tests may only run for <timeout> seconds. Set to 0 for
unlimited.
- repair-binary = <repbin>
- Execute the given binary in case of a problem instead of shutting down the
system.
- repair-timeout = <timeout in seconds>
- repair command may only run for <timeout> seconds. Set to 0 for
unlimited.
- admin = <mail-address>
- Email address to send admin mail to. That is, who shall be notified that
the machine is being halted or rebooted. Default is 'root'. If you want to
disable notification via email just set admin to en empty string.
- realtime = <yes|no>
- If set to yes watchdog will lock itself into memory so it is never swapped
out.
- priority = <schedule priority>
- Set the schedule priority for realtime mode.
- test-directory = <test directory>
- Set the directory to run user test/repair scripts. Default is
'/etc/watchdog.d' See the Test Directory section in watchdog(8) for more
information.
- log-dir = <log directory>
- Set the log directory to capture the standard output and standard error
from repair-binary and test-binary execution. Default is
'/var/log/watchdog'.
FILES¶
- /etc/watchdog.conf
- The watchdog configuration file
- /etc/watchdog.d
- A directory containing test-or-repair commands. See the Test Directory
section in watchdog(8) for more information.
SEE ALSO¶
watchdog(8)