NAME¶
hobbitd_alert - hobbitd worker module for sending out alerts
SYNOPSIS¶
hobbitd_channel --channel=page hobbitd_alert [options]
DESCRIPTION¶
hobbitd_alert is a worker module for hobbitd, and as such it is normally run via
the
hobbitd_channel(8) program. It receives hobbitd page- and
ack-messages from the "page" channel via stdin, and uses these to
send out alerts about failed and recovered hosts and services.
The operation of this module is controlled by the
hobbit-alerts.cfg(5)
file. This file holds the definition of rules and recipients, that determine
who gets alerts, how often, for what servers etc.
OPTIONS¶
- --config=FILENAME
- Sets the filename for the hobbit-alerts.cfg file.
The default value is "etc/hobbit-alerts.cfg" below the Xymon
server directory.
- --dump-config
- Dumps the configuration after parsing it. May be useful to
track down problems with configuration file errors.
- --checkpoint-file=FILENAME
- File where the current state of the hobbitd_alert module is
saved. When starting up, hobbitd_alert will also read this file to restore
the previous state.
- --checkpoint-interval=N
- Defines how often (in seconds) the checkpoint-file is
saved.
- --cfid
- If this option is present, alert messages will include a
line with "cfid:N" where N is the linenumber in the
hobbit-alerts.cfg file that caused this message to be sent. This can be
useful to track down problems with duplicate alerts.
- --test HOST SERVICE [options]
- Shows which alert rules matches the given HOST/SERVICE
combination. Useful to debug configuration problems, and see what rules
are used for an alert.
The possible options are:
--color=COLORNAME The COLORNAME parameter is the color of the alert:
red, yellow or purple.
--duration=SECONDS The SECONDS parameter is the duration of the alert
in seconds.
--group=GROUPNAME The GROUPNAME paramater is a groupid string from
the hobbit-clients.cfg file.
--time=TIMESTRING The TIMESTRING parameter is the time-of-day for the
alert, expressed as an absolute time in the epoch format (seconds since
Jan 1 1970). This is easily obtained with the GNU date utility using the
"+%s" output format.
- --debug
- Enable debugging output.
HOW HOBBIT DECIDES WHEN TO SEND ALERTS¶
The hobbitd_alert module is responsible for sending out all alerts. When a
status first goes to one of the ALERTCOLORS, hobbitd_alert is notified of this
change. It notes that the status is now in an alert state, and records the
timestamp when this event started, and adds the alert to the list
statuses that may potentially trigger one or more alert messages.
This list is then matched against the hobbit-alerts.cfg configuration. This
happens at least once a minute, but may happen more often. E.g. when status
first goes into an alert state, this will always trigger the matching to
happen.
When scanning the configuration, hobbitd_alert looks at all of the configuration
rules. It also checks the DURATION setting against how long time has elapsed
since the event started - i.e. against the timestamp logged when hobbitd_alert
first heard of this event.
When an alert recipient is found, the alert is sent and it is recorded when this
recipient is due for his next alert message, based on the REPEAT setting
defined for this recipient. The next time hobbitd_alert scans the
configuration for what alerts to send, it will still find this recipient
because all of the configuration rules are fulfilled, but an alert message
will not be generated until the repeat interval has elapsed.
It can happen that a status first goes yellow and triggers an alert, and later
it goes red - e.g. a disk filling up. In that case, hobbitd_alert clears the
internal timer for when the next (repeat) alert is due for all recipients. You
generally want to be told when something that has been in a warning state
becomes critical, so in that case the REPEAT setting is ignored and the alert
is sent. This only happens the first time such a change occurs - if the status
switches between yellow and red multiple times, only the first transition from
yellow->red causes this override.
When an status recovers, a recovery message may be sent - depending on the
configuration - and then hobbitd_alert forgets everything about this status.
So the next time it goes into an alert state, the entire process starts all
over again.
ENVIRONMENT¶
- MAIL
- The first part of a command line used to send out an e-mail
with a subject, typically set to "/usr/bin/mail -s" .
hobbitd_alert will add the subject and the mail recipients to form the
command line used for sending out email alerts.
- MAILC
- The first part of a command line used to send out an e-mail
without a subject. Typically this will be "/usr/bin/mail".
hobbitd_alert will add the mail recipients to form the command line used
for sending out email alerts.
FILES¶
- ~xymon/server/etc/hobbit-alerts.cfg
-
SEE ALSO¶
hobbit-alerts.cfg(5),
hobbitd(8),
hobbitd_channel(8),
xymon(7)