.\" -*- mode: troff; coding: utf-8 -*- .\" Automatically generated by Pod::Man 5.01 (Pod::Simple 3.43) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" \*(C` and \*(C' are quotes in nroff, nothing in troff, for use with C<>. .ie n \{\ . ds C` "" . ds C' "" 'br\} .el\{\ . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "COLLECTD-THRESHOLD 5" .TH COLLECTD-THRESHOLD 5 2024-04-19 5.12.0.git collectd .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH NAME collectd\-threshold \- Documentation of collectd's Threshold plugin .SH SYNOPSIS .IX Header "SYNOPSIS" .Vb 11 \& LoadPlugin "threshold" \& \& \& WarningMin 0.00 \& WarningMax 1000.00 \& FailureMin 0.00 \& FailureMax 1200.00 \& Invert false \& Instance "bar" \& \& .Ve .SH DESCRIPTION .IX Header "DESCRIPTION" Starting with version \f(CW4.3.0\fR \fIcollectd\fR has support for \fBmonitoring\fR. By that we mean that the values are not only stored or sent somewhere, but that they are judged and, if a problem is recognized, acted upon. The only action the \fIThreshold plugin\fR takes itself is to generate and dispatch a \&\fInotification\fR. Other plugins can register to receive notifications and perform appropriate further actions. .PP Since systems and what you expect them to do differ a lot, you can configure \&\fIthresholds\fR for your values freely. This gives you a lot of flexibility but also a lot of responsibility. .PP Every time a value is out of range, a notification is dispatched. This means that the idle percentage of your CPU needs to be less then the configured threshold only once for a notification to be generated. There's no such thing as a moving average or similar \- at least not now. .PP Also, all values that match a threshold are considered to be relevant or "interesting". As a consequence collectd will issue a notification if they are not received for \fBTimeout\fR iterations. The \fBTimeout\fR configuration option is explained in section "GLOBAL OPTIONS" in \fBcollectd.conf\fR\|(5). If, for example, \&\fBTimeout\fR is set to "2" (the default) and some hosts sends its CPU statistics to the server every 60 seconds, a notification will be dispatched after about 120 seconds. It may take a little longer because the timeout is checked only once each \fBInterval\fR on the server. .PP When a value comes within range again or is received after it was missing, an "OKAY-notification" is dispatched. .SH CONFIGURATION .IX Header "CONFIGURATION" Here is a configuration example to get you started. Read below for more information. .PP .Vb 10 \& LoadPlugin "threshold" \& \& \& WarningMin 0.00 \& WarningMax 1000.00 \& FailureMin 0.00 \& FailureMax 1200.00 \& Invert false \& Instance "bar" \& \& \& \& Instance "eth0" \& \& FailureMax 10000000 \& DataSource "rx" \& \& \& \& \& \& Instance "idle" \& FailureMin 10 \& \& \& \& \& Instance "cached" \& WarningMin 100000000 \& \& \& \& \& DataSource "midterm" \& FailureMax 4 \& Hits 3 \& Hysteresis 3 \& \& \& .Ve .PP There are basically two types of configuration statements: The \f(CW\*(C`Host\*(C'\fR, \&\f(CW\*(C`Plugin\*(C'\fR, and \f(CW\*(C`Type\*(C'\fR blocks select the value for which a threshold should be configured. The \f(CW\*(C`Plugin\*(C'\fR and \f(CW\*(C`Type\*(C'\fR blocks may be specified further using the \&\f(CW\*(C`Instance\*(C'\fR option. You can combine the block by nesting the blocks, though they must be nested in the above order, i.e. \f(CW\*(C`Host\*(C'\fR may contain either \&\f(CW\*(C`Plugin\*(C'\fR and \f(CW\*(C`Type\*(C'\fR blocks, \f(CW\*(C`Plugin\*(C'\fR may only contain \f(CW\*(C`Type\*(C'\fR blocks and \&\f(CW\*(C`Type\*(C'\fR may not contain other blocks. If multiple blocks apply to the same value the most specific block is used. .PP The other statements specify the threshold to configure. They \fBmust\fR be included in a \f(CW\*(C`Type\*(C'\fR block. Currently the following statements are recognized: .IP "\fBFailureMax\fR \fIValue\fR" 4 .IX Item "FailureMax Value" .PD 0 .IP "\fBWarningMax\fR \fIValue\fR" 4 .IX Item "WarningMax Value" .PD Sets the upper bound of acceptable values. If unset defaults to positive infinity. If a value is greater than \fBFailureMax\fR a \fBFAILURE\fR notification will be created. If the value is greater than \fBWarningMax\fR but less than (or equal to) \fBFailureMax\fR a \fBWARNING\fR notification will be created. .IP "\fBFailureMin\fR \fIValue\fR" 4 .IX Item "FailureMin Value" .PD 0 .IP "\fBWarningMin\fR \fIValue\fR" 4 .IX Item "WarningMin Value" .PD Sets the lower bound of acceptable values. If unset defaults to negative infinity. If a value is less than \fBFailureMin\fR a \fBFAILURE\fR notification will be created. If the value is less than \fBWarningMin\fR but greater than (or equal to) \fBFailureMin\fR a \fBWARNING\fR notification will be created. .IP "\fBDataSource\fR \fIDSName\fR" 4 .IX Item "DataSource DSName" Some data sets have more than one "data source". Interesting examples are the \&\f(CW\*(C`if_octets\*(C'\fR data set, which has received (\f(CW\*(C`rx\*(C'\fR) and sent (\f(CW\*(C`tx\*(C'\fR) bytes and the \f(CW\*(C`disk_ops\*(C'\fR data set, which holds \f(CW\*(C`read\*(C'\fR and \f(CW\*(C`write\*(C'\fR operations. The system load data set, \f(CW\*(C`load\*(C'\fR, even has three data sources: \f(CW\*(C`shortterm\*(C'\fR, \&\f(CW\*(C`midterm\*(C'\fR, and \f(CW\*(C`longterm\*(C'\fR. .Sp Normally, all data sources are checked against a configured threshold. If this is undesirable, or if you want to specify different limits for each data source, you can use the \fBDataSource\fR option to have a threshold apply only to one data source. .IP "\fBInvert\fR \fBtrue\fR|\fBfalse\fR" 4 .IX Item "Invert true|false" If set to \fBtrue\fR the range of acceptable values is inverted, i.e. values between \fBFailureMin\fR and \fBFailureMax\fR (\fBWarningMin\fR and \fBWarningMax\fR) are not okay. Defaults to \fBfalse\fR. .IP "\fBPersist\fR \fBtrue\fR|\fBfalse\fR" 4 .IX Item "Persist true|false" Sets how often notifications are generated. If set to \fBtrue\fR one notification will be generated for each value that is out of the acceptable range. If set to \&\fBfalse\fR (the default) then a notification is only generated if a value is out of range but the previous value was okay. .Sp This applies to missing values, too: If set to \fBtrue\fR a notification about a missing value is generated once every \fBInterval\fR seconds. If set to \fBfalse\fR only one such notification is generated until the value appears again. .IP "\fBPersistOK\fR \fBtrue\fR|\fBfalse\fR" 4 .IX Item "PersistOK true|false" Sets how OKAY notifications act. If set to \fBtrue\fR one notification will be generated for each value that is in the acceptable range. If set to \fBfalse\fR (the default) then a notification is only generated if a value is in range but the previous value was not. .IP "\fBPercentage\fR \fBtrue\fR|\fBfalse\fR" 4 .IX Item "Percentage true|false" If set to \fBtrue\fR, the minimum and maximum values given are interpreted as percentage value, relative to the other data sources. This is helpful for example for the "df" type, where you may want to issue a warning when less than 5\ % of the total space is available. Defaults to \fBfalse\fR. .IP "\fBHits\fR \fIValue\fR" 4 .IX Item "Hits Value" Sets the number of occurrences which the threshold must be raised before to dispatch any notification or, in other words, the number of \fBInterval\fRs that the threshold must be match before dispatch any notification. .IP "\fBHysteresis\fR \fIValue\fR" 4 .IX Item "Hysteresis Value" Sets the hysteresis value for threshold. The hysteresis is a method to prevent flapping between states, until a new received value for a previously matched threshold down below the threshold condition (\fBWarningMax\fR, \fBFailureMin\fR or everything else) minus the hysteresis value, the failure (respectively warning) state will be keep. .IP "\fBInteresting\fR \fBtrue\fR|\fBfalse\fR" 4 .IX Item "Interesting true|false" If set to \fBtrue\fR (the default), a notification with severity \f(CW\*(C`FAILURE\*(C'\fR will be created when a matching value list is no longer updated and purged from the internal cache. When this happens depends on the \fIinterval\fR of the value list and the global \fBTimeout\fR setting. See the \fBInterval\fR and \fBTimeout\fR settings in \fBcollectd.conf\fR\|(5) for details. If set to \fBfalse\fR, this event will be ignored. .SH "SEE ALSO" .IX Header "SEE ALSO" \&\fBcollectd\fR\|(1), \&\fBcollectd.conf\fR\|(5) .SH AUTHOR .IX Header "AUTHOR" Florian Forster