CustoSec:Ranges

From CustosecWiki
Jump to navigation Jump to search

Ranges and Thresholds

Most checks in CustoSec (and ARANSEC) are based on numeric values. This might be a temperature delivered from a temperature sensor or the size of disk space delivered from a snmp request on a system. For a check to work properly, two values have to be entered as arguments. The "Warning" threshold and the "Critical" threshold, both entered as integer numbers.

The good thing is, that most checks allow to enter a threshold as a range, which give us much more flexibility and freedom to use the checks in many different ways.

What is a Range

First of all, a definition of a range:
A range is defined as a start and an end point (inclusive) on a numeric scale (possibly negative or positive infinity).
A return value from a check will be probed on this numeric scale against the ranges. Depending on how the range is entered, a value inside a range or outside a range can lead to a WARNING or CRITICAL.

Entering a Range

A range is entered into the "Warning" or the "Critical" Argument of a check configuration.
The general format for ranges is "[@]start:end".

Basic rules to entering ranges:

  • "Start" must be =< "End". If this is violated, most checks will return an error
  • "Start" and ":" do not have to be entered, when it is 0:
  • If range is of format "start:" and "end" is not specified, "end" is assumed to be infinity
  • To specify negative infinity, "~" must be used
  • Alerting:
    • Alert is raised if value is outside start and end ranges (inclusive endpoints)
    • If range starts with "@", then an alert is issued when the value is inside this range (inclusive endpoints)

Interpretation of Thresholds and Ranges

It is important to understand, how the monitoring system looks at the thresholds of a check.
Basically a threshold is always interpreted as a range. If only one value is given (as in most use cases), it will be read as the range from 0 to this value.

Overview on how thresholds and ranges are interpreted

Threshold (Range) Interpretation of the range Explanation Create Alarm when
15 0 .. 15 Value is outside range (incl.endpoints).
Since only one value is given, "0" is assumed as start.
Value is < 0 and > 15
15: 15 .. ∞ Value is outside range of 15 .. ∞ (incl. endpoints) Value is < 15
~:15 -∞ .. 15 Value is outside range -∞ .. 15 (incl. endpoints) Value is > 15
15:25 15 .. 25 Value is outside range 15 .. 25 (incl. endpoints) Value is < 15 or > 25
@15:25 15 .. 25 Value is inside range 15 .. 25 (incl. endpoints) Value is ≥ 15 or ≤ 25

A check always looks at these ranges like this:
First it checks if the returned value is outside the CRITICAL range. If yes, a CRITICAL will be issued.
It will then check if the returned value is outside the WARNING range). If yes, a WARNING will be issued.
If not, an OK will be issued.

Examples

The following examples should complete the story. They are based on a standard temperature sensor as used in ARANSEC and CustoSec.

Check: "check_snmp". Example Configuration: !custosec!.1.3.6.1.4.1.14848.2.1.2.1.4.1!30!33

Warning Argument Critical Argument Behavior / Explanation
30 33 Standard Example. CRITICAL if temperature is over 33, else WARNING if temperature is over 30. Will also be CRITICAL when temperature is below 0.
The check will return "OK", when the temperature is between 0 and 30.
~:30 ~:33 Same as above, but temperature below 0 degrees will be OK too
30: 33 CRITICAL if temperature is over 33 and below 0, else WARNING if temperature is below 30. Will be OK if temperature is ≥30 and ≤33 degrees.
30: 33 CRITICAL if temperature is over 33 and below 0, else WARNING if temperature is below 30. Will be OK if temperature is ≥30 and ≤33 degrees.
@25:30 @35: CRITICAL if temperature is 35 or above, else WARNING if temperature is between 25 and 30. Will be OK if temperature is below 25 and above 30 but less than 35 degrees.

In some checks, multiple values (separated by comma or semicolon) can be entered for an WARNING or CRITICAL argument. In most cases, these can also be entered as ranges. For further details and examples refer to the check descriptions.