CustoSec:Check SNMP Network Interface

From CustosecWiki
Jump to navigation Jump to search
caption
Basic Information on Check
Name of Check SNMP Network Interface Technical Name check_snmp_netintf
Available in Standard Number of Arguments 10
From Version ARANSEC 2.32 Compability All ARANSEC and CustoSec



Scope of Check

A standard check to monitor the status, traffic and utilization of network interfaces (network ports) on servers, switches and routers.

Cisco port names, port link data and STP status (for Cisco as well as other switches) are supported.


Requirements

For the check to work properly the following requirements must be met:

  • The check is configured as a service check on the target host that should be monitored
  • SNMP must be activated on the target host (read only and a community name; It is also recommended to allow only the ARANSEC/CustoSec IP-Address to read SNMP information on the host). SNMP can easily be checked by starting a second session in a second browser tab and do a SNMP-Walk from ARANSEC's SNMP-Walk function (bottom entry in the left hand menu).
  • When working with OID's it might prove useful to have an explanation on individual OID's at hand. A good resource for this is i.e. the OID Repository where more information on OID's can be found.


Arguments

To configure the check, the following arguments are available:

Argument No. Argument Name Allowed Arguments Explanation Examples
Arg1 snmp-community string Community name for the SNMP agent. It is strongly recommended to change the default community on most systems from "public" to something like "aransec".
Must be entered or check cannot find OID.
custosec
Arg2 port string Port number of the snmp-service on the particular host. Default is 161. Must be entered or check cannot find the OID. 161
Arg3 delta string The preferred time (in seconds) between 2 values that is used to calculate the traffic or errors. This delta time should be bigger that the "Normal Check Interval" in the checks configuration.
The calculation of the check is basically:
Difference of counters divided by the difference of time stamps (bits/sec)
This is repeated for all available time stamps matching the delta. After that the results will be added up and the divided by the number of results. The check will allow 10% less of the delta and 300% more than the delta as a correct "Normal Check Interval".
If i.e. the delta should be 5 minutes, the "Normal Check Interval" could be between 4'30" and 15 Minutes. Values, with a time stamp difference outside these limits will not be counted.
300
Arg4 Interface NAME, optional arguments string Name in description OID (eth0, ppp0 ...). This is treated as a regular expression, which means "eth" will match eth0,eth1,.... Test it before use, because there are known bugs (ex : trailing /, that have to masked).
Additional arguments can be:
* nr = Do not use regular expression to match NAME in description OID
* in = Make critical when interface is up (inverted status monitoring)
* ad = Use administrative status instead of the operational status (see below)
* ig = Ignore the interface status and return OK regardless.
Note: when multiple interfaces are selected with a regular expression, all must be up (or down with "in") to get an "OK" result.
Note: Use the OID 1.3.6.1.2.1.2.2.1.2 in the systems SNMP-Walk to find out all interfaces and their proper names.
Note: If a specific port on a switch should be monitored, just enter the port name, but be aware of the regex. I.e. if port#1 should be monitored on a switch with ports 24 ports, but not port#10, #11,... there are several ways to achieve this:
* Enter the exact name with the additional argument "nr": \"Port#1\",nr (the double quotes are masked with a \ because of the special character # in the port name)
* Enter the name as a regular expression: \#1$
* An entry like this: \#[125]$ monitors the ports port#1, port#2 and port#5.
* If a port, i.e. Port#1 should be monitored in the administrative status, use i.e. \"Port#1\",nr,ad (see more information below)
\"Port#1\",nr
Arg5 bandwith (optional) string Optional bandwidth specific arguments.
- = Do not check bandwidth (WARNING/CRITICAL thresholds will be ignored).
ba = Check the average input/output bandwidth of the interface [in KB/s].
be = Also check the error and discard input/output [in KB/s].
ba
Arg6 Warn (optional) string WARN = The Input / Output Warn level:
-> with ba: Warning level for input / output bandwidth in KB/s (0 for no warning). Input: <In Warn>,<Out Warn>
-> with be: Warning level for input / output bandwidth (including errors and discarded) in KB/s (0 for no warning). Input: <In bytes>,<Out bytes>,<In error>,<Out error>,<In disc>,<Out disc>
300,500
Arg7 Critical (optional) string CRIT = The Input / Output Critical level:
-> with ba: Critical level for input / output bandwidth in KB/s (0 for no critical). Input: <In Crit>,<Out Crit>
-> with be: Critical level for input / output bandwidth (including errors and discarded) in KB/s (0 for no critical). Input: <In bytes>,<Out bytes>,<In error>,<Out error>,<In disc>,<Out disc>
500,800
Arg8 Performance Data (optional) string Optional performance data specific arguments:
- = Do not process performance data.
pr = Perfparse compatible output (no output when interface is down). Streamlined performance data for use in PnP4Nagios (CustoSec PlugIn). Useful for ReportBase.
er = Add error and discard to Performance data output.
is = interface speed will be shown in b/sec in the performance data as well. Usefull option for further use of the performance data within ReportBase.
:speed = (theoretical) port speed. Is optional and must be entered exactly. Returns critical, when the actual port speed differs from this (theoretical) value. (example: is:100Mb)
pr,is:100MB
Arg9 Optional Cisco specific arguments string Optional Cisco specific arguments:
- = Do not use Cisco specific arguments.
. = Use Cisco option without optional arguments.
[oper,][addoper,][linkfault,][use_portnames|show_portnames]
This enables cisco snmp hacks which among other things provide more details on operational and fault status for physical ports. There are 3 tables that are available - "operStatus", "AdditionalOperStatus", "LinkFaultStatus"
(some switches have one, some may have all 3) - if you do not specify any, an attempt will be made for everyone but if caching is used what is actually available will be cached for future requests. When you use optional "use_portnames" as argument, this means that instead of using normal SNMP description OID table it would match name given at if-name with port description names that you set with "set port name", this does however restrict to only cisco module ports (ifindex maybe larger and include also non-port interfaces such as vlan). Using "show_portname" causes port names to go as comments.
oper, linkfault
Arg10 Optional Cisco specific arguments string Optional Spanning Tree Protocol specific arguments:
- = Do not use STP specific arguments.
. = Use STP option without optional arguments.
[disabled|blocking|listening|learning|forwarding|broken]
This enables reporting of STP (Spanning Tree Protocol) switch ports states.If STP port state changes then plugin for period of time (default 15 minutes) reports WARNING. Optional parameter is expected STP state of the port and plugin will return CRITICAL error if its anything else.
blocking, learning


Examples

The following examples should explain the usage of the check and how the arguments should be entered in ARANSEC or CustoSec.
(Please Note: Pipe Character in the fields of this table divide different options. Exception: Within the "Output" lines in the "Output" field, the pipe character is real and shows the division between the checks output and the checks performance data)

Example Description Output
!custosec!161!300!\"Port #6\",nr!ba!300,500!500,800!pr,is:100b!-!- This is an example used to monitor the traffic on Port 6 of a standard Zyxel Switch.
* The delta-parameter is set to 300, which means, the traffic per 5 minutes is measured.
* The interface name argument is set to "\"Port #6\, nr", which will deliver only port number 6.
* The bandwidth parameter is set to ba, which means the error and discarded packets will not be included, just the Input and the Output will be measured.
The Warn value is set for Input: 300 and Output: 500 KB/s
* The Criticalvalue is set for Input: 500 and Output: 800 KB/S.
The performance data arguments are set to pr, which tells the check to include the Input and the Output values into the performance data. The is:100b tells the check to inlude the interfaces speed into the output file as B/s.
Status: OK
Output: Port #6:UP (in=0.1KBps/out=0.1KBps):(1 UP): OK | Port #6_in_Bps=54;307200;512000;0;12.5 Port #6_out_Bps=59;512000;819200;0;12.5 Port #6_speed_bps=100
!custosec!161!300!\"Port #6\",nr!ba!300,500!500,800!pr,er,is:100b!-!- This is the same example as above, only in the performance data parameters, the er is included, which tells the check to also include the errors and discards into the performance data.
'Note:' In this example, both errors and discards are 0.
Status: OK
Output: Port #6:UP (in=0.0KBps/out=0.1KBps):(1 UP): OK | Port #6_in_Bps=46;307200;512000;0;12.5 Port #6_out_Bps=51;512000;819200;0;12.5 Port #6_in_error=0c Port #6_in_discard=0c Port #6_out_error=0c Port #6_out_discard=0c Port #6_speed_bps=100
!custosec!161!300!\"Port #1\",nr!ba!300,500!500,800!pr,er,is:100b!-!- This is the same example as above, but for port 1, which is the port with the firewall linked to.
This example will not deliver any data, because the check has just been implemented on this port and the "check-the-check" has been carried out the first time. This means, there is no previous data for the check to calculate the delta from. Therefore the check will return empty strings.
Status: OK
Output: Port #1:UP '''(no usable data - 20 rows)''' :(1 UP): OK | Port #1_in_error=0c Port #1_in_discard=1c Port #1_out_error=0c Port #1_out_discard=0c Port #1_speed_bps=100
... the same just 20 minutes later... This is the same example as above, but now we have tested the check a couple of times within about 20 minutes to create some "historic data".
This time the check has enough "historic" data, to calculate the delta.
Status: OK
Output: Port #1:UP (in=0.4KBps/out=0.3KBps):(1 UP): OK | Port #1_in_Bps=452;307200;512000;0;12.5 Port #1_out_Bps=327;512000;819200;0;12.5 Port #1_in_error=0c Port #1_in_discard=1c Port #1_out_error=0c Port #1_out_discard=0c Port #1_speed_bps=10

Explanations

Regular Expressions

Find out more on regular expressions in our Explanations on regular Expressions


Administrative Status of a Network Interface

  • In the operative status: A network interface is shown as "down" if the cable is not plugged in or the device is shut down. This allows for an indirect monitoring of hosts...
  • In the administrative status: A network interface is shown as "down", when the interface is turned off or when it is not working at all.