Nagios Plugin Scripts

For details on the syntax of Nagios threshold ranges, see http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT

The "check_swift_background_daemon" and "check_swift_sweep_time" plugins both take a Swift daemon name (see their usage messages for the set of possible values) as a positional argument to allow fine-grained control of which daemons are checked and to allow WARNING/CRITICAL to be more specific.

Please let us know if you have any question about the plugins. Each plugin's usage message includes information about what metric the "-w" and "-c" ranges apply to. The plugins do not include any "fancy" Nagios range syntax since for all four plugins, a higher value is worse. However, all four plugins support the full syntax (specified here: http://nagiosplug.sourceforge.net/developer-guidelines.html#THRESHOLDFORMAT).

Check for unmounted SwiftStack devices

This Nagios plugin compares the count of unmounted Swift devices against the supplied -w and -c threshold ranges. The default values will return WARNING when > 0 device is unmounted and CRITICAL when > 1 devices are unmounted.

Usage

check_drives_mounted [-h] [-w RANGE] [-c RANGE]

 -h, --help show this help message and exit
 -w RANGE WARNING if more than this many devices unmounted. (default: 0)
 -c RANGE CRITICAL if more than this many devices unmounted. (default: 1)

Sample

[root@srv1 ~]# check_drives_mounted
UNMOUNTED DEVICES OKAY- 0 unmounted;

Check for drive capacity utilization

This Nagios plugin compares the maximum single drive percent full value against the supplied -w and -c threshold ranges. The default values will return WARNING when the most full drive is > 70% full and CRITICAL when the most full drive is > 85% full.

Usage

check_drive_utilization [-h] [-w RANGE] [-c RANGE]

 -h, --help show this help message and exit
 -w RANGE WARNING if any drive over this % full (default: 70)
 -c RANGE CRITICAL if any drive over this % full (default: 85)

Sample

[root@srv1 ~]# check_drive_utilization
UTILIZATION OKAY- percent full: 1%;

Check for background Swift daemons

There is one required argument which is the name of the background daemon to check. See below for the set of valid choices. This Nagios plugin will return CRITICAL if the backend Swift daemon is not running or if no log message may be found for the daemon in /var/log/swift/all.log or /var/log/swift/all.log.1. If the daemon is running, the time since its most recent log message is compared against the given Nagios -w and -c thresholds. The units of time used are (potentially fractional) hours. The default threshold values will return WARNING when the backend daemon's most recent log message is more than 4 hours old and CRITICAL when the most recent log message is more than 12 hours old.

Usage

check_swift_background_daemon [-h] [-w RANGE] [-c RANGE]
{account-reaper,account-replicator,account-auditor,container-replicator,container-updater,container-auditor,object-replicator,object-updater,object-auditor}

 -h, --help show this help message and exit
 -w RANGE WARNING if last log message older than this many hours. (default: 4)
 -c RANGE CRITICAL if last log message older than this many hours. (default: 12)

Sample

[root@srv1 ~]# check_swift_background_daemon object-auditor
SWIFT DAEMON object-auditor OKAY- last log message 0.0 hours old;

[root@srv1 ~]# check_swift_background_daemon object-replicator
SWIFT DAEMON object-replicator OKAY- last log message 0.0 hours old;

Check for the most recent Swift back-end daemon sweep time

There is one required argument which is the name of the background daemon to check. See below for the set of valid choices. This Nagios plugin will return UNKNOWN if the backend Swift daemon is not running or if no log message may be found for the daemon in /var/log/swift/all.log or /var/log/swift/all.log.1. If the daemon is running, the sweep time for its most recent full run is compared against the given Nagios -w and -c thresholds. The units of time used are (potentially fractional) hours. The default threshold values will return WARNING when the backend daemon's sweep time exceeds 8 hours and CRITICAL when the most recent sweep time exceeds 12 hours.

Usage

check_swift_sweep_time [-h] [-w RANGE] [-c RANGE]
{account-reaper,account-replicator,account-auditor,container-replicator,container-updater,container-auditor,object-replicator,object-updater,object-auditor}

 -h, --help show this help message and exit
 -w RANGE WARNING if last sweep time greater than this many hours. (default: 8)
 -c RANGE CRITICAL if last sweep time greater than this many hours. (default: 12)

Sample

[root@srv1 ~]# check_swift_sweep_time object-replicator
SWIFT DAEMON object-replicator SWEEP TIME OKAY- sweep time: 0.00 hrs;

Tip

Sweep time for the object-auditor of large clusters may legitimately be multiple days long. Adjust the -w and -c parameters to reflect actual sweep times in your cluster.