From 4071393bef7eef14857df6bfeaca107b8ac09047 Mon Sep 17 00:00:00 2001 From: Trey Dockendorf Date: Mon, 5 Nov 2018 13:22:28 -0500 Subject: [PATCH] Add check_gpfs_health to README --- README.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/README.md b/README.md index 0b375e0..9347ae1 100644 --- a/README.md +++ b/README.md @@ -452,6 +452,7 @@ The table below provides a list of the configuration variables which may be used | MCELOG_MAX_CORRECTED_RATE | `9` | Maximum number of **corrected** MCEs allowed before `check_hw_mcelog()` returns failure | | MCELOG_MAX_UNCORRECTED_RATE | `0` | Maximum number of **uncorrected** MCEs allowed before `check_hw_mcelog()` returns failure | | MDIAG_CMD | `mdiag` | Command to use to invoke Moab's `mdiag` command (may include path) | +| MMHEALTH | `/usr/lpp/mmfs/bin/mmhealth` | Command to use to invoke GPFS' `mmhealth` command | | *NAME | `nhc` | Used to populate default paths/filenames for configuration | | NHC_AUTH_USERS | `root nobody` | Users authorized to have arbitrary processes running on compute nodes | | NHC_CHECK_ALL | `0` | Forces all checks to be non-fatal. Displays each failure message, reports total number of failed checks, and returns that number. | @@ -768,6 +769,23 @@ _**Example**_: `check_fs_size /tmp 512m 4g` _**Example**_: `check_fs_used / 98%` +
+ +##### check_gpfs_health +`check_gpfs_health [-0] [-a] [-l] [-s] [-e ] ` + +Checks the health of a GPFS component. The value for _component_ must match a component reported by mmhealth. + +| **Check Option** | **Purpose** | +| ---------------- | ----------- | +| `-0` | Non-fatal. Failure of this check will be ignored. | +| `-a` | Find, report, and act on all matching components. Default behavior is to fail check after first matching component. | +| `-l` | Log unhealthy component (or components, if used with `-a`) to NHC log (`$LOGFILE`). | +| `-s` | Log unhealthy component (or components, if used with `-a`) to the syslog. | +| `-e`_`action`_ | Execute `/bin/bash -c`_`action`_ if component is NOT healthy. | + +_**Example**_: `check_gpfs_health NETWORK` +