Skip to content

TrueNAS/FreeNAS Nagios check script. Checks for Alerts, Pool health, Pool capacity, Replication errors, TrueNAS software updates, etc.

Notifications You must be signed in to change notification settings

StewLG/check_truenas_extended_play

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 

Repository files navigation

Check TrueNAS - Extended Play

This is a TrueNAS/FreeNAS Nagios check script. Checks for Alerts, Pool health, Pool capacity, Replication errors, TrueNAS software updates, etc.

This is an updated version of check_truenas.py, written by Goran Tornqvist, and originally found here:

https://exchange.nagios.org/directory/Plugins/Hardware/Storage-Systems/SAN-and-NAS/Check-TrueNAS-Health-2FReplication/details

Checks a TrueNAS/FreeNAS server using the 2.0 API. Version 1.42

optional arguments:
  -h, --help            show this help message and exit
  -H HOSTNAME, --hostname HOSTNAME
                        Hostname or IP address
  -u USER, --user USER  Username, only root works, if not specified: use API Key
  -p PASSWD, --passwd PASSWD
                        Password or API Key
  -t TYPE, --type TYPE  Type of check, either alerts, zpool, zpool_capacity, repl, or update
  -pn ZPOOLNAME, --zpoolname ZPOOLNAME
                        For check type zpool, the name of zpool to check. Optional; defaults to all zpools.
  -ns, --no-ssl         Disable SSL (use HTTP); default is to use SSL (use HTTPS)
  -nv, --no-verify-cert
                        Do not verify the server SSL cert; default is to verify the SSL cert
  -ig, --ignore-dismissed-alerts
                        Ignore alerts that have already been dismissed in FreeNas/TrueNAS; default is to treat them as
                        relevant
  -d, --debug           Display debugging information; run script this way and record result when asking for help.
  -zw ZPOOL_WARN, --zpool-warn ZPOOL_WARN
                        ZPool warning storage capacity free threshold. Give a percent value in the range 1-100,
                        defaults to 80%. Used with zpool_capacity check.
  -zc ZPOOL_CRITICAL, --zpool-critical ZPOOL_CRITICAL
                        ZPool critical storage capacity free threshold. Give a percent value in the range 1-100,
                        defaults to 90%. Used with zpool_capacity check.
  -zp, --zpool-perfdata
                        Add Zpool capacity perf data to output. Used with zpool_capacity check.

Requirements

  • Python 3.7 or greater

    If you get an error like:

    File "./check_truenas_extended_play.py", line 48
      ZpoolName: str
    

    Check to make sure you are running an up-to-date Python.

  • python3-urllib3

  • python3-requests

Usage Examples:

Check for alerts. This may be all the average user needs to set up. TrueNAS/FreeNas alerts about nearly all significant events here.

Alerts normal operation - username/password authentication

check_truenas_extended_play.py -H apollo.yourdomain.local -u root -p RootPassy --type alerts -nv
OK - No problem alerts

Alerts normal operation - API Key authentication

check_truenas_extended_play.py -H apollo.yourdomain.local -p 1-weuiK4YY7OUduhpzKISIJJIDIJSJ4YgMwvea3dEhf3ITmoRRYZ3HBkDr2s1KZ1ft7M --type alerts -nv
OK - No problem alerts

Alerts sample error condition

check_truenas_extended_play.py -H sicknas.yourdomain.local -u root -p RootPa$$worD -type alerts -nv -ns
CRITICAL - (C) Space usage for pool "BigMediaThree" is 85%. Optimal pool performance requires
used space remain below 80%. - (W) New feature flags are available for volume BigMediaToo. Refer
to the "Upgrading a ZFS Pool" subsection in the User Guide "Installing and Upgrading" chapter
and "Upgrading" section for more instructions. - (W) New feature flags are available for volume 
BigMediaThree. Refer to the "Upgrading a ZFS Pool" subsection in the User Guide "Installing and Upgrading" 
chapter and "Upgrading" section for more instructions.

Check Zpool health

Check all Zpools

check_truenas_extended_play.py -H apollo.yourdomain.local -u root -p RootPassy --type zpool -nv
OK - No problem Zpools. Zpools examined:  ApolloZpoolOne ApolloZPoolEleven

Check a specifically named Zpool, ignoring any others

check_truenas_extended_play.py -H apollo.yourdomain.local -u root -p RootPassy --type zpool -nv --zpoolname ApolloZPoolEleven
OK - No problem Zpools. Zpools examined:  ApolloZPoolEleven

Example of what happens if Zpool is not present

check_truenas_extended_play.py -H apollo.yourdomain.local -u root -p RootPassy --type zpool -nv --zpoolname PoolNameWhichIsNotActuallyThere
CRITICAL - No Zpools found matching PoolNameWhichIsNotActuallyThere out of 2 pools (ApolloZpoolOne ApolloZPoolEleven)

Check Zpool capacity

Check all zpools for capacity issues

check_truenas_extended_play.py -H apollo.yourdomain.local -t zpool_capacity -p 1-weuiK4YY7OUdukdiejsijeiYFe4YgMwvea3dEhf3ITmoRRYZ3HBkDr2s1KZ1ft7M -nv 
OK - No Zpool capacity issues. ZPools examined: ApolloZpoolOne (75.8% used) - ApolloZPoolEleven (64.0% used) - Root level datasets examined: ApolloZpoolOne ApolloZPoolEleven

Note that the default warning level (80%) and default critical level (90%) will be used here.

Check specific zpool for capacity issues

check_truenas_extended_play.py -H apollo.yourdomain.local -t zpool_capacity -pn ApolloZpoolOne -p 1-weuiK4YY7OUdukdiejsijeiYFe4YgMwvea3dEhf3ITmoRRYZ3HBkDr2s1KZ1ft7M -nv 
OK - No Zpool capacity issues. ZPools examined: ApolloZpoolOne (75.8% used) - Root level datasets examined: ApolloZpoolOne

Check all zpools with custom warning level

check_truenas_extended_play.py -H apollo.yourdomain.local -t zpool_capacity -zw 30 -p 1-weuiK4YY7OUdukdiejsijeiYFe4YgMwvea3dEhf3ITmoRRYZ3HBkDr2s1KZ1ft7M -nv 
WARNING - Pool ApolloZpoolOne usage 75.8% exceeds warning value of 30% - Pool ApolloZPoolEleven usage 64.0% exceeds warning value of 30%

Check all zpools with custom error level

check_truenas_extended_play.py -H apollo.yourdomain.local -t zpool_capacity -zc 40 -p 1-weuiK4YY7OUdukdiejsijeiYFe4YgMwvea3dEhf3ITmoRRYZ3HBkDr2s1KZ1ft7M -nv 
CRITICAL - Pool ApolloZpoolOne usage 75.8% exceeds warning value of 40% - Pool ApolloZPoolEleven usage 64.0% exceeds critical value of 40%

Check specific zpool for capacity issues, adding on perf data as well

check_truenas_extended_play.py -H apollo.yourdomain.local -t zpool_capacity -pn ApolloZpoolOne -zp -p 1-weuiK4YY7OUdukdiejsijeiYFe4YgMwvea3dEhf3ITmoRRYZ3HBkDr2s1KZ1ft7M -nv 
OK - No Zpool capacity issues. ZPools examined: ApolloZpoolOne (75.8% used) - Root level datasets examined: ApolloZpoolOne;| ApolloZpoolOne=294202.30MB;310479.52;155239.76;0;388099.40

Check replication health

check_truenas_extended_play.py -H apollo.yourdomain.local -u root -p RootPassy --type repl -nv
OK - No replication errors. Replications examined:  ApolloDatasetReplications: FINISHED

Check for TrueNAS updates

Check for TrueNAS updates - no updates available

check_truenas_extended_play.py -H apollo.yourdomain.local --type update -p 1-weuiK4YY7OUduhpzKISIJJIDIJSJ4YgMwvea3dEhf3ITmoRRYZ3HBkDr2s1KZ1ft7M -nv
OK - Update Status: UNAVAILABLE (no update available)

'UNAVAILABLE' is the normal update status, and does not indicate a problem.

Check for TrueNAS updates - possible updates available

check_truenas_extended_play.py -H apollo.yourdomain.local --type update -p 1-weuiK4YY7OUduhpzKISIJJIDIJSJ4YgMwvea3dEhf3ITmoRRYZ3HBkDr2s1KZ1ft7M -nv
WARNING - Update Status: AVAILABLE (an update is available). Update may be required. Go to TrueNAS Dashboard -> System -> Update to check for newer version.

All update issues are merely warnings, and not critical errors.

As of 12/15/2021 there is an apparent issue with update checks when the ixsystems update servers are down, and the relevant API call crashes cryptically. I have filed a bug report with IX Systems:

https://jira.ixsystems.com/browse/NAS-113833

Version History

June 18, 2020 - Version 1.0

Initial release. Script was not explicitly versioned - no version number displayed.

August 14, 2020 - Version 1.1

Replication check problems corrected. This check wasn't fully tested by author until this release.

December 3, 2021 - Version 1.2

Added API Key authentication. Thanks to Folke Ashberg.

December 4, 2021 - Version 1.3

Added update check, by request of @madtempest.

December 6, 2021 - Version 1.4

ZPool capacity check added.

Thanks to both Folke Ashberg (@fashberg) and @Cosmits. Each made an independent pull request with a suggested implementation of the zpool capacity check feature, demonstrating a strong demand. It has been a bit involved to implement correctly, so if anyone sees free/used values for their ZPools that don't seem right, please let us know.

March 1, 2022 - Version 1.41

Adding well-intentioned but likely futile version check for Python 3.7 or greater. Clarifying in docs that Python 3.7 or greater is required.

January 30, 2023 - Version 1.42

@juleslink found a typo in a hardcoded byte count - "1204" instead of "1024" as it should have been. This caused incorrect byte math in capacity checks. Apologies for not seeing this earlier, and for any disruption this causes to your checks or metrics tracking!

Feedback Welcome

If you have a suggestion or encounter a problem, I encourage users to get in touch. I've found half-baked Nagios plugins to be a chore to deal with, and I'd like this not to be one of them.

About

TrueNAS/FreeNAS Nagios check script. Checks for Alerts, Pool health, Pool capacity, Replication errors, TrueNAS software updates, etc.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •  

Languages