Skip to content

feat: add vsan stretched cluster health checks#126

Merged
GaryJBlake merged 2 commits into
mainfrom
feat/vsan-stretched-cluster-health
Apr 18, 2023
Merged

feat: add vsan stretched cluster health checks#126
GaryJBlake merged 2 commits into
mainfrom
feat/vsan-stretched-cluster-health

Conversation

@tenthirtyam
Copy link
Copy Markdown
Contributor

In order to have a good experience with our community, we recommend that you read the contributing guidelines for making a pull request.

Summary of Pull Request

  • Updated Publish-VsanHealth to include the results for stretced cluster health status and stretched cluster tests.
  • Bumps the module version to v2.0.0.1011.
  • Updates CHANGELOG.md.

Ref: #104

Type of Pull Request

  • This is a bug fix.
  • This is an enhancement or feature.
  • This is a code style / formatting update.
  • This is a documentation update.
  • This is a refactoring update.
  • This is a chore update
  • This is something else.
    Please describe:

Related to Existing Issues

Closes #104

Test and Documentation Coverage

  • Tests have been completed (for bug fixes / features).
  • Documentation has been added / updated (for bug fixes / features).

With vSAN Stretched Cluster

PS F:\> Publish-VsanHealth -json F:\Reporting\HealthReports\sfo-vcf01-all-health-results.json -failureOnly

Component Resource     Alert  Message
--------- --------     -----  -------
vCenter   sfo-m01-cl01 RED    The stretched cluster does not contain a valid witness host
vCenter   sfo-m01-cl01 RED    This health check will display the worst latency between any two hosts in data sites and witness. Warning threshold: 5ms between data sites, 200ms between data site and witness in NON-ROB...
vCenter   sfo-m01-cl01 RED    The stretched cluster does not contain two valid fault domains
vCenter   sfo-m01-cl01 RED    Stretched cluster contains a witness host without a valid disk group
vCenter   sfo-m01-cl01 RED    Cluster VSAN Health Check is not successful for following groups ['Capacity utilization']. Please refer to Vcenter Skyline Health for more details about the un-successful tests. Overall t...
vCenter   sfo-m01-cl01 YELLOW vSAN Capacity utilization has reached 70%
vCenter   sfo-w01-cl01 RED    This health check will display the worst latency between any two hosts in data sites and witness. Warning threshold: 5ms between data sites, 200ms between data site and witness in NON-ROB...
vCenter   sfo-w01-cl01 RED    The stretched cluster does not contain two valid fault domains
vCenter   sfo-w01-cl01 RED    The stretched cluster does not contain a valid witness host
vCenter   sfo-w01-cl01 YELLOW Cluster VSAN Health Check is not successful for following groups ['Online health (Disabled)']. Please refer to Vcenter Skyline Health for more details about the un-successful tests. Overa...
vCenter   sfo-w01-cl01 RED    Stretched cluster contains a witness host without a valid disk group


PS F:\> Publish-VsanHealth -json F:\Reporting\HealthReports\sfo-vcf01-all-health-results.json

Component Resource     Alert  Message
--------- --------     -----  -------
vCenter   sfo-m01-cl01 GREEN  Hosts should setup unicast agent so that they are able to communicate with the witness node
vCenter   sfo-m01-cl01 GREEN  The witness host should not be a part of the vCenter cluster, which forms the stretched cluster
vCenter   sfo-m01-cl01 GREEN  The stretched cluster contains multiple unicast agents. This means multiple unicast agents were set on non-witness hosts.
vCenter   sfo-m01-cl01 GREEN  Cluster contains hosts with invalid unicast agent
vCenter   sfo-m01-cl01 -      Stretched cluster is disabled on cluster: sfo-m01-cl01
vCenter   sfo-m01-cl01 GREEN  Stretched Cluster Health Status: sfo-m01-cl01
vCenter   sfo-m01-cl01 GREEN  The following (witness) hosts have invalid preferred fault domains.
vCenter   sfo-m01-cl01 RED    The stretched cluster does not contain a valid witness host
vCenter   sfo-m01-cl01 RED    This health check will display the worst latency between any two hosts in data sites and witness. Warning threshold: 5ms between data sites, 200ms between data site and witness in NON-ROB...
vCenter   sfo-m01-cl01 RED    The stretched cluster does not contain two valid fault domains
vCenter   sfo-m01-cl01 RED    Stretched cluster contains a witness host without a valid disk group
vCenter   sfo-m01-cl01 GREEN  The preferred fault domain is not set in the cluster for the witness host
vCenter   sfo-m01-cl01 GREEN  Cluster contains hosts whose ESXi version does not support stretched cluster
vCenter   sfo-m01-cl01 GREEN  The following witness node resides in one of the data fault domains
vCenter   sfo-m01-cl01 -      Compression is enabled on cluster : sfo-m01-cl01
vCenter   sfo-m01-cl01 -      Encryption is disabled on cluster : sfo-m01-cl01
vCenter   sfo-m01-cl01 YELLOW vSAN Capacity utilization has reached 70%
vCenter   sfo-m01-cl01 GREEN  There are no Active Re-Syncing happening
vCenter   sfo-m01-cl01 GREEN  All hosts are healthy
vCenter   sfo-m01-cl01 RED    Cluster VSAN Health Check is not successful for following groups ['Capacity utilization']. Please refer to Vcenter Skyline Health for more details about the un-successful tests. Overall t...
vCenter   sfo-m01-cl01 -      Deduplication is enabled on cluster : sfo-m01-cl01
vCenter   sfo-w01-cl01 GREEN  Hosts should setup unicast agent so that they are able to communicate with the witness node
vCenter   sfo-w01-cl01 GREEN  The witness host should not be a part of the vCenter cluster, which forms the stretched cluster
vCenter   sfo-w01-cl01 GREEN  The stretched cluster contains multiple unicast agents. This means multiple unicast agents were set on non-witness hosts.
vCenter   sfo-w01-cl01 GREEN  The preferred fault domain is not set in the cluster for the witness host
vCenter   sfo-w01-cl01 RED    The stretched cluster does not contain a valid witness host
vCenter   sfo-w01-cl01 RED    This health check will display the worst latency between any two hosts in data sites and witness. Warning threshold: 5ms between data sites, 200ms between data site and witness in NON-ROB...
vCenter   sfo-w01-cl01 RED    The stretched cluster does not contain two valid fault domains
vCenter   sfo-w01-cl01 GREEN  Cluster contains hosts whose ESXi version does not support stretched cluster
vCenter   sfo-w01-cl01 GREEN  The following witness node resides in one of the data fault domains
vCenter   sfo-w01-cl01 RED    Stretched cluster contains a witness host without a valid disk group
vCenter   sfo-w01-cl01 GREEN  Cluster contains hosts with invalid unicast agent
vCenter   sfo-w01-cl01 -      Compression is disabled on cluster : sfo-w01-cl01
vCenter   sfo-w01-cl01 GREEN  There are no Active Re-Syncing happening
vCenter   sfo-w01-cl01 -      Encryption is disabled on cluster : sfo-w01-cl01
vCenter   sfo-w01-cl01 -      Stretched cluster is disabled on cluster: sfo-w01-cl01
vCenter   sfo-w01-cl01 GREEN  Stretched Cluster Health Status: sfo-w01-cl01
vCenter   sfo-w01-cl01 YELLOW Cluster VSAN Health Check is not successful for following groups ['Online health (Disabled)']. Please refer to Vcenter Skyline Health for more details about the un-successful tests. Overa...
vCenter   sfo-w01-cl01 GREEN  The following (witness) hosts have invalid preferred fault domains.
vCenter   sfo-w01-cl01 -      Deduplication is disabled on cluster : sfo-w01-cl01
vCenter   sfo-w01-cl01 GREEN  vSAN Capacity utilization is well within thresholds
vCenter   sfo-w01-cl01 GREEN  All hosts are healthy

Without vSAN Stretched Cluster

PS F:\> Publish-VsanHealth -json F:\Reporting\HealthReports\sfo-vcf01-all-health-results.json -failureOnly

Component Resource     Alert  Message
--------- --------     -----  -------
vCenter   sfo-m01-cl01 YELLOW vSAN Capacity utilization has reached 70%
vCenter   sfo-m01-cl01 RED    Cluster VSAN Health Check is not successful for following groups ['Capacity utilization']. Please refer to Vcenter Skyline Health for more details about the un-successful tests. Overall t...
vCenter   sfo-w01-cl01 YELLOW Cluster VSAN Health Check is not successful for following groups ['Online health (Disabled)']. Please refer to Vcenter Skyline Health for more details about the un-successful tests. Overa...


PS F:\> Publish-VsanHealth -json F:\Reporting\HealthReports\sfo-vcf01-all-health-results.json

Component Resource     Alert  Message
--------- --------     -----  -------
vCenter   sfo-m01-cl01 -      Encryption is disabled on cluster : sfo-m01-cl01
vCenter   sfo-m01-cl01 -      Compression is enabled on cluster : sfo-m01-cl01
vCenter   sfo-m01-cl01 GREEN  There are no Active Re-Syncing happening
vCenter   sfo-m01-cl01 GREEN  Stretched Cluster Health Status: sfo-m01-cl01
vCenter   sfo-m01-cl01 -      Stretched cluster is disabled on cluster: sfo-m01-cl01
vCenter   sfo-m01-cl01 -      Deduplication is enabled on cluster : sfo-m01-cl01
vCenter   sfo-m01-cl01 YELLOW vSAN Capacity utilization has reached 70%
vCenter   sfo-m01-cl01 RED    Cluster VSAN Health Check is not successful for following groups ['Capacity utilization']. Please refer to Vcenter Skyline Health for more details about the un-successful tests. Overall t...
vCenter   sfo-m01-cl01 GREEN  All hosts are healthy
vCenter   sfo-w01-cl01 -      Deduplication is disabled on cluster : sfo-w01-cl01
vCenter   sfo-w01-cl01 GREEN  Stretched Cluster Health Status: sfo-w01-cl01
vCenter   sfo-w01-cl01 -      Stretched cluster is disabled on cluster: sfo-w01-cl01
vCenter   sfo-w01-cl01 YELLOW Cluster VSAN Health Check is not successful for following groups ['Online health (Disabled)']. Please refer to Vcenter Skyline Health for more details about the un-successful tests. Overa...
vCenter   sfo-w01-cl01 GREEN  There are no Active Re-Syncing happening
vCenter   sfo-w01-cl01 GREEN  vSAN Capacity utilization is well within thresholds
vCenter   sfo-w01-cl01 -      Compression is disabled on cluster : sfo-w01-cl01
vCenter   sfo-w01-cl01 -      Encryption is disabled on cluster : sfo-w01-cl01
vCenter   sfo-w01-cl01 GREEN  All hosts are healthy

Breaking Changes?

  • Yes, there are breaking changes.
  • No, there are no breaking changes.

@tenthirtyam tenthirtyam added the enhancement Enhancement label Apr 14, 2023
@tenthirtyam tenthirtyam added this to the v2.0.0 milestone Apr 14, 2023
@tenthirtyam tenthirtyam requested a review from GaryJBlake April 14, 2023 18:40
@tenthirtyam tenthirtyam self-assigned this Apr 14, 2023
@tenthirtyam
Copy link
Copy Markdown
Contributor Author

@GaryJBlake - Marked as draft until #125 is merged.

@tenthirtyam tenthirtyam marked this pull request as ready for review April 15, 2023 11:51
@tenthirtyam tenthirtyam requested a review from a team as a code owner April 15, 2023 11:51
@tenthirtyam
Copy link
Copy Markdown
Contributor Author

Readu for review.

- Updated `Publish-VsanHealth` to include the results for stretced cluster health status and stretched cluster tests.
- Bumps the module version to v2.0.0.1011.
- Updates `CHANGELOG.md`.

Ref: #104

Signed-off-by: Ryan Johnson <johnsonryan@vmware.com>
@tenthirtyam tenthirtyam force-pushed the feat/vsan-stretched-cluster-health branch from c6f9ffa to 7248777 Compare April 17, 2023 21:17
Copy link
Copy Markdown
Contributor

@GaryJBlake GaryJBlake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@vmwclabot vmwclabot added the dco-required DCO Required label Apr 18, 2023
@tenthirtyam tenthirtyam requested a review from GaryJBlake April 18, 2023 12:54
@tenthirtyam tenthirtyam removed the dco-required DCO Required label Apr 18, 2023
@vmware vmware deleted a comment from vmwclabot Apr 18, 2023
Copy link
Copy Markdown
Contributor

@GaryJBlake GaryJBlake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@GaryJBlake GaryJBlake merged commit cd56b74 into main Apr 18, 2023
@tenthirtyam tenthirtyam deleted the feat/vsan-stretched-cluster-health branch April 18, 2023 12:57
@github-actions
Copy link
Copy Markdown

I'm going to lock this pull request because it has been closed for 30 days. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions Bot locked as resolved and limited conversation to collaborators May 26, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

enhancement Enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for vSAN stretched cluster health checks

3 participants