Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make resource HealthStatus computed from HealthReports #6368

Merged

Conversation

adamint
Copy link
Member

@adamint adamint commented Oct 18, 2024

Description

The purpose of this PR is to make the HealthStatus field of a resource no longer settable, and instead have it computed from the HealthReports and State fields.

HealthStatus is computed as

  • null, if the resource is not Running
  • unhealthy, if the resource is running, and has a health check that has not returned a value yet
  • healthy, if there are no health checks
  • otherwise whatever the min of the returned health check values is

Fixes # (issue)

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No
  • Does the change require an update in our Aspire docs?
    • Yes
      • Link to aspire-docs issue:
    • No
Microsoft Reviewers: Open in CodeFlow

@JamesNK
Copy link
Member

JamesNK commented Oct 18, 2024

I tested your branch on health checks sandbox and got this error:

image

Known state propertiy could be set before health reports.

Could fix by defaulting it to an empty array:

private readonly ImmutableArray<HealthReportViewModel> _healthReports = [];

Edit: nevermind, you've fixed this

@JamesNK
Copy link
Member

JamesNK commented Oct 18, 2024

Health status is inconsistently being shown in resource details:

Visible:
image

Not visible:
image

@JamesNK
Copy link
Member

JamesNK commented Oct 18, 2024

While you're making changes in this area, I noticed the Health State casing is inconsistent with other properties. It should be Health state

image

@adamint
Copy link
Member Author

adamint commented Oct 18, 2024

Health status is inconsistently being shown in resource details:

Visible: image

Not visible: image

I was just thinking about this! The property does not appear when the resource has a health report with a null value. We encode that assumption in ResourceStateViewModel but it should really be encoded in HealthStatus itself, which fixes the bug. Made this change

@@ -62,12 +62,6 @@ internal static ResourceStateViewModel GetStateViewModel(ResourceViewModel resou
icon = new Icons.Filled.Size16.Circle();
color = Color.Info;
}
else if (resource.HealthStatus is null)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition is redundant

Copy link
Member

@JamesNK JamesNK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good. Needs tests.

src/Aspire.Dashboard/Model/ResourceViewModel.cs Outdated Show resolved Hide resolved
src/Aspire.Dashboard/Model/ResourceViewModel.cs Outdated Show resolved Hide resolved
@adamint
Copy link
Member Author

adamint commented Oct 18, 2024

Looking pretty good. Needs tests.

ComputeHealthStatus should be tested but otherwise it is pretty covered

@@ -210,6 +210,7 @@ message Resource {

// The aggregate health state of the resource. Generally reflects data from health_reports, but may differ.
optional HealthStatus health_status = 16;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was there offline discussion about adding this back? I'm not against it, just want to make sure that was an agreed decision.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we want to remove it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I said before I didn't want it, but sending the calculated value does reduce duplication. It lets the dashboard be more dumb and accept whatever status the service decides 😄

I haven't got a hard opinion either way, but the current code looks fine to me.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidfowl I had re-added it to the proto to do what james said - avoid re-calculating the status in the dashboard and avoid duplication

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove it, let the dashboard recalculate it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-removed :) is everything else ok?

@JamesNK
Copy link
Member

JamesNK commented Oct 22, 2024

Looks good. I have some questions, nits and changes to the sandbox.

@adamint
Copy link
Member Author

adamint commented Oct 22, 2024

Is this now in a merge-able state?

@adamint adamint requested a review from JamesNK October 22, 2024 05:49
@davidfowl
Copy link
Member

@adamint I suggest you do a squash and a manual backport. This has 36 commits.

Copy link
Member

@davidfowl davidfowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great

@JamesNK
Copy link
Member

JamesNK commented Oct 22, 2024

@adamint I suggest you do a squash and a manual backport. This has 36 commits.

+1.

If you don't know how to squash, squash merge in GH does it automatically. Then just create a new branch from release/9.0 and cherry pick the squashed commit.

@adamint
Copy link
Member Author

adamint commented Oct 22, 2024

There was a health check scenario that has never been accounted for -

  1. Resource has multiple health checks
  2. Resource checks health, one or more reports return a new status but the overall health status has not changed (ie, Unhealthy+Healthy=Unhealthy -> Unhealthy+Degraded=Unhealthy)
  3. The report health is the same as the prior snapshot health status
  4. Resource health reports are not updated, even though the status of one of them changed.

I have updated the sandbox and ResourceHealthCheckService to account for this.

@davidfowl
Copy link
Member

Good catch @adamint !

@adamint adamint enabled auto-merge (squash) October 22, 2024 20:02
@adamint adamint merged commit a4ef97a into dotnet:main Oct 22, 2024
9 checks passed
adamint added a commit that referenced this pull request Oct 23, 2024
* Make resource HealthStatus computed from HealthReports
* Update health reports when they have changed but the status has not

---------

Co-authored-by: Adam Ratzman <adamratzman@microsoft.com>
Co-authored-by: Mitch Denny <midenn@microsoft.com>

(cherry picked from commit a4ef97a)
joperezr pushed a commit that referenced this pull request Oct 24, 2024
* Make resource HealthStatus computed from HealthReports
* Update health reports when they have changed but the status has not

---------

Co-authored-by: Adam Ratzman <adamratzman@microsoft.com>
Co-authored-by: Mitch Denny <midenn@microsoft.com>

(cherry picked from commit a4ef97a)
joperezr added a commit to joperezr/aspire that referenced this pull request Nov 13, 2024
#### AI description  (iteration 1)
#### PR Classification
Merge branch `release/9.0` into `internal/release/9.0` to integrate recent changes and address specific work items.

#### PR Summary
This pull request integrates changes from the `release/9.0` branch into `internal/release/9.0`, addressing several work items related to Azure storage and network isolation.
- `/tests/Aspire.Hosting.Azure.Tests/AzureContainerAppsTests.cs`: Added a new test `ConfigureCustomDomainsMutatesIngress` to validate custom domain configuration for Azure Container Apps.
- `/src/Aspire.Hosting.Azure.AppContainers/ContainerAppExtensions.cs`: Introduced a new extension method `ConfigureCustomDomain` to simplify the process of assigning custom domains to Azure Container Apps.

Related work items: dotnet#6368, dotnet#6391, dotnet#6433, dotnet#6458, dotnet#6467
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants