-
-
Notifications
You must be signed in to change notification settings - Fork 32.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add system health component #20436
Conversation
We should also include (custom) addons/components configured for diagnosis. |
Yes please. Despite the warnings in the logs, lots of people overlook custom components (or heck, forget they've installed them). |
Components will be added in a future PR |
I think the actions need to be a new part. So you see all available checks and can trigger the test to see if there is a problem. I.e., Hue register a health check function. We can click on that and trigger the register function they check if the Hub is connected or other things and return one of 3 states: So the health component has 2 parts. Static system information (they usually are very static) and a health check function for a domain. |
@pvizeli so what about some info that we could run as an action, but also could retrieve statically? For example, we will know if we're connected to HA Cloud ? One other thing I hope other components will add to this view is last interaction. When was last interaction with Hue etc. When was the last error etc. |
What am I doing… this needs to be WS commands instead of HTTP views🤦♂️ |
The Problem is, that code works nicely if you have no issue. You call now every callback any time you try to receive data for Frontend. On component, you need to check data on external API or hardware. If there is an issue, you run into the default timeout. Also, some system API like docker eat a lot of resources and can block other processes. I would prefer that we run the health check every, i.e. hour. And the Frontend sees the cached data of the last health check. But with an option that you can trigger a new health check with knowledge that can take up to 30-60sec until you see the result. That allows us also to slow down the checks and not trigger all at once. With this mechanics, we can later add things like creating a trigger on background checks or a history on which time the system had issues. |
@pvizeli The info command is for static info, firmware version, last interaction, connected to cloud, lovelace storage mode. Future PR will add a diagnostics command that will diagnose things on command when user clicks the button. |
Very confused, can't get the tests to pass on CI but can locally. Mock is not getting applied |
If I need to grab this data from a device or in case of hass.io from the supervisor, they run into the API timeout if there is a problem available. But you are right, for integration with a running connection like the cloud it works perfectly. That end's up in: if you see the healthy data, your system works as it should otherwise you have an issue |
Well, how can we do it otherwise? We give each component up to 5 seconds to get the data. |
I want this to be part of the beta, so will merge it. We can discuss and change things later, as it's an internal implementation. |
https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=147781&start=50#p972790 For those running on a pi, this might be a good check to do and report as undervoltage can lead to SD problems/throttling |
Added system_health: in configuration.yaml In log: Thu Feb 07 2019 13:33:49 GMT-0500 (Eastern Standard Time) |
@vlad36N, please open an issue. |
Description:
Was talking with Tinkerer, and we came to the conclusion that we should prioritize adding this component as it will help with helping (how meta).
Goal is to get a place in the UI to show the info on the machine. This will help people with diagnosing problems.
Some RFC about this implementation:
UI will look something like this:
Related issue (if applicable): fixes home-assistant/architecture#114
Pull request in home-assistant.io with documentation (if applicable): TODO home-assistant/home-assistant.io#<home-assistant.io PR number goes here>
Example entry for
configuration.yaml
(if applicable):system_health:
Checklist:
tox
. Your PR cannot be merged unless tests passIf the code does not interact with devices: