-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matter Server: All device offline all of a sudden #126136
Comments
Restarting the matter server fixes it (after some time). |
Same problem for me. |
Veryfy your network setting in homeassistant. Mine had changed to something completely different. Setting a static adress solved the isue |
No, still the same for me. |
@3oris (and others) when the device go unavailable, does reloading the integration helps? Settings -> Devices & services -> Matter -> Three dot menu -> Reload. What Home Assistant OS and Matter Server add-on version are you using? |
Hey there @home-assistant/matter, mind taking a look at this issue as it has been labeled with an integration ( Code owner commandsCode owners of
(message by CodeOwnersMention) matter documentation |
@agners : will check as soon as it happens again (probably tomorrow or Saturday). Restarting the Add-On does help to say the least.
|
@agners -- Also, I was wondering if it might be a regression in 6.5.1 home-assistant-libs/python-matter-server#882 , but you probably will know anyways. |
You would have a SEVERE issue with mdns if that cleanup is causing your nodes now to be offline. What is the state of the nodes within the Matter Server's own UI ? |
Well, that is another issue. Maybe you (accidentally) reinstalled the whole Matter integration? |
@agners -- So, it happened again, I restarted the integration , devices came back very very slowly. And only a few minutes after they were all back, they all disappeared again and the matter server was one again in the state of #124647 which I hadn't seen since the upgrade to 6.5.0b2. Before I restarted the Matter server I took the logs: |
i guess my problem just flew away.. after 3 times i had this issue and restarting the matter server afterwards its now running since 2 days without problems. |
@marcelveldt -- Will tell next time it happens. |
@marcelveldt yes, I reinstalled the matter integration, but why does a reinstall not create a new node? If so should I create a new github issue? |
Resetting my HomeAssistant VM to a previous state fixed the problem for me. |
If you reinstall the Matter integration, all data gets reset. So you basically destroyed your Matter network by uninstalling Matter from HA. |
If you do a regular update, the nodes should not get lost. Can you try updating the add-on (again)? Worst case you should be able to restore 6.4.1. That said, while the outcome of your issue is similar to the original poster, I don't think you suffer the same problem: In your case the store on the Matter Server lost all devices. If this happens with the second update attempt again, can you open a separate issue for this? This would be some type of add-on update issue 🤔 |
Hm, that sounds like your whole system is completely overwhelmed somehow. I guess the Matter Server doesnt' respond in time for the Core, so the Core gives up communicating. I wonder if the Matter Server gets itself in a state where things just go awry. Some messages I haven't seen so far, that sounds as if the message got corrupted 🤔
From what I can tell you run this on a Raspberry Pi 3? 🤔 Maybe this is just a bit too much for it to handle 😢 |
He removed the Matter integration (to reinstall) but that also removed the matter add-on with its configuration. |
@marcelveldt -- they just all show offline in the Matter server add-on UI |
@agners -- no, this is Home Assistant running on HA Green. What I run on RPi3 is the OTBR which I run isolated from HA and compile myself in order to have some observability into the thread network via cli like channel monitor, TREL connectivity, child node distribution, link quality and stuff. By this I was also able to chose a thread channel with literally no wifi interference (as far as I can tell). But also, there is no difference on the matter fabric if I take the OTBR or any of the nest hubs out of the thread network. (I cannot take two or more TBRs out of the network though, because then total coverage is to low and the thread network gets overloaded.) The points I am trying to make here:
|
Hey, I did another upgrade to My issue is fixed. Thank you for the support. |
I have the same issue for my EVE matter decices (motion, door, energy) the exact time I updated my iPhone to iOS 18 and my homepod to latest version. Matter server is also 6.5.1, i have no pending updates on anything in HA and HA is also on latest version. My EVE devices work on EVE app and on Home app. I also cannot re-add them
|
I've had this happen, but I concluded that the issue wasn't HA, it (or at least that the issue also involved other equipement). I found that to bring devices back online, I needed to reboot my Google Wifi Pro 6e WiFi routers (which also include my OTBRs). Also, I have both Nest OTBRs and 3 Apple TV OTBRs and have found that if I leave the Nest enabled (and unplug the Apple TVs), all seems OK and stable, but if I add more than 1 Apple OTBR, it can cause instability. I'm thinking there may be something going on when you have a mix of OTBRs from different vendors, in my case, particularly seems to happen when Apple OTBRs and Nest OTBRs try to join into a single thread network. But as long as Apple / Google Nest maintain separate thread networks, its more stable. None of this really makes much sense, but it points to issues that may be beyond HA. Also, entire setup destabilizes if I use Matter 1.0 devices (hello Eve!). |
Maybe your case is different because as I mentioned everything was fine for 1 year until I upgraded to homepod OS 18 and iOS18. The devices work on all my other apps except home assistant. I am also not able to re-add them anymore it keeps failing. |
@agners @marcelveldt -- an update: I have been running on 6.5.2b0 with way less trouble over the last 1.5 weeks. I also see there is a 6.5.2 release but I don't seem to receive it. Anyhow, with 6.5.2b0 always only a few devices go offline in the HA fabric while still being pingable from the device info page. So, not all devices any more. These devices are then also reported as unavailable on the Matter Add-On UI, and as before I can ping them back online into the HA matter fabric. Also, but this is guessing now, those devices that go offline in chunks seem to be connected to the same TBR (Nest Hub G2 F20) at that time which also is a bit contradictory to the fact that they are pingable. continuing to keep an eye... |
Let's try to prevent duplicate issues. We're tracking the availability issue in this report: In general, using multiple Border routers is simply broken atm. |
This is not an Apple issue, it's all Nest Hubs and one OTBR (on a dedicated RPi3b). Lowering the amount of BRs is not an option, since the node count is already 135 and one Nest Hub BR is only able to handle about 20 nodes max (be it due to hardware capacity or thread channel congestion). So 7 BRs seems to be a reasonable amount of BRs. With the recent update to Fuchsia 20.1 things really started to become more stable. I cannot tell why though. Also, TREL seems to work actually well with Nest Hubs. E.g. it makes a huge difference if I disable TREL in the OTBR. So, in general, I would not follow your statement that using multiple BRs is broken. But I am also fine with closing this issue here since I feel that the same issue is now popping up every other week, and I see that you guys are actually on the topic. |
I had the same issue! I restored my Homeassistant VM with the backup from before the last update, and all Eve / Matter devices are back! So there must be something wrong with the latest update! |
The problem
After about 5 days of operation all matter devices become unavailable. The devices are still online in the other (google home) fabric though.
The devices are still pingable from the device info page, and if I do so the specific device gets back online again.
This is not feasible though manually with over 90 matter devices in the system.
Matter devices
Border routers
What version of Home Assistant Core has the issue?
core-2024.9.1
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
Matter
Link to integration documentation on our website
No response
Diagnostics information
core_matter_server_2024-09-17T15-42-59.844Z.log
matter-c921cb8346a353e6865401775d822fe4-Essentials GU10-80fecbd596935ee1f84171a5c0aac88b.json
Example YAML snippet
No response
Anything in the logs that might be useful for us?
No response
Additional information
No response
The text was updated successfully, but these errors were encountered: