-
Notifications
You must be signed in to change notification settings - Fork 846
Description
Bug: nextcloud-aio-notify-push
stuck waiting as port 9001 on nextcloud-aio-nextcloud
becomes unresponsive (AIO Helm Chart)
Area: AIO / Helm Chart / Kubernetes
Describe the bug
When running Nextcloud AIO deployed via the official Helm chart on Kubernetes, the nextcloud-aio-notify-push
pod intermittently fails to start or run correctly. It gets stuck logging Waiting for Nextcloud to start...
.
This appears to be caused by the nextcloud-aio-nextcloud
pod/service becoming unresponsive on TCP port 9001 after running fine for a period (ranging from days to weeks). The start.sh
script within the nextcloud-aio-notify-push
container specifically waits for connectivity to $NEXTCLOUD_HOST
(which resolves to the nextcloud-aio-nextcloud
service) on port 9001 before proceeding:
# From notify-push start.sh
# Only start container if nextcloud is accessible
while ! nc -z "$NEXTCLOUD_HOST" 9001; do
echo "Waiting for Nextcloud to start..."
sleep 5
done
While port 9001 becomes unresponsive, the main Nextcloud interface served by the same nextcloud-aio-nextcloud
pod on port 9000 remains accessible and functional. However, features relying on notify-push
, such as Talk connections or calls, may fail.
Steps to reproduce
The issue is intermittent, making exact reproduction steps difficult, but the pattern is:
- Deploy Nextcloud AIO using the official Helm chart on a Kubernetes cluster.
- Ensure the
notify-push
component is enabled and deployed. - The system runs correctly for an indeterminate amount of time (days or weeks).
- Eventually, the
nextcloud-aio-notify-push
pod (if restarted, or potentially during operation) starts loggingWaiting for Nextcloud to start...
. - Attempting to connect to the
nextcloud-aio-nextcloud
service/pod IP on port 9001 from another pod fails (e.g.,nc -zv nextcloud-aio-nextcloud 9001
times out or gets connection refused). - Attempting to connect to the
nextcloud-aio-nextcloud
service/pod IP on port 9000 succeeds (e.g.,nc -zv nextcloud-aio-nextcloud 9000
reports success).
Expected behavior
The nextcloud-aio-nextcloud
pod should consistently listen and respond on port 9001, allowing the nextcloud-aio-notify-push
service to connect and function reliably.
Actual behavior
The process listening on port 9001 within the nextcloud-aio-nextcloud
container appears to stop or crash intermittently, making the port unreachable and blocking nextcloud-aio-notify-push
.
Log entries
-
nextcloud-aio-notify-push
pod logs:Waiting for Nextcloud to start... Waiting for Nextcloud to start... [...]
-
Diagnostic commands (run from another pod in the cluster when the issue occurs):
# Fails $ nc -zv nextcloud-aio-nextcloud 9001 nc: connect to nextcloud-aio-nextcloud (10.x.x.x) port 9001 (tcp) failed: Connection timed out (or refused) # Succeeds $ nc -zv nextcloud-aio-nextcloud 9000 Connection to nextcloud-aio-nextcloud (10.x.x.x) 9000 port [tcp/*] succeeded!
-
nextcloud-aio-nextcloud
pod logs: Standard container logs (kubectl logs ...
) at the time of the failure have not yet revealed a clear cause in the cases observed so far. Further investigation of the internalnextcloud.log
within the data volume might be needed when the issue occurs. The startup sequence seems normal otherwise (example startup logs can be provided if needed).
Environment
- Installation method: Official Nextcloud AIO Helm Chart on Kubernetes.
- Nextcloud Server version:
30.0.9
(latest today) - Relevant AIO Components:
nextcloud-aio-nextcloud
,nextcloud-aio-notify-push
Workaround
Deleting the nextcloud-aio-nextcloud
pod (kubectl delete pod <pod-name> -n <namespace>
) forces Kubernetes to recreate it. Upon restart, the process listening on port 9001 becomes available again, and nextcloud-aio-notify-push
can successfully connect and start. This workaround is temporary, as the issue tends to reappear later.
Potential Cause / Analysis
It seems likely that an internal process or component within the nextcloud-aio-nextcloud
container, specifically responsible for handling connections on port 9001 (required by notify-push
), is unstable and crashes or stops running after some time under certain conditions. Identifying this specific process and the reason for its failure is key.
Related Information
This issue appears identical to the one reported in the Nextcloud Help forum:
https://help.nextcloud.com/t/nextcloud-stops-accepting-connections-on-port-9001-after-a-while-helm-chart/218295