You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Making this bug report because we observe slow init time for bottlerocket EKS nodes. Nodes get stuck in ~5 minutes before starting kubelet and joining the cluster. So far we pinpointed the source to slow pluto.service commit stage (which seems to come from this repo, correct?).
The clusters are using the latest EKS-optimized Bottlerocket image. It reproduces consistently on every new node but not for every cluster.
The question is how to investigate and fix the cause of this? We are not sure if this is a package issue or configuration issue in the clusters. The clusters have IDMS enabled. Not sure what else is required for this process.
Package I'm using: pluto.service
What I expected to happen:
Startup to take 1-2 minutes and not 5+ minutes.
What actually happened:
Looking at systemd logs, pluto.service took 5 minutes to complete. We extracted logs from it and we observe the Committing settings step taking 5 minutes.
Logs from pluto:
bash-5.0# journalctl -u pluto.service
Nov 14 07:42:00 localhost systemd[1]: Starting Generate additional settings for Kubernetes...
Nov 14 07:42:00 localhost settings-committer[1832]: 07:42:00 [INFO] Checking pending settings.
Nov 14 07:42:00 localhost settings-committer[1832]: 07:42:00 [INFO] Committing settings.
Nov 14 07:47:01 localhost systemd[1]: Finished Generate additional settings for Kubernetes.
How to reproduce the problem:
Unclear, we only see this issue in some customer clusters but not on a fresh cluster.
** Extra information **
bash-5.0# apiclient get os
{
"os": {
"arch": "x86_64",
"build_id": "360b7a38",
"pretty_name": "Bottlerocket OS 1.26.2 (aws-k8s-1.30)",
"variant_id": "aws-k8s-1.30",
"version_id": "1.26.2"
}
}
The text was updated successfully, but these errors were encountered:
Making this bug report because we observe slow init time for bottlerocket EKS nodes. Nodes get stuck in ~5 minutes before starting kubelet and joining the cluster. So far we pinpointed the source to slow pluto.service commit stage (which seems to come from this repo, correct?).
The clusters are using the latest EKS-optimized Bottlerocket image. It reproduces consistently on every new node but not for every cluster.
The question is how to investigate and fix the cause of this? We are not sure if this is a package issue or configuration issue in the clusters. The clusters have IDMS enabled. Not sure what else is required for this process.
Package I'm using:
pluto.service
What I expected to happen:
Startup to take 1-2 minutes and not 5+ minutes.
What actually happened:
Looking at systemd logs,
pluto.service
took 5 minutes to complete. We extracted logs from it and we observe theCommitting settings
step taking 5 minutes.Logs from pluto:
How to reproduce the problem:
Unclear, we only see this issue in some customer clusters but not on a fresh cluster.
** Extra information **
bash-5.0# apiclient get os
{
"os": {
"arch": "x86_64",
"build_id": "360b7a38",
"pretty_name": "Bottlerocket OS 1.26.2 (aws-k8s-1.30)",
"variant_id": "aws-k8s-1.30",
"version_id": "1.26.2"
}
}
The text was updated successfully, but these errors were encountered: