-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🐛 fix: resolve JSONPath error in caBundle readiness check by adding kubectl_wait_for_query function #1429
base: main
Are you sure you want to change the base?
🐛 fix: resolve JSONPath error in caBundle readiness check by adding kubectl_wait_for_query function #1429
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
514d318
to
d36d7d1
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1429 +/- ##
==========================================
- Coverage 73.42% 73.29% -0.14%
==========================================
Files 42 42
Lines 3063 3063
==========================================
- Hits 2249 2245 -4
- Misses 641 644 +3
- Partials 173 174 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
efd97f8
to
e7036fc
Compare
e7036fc
to
b538210
Compare
b538210
to
ed2af33
Compare
scripts/install.tpl.sh
Outdated
function kubectl_wait_for_query() { | ||
manifest=$1 | ||
query=$2 | ||
timeout=$3 | ||
start_time=$(date +%s) | ||
while true; do | ||
val=$(kubectl get "${manifest}" -o jsonpath="${query}" 2>/dev/null || echo "") | ||
if [[ -n "${val}" ]]; then | ||
echo "${manifest} has ${query}." | ||
break | ||
fi | ||
if [[ $(( $(date +%s) - start_time )) -ge ${timeout} ]]; then | ||
echo "Timed out waiting for ${manifest} to have ${query}." | ||
exit 1 | ||
fi | ||
sleep 5 | ||
done | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this thing be replaced with something along the lines of:
kubectl wait validatingwebhookconfigurations/cert-manager-webhook --for=jsonpath='{.webhooks[0].clientConfig.caBundle}' --timeout=60s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your suggestion is the current approach, but unfortunately, it doesn’t work for our use case. ;-) The kubectl wait command doesn’t handle this situation well.
You can easily reproduce the issue by:
- run
kind create cluster
- Then, run the install script https://operator-framework.github.io/operator-controller/getting-started/olmv1_getting_started/ OR following the steps; 📖 docs: add local build and deploy instructions to CONTRIBUTING.md #1430. You will see the error described in the description.
My Assumption here is: The issue seems to arise because kubectl wait expects JSONPath conditions related to .status
fields (such as readyReplicas
), which are applicable for deployments but not for webhook configurations. Since mutatingwebhookconfigurations
and validatingwebhookconfigurations
lack these .status
fields, kubectl wait may fail when it tries to monitor non-existent replica-related conditions. Additionally, it’s possible this issue is linked to cert-manager’s recommendation to use 3 replicas in production environments since we see error: jsonpath wait format must be --for=jsonpath='{.status.readyReplicas}'=3
, which may indirectly affect how readiness is handled.
scripts/install.tpl.sh
Outdated
echo "Timed out waiting for ${manifest} to have ${query}." | ||
exit 1 | ||
fi | ||
sleep 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good change assuming it helps things be less flaky with timing in practice, but
what's the 5 magic number? I think 5s?
- I think this should be a variable with a good name:
poll_interval_in_seconds
?; - But then you are delaying the polling for 5s? Could be a fair amount of extra wait for no benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Applied your suggestion and set the polling interval to 10 seconds. With a 60-second timeout, it wouldn’t make sense to use a longer interval.
ed2af33
to
bdbef17
Compare
…ectl_wait_for_query function By running `make kind-deploy` against a kind cluster or installing the released script from https://operator-framework.github.io/operator-controller/getting-started/olmv1_getting_started/ the following error is faced: ```sh ... deployment.apps/cert-manager-webhook condition met deployment.apps/cert-manager-cainjector condition met deployment.apps/cert-manager condition met error: jsonpath wait format must be --for=jsonpath='{.status.readyReplicas}'=3 ``` This PR fixes an issue with kubectl wait when used to check the caBundle field in `mutatingwebhookconfigurations` and `validatingwebhookconfigurations`. This PR introduces the `kubectl_wait_for_query` function, which replaces `kubectl wait` for this specific use case. The function repeatedly checks the `caBundle` field by using `kubectl get` in a loop, ensuring that the `caBundle` is populated without relying on status-based conditions. This approach provides a more flexible solution compatible with webhook configurations, bypassing the limitations of `kubectl wait`.
bdbef17
to
b118c39
Compare
BY running
make kind-deploy
against a kind cluster or installing the released script from https://operator-framework.github.io/operator-controller/getting-started/olmv1_getting_started/ the following error is faced:This PR fixes an issue with kubectl wait when used to check the caBundle field in
mutatingwebhookconfigurations
andvalidatingwebhookconfigurations
.Solution
This PR introduces the
kubectl_wait_for_query
function, which replaceskubectl wait
for this specific use case. The function repeatedly checks thecaBundle
field by usingkubectl get
in a loop, ensuring that thecaBundle
is populated without relying on status-based conditions. This approach provides a more flexible solution compatible with webhook configurations, bypassing the limitations ofkubectl wait
.After the fix: