OCPBUGS-76331: fix: truncate job names to respect k8s 63-char limit#1554
OCPBUGS-76331: fix: truncate job names to respect k8s 63-char limit#1554fracappa wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
@fracappa: This pull request references Jira Issue OCPBUGS-76331, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Warning Rate limit exceeded
⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (3)
WalkthroughAdds DNS-safe job naming: new constants and regex, GetJobName now shortens long node names via sanitize and hash-based shortening, plus helper functions shortenNodeName and sanitizeDNSLabel; extensive unit tests added and existing tests updated to use GetJobName for expected names. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@fracappa: This pull request references Jira Issue OCPBUGS-76331, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
56a94c4 to
6b52258
Compare
|
/jira refresh |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
pkg/tnf/pkg/tools/jobs_test.go (1)
233-284: Add a DNS-compliance test case for FQDN-style non-standard node names (dots in the fallback path).All current test node names are either dot-free or match the
nodeRolePatternregex (so dots never reach the fallback). Adding a case like"my-custom-host.long-domain-name.example.com"(norole-Nprefix) with a realisticmaxLengthwould exercise the fallback path and expose thesanitizeDNSLabelgap flagged injobs.go.Suggested additional test case
{ name: "Non-standard FQDN node name with dots", nodeName: "my-custom-host.long-domain-name.example.com", maxLength: 30, },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/tnf/pkg/tools/jobs_test.go` around lines 233 - 284, Add a test that exercises the fallback path where node names contain dots so the code path through shortenNodeName and sanitizeDNSLabel is exercised; specifically add a case in TestShortenNodeNameDNSCompliance with name "Non-standard FQDN node name with dots", nodeName "my-custom-host.long-domain-name.example.com" and a realistic maxLength (e.g. 30) and keep the existing assertions (length <= maxLength and matching dnsLabelRegex) so the test will expose any sanitization issues in sanitizeDNSLabel called by shortenNodeName.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/tnf/pkg/tools/jobs.go`:
- Around line 104-106: shortenNodeName can panic when maxLength > len(shortHash)
(shortHash is exactly 8 chars from "%08x"); update shortenNodeName to bound the
slice index by using the minimum of maxLength and len(shortHash) (or otherwise
cap the index) before returning shortHash[:index], and keep references to
shortHash and maxLength in the change so the fix is applied where
shortenNodeName is defined and used (e.g., by GetJobName).
- Around line 115-118: The current fallback returns a raw slice
nodeName[:maxLength] when match == "" and maxPrefixAllowed ≤ 0, which can
include invalid DNS characters; change the fallback in the same function to
follow the earlier pattern and return the capped hash instead of raw nodeName:
when match == "" and maxPrefixAllowed <= 0 compute the deterministic hash used
elsewhere (same hashing helper or algorithm used above), format/truncate it to
maxLength, and return that hashed string rather than nodeName[:maxLength];
update any references to nodeName, match, maxPrefixAllowed and maxLength to use
the hash-return path.
- Around line 120-123: sanitizeDNSLabel currently only trims leading/trailing
dashes and must be replaced so it removes dots and any invalid DNS characters:
update the sanitizeDNSLabel(name string) function to lowercase the input,
replace any character not in [a-z0-9-] (including dots) with '-', collapse
consecutive '-' runs into one, trim leading/trailing '-' and ensure the result
begins and ends with an alphanumeric character (if empty after sanitization
return a safe fallback or hashed suffix as used by the caller). Reference:
sanitizeDNSLabel and nodeRolePattern (the fallback path that calls this) when
locating where to change behavior.
---
Nitpick comments:
In `@pkg/tnf/pkg/tools/jobs_test.go`:
- Around line 233-284: Add a test that exercises the fallback path where node
names contain dots so the code path through shortenNodeName and sanitizeDNSLabel
is exercised; specifically add a case in TestShortenNodeNameDNSCompliance with
name "Non-standard FQDN node name with dots", nodeName
"my-custom-host.long-domain-name.example.com" and a realistic maxLength (e.g.
30) and keep the existing assertions (length <= maxLength and matching
dnsLabelRegex) so the test will expose any sanitization issues in
sanitizeDNSLabel called by shortenNodeName.
|
@fracappa: This pull request references Jira Issue OCPBUGS-76331, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (dhensel@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@fracappa: This pull request references Jira Issue OCPBUGS-76331, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (dhensel@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/tnf/pkg/tools/jobs.go`:
- Around line 66-68: The direct-append branch that does "if len(*nodeName) <=
maxSuffixLen { name = fmt.Sprintf(...) }" bypasses sanitization and allows dots
in short FQDNs; change the logic to always call shortenNodeName when appending
the node suffix (use shortenNodeName(*nodeName) instead of interpolating raw
*nodeName), remove the separate short-path branch, and update shortenNodeName to
early-return the input if it already conforms to DNS label rules (only
alphanumerics and '-', <= maxSuffixLen) so short names are preserved but still
validated/sanitized before being appended to name.
---
Duplicate comments:
In `@pkg/tnf/pkg/tools/jobs.go`:
- Around line 104-106: The code slices shortHash with shortHash[:maxLength]
which panics when maxLength > len(shortHash) (e.g., maxLength == 9 since
shortHash is 8 chars from "%08x"); update the return to cap the slice index by
using min(maxLength, len(shortHash)) (or compute allowed := maxLength; if
allowed > len(shortHash) { allowed = len(shortHash) } and return
shortHash[:allowed]) so shortHash is never indexed out-of-bounds; locate the
slice use of shortHash and the surrounding logic that computes maxLength in this
function and apply this defensive bound.
- Around line 108-117: The fallback path returns unsanitized slices containing
dots; update this path to use the sanitizer and ensure it replaces invalid
characters (not just trims dashes). Specifically, modify sanitizeDNSLabel to
replace illegal characters (e.g., dots) with valid characters (or remove them)
and then in the non-standard fallback use sanitizedPrefix :=
sanitizeDNSLabel(nodeName[:maxPrefixAllowed]) and return fmt.Sprintf("%s-%s",
sanitizedPrefix, shortHash) when sanitizedPrefix != "" ; likewise replace the
final bare return nodeName[:maxLength] with return
sanitizeDNSLabel(nodeName[:maxLength]) so both branches consistently produce
valid DNS labels (reference sanitizeDNSLabel, the prefix variable, and the
fallback return).
- Around line 120-123: sanitizeDNSLabel currently only trims leading/trailing
dashes; update it to produce a valid Kubernetes DNS label by: convert the input
to lowercase, replace every character not in [a-z0-9-] (including dots and
underscores) with '-', collapse consecutive '-' into a single '-', then trim
leading and trailing '-' so the result starts and ends with an alphanumeric
character and matches ^[a-z0-9]([a-z0-9-]*[a-z0-9])?$; if the result becomes
empty, return a safe fallback (e.g., "a" or "default"). Apply these changes
inside the sanitizeDNSLabel function.
b96a1a0 to
8c0b23f
Compare
|
/retest-required |
2 similar comments
|
/retest-required |
|
/retest-required |
767313e to
0b2e15c
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/tnf/pkg/jobs/tnf_test.go (1)
262-265: Minor inconsistency: hardcoded job name in error message.The job name at lines 262-265 is correctly generated via
GetJobName, but line 282 still uses the hardcoded string"tnf-after-setup-job-master-1"in the error response. Consider usingjobNamefor consistency:♻️ Suggested improvement
client.PrependReactor("delete", "jobs", func(action k8stesting.Action) (handled bool, ret runtime.Object, err error) { - return true, nil, apierrors.NewForbidden(batchv1.Resource("jobs"), "tnf-after-setup-job-master-1", nil) + return true, nil, apierrors.NewForbidden(batchv1.Resource("jobs"), jobName, nil) })Also applies to: 281-283
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/tnf/pkg/jobs/tnf_test.go` around lines 262 - 265, The test constructs the Job name via tools.JobTypeAfterSetup.GetJobName and stores it in jobName but the error message still uses the hardcoded string "tnf-after-setup-job-master-1"; update the error/assertion to reference the jobName variable (and any related uses around the job/Job object) so the test message is consistent with the generated name.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/tnf/pkg/tools/jobs.go`:
- Around line 108-114: The slice nodeName[:maxPrefixAllowed] can panic when
maxPrefixAllowed > len(nodeName); update the fallback branch to guard or use the
minimum of maxPrefixAllowed and len(nodeName) before slicing: compute an
effectivePrefixLen := min(maxPrefixAllowed, len(nodeName)) and call
sanitizeDNSLabel(nodeName[:effectivePrefixLen]) (or add a conditional
maxPrefixAllowed <= len(nodeName)) so sanitizeDNSLabel is only given a valid
substring; ensure you still check prefix != "" and return fmt.Sprintf("%s-%s",
prefix, shortHash) as before.
---
Nitpick comments:
In `@pkg/tnf/pkg/jobs/tnf_test.go`:
- Around line 262-265: The test constructs the Job name via
tools.JobTypeAfterSetup.GetJobName and stores it in jobName but the error
message still uses the hardcoded string "tnf-after-setup-job-master-1"; update
the error/assertion to reference the jobName variable (and any related uses
around the job/Job object) so the test message is consistent with the generated
name.
ℹ️ Review info
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (3)
pkg/tnf/pkg/jobs/tnf_test.gopkg/tnf/pkg/tools/jobs.gopkg/tnf/pkg/tools/jobs_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/tnf/pkg/tools/jobs_test.go
|
/retest-required |
18120d7 to
49b81e6
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/tnf/pkg/tools/jobs_test.go (1)
143-200: Consider documenting the expected hash values.The exact hash expectations (e.g.,
"worker-0-72f9309a","master-1-64736551") are intentionally hardcoded to verify determinism, which is good. However, these values are derived from FNV-1a and could benefit from a brief comment explaining they are expected outputs from the specific algorithm, making future maintenance easier.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/tnf/pkg/tools/jobs_test.go` around lines 143 - 200, Add a short comment above TestShortenNodeName (and/or above the hardcoded expected values) stating that the expected suffixes like "72f9309a" and "64736551" are deterministic FNV-1a hash outputs produced by shortenNodeName, include the hash algorithm name (FNV-1a), any truncation/hex length rules used, and note that tests rely on that deterministic behavior so future changes to shortenNodeName's hashing or truncation must update these expected values.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/tnf/pkg/jobs/tnf_test.go`:
- Around line 262-265: The delete reactor uses a hardcoded job name in the
NewForbidden error which can diverge from the computed name; replace the literal
"tnf-after-setup-job-master-1" with the computed jobName variable so the error
message matches the actual Job created by
tools.JobTypeAfterSetup.GetJobName(stringPtr("master-1"))—update the
NewForbidden invocation in the delete reactor (where batchv1.NewForbidden is
constructed) to reference jobName instead of the hardcoded string.
---
Nitpick comments:
In `@pkg/tnf/pkg/tools/jobs_test.go`:
- Around line 143-200: Add a short comment above TestShortenNodeName (and/or
above the hardcoded expected values) stating that the expected suffixes like
"72f9309a" and "64736551" are deterministic FNV-1a hash outputs produced by
shortenNodeName, include the hash algorithm name (FNV-1a), any truncation/hex
length rules used, and note that tests rely on that deterministic behavior so
future changes to shortenNodeName's hashing or truncation must update these
expected values.
ℹ️ Review info
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to data retention organization setting
📒 Files selected for processing (3)
pkg/tnf/pkg/jobs/tnf_test.gopkg/tnf/pkg/tools/jobs.gopkg/tnf/pkg/tools/jobs_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/tnf/pkg/tools/jobs.go
62cd2ae to
8d94d49
Compare
8d94d49 to
6cf7813
Compare
|
/approve |
|
Looks good to me but I'd like to see a TNF cluster deploy with this. :) |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-ipv6-techpreview periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-techpreview |
|
@eggfoobar: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8e6427b0-1735-11f1-9ae5-c2ea532d302a-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-ipv6-techpreview periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-techpreview |
|
@eggfoobar: trigger 2 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/737f52f0-1751-11f1-9496-acd8a1abf454-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-4.22-e2e-metal-ovn-two-node-fencing-techpreview Somehow this job picked up an older nightly, retrying |
|
@eggfoobar: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/51722290-176b-11f1-8f1a-3220a7ac526e-0 |
|
/approve Not sure why the bot didn't apply this |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: eggfoobar, fracappa The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/verified by @fracappa locally and CI install validated install
|
|
@fracappa: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest-required |
1 similar comment
|
/retest-required |
|
@fracappa: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
For solving such issues technique like truncating or hashing come into help but using them as stand-alone could be limiting.
worker-01.datacenter-east..andworker-01.datacenter-west..)The best approach would be a hybrid one (
tuncate + short hash suffix).For the hashing computation, we rely on FNV-1a Hash since we need to map a string to a small integer (the suffix used for name uniqueness ).
Summary by CodeRabbit
New Features
Tests