feat(validate): consume cli's structured validation API; plumb structures through machine output#347
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors schema validation in crossplane-diff to consume the Crossplane CLI’s new structured validation API (typed ValidationResult / FieldValidationError) instead of parsing rendered stdout, while preserving prior semantics around missing schemas and defaulting-before-diff.
Changes:
- Switch schema validation to
pkg/validate.SchemaValidate+pkg/validate.ResultError, removing the prior stdout parsing approach. - Apply CRD defaults explicitly via
pkg/xr.ApplyCRDDefaultsbefore diff calculation to preserve defaulted-resource invariants. - Update dependencies to consume the required Crossplane CLI pseudo-version and related
crossplane-runtime/apisRC versions.
Reviewed changes
Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
go.mod |
Bumps github.com/crossplane/cli/v2 (pseudo-version) and updates Crossplane RC dependencies to satisfy new structured validation API constraints. |
go.sum |
Updates module checksums corresponding to the dependency bumps. |
cmd/diff/diffprocessor/schema_validator.go |
Replaces stdout-based validation parsing with structured ValidationResult handling; adds explicit defaulting and debug logging helpers. |
cmd/diff/diffprocessor/schema_validator_test.go |
Replaces parsing tests with formatValidationErrors tests; keeps regression coverage for in-place defaulting behavior. |
cmd/diff/diffprocessor/diff_processor.go |
Updates XR defaulting call site to use pkg/xr.ApplyCRDDefaults (renamed/promoted from render.DefaultValues). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| case pkgvalidate.ValidationStatusInvalid: | ||
| for _, e := range r.Errors { | ||
| if e.Type == pkgvalidate.FieldErrorTypeDefaulting { | ||
| // ResultError treats a defaulting error as a | ||
| // failure only when accompanied by a schema-class | ||
| // error on the same resource. The schema-class | ||
| // error already gets its own entry; emitting the | ||
| // defaulting line too would just be noise. | ||
| continue | ||
| } | ||
|
|
||
| msgs = append(msgs, fmt.Sprintf("schema validation error %s, %s : %s", gvk, r.Name, e.Message)) | ||
| } |
There was a problem hiding this comment.
Good catch — addressed in 630612c.
You're right that the silent-drop is wrong even though it doesn't trip today. Upstream's statusFromErrors guarantees an Invalid resource has at least one non-defaulting error, so the suppression always lands on a useful entry. But if that contract ever shifted, the formatter would fall through to the generic schema validation failed string and the actual cause would disappear.
Now suppression only happens when another actionable (schema / CEL / unknown-field) error is present on the same resource. With only defaulting errors, they get emitted as-is. New InvalidWithOnlyDefaultingErrorsStillEmitted test case covers it.
Address Copilot review feedback on PR #347. formatValidationErrors previously dropped every FieldErrorTypeDefaulting entry on Invalid resources unconditionally. Upstream's statusFromErrors guarantees an Invalid resource carries at least one non-defaulting error today, so in practice this produced the right output — but if that contract ever shifted, an Invalid resource whose Errors are all defaulting would silently produce zero per-resource messages and the formatter would fall through to the generic "schema validation failed" string, hiding the real cause. Suppress defaulting only when there is another actionable (schema / CEL / unknown-field) error on the same resource. With only defaulting errors present, emit them as-is. Tested by the new InvalidWithOnlyDefaultingErrorsStillEmitted case. Signed-off-by: Jonathan Ogilvie <jonathan.ogilvie@sumologic.com>
| func formatErrorLine(e pkgvalidate.FieldValidationError) string { | ||
| msg := e.Message | ||
| if rendered := renderBadValue(e.Value); rendered != "" && !strings.Contains(msg, rendered) { | ||
| msg = fmt.Sprintf("%s (got %s)", msg, rendered) | ||
| } | ||
|
|
||
| return fmt.Sprintf("%s [%s]", msg, e.Type) | ||
| } | ||
|
|
There was a problem hiding this comment.
Good catch — addressed in f9a1ac8. renderBadValue now uses %q for string values for both the display form and the Contains-search, matching how k8s field validators emit them (Invalid value: "five"). Numbers / bools / structs keep %v unchanged. Two new regression cases: BadValueNotSuppressedByIncidentalSubstring (Value="k" vs message containing "kind" — tail now renders correctly) and NumericBadValueRendersUnquoted (lock the unquoted display for non-strings).
…ew feedback Three changes, all in response to review feedback on PR #347: - Adopt the established reason-field test convention (used in diff_integration_test and the client/crossplane test suites) for the new TestFormatValidationErrors, TestNewOutputError, and TestValidationFailuresFromResult cases. Inline `// comment` blocks move into a `reason string` table-struct field that surfaces in failure messages, so a failing case prints both what went wrong and why the case exists. - Match the bad-value duplication check to k8s emit conventions: string values get %q (quoted) for both display and the Contains-search, so an unquoted incidental substring (e.g. Value="k" against message "spec.kind: Required value") no longer suppresses the "(got <value>)" tail. Numbers / bools / structs keep %v unchanged. New regression cases: BadValueNotSuppressedByIncidentalSubstring and NumericBadValueRendersUnquoted. - Drop the misleading preloadedCRDs entry in the TestDefaultSchemaValidator_ValidateResources ValidationError case. The field is dead state — `tt.preloadedCRDs` is never read in any test body in this file; CRDs flow through the mock SchemaClient set up by setupClients. Removing the wider unused field is out-of-scope for this PR (touches many unrelated tests), so just empty this entry to stop implying it influences validation behavior. Signed-off-by: Jonathan Ogilvie <jonathan.ogilvie@sumologic.com>
| result, err := pkgvalidate.SchemaValidate(ctx, resources, v.schemaClient.GetAllCRDs()) | ||
| if err != nil { | ||
| // Parse and extract only the error lines from validation output | ||
| details := extractValidationErrors(validationOutput.String()) | ||
| return NewSchemaValidationError("", details, err) | ||
| return errors.Wrap(err, "schema validation failed") | ||
| } | ||
|
|
||
| v.logResultDetails(result) | ||
|
|
||
| if rerr := pkgvalidate.ResultError(result, true); rerr != nil { | ||
| return NewSchemaValidationError("", formatValidationErrors(result), rerr).WithResult(result) | ||
| } |
There was a problem hiding this comment.
Good catch — addressed in 3b7936b. Pre-refactor, the old validate.SchemaValidation entry point produced a SchemaValidationError on every error path; my structured-API split inadvertently dropped that for setup errors, which would have broken IsSchemaValidationError / exit-code 2. Now wrapped via NewSchemaValidationError with no Result attached.
| func resourceHeader(r pkgvalidate.ResourceValidationResult) string { | ||
| name := r.Name | ||
| if r.Namespace != "" { | ||
| name = r.Namespace + "/" + r.Name | ||
| } | ||
|
|
||
| return strings.Join(validationErrs, "; ") | ||
| if name == "" { | ||
| return fmt.Sprintf("%s/%s", r.APIVersion, r.Kind) | ||
| } | ||
|
|
||
| return fmt.Sprintf("%s/%s %s", r.APIVersion, r.Kind, name) | ||
| } |
There was a problem hiding this comment.
Fixed in 3b7936b. The empty-name guard was checking the computed name string ('ns/') instead of r.Name. Gated on r.Name directly now, so an empty name collapses to GVK-only regardless of namespace. New regression case: NamespaceWithoutNameCollapsesToGVK.
| func formatErrorLine(e pkgvalidate.FieldValidationError) string { | ||
| msg := e.Message | ||
| if rendered, found := renderBadValue(e.Value); rendered != "" && !strings.Contains(msg, found) { | ||
| msg = fmt.Sprintf("%s (got %s)", msg, rendered) | ||
| } | ||
|
|
||
| return fmt.Sprintf("%s [%s]", msg, e.Type) | ||
| } |
There was a problem hiding this comment.
Fixed in 3b7936b. Same class of false positive you caught earlier on the string side, just on the numeric path. Split the duplicate check into valueAlreadyInMessage: strings keep the quoted-form substring search (delimiter-safe via the %q quotes), non-strings use a word-boundary regex (\\b … \\b) so 42 doesn't match inside 420. Two new regression cases: NumericBadValueNotSuppressedByDigitPrefix (Value=42 vs "420" — tail rendered) and NumericBadValueSuppressedAsToken (Value=42 vs "Invalid value: 42:" — tail correctly suppressed).
crossplane/cli#66 lands a structured-result validate API at pkg/validate.SchemaValidate, replacing the previous cmd/crossplane/validate.SchemaValidation entry point that produced its results by writing prefix-marked lines to a writer. Switch schema_validator.go to the new API so that: - Validation errors come from typed FieldValidationError records rather than parsed [x]/[!] lines, removing the extractValidationErrors stdout-parsing helper and its test. - The bytes.Buffer + io.MultiWriter + loggerwriter indirection used to capture rendered output is gone; per-resource debug logging is now explicit via logResultDetails. - Failure detection runs through pkg/validate.ResultError, which preserves the historical error-on-missing-schemas semantic. The previous SchemaValidation fused defaulting with validation, mutating the caller's resources in place. SchemaValidate deep-copies its inputs, so defaulting no longer happens as a side-effect. Apply defaults explicitly via pkg/xr.ApplyCRDDefaults before validation to preserve the downstream invariant that the diff calculator sees fully-defaulted resources (covered by TestDefaultSchemaValidator_ValidateResources_AppliesDefaults). Also rename the call site in diff_processor.go from render.DefaultValues to xr.ApplyCRDDefaults: the helper kept its signature but was promoted out of cmd/crossplane/render into the new pkg/xr library package in the same upstream cleanup wave. go.mod pins crossplane/cli to v2.4.0-rc.0.0.20260615182009-ba59fbfac34b (the merge commit of crossplane/cli#66) and pulls crossplane-runtime/v2 and crossplane/apis/v2 forward to v2.4.0-rc.0 to satisfy the cli's transitive constraints. Bump to a tagged release once one is cut. Signed-off-by: Jonathan Ogilvie <jonathan.ogilvie@sumologic.com>
Address Copilot review feedback on PR #347. formatValidationErrors previously dropped every FieldErrorTypeDefaulting entry on Invalid resources unconditionally. Upstream's statusFromErrors guarantees an Invalid resource carries at least one non-defaulting error today, so in practice this produced the right output — but if that contract ever shifted, an Invalid resource whose Errors are all defaulting would silently produce zero per-resource messages and the formatter would fall through to the generic "schema validation failed" string, hiding the real cause. Suppress defaulting only when there is another actionable (schema / CEL / unknown-field) error on the same resource. With only defaulting errors present, emit them as-is. Tested by the new InvalidWithOnlyDefaultingErrorsStillEmitted case. Signed-off-by: Jonathan Ogilvie <jonathan.ogilvie@sumologic.com>
…rce blocks
The previous formatValidationErrors mirrored the upstream text
renderer's prefix-marked output (`[x] schema validation error <gvk>,
<name> : <msg>`) because the original implementation literally parsed
that text. With structured FieldValidationError records now in hand,
the mirroring is gone: the format is multi-line, grouped per resource,
and surfaces information the line-parser couldn't.
Output shape:
<apiVersion>/<Kind> <ns>/<name>:
<message>[ (got <value>)] [<type>]
...
<apiVersion>/<Kind>: missing schema
Per-resource grouping replaces a flat list with N entries duplicating
the GVK on each line. The error type ([schema] / [cel] /
[unknownField] / [defaulting]) makes the failure class visible
without re-parsing the message. Bad values from
FieldValidationError.Value are appended as "(got <value>)" only when
they aren't already substring-present in the message — k8s-derived
errors typically embed the bad value in the message text, so this
avoids duplication.
The "schema validation failed" fallback is gone: formatValidationErrors
is invoked only after pkgvalidate.ResultError has flagged a failure,
so an empty result is unreachable in production. Returning "" lets
the wrapping SchemaValidationError carry the underlying ResultError
message instead of substituting a misleading generic string.
The integration TestDiffIntegration/SchemaValidationError previously
asserted on `Contains "schema validation"`, which only passed because
the inner formatter happened to repeat that phrase. The wrap prefix
"cannot validate resources" from diff_processor.go is the stable
semantic anchor; assertion updated.
Address Copilot review feedback as part of this redesign:
defaulting-only Invalid resources (currently unreachable per upstream
statusFromErrors, but defensible as a future-proofing measure) emit
their defaulting messages rather than producing an empty block.
Signed-off-by: Jonathan Ogilvie <jonathan.ogilvie@sumologic.com>
Schema validation now surfaces structured per-resource failure detail
through OutputError.ValidationFailures so machine consumers of
crossplane-diff's JSON / YAML output don't have to parse the
human-readable Message string. The string is still emitted alongside,
unchanged, for operators reading the JSON directly.
Wire shape:
{
"errors": [{
"resourceID": "XNopResource/my-xr",
"message": "...",
"validationFailures": [
{
"apiVersion": "ns.example.org/v1alpha1",
"kind": "XNopResource",
"name": "my-xr",
"namespace": "default",
"status": "invalid",
"errors": [
{
"type": "schema",
"field": "spec.coolField",
"message": "spec.coolField: Invalid value: \"number\": ...",
"value": "number"
}
]
}
]
}]
}
ResourceID and ValidationFailures play complementary roles: ResourceID
identifies which user input the diff was processing (one entry per
batched run), while ValidationFailures lists every resource inside
that input's render tree that failed validation, with full GVK,
namespace, and typed errors. They overlap on Kind+Name when the input
itself is among the failing resources — that's intentional: consumers
iterating ValidationFailures should never miss an XR-level rejection.
Wire types live in renderer/types/types.go (ResourceValidationFailure,
FieldValidationError) so crossplane-diff owns its public schema rather
than re-exporting crossplane/cli's pkg/validate types — the upstream
shape can evolve independently of ours.
SchemaValidationError gains an optional Result field carrying the
*pkgvalidate.ValidationResult that produced it; ValidateResources
attaches it. NewOutputError handles the type assertion + conversion in
one place, used at both XR diff and comp diff error-construction
sites. Non-validation paths (scope check, tool errors, IO errors)
leave ValidationFailures nil.
The integration test infra splits stdout/stderr into separate buffers
(previously combined via kong.Writers(&stdout, &stdout), which broke
JSON parsing as soon as the renderer wrote a stderr ERROR line) and
allows expectedStructuredOutput to coexist with expectedError when no
expectedErrorContains is set. ExpectedDiff gains a fluent WithError /
WithValidationFailure / WithFieldError builder chain so tests assert
on validationFailures structurally.
The SchemaValidationError integration test switches from substring
matching the human-readable error string to JSON-mode + structured
assertion: it now pins the failing GVKs, namespaces, statuses, error
types, and field paths for both the failing XR and its composed
resource. The k8s-emitted message wording in between is intentionally
not asserted so apimachinery upgrades can shift the phrasing without
breaking the test.
Signed-off-by: Jonathan Ogilvie <jonathan.ogilvie@sumologic.com>
…ew feedback Three changes, all in response to review feedback on PR #347: - Adopt the established reason-field test convention (used in diff_integration_test and the client/crossplane test suites) for the new TestFormatValidationErrors, TestNewOutputError, and TestValidationFailuresFromResult cases. Inline `// comment` blocks move into a `reason string` table-struct field that surfaces in failure messages, so a failing case prints both what went wrong and why the case exists. - Match the bad-value duplication check to k8s emit conventions: string values get %q (quoted) for both display and the Contains-search, so an unquoted incidental substring (e.g. Value="k" against message "spec.kind: Required value") no longer suppresses the "(got <value>)" tail. Numbers / bools / structs keep %v unchanged. New regression cases: BadValueNotSuppressedByIncidentalSubstring and NumericBadValueRendersUnquoted. - Drop the misleading preloadedCRDs entry in the TestDefaultSchemaValidator_ValidateResources ValidationError case. The field is dead state — `tt.preloadedCRDs` is never read in any test body in this file; CRDs flow through the mock SchemaClient set up by setupClients. Removing the wider unused field is out-of-scope for this PR (touches many unrelated tests), so just empty this entry to stop implying it influences validation behavior. Signed-off-by: Jonathan Ogilvie <jonathan.ogilvie@sumologic.com>
The preloadedCRDs field was historical residue: a previous refactor
moved CRD plumbing through a mock SchemaClient (set up by each test
case's setupClients / setupClient closure) but left the original
preloadedCRDs slice on three test struct types. The field was never
read; every assignment went straight to the GC. Comments like "// No
longer needed" at several call sites confirmed the previous author
knew the data was unused but stopped short of removing the field.
Sweep the field from all three table-driven test functions in
cmd/diff/diffprocessor/schema_validator_test.go:
- TestDefaultSchemaValidator_ValidateResources (4 cases)
- TestDefaultSchemaValidator_LoadCRDs (1 case)
- TestDefaultSchemaValidator_ValidateScopeConstraints (8 cases)
Removing the field exposed two more pieces of dead state in the same
file:
- The ValidationError closure in TestDefaultSchemaValidator_ValidateResources
had a composedCRDUn block that built an unstructured via
runtime.DefaultUnstructuredConverter and discarded the result.
Drop the block; with no remaining caller, MustToUnstructured and
the runtime import go with it.
- The const testComposedResource = "ComposedResource" was defined
but never referenced — every call site uses the literal string
"testComposedResource" (matching the const's name, not its value).
Drop the const.
No behavior change; all unit tests still pass under earthly -P +reviewable.
Signed-off-by: Jonathan Ogilvie <jonathan.ogilvie@sumologic.com>
f303322 to
cb43f61
Compare
| func resourceHeader(r pkgvalidate.ResourceValidationResult) string { | ||
| name := r.Name | ||
| if r.Namespace != "" { | ||
| name = r.Namespace + "/" + r.Name | ||
| } | ||
|
|
||
| return strings.Join(validationErrs, "; ") | ||
| if name == "" { | ||
| return fmt.Sprintf("%s/%s", r.APIVersion, r.Kind) | ||
| } | ||
|
|
||
| return fmt.Sprintf("%s/%s %s", r.APIVersion, r.Kind, name) | ||
| } |
There was a problem hiding this comment.
Same issue as #3421888473, addressed in 3b7936b. Empty-name guard now checks r.Name directly so namespace+no-name collapses cleanly to GVK-only.
Three issues caught by Copilot on the rebased branch:
1. SchemaValidate setup-error path returned a generic errors.Wrap
rather than a SchemaValidationError, so IsSchemaValidationError /
DetermineExitCode would classify it as ExitCodeToolError (1)
instead of ExitCodeSchemaValidation (2). Pre-refactor, the old
validate.SchemaValidation entry point produced a SchemaValidationError
on every error path; the structured-API split inadvertently broke
that contract. Restore it: setup errors are wrapped via
NewSchemaValidationError(...) with no Result attached (the
validator never produced one).
2. resourceHeader produced "<gvk> <ns>/" with a stray trailing slash
when a resource had metadata.namespace set but no metadata.name.
The empty-name guard checked the *computed* name string ("ns/")
rather than r.Name. Gate on r.Name directly so an empty name
collapses to GVK-only regardless of namespace, matching the
intent already documented for the empty-namespace case.
Regression case: NamespaceWithoutNameCollapsesToGVK.
3. Non-string Value duplicate check used a raw strings.Contains, so
Value=42 against message containing "420" would suppress the
"(got 42)" tail incorrectly — same class of false-positive
Copilot caught earlier for unquoted strings, just on the
numeric path. Split the duplicate check out as
valueAlreadyInMessage: strings still use the quoted-form
substring search (delimiter-safe via the surrounding %q quotes),
non-strings use a word-boundary regex so 42 doesn't match inside
420. Two new regression cases: NumericBadValueNotSuppressedByDigitPrefix
(Value=42 vs "420" — tail correctly rendered) and
NumericBadValueSuppressedAsToken (Value=42 vs "Invalid value:
42:" — tail correctly suppressed because 42 appears as a
standalone token).
Signed-off-by: Jonathan Ogilvie <jonathan.ogilvie@sumologic.com>
Description of your changes
Adopts the structured-result validation API that landed in crossplane/cli#66, replacing the previous parse-stdout pattern in
cmd/diff/diffprocessor/schema_validator.go. This is the Part 3 follow-up to the structured-validation work whose Part 1 introducedextractValidationErrorsas an interim line-parsing solution; with upstream now exposing typed results, that interim layer goes away.What changes:
FieldValidationErrorrecords viapkg/validate.SchemaValidateinstead of[x]/[!]prefix-line parsing. TheextractValidationErrorshelper and itsTestExtractValidationErrorsare gone, replaced byformatValidationErrors(covered byTestFormatValidationErrors).bytes.Buffer+io.MultiWriter+loggerwriterindirection used to capture rendered output is gone; per-resource debug logging is now explicit vialogResultDetails.pkg/validate.ResultError, which preserves the historical error-on-missing-schemas semantic.pkg/validate.SchemaValidatedeep-copies its inputs and does not mutate caller-owned resources, so the previous side-effect of in-place defaulting fromSchemaValidationis gone. Defaults are applied explicitly viapkg/xr.ApplyCRDDefaultsbefore validation, preserving the invariant that the diff calculator sees fully-defaulted resources.TestDefaultSchemaValidator_ValidateResources_AppliesDefaultsis the regression guard for this.render.DefaultValuescall site indiff_processor.gotoxr.ApplyCRDDefaults. Same signature, but the helper was promoted out ofcmd/crossplane/renderinto the newpkg/xrlibrary package in the same upstream cleanup wave.go.modpinscrossplane/clitov2.4.0-rc.0.0.20260615182009-ba59fbfac34b(the merge commit of crossplane/cli#66) and pullscrossplane-runtime/v2andcrossplane/apis/v2forward tov2.4.0-rc.0to satisfy the cli's transitive constraints. The cli's current latest tagv2.4.0-rc.0predates this merge by 78 commits, so a pseudo-version is the only way to consume the refactor today; bump to a real tagged release once crossplane/cli cuts one.Fixes #
I have:
earthly -P +reviewableto ensure this PR is ready for review.Added or updated e2e tests.— no behavior change at the e2e surface; the cleanup is internal to schema validation and produces equivalent error strings.Documented this change as needed.— no user-facing behavior change.Followed the API promotion workflow if this PR introduces, removes, or promotes an API.— no public API surface change.Need help with this checklist? See the cheat sheet.