-
Notifications
You must be signed in to change notification settings - Fork 50
tests: always make sure we return an LLB state #909
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
5cfdf4e to
6f64bda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors the test execution flow to ensure that an LLB state is always returned, even when tests fail. This enables proper debugging support with docker buildx dap by ensuring breakpoints can attach to the state. Instead of returning errors directly to BuildKit when tests fail, the PR adds a new LLB exec operation that prints test errors to stderr and exits non-zero, maintaining the build failure behavior while preserving the state chain.
Key changes:
- Modified
RunTeststo returnllb.StateOptioninstead of(Reference, error), allowing tests to be composed into the state chain - Introduced
withTestErrorto create an error-reporting state that outputs test failures before exiting - Added
TestErrorCmdas a new frontend command to handle test error output
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| test/linux_target_test.go | Updates test to consume logs via SolveStatus callback and verify test errors appear in output rather than as direct function errors |
| targets/linux/rpm/distro/pkg.go | Changes RunTests signature from returning (Reference, error) to StateOption, composing test dependencies and execution into the state chain |
| targets/linux/deb/distro/pkg.go | Matches RPM implementation with same RunTests signature change to StateOption pattern |
| targets/linux/distro_handler.go | Updates handlers to use new StateOption-based test execution and adds getRef helper to simplify state marshalling and solving |
| frontend/test_runner.go | Core refactoring: RunTests now returns StateOption, adds withTestError for error reporting, TestErrorCmd for outputting errors, and evalState helper |
| cmd/frontend/main.go | Wires up TestErrorCmd in the frontend command router |
| .github/workflows/ci.yml | Adds Docker system prune before tests to clean up resources |
| // RunTests runs the package tests | ||
| // The returned reference is the solved container state | ||
| func (cfg *Config) RunTests(ctx context.Context, client gwclient.Client, spec *dalec.Spec, sOpt dalec.SourceOpts, ctr llb.State, targetKey string, opts ...llb.ConstraintsOpt) (gwclient.Reference, error) { | ||
| def, err := ctr.Marshal(ctx, opts...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice seeing that duplicated bits refactored!
| assert.ErrorContains(t, err, v) | ||
| dir := t.TempDir() | ||
| f, err := os.OpenFile(filepath.Join(dir, "out.txt"), os.O_CREATE|os.O_RDWR, 0o600) | ||
| assert.NilError(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that's the good place for discussing this, but generally speaking, assert does not interrupt the execution of the function which often leads to panics and false-negatives (e.g. when f is nil), so maybe it would be better to use require instead, which gives clearer failure handling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert here is not testify, its gotest.tools/v3/asssert.
assert.NilError does halt the test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, that's confusing, TIL. Maybe we could remove testify assert completely to avoid this confusion. I'll have a look into that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I opened an issue yesterday to track this, except I suggested switching to testify since that is a more well known library (of course either way seems fine to me).
| testEnv.RunTest(ctx, t, func(ctx context.Context, client gwclient.Client) { | ||
| sr := newSolveRequest(withSpec(ctx, t, spec), withBuildTarget(target)) | ||
| _, err := client.Solve(ctx, sr) | ||
| assert.Assert(t, err != nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require.NilError?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is asserting that we have an error.
test/linux_target_test.go
Outdated
|
|
||
| v := (&dalec.CheckOutputError{Path: "/non-existing-file", Kind: dalec.CheckFileNotExistsKind, Expected: "exists=true", Actual: "exists=false"}).Error() | ||
| assert.ErrorContains(t, err, v) | ||
| doTest := func(target string) func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe?
| doTest := func(target string) func(t *testing.T) { | |
| testsForTarget := func(target string) func(t *testing.T) { |
| // RunTests runs the tests defined in the spec against the given the input state. | ||
| // The result of this is either the provided `finalState` or a state that errors when executed with the errors produced by the tests. | ||
| // The input state should be a runnable container with all dependencies already installed. | ||
| func RunTests(ctx context.Context, client gwclient.Client, spec *dalec.Spec, finalState llb.State, target string, platform *ocispecs.Platform) llb.StateOption { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If implementing llb.StateOption, do we actually need to take finalState llb.State as argument? Wouldn't in llb.State always be the final state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finalState here is because the input state is not always going to be the final state, e.g. for packaging targets such as jammy/deb and sysext targets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I still don't understand. Could you elaborate to help me graps how this works? So the input state could be worker with test dependencies, while final state would be where deb/sysext are? Do I get this right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The input state must always be a state that is ready to have test assertions executed.
deb/rpm/sysext are not that but we still want to run tests when building those targets (actually even the container state may have extra stuff due to test dependencies being installed).
In those cases we make the input state the container (ie package installed) + test dependencies.
The final state, which the test runner outputs, needs to be what we plan to output OR we need to handle that at the call site, which could also be fair (maybe a separate state option called Reset or something).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, I agree the code itself is complex.
I factored out a bunch of this in #898 but ran into some issues that need to be dug into further and may require fixes in buildkit (which would also delay actually being able to refactor since the test runner wouldn't be usable on older buildkits).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I factored out a few things to split this loop up in to smaller chunks, hope this helps (all in the 2nd commit).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation. It makes sense now :)
| case testrunner.CheckFilesCmdName: | ||
| args := flag.Args()[2:] | ||
| testrunner.CheckFilesCmd(args) | ||
| case frontend.TestErrorCmdName: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need a separate frontend command to handle printing errors? Couldn't this just run after we are finished running tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't just printing the errors, its exiting non-zero.
The reason we have this is so we can return an LLB state to buildkit that will error out and fail the build rather than returning an error to buildkit which messes up the dap debugger.
I actually have an alternative to this where each test assertion is actually run as an exec (state.Run(...)) but that seems to be triggering a bug in buildkit.
So I've settled for this case where we keep running the tests like we used to which requires un-lazying the build but still give buildkit a valid LLB state that produces an error.
Before this we are returning errors to buildkit when there are test
errors.
Or for that matter, if there is an error during evaluation of the stuff
running up to the tests.
This causes issues with the debug workflow ("docker buildx dap") since
there is no LLB state to attach to, so it can't set breakpoints
properly.
With this change even when we get a solve error when evaluating state,
we return the bad state so that the debugger can attach to it.
When a test fails, we add a new LLB exec op which prints the joined
test errors and exits non-zero so that the build will fail.
There's also some minor changes to make it simpler to discard any
changes to the filesystem from running tests, or in the case of
non-container outputs return the original state, while keeping the
dependency chain in tact.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
frontend/test_runner.go
Outdated
| }) | ||
| } | ||
|
|
||
| err = group.Wait() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's the task for this PR, but I feel code in RunTests would benefit from the extensive test suite for all different behaviours, since there is really a lot of code and as a newcomer it's hard to figure out what it all does.
Or maybe alternatively some refactoring to better organize levels of abstraction in this function could help make it easier to understand and drive down the complexity would be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have a large amount of coverage for the test runner within the integration test suite as it is.
Its possible an audit may come up with some case that's not tested but we do cover both positive and negative case in there to validate the test runner is correct and we also rely on the test runner for some of the rest of the test suite.
This is still not where I'd want it to be, but splits things out so logic is not sitting in a massive loop with a bunch of other unrelated things. Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Before this we are returning errors to buildkit when there are test errors.
Or for that matter, if there is an error during evaluation of the stuff running up to the tests.
This causes issues with the debug workflow ("docker buildx dap") since there is no LLB state to attach to, so it can't set breakpoints properly.
With this change even when we get a solve error when evaluating state, we return the bad state so that the debugger can attach to it. When a test fails, we add a new LLB exec op which prints the joined test errors and exits non-zero so that the build will fail.
There's also some minor changes to make it simpler to discard any changes to the filesystem from running tests, or in the case of non-container outputs return the original state, while keeping the dependency chain in tact.
This is a simpler alternative to #898.
#898 is hitting some race condition in buildkit (or at least that's what it looks like).
Additionally this can actually be backported to older release branches as the changes are minor.