-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing: summary and test output interleaved #40771
Comments
go test -v
can interleave summary and test outputgo test -v
can interleave summary and test output
The
The |
I actually suspect that the Here's another strange interleaving I see, which ultimately leaves certain log lines under the incorrect heading:
Is there a way to disable the streaming interleaved output? If so, then we can probably mitigate many of this issues by simply throwing that in our CI setup. |
go test -v
can interleave summary and test output
@mattmoor, the output in that last snippet does not look like a call to |
Should all be @bcmills @dnephin Is there any way to mitigate this? Ideally we'd want the |
You may be looking at this formatting, but that's going through the logf: https://github.com/knative/pkg/blob/c9f9284521f197a17fcdbaf5c4360bef8ef8570d/test/logstream/kubelogs.go#L185 |
Switching to |
@mattmoor, are these |
No, we stop calling |
Could you please provide a link to an actual test that is failing? I can see how this library is wired to its exported API, but that doesn't really help me understand the sequencing or concurrency properties of the calls in the actual test. |
A lot of tests are failing with this. The ones pertinent to the logs in the linked issue are here: https://github.com/knative/serving/blob/d9d93bd2ce10140af538acaa787aec031c526401/test/conformance/api/v1/errorcondition_test.go#L160 The specific callsite I link above is what initializes the logstream here: |
@mattmoor, could you provide a complete test output demonstrating the problem? I'd like to see which messages (particularly Did this start as of |
I believe the event problem was new in 1.14.6, but we had other issues related to the fix that landed in .6 There is a link in the top comment of gotestyourself/gotestsum#141 with a link to download the full build log. This is interleaved with some other bash output, but when I was playing with test2json locally I'd just snipped the go test output out of one of these. |
@mattmoor, one more question: do the tests in question pass if the |
Looks like this is with |
Thanks, that's helpful. I haven't been able to reproduce this with all logging occurring synchronously within the test, but I can get a very similar symptom by logging after the test has returned (which appears to bypass both the streaming output and synchronization): https://play.golang.org/p/SnpZP6n0lvk func TestWithDelayedOutput(t *testing.T) {
t.Parallel()
subtestsDone := make(chan struct{})
var delayed sync.WaitGroup
for i := 0; i < 3; i++ {
t.Run(fmt.Sprint(i), func(t *testing.T) {
delayed.Add(1)
go func() {
<-subtestsDone
defer delayed.Done()
t.Logf(t.Name())
}()
})
}
close(subtestsDone)
delayed.Wait()
}
I suspect that it won't be hard to extrapolate from that to a racy write. |
I am able to reproduce the interrupting |
@bcmills What if there is logging between the return of the test, but before the This is actually very possible today since the |
@mattmoor, it appears that logging to a test after the test function but before the Logging to a subtest before the main test's https://play.golang.org/p/EaLLhYQejXa
Likewise for logging to the main test function itself: https://play.golang.org/p/sGxwXx_D8KH
On the other hand, logging to a test before that test function's own |
Change https://golang.org/cl/249026 mentions this issue: |
@mattmoor, I still can't reproduce the reported behavior without modifying the code, but I think I understand where it's coming from. Could you try patching in https://golang.org/cl/249026 and see if that fixes the test output in your environment? |
@mattmoor, this should be fixed at head, but since I couldn't reproduce the failure without injecting a |
Change https://golang.org/cl/252638 mentions this issue: |
I'll talk to the folks that operator the Knative CI infrastructure. This hits about once a day in each of our post-submit test legs, so if we could convert even just one of those, it'd be a pretty strong signal IMO. |
Alrighty, I think I have good news, but you should take this with the grain of salt that this thing is obvs non-deterministic 😅 So in our typical "contour-latest" leg, we see this manifesting typically a couple times a day (periodic run every ~2 hours). Our test-infra folks added an experimental leg that's been running with this change, and clicking through each of legs to see their results in "spyglass" (we don't have it in "testgrid") I don't see any manifestations of this in ~31 runs of this so far. So I'm optimistic that this fixes things 🤞 🎉 |
This is still biting us pretty pervasively across our various e2e tests. When do we think we can expect this in a patch release on 1.14 or 1.15? |
@mattmoor, if we don't get any reports of regressions at head, then I would guess that this can make 1.15.3. (It would help to have more data supporting that it fixes your issue — you mentioned ~31 runs earlier; have you had any occurrences of the bug using the patched version since then?) |
I will try to dig the link to those runs back up and see if they are still running, but I haven't looked since posting above. |
Something had broken the test image, but they kicked it today and it's running again. Every build since has been clean, and I'll keep an eye on it over the weekend. For my own sanity: https://prow.knative.dev/?job=ci-knative-serving-contour-latest-beta-prow-tests |
So far another 16 runs without this issue manifesting 🤞 |
Up to 29 runs without this issue. As a parallel datapoint, in our unmodified leg over roughly the same period 6 of the last 35 runs have failed with this. |
Up to 44 runs without this issue manifesting. How much longer should we do this? I think we'll soon start to fall off of our run history and start to fall victim to my horrible mental bookmarking / checkpointing in these numbers. Future self: Green run number 44 was 9/21 at 15:43:56. |
That seems like pretty solid evidence to me. I'll talk with the folks on the release team and see if we can get this into the next minor release. Thanks for following up! |
…cally when streaming output While debugging #40771, I realized that the chatty printer should only ever print to a single io.Writer (normally os.Stdout). The other Writer implementations in the chain write to local buffers, but if we wrote a test's output to a local buffer, then we did *not* write it to stdout and we should not store it as the most recently logged test. Because the chatty printer should only ever print to one place, it shouldn't receive an io.Writer as an argument — rather, it shouldn't be used at all for destinations other than the main output stream. On the other hand, when we flush the output buffer to stdout in the top-level flushToParent call, it is important that we not allow some other test's output to intrude between the test summary header and the remainder of the test's output. cmd/test2json doesn't know how to parse such an intrusion, and it's confusing to humans too. No test because I couldn't reproduce the user-reported error without modifying the testing package. (This behavior seems to be very sensitive to output size and/or goroutine scheduling.) Fixes #40880 Updates #40771 Updates #38458 Change-Id: Ic19bf1d535672b096ba1c8583a3b74aab6d6d766 Reviewed-on: https://go-review.googlesource.com/c/go/+/249026 Run-TryBot: Bryan C. Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> (cherry picked from commit 51c0bdc) Reviewed-on: https://go-review.googlesource.com/c/go/+/252638 TryBot-Result: Go Bot <gobot@golang.org> Trust: Bryan C. Mills <bcmills@google.com>
…cally when streaming output While debugging #40771, I realized that the chatty printer should only ever print to a single io.Writer (normally os.Stdout). The other Writer implementations in the chain write to local buffers, but if we wrote a test's output to a local buffer, then we did *not* write it to stdout and we should not store it as the most recently logged test. Because the chatty printer should only ever print to one place, it shouldn't receive an io.Writer as an argument — rather, it shouldn't be used at all for destinations other than the main output stream. On the other hand, when we flush the output buffer to stdout in the top-level flushToParent call, it is important that we not allow some other test's output to intrude between the test summary header and the remainder of the test's output. cmd/test2json doesn't know how to parse such an intrusion, and it's confusing to humans too. No test because I couldn't reproduce the user-reported error without modifying the testing package. (This behavior seems to be very sensitive to output size and/or goroutine scheduling.) Fixes #40881 Updates #40771 Updates #38458 Change-Id: Ic19bf1d535672b096ba1c8583a3b74aab6d6d766 Reviewed-on: https://go-review.googlesource.com/c/go/+/249026 Run-TryBot: Bryan C. Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> (cherry picked from commit 51c0bdc) Reviewed-on: https://go-review.googlesource.com/c/go/+/252637 TryBot-Result: Go Bot <gobot@golang.org> Trust: Bryan C. Mills <bcmills@google.com>
…cally when streaming output While debugging golang#40771, I realized that the chatty printer should only ever print to a single io.Writer (normally os.Stdout). The other Writer implementations in the chain write to local buffers, but if we wrote a test's output to a local buffer, then we did *not* write it to stdout and we should not store it as the most recently logged test. Because the chatty printer should only ever print to one place, it shouldn't receive an io.Writer as an argument — rather, it shouldn't be used at all for destinations other than the main output stream. On the other hand, when we flush the output buffer to stdout in the top-level flushToParent call, it is important that we not allow some other test's output to intrude between the test summary header and the remainder of the test's output. cmd/test2json doesn't know how to parse such an intrusion, and it's confusing to humans too. No test because I couldn't reproduce the user-reported error without modifying the testing package. (This behavior seems to be very sensitive to output size and/or goroutine scheduling.) Fixes golang#40881 Updates golang#40771 Updates golang#38458 Change-Id: Ic19bf1d535672b096ba1c8583a3b74aab6d6d766 Reviewed-on: https://go-review.googlesource.com/c/go/+/249026 Run-TryBot: Bryan C. Mills <bcmills@google.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Jay Conrod <jayconrod@google.com> Reviewed-by: Ian Lance Taylor <iant@golang.org> (cherry picked from commit 51c0bdc) Reviewed-on: https://go-review.googlesource.com/c/go/+/252637 TryBot-Result: Go Bot <gobot@golang.org> Trust: Bryan C. Mills <bcmills@google.com>
Hi. It seems that this change to only flush on completion essentially prevents you from watching tests in the terminal. This is not too much an issue for CI/CD, but it's unsettling when running tests manually. Is there any workaround to still be able to see some progress while tests run, and not just a wall of text after 30min? Am I missing something big here? |
@calexicoz, the change for this issue was not to “to only flush on completion”. Flushing on completion is the default behavior for Go tests. The major change was to add support for streaming output while the test is running, but only if the |
Mmm. Something changed recently though. I am running my tests as:
And it definitely does not stream. It's a long test suite, so it runs for 20min or so then outputs everything. I'm on |
@calexicoz, yes, please open a new issue (ideally with steps to reproduce it). (Is it possible that all of the logging in your test is going to a |
I don't have any logging in the test proper. All I expect is simply to see the |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
We haven't yet picked up 1.15, which cut like yesterday and it's an intermittent flake for us.
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
We are running our e2e tests with
t.Parallel()
and-v
thrugotestsum
.What did you expect to see?
gotestsum
usestest2json
to parse the build log output, and I expect to see passing test results for all of the tests run.What did you see instead?
Action: pass
events are missing for typically 2 of the tests.We were originally chasing this here (has links to richer logs): gotestyourself/gotestsum#141
To include some relevant context here, we see several test summaries that are interleaved with other output, which seems to trip up
test2json
. Here are three from a single run (3 diff API versions):Normal (what you'd normally see):
Weird (interleaved, but surfaces events properly):
Broken (interleaved with another test context switch, breaks things):
If I change the ordering of lines in this last section, I get the appropriate set of events.
The text was updated successfully, but these errors were encountered: