Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GitHub Actions intermittent test timeouts #336

Merged
merged 5 commits into from
Jul 8, 2021

Conversation

Integralist
Copy link
Collaborator

@Integralist Integralist commented Jul 7, 2021

Problem

We've seen regular, but intermittent, test timeouts in the GitHub Actions CI.

The source of the problem appears to be due to exec.Command’s Run method (or at least the command we're passing to it, which I believe for the issues I've seen it occur would be a cargo build command).

github.com/fastly/cli/pkg/exec.Streaming.Exec(0x1ea1c8c, 0x5, 0xc0000ca090, 0x9, 0x9, 0xc000240c00, 0x5d, 0x5d, 0x0, 0xb9e7678, ...)
	/Users/runner/work/cli/cli/pkg/exec/exec.go:53 +0x2de
github.com/fastly/cli/pkg/compute.Rust.Build(0x2038ee0, 0xc000552390, 0xc0001ce000, 0xb9e7678, 0xc002c8b200, 0xc0004a2b00, 0x104dac5, 0x1dc0180)
	/Users/runner/work/cli/cli/pkg/compute/rust.go:356 +0x7d8
github.com/fastly/cli/pkg/compute.(*BuildCommand).Exec(0xc000504000, 0x20388a0, 0xc00011e140, 0x2037c40, 0xc000210708, 0x0, 0x0)
	/Users/runner/work/cli/cli/pkg/compute/build.go:174 +0xace
github.com/fastly/cli/pkg/compute.(*PublishCommand).Exec(0xc0001ce9c0, 0x20388a0, 0xc00011e140, 0x2037c40, 0xc000210708, 0x203d718, 0xc0001ce9c0)
	/Users/runner/work/cli/cli/pkg/compute/publish.go:86 +0x145
github.com/fastly/cli/pkg/compute_test.TestPublish(0xc0000a8300)
	/Users/runner/work/cli/cli/pkg/compute/compute_integration_test.go:1313 +0x170a

Solution

NOTE: These aren't solutions in that they don't resolve the timeout issue. These are attempts at trying to help us better identify the root cause of the timeout when they happen.

  • Update the TestPublish test code to log the stdout (this only prints when the test fails).
    • This is so when it does fail randomly again, we'll be able to see what was happening.
  • Implement exec.CommandContext so we can set a timeout around the subprocess.
    • This helps us to short-circuit the test run (rather than waiting the full 10mins).

Notes

I did some Googling and it seems there’s a few open issues related to this:

One suggestion I stumbled across was to set GO_TEST_TIMEOUT_SCALE as some architectures can have problems with the default value. But we've NOT done that here as just increasing the timeout isn't really a solution and considering the entire test run normally only takes 4mins (which is well under the 10min limit) and the fact it only occurs occasionally suggests a problem we should properly identify.

Another suggestion was to split up code into more granular packages to avoid timeouts, but that's not an appropriate solution because the tests aren't timing out because of a code logic error but because the command itself being executed (cargo build in this case) is taking too long (well, as far as I'm concerned something is stalling randomly at the point of running the command).

@Integralist Integralist added the enhancement New feature or request label Jul 7, 2021
@Integralist Integralist requested review from a team and triblondon and removed request for a team July 7, 2021 17:15
Copy link
Contributor

@triblondon triblondon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved on the basis of changing the units of the timeout from minutes to seconds.

pkg/compute/publish.go Outdated Show resolved Hide resolved
@Integralist Integralist force-pushed the integralist/investigate-shell-timeout branch from e60728f to 05910db Compare July 8, 2021 11:35
@Integralist Integralist force-pushed the integralist/investigate-shell-timeout branch from 05910db to f5a6297 Compare July 8, 2021 12:29
@Integralist Integralist merged commit 76377c3 into main Jul 8, 2021
@Integralist Integralist deleted the integralist/investigate-shell-timeout branch July 8, 2021 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants