Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Similar error trying to upgrade from v6.5.0 to v7.1.0 #21665

Closed
fmeum opened this issue Mar 13, 2024 · 11 comments
Closed

Similar error trying to upgrade from v6.5.0 to v7.1.0 #21665

fmeum opened this issue Mar 13, 2024 · 11 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) team-Local-Exec Issues and PRs for the Execution (Local) team type: bug

Comments

@fmeum
Copy link
Collaborator

fmeum commented Mar 13, 2024

Similar error trying to upgrade from v6.5.0 to v7.1.0

FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.RuntimeException: Unrecoverable error while evaluating node 'UnshareableActionLookupData{actionLookupKey=ConfiguredTargetKey{label=//pkg/hubspot:go_default_test, config=BuildConfigurationKey[107978ce4da7f89b23c7ed2d5a8e7aa1fe7225cbd1812a31c7e3b2c5b88f8a03]}, actionIndex=9}' (requested by nodes 'TestCompletionKey{configuredTargetKey=ConfiguredTargetKey{label=//pkg/hubspot:go_default_test, config=BuildConfigurationKey[107978ce4da7f89b23c7ed2d5a8e7aa1fe7225cbd1812a31c7e3b2c5b88f8a03]}, topLevelArtifactContext=com.google.devtools.build.lib.analysis.TopLevelArtifactContext@90904c3b, exclusiveTesting=false}')
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:550)
	at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:414)
	at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
Caused by: java.lang.NullPointerException
	at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:903)
	at com.google.devtools.build.lib.vfs.Path.getRelative(Path.java:122)
	at com.google.devtools.build.lib.sandbox.SandboxStash.takeStashedSandboxInternal(SandboxStash.java:99)
	at com.google.devtools.build.lib.sandbox.SandboxStash.takeStashedSandbox(SandboxStash.java:73)
	at com.google.devtools.build.lib.sandbox.SymlinkedSandboxedSpawn.filterInputsAndDirsToCreate(SymlinkedSandboxedSpawn.java:79)
	at com.google.devtools.build.lib.sandbox.AbstractContainerizingSandboxedSpawn.createFileSystem(AbstractContainerizingSandboxedSpawn.java:135)
	at com.google.devtools.build.lib.sandbox.AbstractSandboxSpawnRunner.runSpawn(AbstractSandboxSpawnRunner.java:146)
	at com.google.devtools.build.lib.sandbox.AbstractSandboxSpawnRunner.exec(AbstractSandboxSpawnRunner.java:113)
	at com.google.devtools.build.lib.sandbox.SandboxModule$SandboxFallbackSpawnRunner.exec(SandboxModule.java:456)
	at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:159)
	at com.google.devtools.build.lib.exec.AbstractSpawnStrategy.exec(AbstractSpawnStrategy.java:119)
	at com.google.devtools.build.lib.exec.SpawnStrategyResolver.exec(SpawnStrategyResolver.java:45)
	at com.google.devtools.build.lib.exec.StandaloneTestStrategy.runTestAttempt(StandaloneTestStrategy.java:656)
	at com.google.devtools.build.lib.exec.StandaloneTestStrategy.beginTestAttempt(StandaloneTestStrategy.java:315)
	at com.google.devtools.build.lib.exec.StandaloneTestStrategy$StandaloneTestRunnerSpawn.execute(StandaloneTestStrategy.java:581)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.executeAllAttempts(TestRunnerAction.java:1163)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.execute(TestRunnerAction.java:975)
	at com.google.devtools.build.lib.analysis.test.TestRunnerAction.execute(TestRunnerAction.java:952)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.executeAction(SkyframeActionExecutor.java:1144)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor$ActionRunner.run(SkyframeActionExecutor.java:1061)
	at com.google.devtools.build.lib.skyframe.ActionExecutionState.runStateMachine(ActionExecutionState.java:165)
	at com.google.devtools.build.lib.skyframe.ActionExecutionState.getResultOrDependOnFuture(ActionExecutionState.java:94)
	at com.google.devtools.build.lib.skyframe.SkyframeActionExecutor.executeAction(SkyframeActionExecutor.java:558)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.checkCacheAndExecuteIfNeeded(ActionExecutionFunction.java:859)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.computeInternal(ActionExecutionFunction.java:333)
	at com.google.devtools.build.lib.skyframe.ActionExecutionFunction.compute(ActionExecutionFunction.java:171)
	at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:461)
	... 7 more

Originally posted by @derekperkins in #21632 (comment)

@fmeum
Copy link
Collaborator Author

fmeum commented Mar 13, 2024

@oquenchil I split this into a separate issue.

@derekperkins Do you have further information that could help us fix this?

@fmeum
Copy link
Collaborator Author

fmeum commented Mar 13, 2024

@bazel-io fork 7.1.1

@oquenchil
Copy link
Contributor

Yes, this looks like it's the same as bazelbuild/bazel-skylib#488 (comment). Does this not go away after cleaning up and trying again?

If it doesn't go away, I'd imagine then that it might be related to something in the environment instead of a race condition.

@oquenchil
Copy link
Contributor

I will add more logging right now like I promised in the previous issue.

@oquenchil oquenchil added type: bug P1 I'll work on this now. (Assignee required) team-Local-Exec Issues and PRs for the Execution (Local) team awaiting-user-response Awaiting a response from the author labels Mar 13, 2024
@oquenchil oquenchil self-assigned this Mar 13, 2024
@oquenchil oquenchil removed the awaiting-user-response Awaiting a response from the author label Mar 13, 2024
@oquenchil
Copy link
Contributor

@derekperkins Could you please bazel clean and try again? Does it keep happening?

@oquenchil
Copy link
Contributor

@fmeum I'm actually thinking of fixing the race you describe bazelbuild/bazel-skylib#488 (comment) in the same PR where I add the logging.

I think it'd be as easy as moving the put()before the renameTo.

@fmeum
Copy link
Collaborator Author

fmeum commented Mar 13, 2024

@oquenchil Sounds good, thanks!

copybara-service bot pushed a commit that referenced this issue Mar 14, 2024
Attempts to address NPE reported in: bazelbuild/bazel-skylib#488 (comment) and #21665 (comment)

The `put()` call to the runfiles dir map is placed before the call that stashes the corresponding directory to address the race condition described here: bazelbuild/bazel-skylib#488 (comment).

The exception will now log:
- entries in the runfiles dir map
- environment variables
- stashes on disk

Closes #21668.

PiperOrigin-RevId: 615739651
Change-Id: Ida90e334d1d1f890cf204d272134726bb1f70eb9
bazel-io pushed a commit to bazel-io/bazel that referenced this issue Mar 14, 2024
Attempts to address NPE reported in: bazelbuild/bazel-skylib#488 (comment) and bazelbuild#21665 (comment)

The `put()` call to the runfiles dir map is placed before the call that stashes the corresponding directory to address the race condition described here: bazelbuild/bazel-skylib#488 (comment).

The exception will now log:
- entries in the runfiles dir map
- environment variables
- stashes on disk

Closes bazelbuild#21668.

PiperOrigin-RevId: 615739651
Change-Id: Ida90e334d1d1f890cf204d272134726bb1f70eb9
github-merge-queue bot pushed a commit that referenced this issue Mar 14, 2024
…message (#21692)

Attempts to address NPE reported in:
bazelbuild/bazel-skylib#488 (comment)
and
#21665 (comment)

The `put()` call to the runfiles dir map is placed before the call that
stashes the corresponding directory to address the race condition
described here:
bazelbuild/bazel-skylib#488 (comment).

The exception will now log:
- entries in the runfiles dir map
- environment variables
- stashes on disk

Closes #21668.

Commit
59dbf7a

PiperOrigin-RevId: 615739651
Change-Id: Ida90e334d1d1f890cf204d272134726bb1f70eb9

Co-authored-by: Pedro <plf@google.com>
@Wyverald
Copy link
Member

is this fixed by #21668 ?

@derekperkins
Copy link

Sorry, I just saw this issue. I haven't been able to replicate it, so it seems likely that it is related to that race condition. In case it's helpful, it was running in Google Cloud Build in Debian Bookworm, on an 8 core machine.

@Wyverald
Copy link
Member

ok, closing this then. Thanks!

@iancha1992
Copy link
Member

A fix for this issue has been included in Bazel 7.1.1 RC1. Please test out the release candidate and report any issues as soon as possible.
If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=7.1.1rc1. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-Local-Exec Issues and PRs for the Execution (Local) team type: bug
Projects
None yet
Development

No branches or pull requests

5 participants