Skip to content

Conversation

@fmeum
Copy link
Collaborator

@fmeum fmeum commented Jan 25, 2026

The call to RemoteActionFileSystem#getInputStream in RemotePathChecker#isRemote could result in a deadlock since it is executed on the same executor that is also used to prefetch the file from the disk cache.

This is fixed by replacing the blocking call with a future. Along the way the method is renamed to avoid confusion due to two different meanings of "remote" (not being available in the disk cache vs. not being materialized at the exec path).

Fixing this bug makes it possible to run all BwoB tests with disk cache.

Related to #28408

@fmeum fmeum force-pushed the 28408-disk-cache-test branch from c65fb50 to 2a83435 Compare January 26, 2026 11:14
@fmeum fmeum changed the title Also run BwoB tests with a combined disk/remote cache Fix a deadlock in RemotePathChecker when using a disk cache Jan 26, 2026
@fmeum fmeum marked this pull request as ready for review January 26, 2026 11:19
@fmeum fmeum requested a review from a team as a code owner January 26, 2026 11:19
@fmeum fmeum requested a review from coeuvre January 26, 2026 11:19
@fmeum
Copy link
Collaborator Author

fmeum commented Jan 26, 2026

@bazel-io fork 9.0.1

@github-actions github-actions bot added team-Remote-Exec Issues and PRs for the Execution (Remote) team awaiting-review PR is awaiting review from an assigned reviewer labels Jan 26, 2026
@fmeum
Copy link
Collaborator Author

fmeum commented Jan 26, 2026

@bazel-io fork 8.6.0

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request effectively addresses a potential deadlock in RemotePathChecker by refactoring blocking calls into a non-blocking, future-based approach. The changes in RemoteActionFileSystem.java and RemoteExecutionCache.java correctly implement this asynchronous pattern, particularly by updating the RemotePathChecker interface and its usage. The added test parameters in BuildWithoutTheBytesIntegrationTest.java ensure comprehensive testing of the fix with and without disk cache. Overall, the changes are well-implemented and directly resolve the identified issue, improving the system's robustness.

@fmeum
Copy link
Collaborator Author

fmeum commented Jan 26, 2026

I will investigate the test failure.

@fmeum fmeum marked this pull request as draft January 26, 2026 12:47
@fmeum fmeum force-pushed the 28408-disk-cache-test branch from 2a83435 to bb27d9c Compare January 26, 2026 14:13
@fmeum fmeum marked this pull request as ready for review January 26, 2026 14:40
@fmeum
Copy link
Collaborator Author

fmeum commented Jan 26, 2026

Fixed!

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request effectively addresses the deadlock issue in RemotePathChecker by transitioning from blocking I/O operations to non-blocking ListenableFutures. The changes in RemoteActionFileSystem and RemoteExecutionCache correctly refactor the downloadIfRemote and isRemote methods, respectively, to align with this asynchronous pattern. The renaming of isRemote to isAvailableLocally in RemotePathChecker also enhances clarity regarding its functionality. Additionally, the updated integration tests for BuildWithoutTheBytesIntegrationTest appropriately incorporate disk cache scenarios, improving test coverage for the fix. The overall approach is sound and resolves the described problem.

@iancha1992
Copy link
Member

@bazel-io fork 9.1.0

@coeuvre coeuvre added awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally and removed awaiting-review PR is awaiting review from an assigned reviewer labels Jan 27, 2026
@github-actions github-actions bot removed the awaiting-PR-merge PR has been approved by a reviewer and is ready to be merge internally label Jan 28, 2026
bazel-io pushed a commit to bazel-io/bazel that referenced this pull request Jan 28, 2026
…uild#28416)

The call to `RemoteActionFileSystem#getInputStream` in `RemotePathChecker#isRemote` could result in a deadlock since it is executed on the same executor that is also used to prefetch the file from the disk cache.

This is fixed by replacing the blocking call with a future. Along the way the method is renamed to avoid confusion due to two different meanings of "remote" (not being available in the disk cache vs. not being materialized at the exec path).

Fixing this bug makes it possible to run all BwoB tests with disk cache.

Related to bazelbuild#28408

Closes bazelbuild#28416.

PiperOrigin-RevId: 862135346
Change-Id: Ic6a0b190be8f4769788078afe481496996f82013
bazel-io pushed a commit to bazel-io/bazel that referenced this pull request Jan 28, 2026
…uild#28416)

The call to `RemoteActionFileSystem#getInputStream` in `RemotePathChecker#isRemote` could result in a deadlock since it is executed on the same executor that is also used to prefetch the file from the disk cache.

This is fixed by replacing the blocking call with a future. Along the way the method is renamed to avoid confusion due to two different meanings of "remote" (not being available in the disk cache vs. not being materialized at the exec path).

Fixing this bug makes it possible to run all BwoB tests with disk cache.

Related to bazelbuild#28408

Closes bazelbuild#28416.

PiperOrigin-RevId: 862135346
Change-Id: Ic6a0b190be8f4769788078afe481496996f82013
iancha1992 pushed a commit that referenced this pull request Feb 2, 2026
…#28416) (#28465)

The call to `RemoteActionFileSystem#getInputStream` in
`RemotePathChecker#isRemote` could result in a deadlock since it is
executed on the same executor that is also used to prefetch the file
from the disk cache.

This is fixed by replacing the blocking call with a future. Along the
way the method is renamed to avoid confusion due to two different
meanings of "remote" (not being available in the disk cache vs. not
being materialized at the exec path).

Fixing this bug makes it possible to run all BwoB tests with disk cache.

Related to #28408

Closes #28416.

PiperOrigin-RevId: 862135346
Change-Id: Ic6a0b190be8f4769788078afe481496996f82013

Commit
55e4aac

Co-authored-by: Fabian Meumertzheim <fabian@meumertzhe.im>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

team-Remote-Exec Issues and PRs for the Execution (Remote) team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants