Managing growth of the `filepool` for remote output service

I'm not sure if this is a feature request or simply a request for more documentation if I'm just missing something in my configuration or just fundamentally misunderstanding how all of this works.  

I've been trying to set up `bb-clientd` for a remote output service in a CI environment.  This should be a huge benefit for us in terms of being able to share a local cache between multiple builds running concurrently.  However I've run into an apparent blocker around the `filepool`.  As far as I can tell, there isn't any kind of eviction policy for the `filepool`, nor is there any kind of deduplication between builds.  So it grows unbounded until it gets full and stuff starts breaking.

As far as I can tell, here's what's going on:
1. Every `bazel build` invocation gets its own virtual output directory in `~/bb_clientd/outputs`.
2. For the most part, that is getting populated with metadata entries pointing to digests for content from the remote executors, which winds up backed by the CAS so their entries are relatively small.  This still eventually becomes a problem without any kind of cleanup of old output directories - even slow unbounded growth is still unbounded.
3. Files created by local actions are written into the output tree directly and wind up backed by the `filepool`.   At least in our build this is in the form of a lot of `ctx.actions.symlink`, which is mostly fine (same caveats as previous point), but also a lot of `ctx.actions.write` or `ctx.actions.expand_template`, which can be significantly less fine. There can be a lot of these, from for example the [generated stubs for `py_test` targets](https://github.com/bazel-contrib/rules_python/blob/32d7a24d45eae7430b38733353e3ee77583d2da8/python/private/py_executable.bzl#L479).
4. Most of the content in the `filepool` is identical from one build to the next, but there isn't any deduplication for it as far as I can tell.
5. I haven't been able to find any configuration options which would cause old output trees to be cleaned up, so they continue to accumulate until the filepool fills up.
6. Cleaning them up in a CI environment can be tricky.  A build node may have multiple concurrent executors running (part of the point of `bb_clientd` here is to take advantage of sharing caches between these, after all) so it isn't easy to tell which output trees are still relevant.  A build job can include a post-build cleanup step to delete the output tree it just created, but then an interrupted build could lead to leakage.
7. I'm also not entirely sure whether deleting an output tree actually frees the space from the filepool; as far as I can tell it's using `bitmapSectorAllocator`, the [commentary](https://github.com/buildbarn/bb-remote-execution/blob/58b88e8adfbd4cb5c32905383ef9c35ea1f9598e/pkg/filesystem/bitmap_sector_allocator.go#L27-L31) for which suggests it was designed to be an ephemeral storage arena for remote execution workers, which is not really all that close to the usage pattern we're talking about here.
8. There isn't an easy way to monitor the utilization of the `filepool` to figure out if we're close to hitting the limit there.  Gemini suggested the Prometheus metric `bbclientd_filepool_used_bytes` metric, which would be great if it weren't a hallucination.

I'm trying to wrap my head around how this is supposed to work; how are stale output trees supposed to be managed? I feel like there really does need to be some way to configure automatic cleanup of old output trees.  Examples would include keeping the most recent N invocations worth or defining the limit in terms of bytes, or maybe in terms of age.  Recency could be based on when the invocation started, when it was finalized, or maybe when that output tree was last accessed (which the daemon knows about because it is involved in any access to it).  I don't want to bike-shed the precise details of which ones get dropped and when as long as there's some way to bound the growth.

Probably a separate feature entirely but it would also be neat if, when an output tree was finalized, file contents in the `filepool` got moved into a CAS.  Probably a separate CAS from the one caching remote outputs, with separate eviction policies, since in this case it _wouldn't_ just be a cache.  I understand this would be tricky; you'd at least need to compute digests, copy the content from the filepool to the CAS file, and then atomically replace the metadata entries.  At least if you're careful about how you do the copy, the go standard library will be using [copy_file_range](https://man7.org/linux/man-pages/man2/copy_file_range.2.html) so if the underlying filesystem supports CoW reflinks (e.g. btrfs or xfs) then it won't have to copy any actual bytes until such time as that section of the `filepool` gets overwritten.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Managing growth of the `filepool` for remote output service #32

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Managing growth of the filepool for remote output service #32

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Managing growth of the `filepool` for remote output service #32