Downloader config rewrite rules should be able to use specified hashes #23502
Labels
help wanted
Someone outside the Bazel team could own this
P2
We'll consider working on this in future. (Assignee optional)
team-ExternalDeps
External dependency handling, remote repositiories, WORKSPACE file.
type: feature request
Description of the feature request:
Frustrating as it may be, some upstreams don't use version-based file names for their archives and simply update them in place, changing their hashes and breaking existing builds, so we mirror third-party dependencies using content-based paths, such as
third-party/<sha256>/bazel-skylib-1.7.1.tar.gz
. We don't use a simple read-through cache that can go fetch new things from the internet because we want to actively break builds which change dependencies so we can review and document those changes.At present a downloader config file can rewrite URLs using pattern matching, but only some of the information which one has passed to
module_ctx.download
is actually present in the URL. It is impossible to construct a hash-based path using a rewrite rule because the original URL rarely contains this hash. We would like to be able to request a specific file hash (sha256, etc.) and encoding (hex, base64) from Bazel in a rewrite rule so this is possible to do seamlessly.Clearly, this will result in failures in cases where the hash the user asks for isn't the one we've been given in the repository rule or module. However, that is a self-inflicted problem that the mirror owner and the downloader config writer are on the hook for resolving themselves.
Which category does this issue belong to?
External Dependency
What underlying problem are you trying to solve with this feature?
In our present, pre-bzlmod-based universe, we use a dedicated macro for third-party archives that takes a SHA256 hash and the original file name and constructs the appropriate mirror URL out of them. Continuing this practice with bzlmod would mean having to host a modified copy of the entire central registry because there is no other way to do that without essentially undoing all of the convenient dependency-resolving benefits that system is supposed to provide. Making all of the information the
download
function has available to it (or at least the hashes) available to rewrite rules, on the other hand, would make using stock BCR fairly painless because we can handle everything with rewrite rules. All we would have to do is handle mirroring when dependencies change, and our internal registry could then be limited to the things that aren't on BCR.Which operating system are you running Bazel on?
Ubuntu 20.04
What is the output of
bazel info release
?release 7.3.1
If
bazel info release
returnsdevelopment version
or(@non-git)
, tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse HEAD
?No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
Our Bazel Slack conversation, in case you're curious: https://bazelbuild.slack.com/archives/C014RARENH0/p1725058457719319
The text was updated successfully, but these errors were encountered: