Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix downloading URLs with invalid characters #5921

Merged
merged 2 commits into from
May 13, 2024

Conversation

dra27
Copy link
Member

@dra27 dra27 commented Apr 11, 2024

OpamDownload.download downloads a given URL using the basename of the URL as the filename. On Windows, where are there are many restrictions on valid filenames, this causes a problem if the URL includes any query string. Since OpamDownload passes the name used to the continuation, on Windows the illegal characters are simply replaced with underscores.

OpamDownload.download downloads a given URL using the basename of the
URL as the filename. On Windows, where are there are many restrictions
on valid filenames, this causes a problem if the URL includes any query
string. Since OpamDownload passes the name used to the continuation, on
Windows the illegal characters are simply replaced with underscores.
Copy link
Member

@kit-ty-kate kit-ty-kate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine on its own, however I'm wondering how things interact if two files (e.g. extra-files) are downloaded and rewritten to the same name.

For example file*.txt and file?.txt would both rewrite to file_.txt and would overwrite each other. Is this detected by OpamDownload?

If so it would be nice to have a testcase for this, as well as for the base fix.

@dra27
Copy link
Member Author

dra27 commented Apr 23, 2024

It's worth the test case regardless, but this isn't disambiguating the final name - it's the intermediate name which if I traced the code through correctly is always done in an empty temporary directory. For example, in:

patches: ["zstd-detection.patch"]
extra-source "zstd-detection.patch" {
  src: "https://github.com/ocaml/ocaml/commit/baf65b91c51bb04b09ecc98b94ddd4ba3b446912.patch?full_index=1"
  checksum: "sha256=958e061bc3b967e32a5606d5109ed7faacb9b793fe2de0e8f8697c23a178c5cf"
}

the final name (zstd-detection.patch) must be Windows-compatible and there's no automatic renaming (ever) taking place there. The issue is that while downloading, opam downloads https://github.com/ocaml/ocaml/commit/baf65b91c51bb04b09ecc98b94ddd4ba3b446912.patch?full_index=1 to a file called baf65b91c51bb04b09ecc98b94ddd4ba3b446912.patch?full_index=1 before moving it to zstd-detection.patch

@kit-ty-kate
Copy link
Member

oh i see! Then I'm wondering, does the name have to be similar as the original? Why not simply have one static name that's always going to be the same?

@rjbou rjbou self-requested a review May 2, 2024 13:34
src/core/opamStd.mli Outdated Show resolved Hide resolved
let dst =
OpamFilename.(create dstdir (Base.of_string (OpamUrl.basename url)))
OpamFilename.(create dstdir (Base.of_string base))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should have that check/change in OpamFilename.Base directly, like that we are sure that every file created is valid. But can we have that change (forbidden char -> _) for all files?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can ever automatically change it, but we may want to do things to cause better error messages. For example, if an opam file actually specifies extra-source "zstd-detection?full_index=1.patch" { then I don't think opam should ever do anything to make this work, but it perhaps fail with a better error message if this is encountered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(the key point here is that this filename is internal to opam)

let dst =
OpamFilename.(create dstdir (Base.of_string (OpamUrl.basename url)))
OpamFilename.(create dstdir (Base.of_string base))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In opam code, there is no collision with _ change (as you pointed), but I'm wondering if we could have collision for library users.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly - on the plus-side, the fallout is limited to Windows, where the code would have been failing before!

Co-authored-by: R. Boujbel <rjbou@ocamlpro.com>
Copy link
Collaborator

@rjbou rjbou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@kit-ty-kate kit-ty-kate merged commit df4a390 into ocaml:master May 13, 2024
29 checks passed
@dra27 dra27 deleted the fix-download-chars branch May 13, 2024 12:16
@dra27 dra27 mentioned this pull request Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants