-
Notifications
You must be signed in to change notification settings - Fork 691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Base16 hash for script path. #9459
Conversation
42c601e
to
adaf924
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you given some thought to testing? Otherwise, it's okay.
adaf924
to
f9c4641
Compare
I was indeed wondering if I should add some tests, but I guess by reverting this to doing the same as we do with unit-ids this "can't go wrong"? perhaps I'm too new here to know exactly whether this should be tested or it is simple enough to not do so. |
I've been thinking about it and am not sure that it's reasonably testable. I am however also unsure that changing the hash to operate on the file contents instead of its path is a good idea; it seems to me that this could build up quite a lot of cached script builds for which we have no garbage collection mechanism. |
That is true. I mainly followed the indications on this comment #9334 (comment) |
Mm, I should have commented about the path over there, then. I am still worried about excessive cached garbage. |
It's fine, we can continue the discussion here 👍 I think the argument that disk storage will just go up with no garbage collection makes sense, and that is alleviated somehow by using file paths (you usually don't move scripts often). |
Right, I'm expecting it to be more common to edit a script than to move it. And I'm not sure @mpickering's CI and machine arguments make any sense for cabal scripts, as opposed to projects. |
It looks like the validate run contains script hashes and the output needs to be updated. This is usually done with |
If the PR breaks tests, this means the feature is, in a way, tested already. Carefully reviewing the breakage and accepting new test results then counts as testing. Having said that, more tests is more better. E.g., if the spec of base16 did not specify that no |
BTW, isn't #9452 adding a test for that? In that case, adding a test specific to this PR only makes sense if one can devise something extra useful knowing the PR implementation details, as opposed to just the issue it solves. Unlikely. |
The RFC already defines the set of US-ASCII characters used for base16 (hex) https://datatracker.ietf.org/doc/html/rfc4648#section-8 so no % or / can appear. And About #9452 I think it was testing the previous implementation, that replaced / for %s but without fixing it on Windows. |
Oh, great. As @ffaf1 and @ulysses4ever mention, specs should not be trusted, but in this case I'm satisfied; my paranoia doesn't get triggered. :)
I think @mpickering was trying to reproduce the bug (using the buggy implementation, obviously) which your PR fixes, so that's relevant. Am I missing the point? Given a better understanding of the bug and the nature of the fix, can we improve #9452? |
Seems like this should probably have a comment explaining the choice of hash function? |
b0e8435
to
1f734e6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
This is quite urgent, so let's maybe add the merge label? @Kleidukos: any last thoughts? |
We could solve two problems in one go, if we modify the modified Cabal-syntax/src/Distribution/Backpack.hs
@@ -150,7 +150,7 @@ mkDefUnitId :: ComponentId -> Map ModuleName Module -> DefUnitId
mkDefUnitId cid insts =
unsafeMkDefUnitId
( mkUnitId
- (unComponentId cid ++ maybe "" ("+" ++) (hashModuleSubst insts))
+ (unComponentId cid ++ maybe "" ("_" ++) (hashModuleSubst insts))
)
-- impose invariant! |
If @jasagredo is confident about solving the aforementioned problem, let's ship it |
I'd like to first confirm if the |
This seems an unrelated problem to the one solved in this MR, so I think it should be part of a separate MR, ticket etc. |
Sounds reasonable too. We can merge this one and I can open a separate one later. |
Perfect. @jasagredo I'll let you set the merge label :) |
A file path wrapper of some sort that prevents us from generating bogus paths does not sound like a bad idea. Otherwise we will find ourselves having to double check every string we create. |
Issue haskell#9334 shows that `%` characters on Windows result in invalid paths, also `/` characters on Linux create invalid paths. This changes from using base64 to using base16 with the same length we use for unit-ids.
1f734e6
to
97f9917
Compare
Fantastic, thanks again @jasagredo! 🎉 |
@andreabedini: would you open a ticket for that? I remember some old discussions on the channel, but I don't know if a ticket emerged from that. Also, recently @geekosaur looked into this, but I'm not sure how relevant the tickets were. @Kleidukos: I'm assuming we are backporting this? Let me set the label. @jasagredo: does it need any other PRs to backport together with or is it standalone? If the latter, could you comment |
I was looking at the QuickCheck instances |
Confirmed working as expected on cabal-install master in Windows. |
Sorry for the abbreviated message; I was called into the doctor's office early. As I said, I was looking into the QuickCheck instances in Cabal-tests. I still don't know what to do with them without being quite invasive, because most things end up using the |
@mergify backport 3.10 |
✅ Backports have been created
|
Ah sorry @Mikolaj I was busy with some other stuff lately so I forgot about this. Thanks @Kleidukos for taking care of mergify here!! |
Use Base16 hash for script path. (backport #9459)
Issue #9334 shows that
%
characters on Windows result in invalid paths, also/
characters on Linux create invalid paths.This changes from using base64 to using base16 with the same length we use for unit-ids.
This supersedes #9452 or rather makes it not needed.
QA notes
Building scripts will now use base16 hashes. So doing
cabal run <script> --verbose
should output something like the following:The hash is a function of the canonical path, so changing the contents of the file shall not change the hash, but changing the location should.
Template Α: This PR modifies
cabal
behaviourInclude the following checklist in your PR:
Tests have been added.N/A (I think)