Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix binary patching #3449

Merged
merged 2 commits into from
Jul 20, 2018
Merged

Fix binary patching #3449

merged 2 commits into from
Jul 20, 2018

Conversation

dra27
Copy link
Member

@dra27 dra27 commented Jul 12, 2018

This is still slightly in progress.

The first commit fixes OpamSystem.translate_patch corrupting binary patches by incorrectly applying CRLF transformations. The revised version both detects that a file has mixed line endings (i.e. some are CRLF and some are LF) and does the same thing with all the chunks of the patch. If either the file being patched or the patch chunks themselves have mixed endings, then no translations occur. The edge cases for binary files which happen to be CRLF or LF encoded are also sound (the patch will either maintain them and, if it changes the "endings" then the + lines and - lines would have different endings which would also disable the function).

However, doing that has made the original function even uglier than it was and so the following commit (which is the work-in-progress part - I just haven't tidied it up) rewrites the function. Before, translate_patch read the entire patch into memory (mostly reasonable) and folded the list of lines once. The problem is that the chunk line-ending detection means that lines now have to be cached in the state machine, which is a bitlot hideous. The new version accepts that the file is read 3 times - the first to determine if GNU patch's CR-stripping will come into effect, the second time to work out which lines may or may not require CR characters to be added or removed and then the third time to actually write the patch.

It also seems prudent to start testing this, and I've written a rudimentary test harness for that function. There are a couple of commits added which control the debug output a bit further, which allows the log to be used as a reference file.

The logic for the CRLF patch adaptation failed to take into account
mixed CRLF-ending files (i.e. binary files) which should obviously be
left alone.

read_lines updated to return a tri-state indicating if the file is
CRLF-encoded, LF-encoded or mixed. The patching logic is now altered to
accumulate patch lines and the patch is only adapted if both the target
file requires it and also the patch itself is consistent.
Previous implementation held the entire patch in memory and with the
correction for binary patching was doing silly operations over lists of
lines. This version, which is fractionally simpler, accepts scanning the
file three times:
  1. Works out whether the entire patch is CRLF encoded
  2. Works out what to do about each file's chunks
  3. Actually writes the translated patch.

Processing this way removes several minor inconsistencies from the
previous implementation.
@AltGr
Copy link
Member

AltGr commented Jul 19, 2018

Just tested on an old repo of mine that gave me errors like:

+ /usr/bin/patch "-p1" "-i" "/home/lg/.opam/log/processed-patch-2688-58c514" (CWD=/home/lg/.opam/repo/default-git)
- patching file packages/ocamlspot/ocamlspot.4.02.1.2.3.0/opam
- Hunk #1 FAILED at 1.
- Hunk #2 FAILED at 14 (different line endings).
- 2 out of 2 hunks FAILED -- saving rejects to file packages/ocamlspot/ocamlspot.4.02.1.2.3.0/opam.rej
- patching file packages/ocamlspot/ocamlspot.4.02.1.2.3.0/opam.orig
- Reversed (or previously applied) patch detected!  Assume -R? [n] 
- Apply anyway? [n] 
- Skipping patch.
- 1 out of 1 hunk ignored -- saving rejects to file packages/ocamlspot/ocamlspot.4.02.1.2.3.0/opam.orig.rej
- patching file packages/ocamlspot/ocamlspot.4.02.1.2.3.0/opam.orig.rej

and I can confirm it seems to work fine with this new version !

@AltGr AltGr merged commit f895dd4 into ocaml:master Jul 20, 2018
@dra27 dra27 deleted the fix-binary-patching branch June 20, 2019 09:01
@dra27 dra27 mentioned this pull request Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants