comparison of diff algorithms in rust
- Myers diff algorithm
- can produce 'slider errors' = wrong grouping of lines / code blocks
- Patience diff algorithm
- implemented in
diffslibrary, see diffs-patience-token/ - (much) slower than myers algorithm
- can fix 'slider errors' = wrong grouping of lines / code blocks
- implemented in
- Longest Common Subsequence (LCS) algorithm
- TODO better name? strictly speaking, all diff algos are LCS algos
- sometimes fails to find the longest common sequence, see difference-lcs/
- read input from files, like the diffr tool
- add more tests to compare algos, like diff-slider-tools/corpus
- compare lines / tokens / bytes
- Histogram diff algorithm
- can be faster than myers algorithm
- does not work on all inputs (limited number of different tokens?)
- no rust implementation yet?
- java: jgit/diff/HistogramDiff.java
- C: git/xdiff/xhistogram.c
- include other libraries
$ ./run.sh
file size
total extra file
293024 0 target/release/noop-str
293024 0 target/release/noop-string
305312 12288 target/release/diffs-myers-byte
309408 16384 target/release/diffs-myers-token
313504 20480 target/release/difference-lcs
333984 40960 target/release/diffs-patience-token
333984 40960 target/release/dissimilar-myers
https://github.com/johnthagen/min-sized-rust
strip will reduce size by 70%
cargo build --release
strip target/release/*
removing panic/format also brings drastic reductions in file size (this is important for embedded targets), but libraries must be patched