-
Notifications
You must be signed in to change notification settings - Fork 153
Comparing changes
Open a pull request
base repository: gitgitgadget/git
base: master
head repository: gitgitgadget/git
compare: en/rust-xdiff
- 17 commits
- 30 files changed
- 3 contributors
Commits on Aug 18, 2025
-
doc: add a policy for using Rust
Git has historically been written primarily in C, with some shell and Perl. However, C is not memory safe, which makes it more likely that security vulnerabilities or other bugs will be introduced, and it is also more verbose and less ergonomic than other, more modern languages. One of the most common modern compiled languages which is easily interoperable with C is Rust. It is popular (the most admired language on the 2024 Stack Overflow Developer Survey), efficient, portable, and robust. Introduce a document laying out the incremental introduction of Rust to Git and provide a detailed rationale for doing so, including the points above. Propose a design for this approach that addresses the needs of downstreams and distributors, as well as contributors. Since we don't want to carry both a C and Rust version of code and want to be able to add new features only in Rust, mention that Rust is a required part of our platform support policy. It should be noted that a recent discussion at the Berlin Git Merge Contributor Summit found widespread support for the addition of Rust to Git. While of course not all contributors were represented, the proposal appeared to have the support of a majority of active contributors. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for be19483 - Browse repository at this point
Copy the full SHA be19483View commit details -
Upcoming patches will accelerate and simplify xdiff, while also porting parts of it to Rust. In preparation, add some stubs and setup the Rust build. For now, it is easier to let cargo build rust and have make or meson merely link against the static library that cargo builds. In line with ongoing libification efforts, use multiple crates to allow more modularity on the Rust side. xdiff is the crate that this series will focus on, but we also introduce the interop crate for future patch series. In order to facilitate interoperability between C and Rust, introduce C definitions for Rust primitive types in git-compat-util.h. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 56843cc - Browse repository at this point
Copy the full SHA 56843ccView commit details -
xdiff/xprepare: remove superfluous forward declarations
Move xdl_prepare_env() later in the file to avoid the need for forward declarations. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 89fbf31 - Browse repository at this point
Copy the full SHA 89fbf31View commit details -
xdiff: delete unnecessary fields from xrecord_t and xdfile_t
xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized, but never used for anything by the code. Remove them. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 7064862 - Browse repository at this point
Copy the full SHA 7064862View commit details -
xdiff: make fields of xrecord_t Rust friendly
A few commits ago, we added definitions for Rust primitive types, to facilitate interoperability between C and Rust. Switch a few variables to use these types. Which, for now, will require adding some casts. Also change xdlclass_t::ha to be u64 to match xrecord_t::ha, as pointed out by Johannes. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 398a826 - Browse repository at this point
Copy the full SHA 398a826View commit details -
xdiff: separate parsing lines from hashing them
We want to use xxhash for faster hashing. To facilitate that and to simplify the code. Separate the concerns of parsing and hashing into discrete steps. This makes swapping the hash function much easier. Since xdl_hash_record() both parses and hashses lines, this requires some slight code restructuring. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 6b00ff8 - Browse repository at this point
Copy the full SHA 6b00ff8View commit details -
xdiff: conditionally use Rust's implementation of xxhash
When no whitespace flags are present use xxhash, for faster hashing, otherwise use DJB2a (which is what xdiff has been using all along). The benchmark below compares my series with version v2.49.0 (built in build_release/ and build_v2.49.0/ respectively), running log commands on linux kernel with 3 different machines. $ BASE=/path/to/git/root // laptop // CPU: 6-core Intel Core i7-8750H (-MT MCP-) speed/min/max: 726/800/4100 MHz $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null' Benchmark 1: /home/ezekiel/development/work/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null Time (mean ± σ): 10.419 s ± 0.166 s [User: 10.097 s, System: 0.284 s] Range (min … max): 10.215 s … 10.680 s 10 runs Benchmark 2: /home/ezekiel/development/work/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null Time (mean ± σ): 10.980 s ± 0.137 s [User: 10.633 s, System: 0.308 s] Range (min … max): 10.791 s … 11.178 s 10 runs Summary /home/ezekiel/development/work/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran 1.05 ± 0.02 times faster than /home/ezekiel/development/work/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null // desktop // CPU: 8-core Intel Core i7-9700 (-MCP-) speed/min/max: 800/800/4700 MHz $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null' Benchmark 1: /home/steamuser/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null Time (mean ± σ): 6.823 s ± 0.020 s [User: 6.624 s, System: 0.180 s] Range (min … max): 6.801 s … 6.858 s 10 runs Benchmark 2: /home/steamuser/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null Time (mean ± σ): 8.151 s ± 0.024 s [User: 7.928 s, System: 0.198 s] Range (min … max): 8.105 s … 8.184 s 10 runs Summary /home/steamuser/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran 1.19 ± 0.01 times faster than /home/steamuser/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null // router // CPU: dual core Intel Celeron 3965U (-MCP-) speed/min/max: 1300/400/2200 MHz $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null' Benchmark 1: /home/metal/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null Time (mean ± σ): 21.209 s ± 0.054 s [User: 20.341 s, System: 0.605 s] Range (min … max): 21.135 s … 21.309 s 10 runs Benchmark 2: /home/metal/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null Time (mean ± σ): 23.683 s ± 0.060 s [User: 22.735 s, System: 0.672 s] Range (min … max): 23.566 s … 23.751 s 10 runs Summary /home/metal/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran 1.12 ± 0.00 times faster than /home/metal/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 5e2ceb6 - Browse repository at this point
Copy the full SHA 5e2ceb6View commit details -
github workflows: install rust
Since we have introduced rust, it needs to be installed for the continuous integration build targets. Create an install script (build_rust.sh) that needs to be run as the same user that builds git. Because of the limitations of meson, create build_rust.sh which makes it easy to centralize how rust is built between meson and make. There are 2 interesting decisions worth calling out in this commit: * The 'output' field of custom_target() does not allow specifying a file nested inside the build directory. Thus create build_rust.sh to build rust with all of its parameters and then moves libxdiff.a to the root of the build directory. * Install curl, to facilitate the rustup install script. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 34129ed - Browse repository at this point
Copy the full SHA 34129edView commit details -
Do support Windows again after requiring Rust
By default, Rust wants to build MS Visual C-compatible libraries on Windows, because that is _the_ native C compiler. Git is historically lacking in its MSVC support, and the official Git for Windows versions are built using GCC instead. As a consequence, a (subset of a) GCC toolchain is installed as part of the `windows-build` job of every CI build. Naturally, this requires adjustments in how Rust is called, most importantly it requires installing support for a GCC-compatible build target. Let's make the necessary adjustment both in the CI-specific code that installs Rust as well as in the Windows-specific configuration in `config.mak.uname`. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> [en: Moved lib userenv handling to a later patch] Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 6dfca2e - Browse repository at this point
Copy the full SHA 6dfca2eView commit details -
win+Meson: allow for xdiff to be compiled with MSVC
The `build_rust.sh` script is quite opinionated about the naming scheme of the C compiler: It assumes that the xdiff library file will be named `libxdiff.a`. However, MS Visual C generates `xdiff.lib` files instead; This naming scheme has been in use in a very, very long time. Let's allow for that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 75e6360 - Browse repository at this point
Copy the full SHA 75e6360View commit details -
win+Meson: do allow linking with the Rust-built xdiff
When linking against the Rust-built `xdiff`, there is now a new required dependency: Without _also_ linking to the system library `userenv`, the compile would fail with this error message: xdiff.lib(std-c85e9beb7923f636.std.df32d1bc89881d89-cgu.0.rcgu.o) : error LNK2019: unresolved external symbol __imp_GetUserProfileDirectoryW referenced in function _ZN3std3env8home_dir17hfd1c3b6676cd78f6E Therefore, just like we do in case of Makefile-based builds on Windows, we now also link to that library when building with Meson. Note that if we only have Rust depend upon libuserenv then at link time GCC would complain about: undefined reference to `GetUserProfileDirectoryW' Apparently there is _some_ closure that gets compiled in that requires this function, and that in turn forces Git to link to libuserenv. This is a new requirement, and therefore has not been made part of the "minimal Git for Windows SDK". In the near future, I intend to include it, but for now let's just ensure that the file is added manually if it is missing. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> [en: Squashed a few of Johannes's patches, and moved lib userenv handling from an earlier patch] Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for f1cf460 - Browse repository at this point
Copy the full SHA f1cf460View commit details -
github workflows: define rust versions and targets in the same place
Consolidate the Rust toolchain definitions in main.yaml. Prefer using actions-rs/toolchain@v1 where possible, but for docker targets use a script to install the Rust toolchain. Four overrides are used in main.yaml: * On Windows: Rust didn't resolve the bcrypt library on Windows correctly until version 1.78.0. Also since rustup mis-identifies the Rust toolchain, the Rust target triple must be set to x86_64-pc-windows-gnu. * On musl: libc differences, such as ftruncate64 vs ftruncate, were not accounted for until Rust version 1.72.0. No older version of Rust will work on musl for our needs. * In a 32-bit docker container running on a 64-bit host, we need to override the Rust target triple. This is because rustup asks the kernel for the bitness of the system and it says 64, even though the container will only run 32-bit. This also allows us to remove the BITNESS environment variable in ci/lib.sh. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 8642dec - Browse repository at this point
Copy the full SHA 8642decView commit details -
github workflows: upload Cargo.lock
Make each ci workflow upload its Cargo.lock file as a build artifact so that we can audit build dependencies. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 992477f - Browse repository at this point
Copy the full SHA 992477fView commit details -
xdiff: implement a white space iterator in Rust
Xdiff has traditionally implemented the logic for iterating over whitespace in every location that needed to do so. Create a consolidated iterator in Rust that we can call from each location. Write Rust unit tests to ensure the correctness of the Rust whitespace iterator and the chunked_iter_equal() function. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 4e388da - Browse repository at this point
Copy the full SHA 4e388daView commit details -
xdiff: create line_hash() and line_equal()
These functions use the whitespace iterator, when applicable, to hash, and compare lines. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 4d631e3 - Browse repository at this point
Copy the full SHA 4d631e3View commit details -
xdiff: optimize case where --ignore-cr-at-eol is the only whitespace …
…flag Currently the whitespace iterator is slower than git's C implementation so we skip using the whitespace iterator if there are no whitespace flags. Special case the --ignore-cr-at-eol similarly to make it performant. For the rest of the whitespace flags they will be slower for now, but as more of Xdiff is translated into Rust it'll be easier to revisit and optimize whitespace processing. Optimizing the other whitespace flags now would be difficult because: * Xxhash uses chunk based processing. * The same iterator is used for hashing and equality, which means the iterator could be optimized for returning large chunks for fast hashing or could return each byte making equality testing faster. I opted for faster hashing. The data structures in C need to be cleaned up before they're interoperable with Rust. Once that's done I believe a faster method of whitespace processing will be possible. * Trying to make heavliy optimized code between 2 languages that aren't easily interoperable in their current state makes the code either fast or easy to maintain. But once enough of Xdiff is written in Rust I believe that a fast and maintainable method can be implemented. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for 2f2f308 - Browse repository at this point
Copy the full SHA 2f2f308View commit details -
xdiff: use rust's version of whitespace processing
Delete xdl_hash_record() and xdl_recmatch() in favor of xdl_line_hash() and xdl_line_equal(). Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Configuration menu - View commit details
-
Copy full SHA for d869f26 - Browse repository at this point
Copy the full SHA d869f26View commit details
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff master...en/rust-xdiff