Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: gitgitgadget/git
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: master
Choose a base ref
...
head repository: gitgitgadget/git
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: en/rust-xdiff
Choose a head ref
  • 17 commits
  • 30 files changed
  • 3 contributors

Commits on Aug 18, 2025

  1. doc: add a policy for using Rust

    Git has historically been written primarily in C, with some shell and
    Perl.  However, C is not memory safe, which makes it more likely that
    security vulnerabilities or other bugs will be introduced, and it is
    also more verbose and less ergonomic than other, more modern languages.
    
    One of the most common modern compiled languages which is easily
    interoperable with C is Rust.  It is popular (the most admired language
    on the 2024 Stack Overflow Developer Survey), efficient, portable, and
    robust.
    
    Introduce a document laying out the incremental introduction of Rust to
    Git and provide a detailed rationale for doing so, including the points
    above.  Propose a design for this approach that addresses the needs of
    downstreams and distributors, as well as contributors.
    
    Since we don't want to carry both a C and Rust version of code and want
    to be able to add new features only in Rust, mention that Rust is a
    required part of our platform support policy.
    
    It should be noted that a recent discussion at the Berlin Git Merge
    Contributor Summit found widespread support for the addition of Rust to
    Git.  While of course not all contributors were represented, the
    proposal appeared to have the support of a majority of active
    contributors.
    
    Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    bk2204 authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    be19483 View commit details
    Browse the repository at this point in the history
  2. xdiff: introduce rust

    Upcoming patches will accelerate and simplify xdiff, while also
    porting parts of it to Rust. In preparation, add some stubs and setup
    the Rust build. For now, it is easier to let cargo build rust and
    have make or meson merely link against the static library that cargo
    builds. In line with ongoing libification efforts, use multiple
    crates to allow more modularity on the Rust side. xdiff is the crate
    that this series will focus on, but we also introduce the interop
    crate for future patch series.
    
    In order to facilitate interoperability between C and Rust, introduce
    C definitions for Rust primitive types in git-compat-util.h.
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    56843cc View commit details
    Browse the repository at this point in the history
  3. xdiff/xprepare: remove superfluous forward declarations

    Move xdl_prepare_env() later in the file to avoid the need
    for forward declarations.
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    89fbf31 View commit details
    Browse the repository at this point in the history
  4. xdiff: delete unnecessary fields from xrecord_t and xdfile_t

    xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
    but never used for anything by the code. Remove them.
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    7064862 View commit details
    Browse the repository at this point in the history
  5. xdiff: make fields of xrecord_t Rust friendly

    A few commits ago, we added definitions for Rust primitive types,
    to facilitate interoperability between C and Rust. Switch a
    few variables to use these types. Which, for now, will
    require adding some casts.
    
    Also change xdlclass_t::ha to be u64 to match xrecord_t::ha, as
    pointed out by Johannes.
    
    Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    398a826 View commit details
    Browse the repository at this point in the history
  6. xdiff: separate parsing lines from hashing them

    We want to use xxhash for faster hashing. To facilitate that
    and to simplify the code. Separate the concerns of parsing
    and hashing into discrete steps. This makes swapping the hash
    function much easier. Since xdl_hash_record() both parses and
    hashses lines, this requires some slight code restructuring.
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    6b00ff8 View commit details
    Browse the repository at this point in the history
  7. xdiff: conditionally use Rust's implementation of xxhash

    When no whitespace flags are present use xxhash, for faster
    hashing, otherwise use DJB2a (which is what xdiff has been
    using all along).
    
    The benchmark below compares my series with version v2.49.0
    (built in build_release/ and build_v2.49.0/ respectively),
    running log commands on linux kernel with 3 different machines.
    
    $ BASE=/path/to/git/root
    
        // laptop
        // CPU: 6-core Intel Core i7-8750H (-MT MCP-) speed/min/max: 726/800/4100 MHz
        $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null'
        Benchmark 1: /home/ezekiel/development/work/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null
          Time (mean ± σ):     10.419 s ±  0.166 s    [User: 10.097 s, System: 0.284 s]
          Range (min … max):   10.215 s … 10.680 s    10 runs
    
        Benchmark 2: /home/ezekiel/development/work/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
          Time (mean ± σ):     10.980 s ±  0.137 s    [User: 10.633 s, System: 0.308 s]
          Range (min … max):   10.791 s … 11.178 s    10 runs
    
        Summary
          /home/ezekiel/development/work/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran
            1.05 ± 0.02 times faster than /home/ezekiel/development/work/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
    
        // desktop
        // CPU: 8-core Intel Core i7-9700 (-MCP-) speed/min/max: 800/800/4700 MHz
        $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null'
        Benchmark 1: /home/steamuser/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null
          Time (mean ± σ):      6.823 s ±  0.020 s    [User: 6.624 s, System: 0.180 s]
          Range (min … max):    6.801 s …  6.858 s    10 runs
    
        Benchmark 2: /home/steamuser/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
          Time (mean ± σ):      8.151 s ±  0.024 s    [User: 7.928 s, System: 0.198 s]
          Range (min … max):    8.105 s …  8.184 s    10 runs
    
        Summary
          /home/steamuser/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran
            1.19 ± 0.01 times faster than /home/steamuser/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
    
        // router
        // CPU: dual core Intel Celeron 3965U (-MCP-) speed/min/max: 1300/400/2200 MHz
        $ hyperfine --warmup 3 -L exe $BASE/build_release/git,$BASE/build_v2.49.0/git '{exe} log --oneline --shortstat v6.8..v6.9 >/dev/null'
        Benchmark 1: /home/metal/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null
          Time (mean ± σ):     21.209 s ±  0.054 s    [User: 20.341 s, System: 0.605 s]
          Range (min … max):   21.135 s … 21.309 s    10 runs
    
        Benchmark 2: /home/metal/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
          Time (mean ± σ):     23.683 s ±  0.060 s    [User: 22.735 s, System: 0.672 s]
          Range (min … max):   23.566 s … 23.751 s    10 runs
    
        Summary
          /home/metal/dev/git/build_release/git log --oneline --shortstat v6.8..v6.9 >/dev/null ran
            1.12 ± 0.00 times faster than /home/metal/dev/git/build_v2.49.0/git log --oneline --shortstat v6.8..v6.9 >/dev/null
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    5e2ceb6 View commit details
    Browse the repository at this point in the history
  8. github workflows: install rust

    Since we have introduced rust, it needs to be installed for the
    continuous integration build targets. Create an install script
    (build_rust.sh) that needs to be run as the same user that builds git.
    Because of the limitations of meson, create build_rust.sh which makes
    it easy to centralize how rust is built between meson and make.
    
    There are 2 interesting decisions worth calling out in this commit:
    
    * The 'output' field of custom_target() does not allow specifying a
      file nested inside the build directory. Thus create build_rust.sh to
      build rust with all of its parameters and then moves libxdiff.a to
      the root of the build directory.
    
    * Install curl, to facilitate the rustup install script.
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    34129ed View commit details
    Browse the repository at this point in the history
  9. Do support Windows again after requiring Rust

    By default, Rust wants to build MS Visual C-compatible libraries on
    Windows, because that is _the_ native C compiler.
    
    Git is historically lacking in its MSVC support, and the official Git
    for Windows versions are built using GCC instead. As a consequence, a
    (subset of a) GCC toolchain is installed as part of the `windows-build`
    job of every CI build.
    
    Naturally, this requires adjustments in how Rust is called, most
    importantly it requires installing support for a GCC-compatible build
    target.
    
    Let's make the necessary adjustment both in the CI-specific code that
    installs Rust as well as in the Windows-specific configuration in
    `config.mak.uname`.
    
    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
    [en: Moved lib userenv handling to a later patch]
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    dscho authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    6dfca2e View commit details
    Browse the repository at this point in the history
  10. win+Meson: allow for xdiff to be compiled with MSVC

    The `build_rust.sh` script is quite opinionated about the naming scheme
    of the C compiler: It assumes that the xdiff library file will be named
    `libxdiff.a`.
    
    However, MS Visual C generates `xdiff.lib` files instead; This naming
    scheme has been in use in a very, very long time.
    
    Let's allow for that.
    
    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    dscho authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    75e6360 View commit details
    Browse the repository at this point in the history
  11. win+Meson: do allow linking with the Rust-built xdiff

    When linking against the Rust-built `xdiff`, there is now a new required
    dependency: Without _also_ linking to the system library `userenv`, the
    compile would fail with this error message:
    
      xdiff.lib(std-c85e9beb7923f636.std.df32d1bc89881d89-cgu.0.rcgu.o) :
      error LNK2019: unresolved external symbol __imp_GetUserProfileDirectoryW
      referenced in function _ZN3std3env8home_dir17hfd1c3b6676cd78f6E
    
    Therefore, just like we do in case of Makefile-based builds on Windows,
    we now also link to that library when building with Meson.
    
    Note that if we only have Rust depend upon libuserenv then at link time
    GCC would complain about:
    
      undefined reference to `GetUserProfileDirectoryW'
    
    Apparently there is _some_ closure that gets compiled in that requires
    this function, and that in turn forces Git to link to libuserenv.
    
    This is a new requirement, and therefore has not been made part of the
    "minimal Git for Windows SDK".
    
    In the near future, I intend to include it, but for now let's just
    ensure that the file is added manually if it is missing.
    
    Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
    [en: Squashed a few of Johannes's patches, and moved lib userenv
     handling from an earlier patch]
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    dscho authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    f1cf460 View commit details
    Browse the repository at this point in the history
  12. github workflows: define rust versions and targets in the same place

    Consolidate the Rust toolchain definitions in main.yaml. Prefer using
    actions-rs/toolchain@v1 where possible, but for docker targets use
    a script to install the Rust toolchain. Four overrides are used in
    main.yaml:
    
      * On Windows: Rust didn't resolve the bcrypt library on Windows
        correctly until version 1.78.0. Also since rustup mis-identifies
        the Rust toolchain, the Rust target triple must be set to
        x86_64-pc-windows-gnu.
      * On musl: libc differences, such as ftruncate64 vs ftruncate, were
        not accounted for until Rust version 1.72.0. No older version of
        Rust will work on musl for our needs.
      * In a 32-bit docker container running on a 64-bit host, we need to
        override the Rust target triple. This is because rustup asks the
        kernel for the bitness of the system and it says 64, even though
        the container will only run 32-bit. This also allows us to remove
        the BITNESS environment variable in ci/lib.sh.
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    8642dec View commit details
    Browse the repository at this point in the history
  13. github workflows: upload Cargo.lock

    Make each ci workflow upload its Cargo.lock file as a build artifact so
    that we can audit build dependencies.
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    992477f View commit details
    Browse the repository at this point in the history
  14. xdiff: implement a white space iterator in Rust

    Xdiff has traditionally implemented the logic for iterating over
    whitespace in every location that needed to do so. Create a consolidated
    iterator in Rust that we can call from each location. Write Rust unit
    tests to ensure the correctness of the Rust whitespace iterator and the
    chunked_iter_equal() function.
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    4e388da View commit details
    Browse the repository at this point in the history
  15. xdiff: create line_hash() and line_equal()

    These functions use the whitespace iterator, when applicable, to hash,
    and compare lines.
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    4d631e3 View commit details
    Browse the repository at this point in the history
  16. xdiff: optimize case where --ignore-cr-at-eol is the only whitespace …

    …flag
    
    Currently the whitespace iterator is slower than git's C implementation
    so we skip using the whitespace iterator if there are no whitespace
    flags. Special case the --ignore-cr-at-eol similarly to make it
    performant. For the rest of the whitespace flags they will be slower
    for now, but as more of Xdiff is translated into Rust it'll be easier
    to revisit and optimize whitespace processing. Optimizing the other
    whitespace flags now would be difficult because:
    
      * Xxhash uses chunk based processing.
      * The same iterator is used for hashing and equality, which means the
        iterator could be optimized for returning large chunks for fast
        hashing or could return each byte making equality testing faster.
        I opted for faster hashing. The data structures in C need to be
        cleaned up before they're interoperable with Rust. Once that's done
        I believe a faster method of whitespace processing will be possible.
      * Trying to make heavliy optimized code between 2 languages that aren't
        easily interoperable in their current state makes the code either
        fast or easy to maintain. But once enough of Xdiff is written in
        Rust I believe that a fast and maintainable method can be
        implemented.
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    2f2f308 View commit details
    Browse the repository at this point in the history
  17. xdiff: use rust's version of whitespace processing

    Delete xdl_hash_record() and xdl_recmatch() in favor of xdl_line_hash()
    and xdl_line_equal().
    
    Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
    Signed-off-by: Junio C Hamano <gitster@pobox.com>
    ezekielnewren authored and gitster committed Aug 18, 2025
    Configuration menu
    Copy the full SHA
    d869f26 View commit details
    Browse the repository at this point in the history
Loading