Skip to content

gix-worktree and gix-index (checkout, status, commit) #301

Open
@Byron

Description

@Byron

Note that we handle both crates here as they are very much intertwined. git-index handles the data structure to accelerate operations in the git-worktree for actually manipulating the working copy.

Tasks for checkout

Reset

  • gix reset with soft/mixed/hard/merge/keep semantics with pathspecs as well. Submodule support should be possible, too.
  • gix-worktree-state reset to reset a working tree according to to an index, with pathspec support.
  • reset index to match tree based on pathspecs.

Out of scope

  • hunk support (i.e. git reset -p)

Tasks for add

Add files to the index.

Tasks for commit

  • create tree from index
  • create commit
  • round-trippable reads and writes (write all index extensions to not degenerate information)

Tasks for fetch/clone

  • create index from tree
    • can there be an optimization that keeps what didn't change?

Tasks for status

The difference between an index and the work tree. Analysis TBD.

See this blog post for incredible details on how git does things, related to fs-monitor as well.

There is also an alternative implementation which provides a lot of details on how to be better.
@pascalkuthe did a first analysis and concluded that most of the speedup came through congestion-free multi-threading and the usage of something like the untracked-cache. On Linux, it's possible to also speedup syscalls using more specific versions of it, but that should definitely be left as last resort for performance improvements.

Stages

  • determine unstaged changes (Diff between worktree and index #805)
    • changes between worktree and index
    • needs one stat call per file one way or another.
    • Question: what's faster: walkdir or symlink_metadata per index entry? Note that walkdir doesn't use ``
    • rename/copy tracking - should be based on tree-tree rename tracking, can it be generalized?
  • assure status works with file_size >= u32::MAX
    • currently it's acknowledged in the documentation but there is no test for that, nor is it clear how this works in git.
  • determine staged changes
    • compare tree entries with index entries
    • Question: is there a way to avoid having to traverse a tree recursively? Yes, use the TREE extension to know the dir ids of all entries, which allows to reproduce the trees and see if they changed, and only if so we lookup the tree itself.
  • find untracked files
    • can use untracked-cache to be faster. Could be coming 'for free' if walkdir would be used
  • fast is-dirty checks - and wiring that up to describe

Checkout Research

Follow Ups

  • symlink wait for 1.0 release with additional fixes (see thread on MR)
    • need to use remove_symlink() from this crate, but can't use it for relative paths due to the filename check

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-tracking-issueAn issue to track to track the progress of multiple PRs or issues

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions