Skip to content

Serialized Commit Graph #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 21 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
7d4bebf
Merge branch 'jt/binsearch-with-fanout' into HEAD
gitster Mar 13, 2018
2ee13a7
Merge branch 'jk/cached-commit-buffer' into HEAD
gitster Mar 13, 2018
f2af9f5
csum-file: rename hashclose() to finalize_hashfile()
derrickstolee Apr 2, 2018
cfe8321
csum-file: refactor finalize_hashfile() method
derrickstolee Apr 2, 2018
b84f767
commit-graph: add format document
derrickstolee Apr 2, 2018
ae30d7b
graph: add commit graph design document
derrickstolee Apr 2, 2018
4ce58ee
commit-graph: create git-commit-graph builtin
derrickstolee Apr 2, 2018
08fd81c
commit-graph: implement write_commit_graph()
derrickstolee Apr 2, 2018
f237c8b
commit-graph: implement git-commit-graph write
derrickstolee Apr 2, 2018
2a2e32b
commit-graph: implement git commit-graph read
derrickstolee Apr 10, 2018
1b70dfd
commit-graph: add core.commitGraph setting
derrickstolee Apr 10, 2018
4f2542b
commit-graph: close under reachability
derrickstolee Apr 10, 2018
177722b
commit: integrate commit graph with commit parsing
derrickstolee Apr 10, 2018
049d51a
commit-graph: read only from specific pack-indexes
derrickstolee Apr 10, 2018
3d5df01
commit-graph: build graph from starting commits
derrickstolee Apr 10, 2018
7547b95
commit-graph: implement "--append" option
derrickstolee Apr 10, 2018
2d5792f
Merge branch 'bw/c-plus-plus' into ds/lazy-load-trees
gitster Apr 11, 2018
891435d
treewide: rename tree to maybe_tree
derrickstolee Apr 6, 2018
5bb03de
commit: create get_commit_tree() method
derrickstolee Apr 6, 2018
2e27bd7
treewide: replace maybe_tree with accessor methods
derrickstolee Apr 6, 2018
7b8a21d
commit-graph: lazy-load trees for commits
derrickstolee Apr 6, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
/git-clone
/git-column
/git-commit
/git-commit-graph
/git-commit-tree
/git-config
/git-count-objects
Expand Down
30 changes: 30 additions & 0 deletions Documentation/RelNotes/2.16.2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
Git v2.16.2 Release Notes
=========================

Fixes since v2.16.1
-------------------

* An old regression in "git describe --all $annotated_tag^0" has been
fixed.

* "git svn dcommit" did not take into account the fact that a
svn+ssh:// URL with a username@ (typically used for pushing) refers
to the same SVN repository without the username@ and failed when
svn.pushmergeinfo option is set.

* "git merge -Xours/-Xtheirs" learned to use our/their version when
resolving a conflicting updates to a symbolic link.

* "git clone $there $here" is allowed even when here directory exists
as long as it is an empty directory, but the command incorrectly
removed it upon a failure of the operation.

* "git stash -- <pathspec>" incorrectly blew away untracked files in
the directory that matched the pathspec, which has been corrected.

* "git add -p" was taught to ignore local changes to submodules as
they do not interfere with the partial addition of regular changes
anyway.


Also contains various documentation updates and code clean-ups.
4 changes: 4 additions & 0 deletions Documentation/config.txt
Original file line number Diff line number Diff line change
Expand Up @@ -898,6 +898,10 @@ core.notesRef::
This setting defaults to "refs/notes/commits", and it can be overridden by
the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1].

core.commitGraph::
Enable git commit graph feature. Allows reading from the
commit-graph file.

core.sparseCheckout::
Enable "sparse checkout" feature. See section "Sparse checkout" in
linkgit:git-read-tree[1] for more information.
Expand Down
94 changes: 94 additions & 0 deletions Documentation/git-commit-graph.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
git-commit-graph(1)
===================

NAME
----
git-commit-graph - Write and verify Git commit graph files


SYNOPSIS
--------
[verse]
'git commit-graph read' [--object-dir <dir>]
'git commit-graph write' <options> [--object-dir <dir>]


DESCRIPTION
-----------

Manage the serialized commit graph file.


OPTIONS
-------
--object-dir::
Use given directory for the location of packfiles and commit graph
file. This parameter exists to specify the location of an alternate
that only has the objects directory, not a full .git directory. The
commit graph file is expected to be at <dir>/info/commit-graph and
the packfiles are expected to be in <dir>/pack.


COMMANDS
--------
'write'::

Write a commit graph file based on the commits found in packfiles.
+
With the `--stdin-packs` option, generate the new commit graph by
walking objects only in the specified pack-indexes. (Cannot be combined
with --stdin-commits.)
+
With the `--stdin-commits` option, generate the new commit graph by
walking commits starting at the commits specified in stdin as a list
of OIDs in hex, one OID per line. (Cannot be combined with
--stdin-packs.)
+
With the `--append` option, include all commits that are present in the
existing commit-graph file.

'read'::

Read a graph file given by the commit-graph file and output basic
details about the graph file. Used for debugging purposes.


EXAMPLES
--------

* Write a commit graph file for the packed commits in your local .git folder.
+
------------------------------------------------
$ git commit-graph write
------------------------------------------------

* Write a graph file, extending the current graph file using commits
* in <pack-index>.
+
------------------------------------------------
$ echo <pack-index> | git commit-graph write --stdin-packs
------------------------------------------------

* Write a graph file containing all reachable commits.
+
------------------------------------------------
$ git show-ref -s | git commit-graph write --stdin-commits
------------------------------------------------

* Write a graph file containing all commits in the current
* commit-graph file along with those reachable from HEAD.
+
------------------------------------------------
$ git rev-parse HEAD | git commit-graph write --stdin-commits --append
------------------------------------------------

* Read basic information from the commit-graph file.
+
------------------------------------------------
$ git commit-graph read
------------------------------------------------


GIT
---
Part of the linkgit:git[1] suite
97 changes: 97 additions & 0 deletions Documentation/technical/commit-graph-format.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
Git commit graph format
=======================

The Git commit graph stores a list of commit OIDs and some associated
metadata, including:

- The generation number of the commit. Commits with no parents have
generation number 1; commits with parents have generation number
one more than the maximum generation number of its parents. We
reserve zero as special, and can be used to mark a generation
number invalid or as "not computed".

- The root tree OID.

- The commit date.

- The parents of the commit, stored using positional references within
the graph file.

These positional references are stored as unsigned 32-bit integers
corresponding to the array position withing the list of commit OIDs. We
use the most-significant bit for special purposes, so we can store at most
(1 << 31) - 1 (around 2 billion) commits.

== Commit graph files have the following format:

In order to allow extensions that add extra data to the graph, we organize
the body into "chunks" and provide a binary lookup table at the beginning
of the body. The header includes certain values, such as number of chunks
and hash type.

All 4-byte numbers are in network order.

HEADER:

4-byte signature:
The signature is: {'C', 'G', 'P', 'H'}

1-byte version number:
Currently, the only valid version is 1.

1-byte Hash Version (1 = SHA-1)
We infer the hash length (H) from this value.

1-byte number (C) of "chunks"

1-byte (reserved for later use)
Current clients should ignore this value.

CHUNK LOOKUP:

(C + 1) * 12 bytes listing the table of contents for the chunks:
First 4 bytes describe the chunk id. Value 0 is a terminating label.
Other 8 bytes provide the byte-offset in current file for chunk to
start. (Chunks are ordered contiguously in the file, so you can infer
the length using the next chunk position if necessary.) Each chunk
ID appears at most once.

The remaining data in the body is described one chunk at a time, and
these chunks may be given in any order. Chunks are required unless
otherwise specified.

CHUNK DATA:

OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
The ith entry, F[i], stores the number of OIDs with first
byte at most i. Thus F[255] stores the total
number of commits (N).

OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
The OIDs for all commits in the graph, sorted in ascending order.

Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
* The first H bytes are for the OID of the root tree.
* The next 8 bytes are for the positions of the first two parents
of the ith commit. Stores value 0xffffffff if no parent in that
position. If there are more than two parents, the second value
has its most-significant bit on and the other bits store an array
position into the Large Edge List chunk.
* The next 8 bytes store the generation number of the commit and
the commit time in seconds since EPOCH. The generation number
uses the higher 30 bits of the first 4 bytes, while the commit
time uses the 32 bits of the second 4 bytes, along with the lowest
2 bits of the lowest byte, storing the 33rd and 34th bit of the
commit time.

Large Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
This list of 4-byte values store the second through nth parents for
all octopus merges. The second parent value in the commit data stores
an array position within this list along with the most-significant bit
on. Starting at that array position, iterate through this list of commit
positions for the parents until reaching a value with the most-significant
bit on. The other bits correspond to the position of the last parent.

TRAILER:

H-byte HASH-checksum of all of the above.
Loading