Skip to content

The data representation of tree is lossy and prevents round-tripping #1887

Open
@pierrechevalier83

Description

@pierrechevalier83

Current behavior 😯

The data representation of Tree is lossy compared to the git representation, which means that some git trees cannot round-trip through git-oxide while remaining byte-for-byte identical.

Specifically, the entry mode in git is represented as a 6 digit byte string which represents the octal representation of the mode.
For the EntryMode that corresponds to a Tree, the octal value is 0o040000u16, which can be represented as b"40000" or b"040000". Both representations occur in the wild, as shown in the test fixture: special-1.tree where both co-exist in the same tree representation.

In gitoxide, we parse this string as a u16 before getting to the EntryKind enum representation, where we loose information on whether or not the tree EntryMode was represented in git with a leading zero.

Expected behavior 🤔

EntryMode and EntryKind should hold on to at least one more bit of information (did the git byte string representation contain a leading zero?) so we can round-trip from git -> gitoxide -> git while preserving the byte representation.

Git behavior

The entry mode in git is represented as a 6 digit byte string which represents the octal representation of the mode.
For the EntryMode that corresponds to a Tree, the octal value is 0o040000u16, which can be represented as b"40000" or b"040000".

Steps to reproduce 🕹

See this change to the test that verifies we can round-trip.
It currently fails:
From gix-object,

cargo test

fails with:

---- tree::from_bytes::special_trees stdout ----

thread 'tree::from_bytes::special_trees' panicked at gix-object/tests/object/tree/from_bytes.rs:107:9:
assertion `left == right` failed
  left: [52, 48, 48, 48, 48, 32, 99, 104, 114, 111, 109, 101, 0, 126, 87, 230, 62, 203, 152, 244, 156, 233, 186, 220, 52, 58, 28, 34, 169, 230, 175, 227, 116, 52, 48, 48, 48, 48, 32, 101, 100, 105, 116, 111, 114, 115, 0, 87, 15, 194, 205, 208, 108, 98, 188, 56, 180, 7, 251, 45, 138, 241, 73, 89, 164, 134, 70, 49, 48, 48, 54, 52, 52, 32, 106, 115, 98, 105, 110, 46, 106, 115, 0, 126, 117, 18, 40, 234, 142, 9, 4, 82, 165, 64, 249, 109, 30, 33, 21, 76, 250, 170, 246, 52, 48, 48, 48, 48, 32, 114, 101, 110, 100, 101, 114, 0, 190, 206, 230, 168, 101, 95, 18, 40, 56, 73, 21, 0, 149, 163, 200, 188, 95, 208, 160, 82, 52, 48, 48, 48, 48, 32, 118, 101, 110, 100, 111, 114, 0, 97, 13, 187, 16, 54, 17, 224, 60, 151, 2, 230, 118, 36, 31, 214, 237, 68, 155, 190, 96]
 right: [48, 52, 48, 48, 48, 48, 32, 99, 104, 114, 111, 109, 101, 0, 126, 87, 230, 62, 203, 152, 244, 156, 233, 186, 220, 52, 58, 28, 34, 169, 230, 175, 227, 116, 52, 48, 48, 48, 48, 32, 101, 100, 105, 116, 111, 114, 115, 0, 87, 15, 194, 205, 208, 108, 98, 188, 56, 180, 7, 251, 45, 138, 241, 73, 89, 164, 134, 70, 49, 48, 48, 54, 52, 52, 32, 106, 115, 98, 105, 110, 46, 106, 115, 0, 126, 117, 18, 40, 234, 142, 9, 4, 82, 165, 64, 249, 109, 30, 33, 21, 76, 250, 170, 246, 48, 52, 48, 48, 48, 48, 32, 114, 101, 110, 100, 101, 114, 0, 190, 206, 230, 168, 101, 95, 18, 40, 56, 73, 21, 0, 149, 163, 200, 188, 95, 208, 160, 82, 48, 52, 48, 48, 48, 48, 32, 118, 101, 110, 100, 111, 114, 0, 97, 13, 187, 16, 54, 17, 224, 60, 151, 2, 230, 118, 36, 31, 214, 237, 68, 155, 190, 96]
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace


failures:
    tree::from_bytes::special_trees

Metadata

Metadata

Assignees

Labels

acknowledgedan issue is accepted as shortcoming to be fixed

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions