Skip to content

Integration with commit evolution #1

Closed
@arxanas

Description

@arxanas

Hi @quark-zju, this is a super cool project. I would like to integrate it into my project at https://github.com/arxanas/git-branchless, which simulates the workflows at companies like Facebook. I had a few questions I was hoping you could help me with.

Data structures?

What data structures are used to implement the DAG? I found this remark:

See slides/201904-segmented-changelog/segmented-changelog.pdf for pretty graphs about how segments help with ancestry queries.

at https://docs.rs/esl01-dag/0.1.1/esl01_dag/struct.IdDag.html, but I didn't find the associated slides (are they publicly available?). What kind of performance can I expect for various operations?

What kind of correctness guarantees can I expect? If I query the DAG for a node which hasn't yet been observed, what happens? Can I use it in a multi-threaded or multi-process context?

How stable is the DAG API? To what degree can I rely on it?

Performance with reference updates?

The performance for initializing the DAG when running git-revs is quite good on the repository I'm testing with (maybe 30 seconds, compared to minutes when running git commit-graph instead, but I didn't even measure the time because it took so long). But subsequent invocations take two or three seconds at a minimum.

My guess is that it's because crawling all the references here: https://docs.rs/gitdag/0.1.2/src/gitdag/gitdag.rs.html#82. I think you mentioned somewhere in the documentation that it will be slow if there are a lot of references. In the case of git-branchless, we keep track of the commit graph heads ourselves, and we don't care about remote references, so I should be able to significantly speed it up. However, I can't pass in my own GitDag to this library. Should I change the API, and if so, what changes do you recommend?

Commit evolution?

git-branchless implements its own commit evolution feature, not based on the reflog (see https://github.com/arxanas/git-branchless/wiki/Architecture). So I don't want the reflog-based commit evolution implementation here: https://github.com/quark-zju/gitrevset/blob/master/src/mutation.rs#L13. Similarly to the above, if I want to swap out the implementation for this behavior with my own, do I need to change the git-revset API, and if so, what changes do you recommend?

add_heads_and_flush

For this function DagPersistent::add_heads_and_flush: https://docs.rs/esl01-dag/0.2.1/esl01_dag/namedag/struct.NameDag.html#method.add_heads_and_flush, why does it care about the difference between master names and non-master names? git-branchless relies on a main branch, but I don't see why the DAG itself cares about which branches are "main".

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions