Open
Description
Problem
If a package has a git dependency with a large submodule, any change to the git repo that updates the submodule causes the entire submodule repo to be re-downloaded from scratch, and an entire separate copy is retained. This can be very expensive for both network download time and disk space.
Steps
- In a blank project add dependency:
rocksdb = {git = "https://github.com/tikv/rust-rocksdb.git", rev="fe7be35ba191684c989effdc6ee8e39a3978e650"}
cargo fetch
- Change rev to
3cd18c44d160a3cdba586d6502d51b7cc67efc59
cargo fetch
- Notice it downloaded the entirety of the submodule https://github.com/tikv/rocksdb.git which is about 100MB.
- Change rev to
5adf5b847e13cea2a59a1b4921aa5bf38591d1a3
cargo fetch
- Notice it downloaded yet another copy.
Possible Solution(s)
The repo in git/db/…
should probably contain the submodule. Currently it appears that it checks out a fresh copy for every commit in git/checkout/…
. I think it is because cargo is using Submodule::open here. I wonder if using Submodule::update would be the solution?