Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
119 changes: 66 additions & 53 deletions src/cargo/sources/git/source.rs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[not specific to this file]

[not hard blockers]

I noticed that in the commit history we have Merge branch 'master' of https://…. We encourage rebasing and having a well-structured git history for review/debugging/tracing purpose.

Also, chore: … were used in some commits, but chore should have not production code change (see https://stackoverflow.com/a/26944812). IMO they should either be refactor or fix.
(Some may be better being squashed into the entire fix ffd78e6, as those intermediates commits minimally help future review and code tracing)

Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ use crate::core::global_cache_tracker;
use crate::core::{Dependency, Package, PackageId};
use crate::sources::IndexSummary;
use crate::sources::RecursivePathSource;
use crate::sources::git::utils::GitDatabase;
use crate::sources::git::utils::GitRemote;
use crate::sources::git::utils::rev_to_oid;
use crate::sources::source::MaybePackage;
Expand Down Expand Up @@ -155,6 +156,69 @@ impl<'gctx> GitSource<'gctx> {
});
Ok(())
}

pub(crate) fn update_db(&self, is_submodule: bool) -> CargoResult<(GitDatabase, git2::Oid)> {
let db_path = self.gctx.git_db_path().join(&self.ident);
let db_path = db_path.into_path_unlocked();

let db = self.remote.db_at(&db_path).ok();

let (db, actual_rev) = match (&self.locked_rev, db) {
// If we have a locked revision, and we have a preexisting database
// which has that revision, then no update needs to happen.
(Revision::Locked(oid), Some(db)) if db.contains(*oid) => (db, *oid),

// If we're in offline mode, we're not locked, and we have a
// database, then try to resolve our reference with the preexisting
// repository.
(Revision::Deferred(git_ref), Some(db)) if !self.gctx.network_allowed() => {
let offline_flag = self
.gctx
.offline_flag()
.expect("always present when `!network_allowed`");
let rev = db.resolve(&git_ref).with_context(|| {
format!(
"failed to lookup reference in preexisting repository, and \
can't check for updates in offline mode ({offline_flag})"
)
})?;
(db, rev)
}

// ... otherwise we use this state to update the git database. Note
// that we still check for being offline here, for example in the
// situation that we have a locked revision but the database
// doesn't have it.
(locked_rev, db) => {
if let Some(offline_flag) = self.gctx.offline_flag() {
anyhow::bail!(
"can't checkout from '{}': you are in the offline mode ({offline_flag})",
self.remote.url()
);
}

if !self.quiet {
if is_submodule {
self.gctx
.shell()
.status("Updating", format!("git submodule `{}`", self.remote.url()))?;
} else {
self.gctx.shell().status(
"Updating",
format!("git repository `{}`", self.remote.url()),
)?;
}
}

trace!("updating git source `{:?}`", self.remote);

let locked_rev = locked_rev.clone().into();
self.remote.checkout(&db_path, db, &locked_rev, self.gctx)?
}
};

Ok((db, actual_rev))
}
}

/// Indicates a [Git revision] that might be locked or deferred to be resolved.
Expand Down Expand Up @@ -286,58 +350,7 @@ impl<'gctx> Source for GitSource<'gctx> {
// exists.
exclude_from_backups_and_indexing(&git_path);

let db_path = self.gctx.git_db_path().join(&self.ident);
let db_path = db_path.into_path_unlocked();

let db = self.remote.db_at(&db_path).ok();

let (db, actual_rev) = match (&self.locked_rev, db) {
// If we have a locked revision, and we have a preexisting database
// which has that revision, then no update needs to happen.
(Revision::Locked(oid), Some(db)) if db.contains(*oid) => (db, *oid),

// If we're in offline mode, we're not locked, and we have a
// database, then try to resolve our reference with the preexisting
// repository.
(Revision::Deferred(git_ref), Some(db)) if !self.gctx.network_allowed() => {
let offline_flag = self
.gctx
.offline_flag()
.expect("always present when `!network_allowed`");
let rev = db.resolve(&git_ref).with_context(|| {
format!(
"failed to lookup reference in preexisting repository, and \
can't check for updates in offline mode ({offline_flag})"
)
})?;
(db, rev)
}

// ... otherwise we use this state to update the git database. Note
// that we still check for being offline here, for example in the
// situation that we have a locked revision but the database
// doesn't have it.
(locked_rev, db) => {
if let Some(offline_flag) = self.gctx.offline_flag() {
anyhow::bail!(
"can't checkout from '{}': you are in the offline mode ({offline_flag})",
self.remote.url()
);
}

if !self.quiet {
self.gctx.shell().status(
"Updating",
format!("git repository `{}`", self.remote.url()),
)?;
}

trace!("updating git source `{:?}`", self.remote);

let locked_rev = locked_rev.clone().into();
self.remote.checkout(&db_path, db, &locked_rev, self.gctx)?
}
};
let (db, actual_rev) = self.update_db(false)?;

// Don’t use the full hash, in order to contribute less to reaching the
// path length limit on Windows. See
Expand All @@ -353,7 +366,7 @@ impl<'gctx> Source for GitSource<'gctx> {
.join(&self.ident)
.join(short_id.as_str());
let checkout_path = checkout_path.into_path_unlocked();
db.copy_to(actual_rev, &checkout_path, self.gctx)?;
db.copy_to(actual_rev, &checkout_path, self.gctx, self.quiet)?;

let source_id = self
.source_id
Expand Down
59 changes: 37 additions & 22 deletions src/cargo/sources/git/utils.rs
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
//! Utilities for handling git repositories, mainly around
//! authentication/cloning.

use crate::core::{GitReference, Verbosity};
use crate::core::{GitReference, SourceId, Verbosity};
use crate::sources::git::fetch::RemoteKind;
use crate::sources::git::oxide;
use crate::sources::git::oxide::cargo_config_to_gitoxide_overrides;
use crate::sources::git::source::GitSource;
use crate::sources::source::Source as _;
use crate::util::HumanBytes;
use crate::util::errors::{CargoResult, GitCliError};
use crate::util::{GlobalContext, IntoUrl, MetricsCounter, Progress, network};
Expand Down Expand Up @@ -169,6 +171,7 @@ impl GitDatabase {
rev: git2::Oid,
dest: &Path,
gctx: &GlobalContext,
quiet: bool,
) -> CargoResult<GitCheckout<'_>> {
// If the existing checkout exists, and it is fresh, use it.
// A non-fresh checkout can happen if the checkout operation was
Expand All @@ -182,7 +185,7 @@ impl GitDatabase {
Some(co) => co,
None => {
let (checkout, guard) = GitCheckout::clone_into(dest, self, rev, gctx)?;
checkout.update_submodules(gctx)?;
checkout.update_submodules(gctx, quiet)?;
guard.mark_ok()?;
checkout
}
Expand Down Expand Up @@ -384,24 +387,27 @@ impl<'a> GitCheckout<'a> {
/// Submodules set to `none` won't be fetched.
///
/// [^1]: <https://git-scm.com/docs/git-submodule#Documentation/git-submodule.txt-none>
fn update_submodules(&self, gctx: &GlobalContext) -> CargoResult<()> {
return update_submodules(&self.repo, gctx, self.remote_url().as_str());
fn update_submodules(&self, gctx: &GlobalContext, quiet: bool) -> CargoResult<()> {
return update_submodules(&self.repo, gctx, quiet, self.remote_url().as_str());

/// Recursive helper for [`GitCheckout::update_submodules`].
fn update_submodules(
repo: &git2::Repository,
gctx: &GlobalContext,
quiet: bool,
parent_remote_url: &str,
) -> CargoResult<()> {
debug!("update submodules for: {:?}", repo.workdir().unwrap());

for mut child in repo.submodules()? {
update_submodule(repo, &mut child, gctx, parent_remote_url).with_context(|| {
format!(
"failed to update submodule `{}`",
child.name().unwrap_or("")
)
})?;
update_submodule(repo, &mut child, gctx, quiet, parent_remote_url).with_context(
|| {
format!(
"failed to update submodule `{}`",
child.name().unwrap_or("")
)
},
)?;
}
Ok(())
}
Expand All @@ -411,6 +417,7 @@ impl<'a> GitCheckout<'a> {
parent: &git2::Repository,
child: &mut git2::Submodule<'_>,
gctx: &GlobalContext,
quiet: bool,
parent_remote_url: &str,
) -> CargoResult<()> {
child.init(false)?;
Expand Down Expand Up @@ -447,10 +454,10 @@ impl<'a> GitCheckout<'a> {
let target = repo.head()?.target();
Ok((target, repo))
});
let mut repo = match head_and_repo {
let repo = match head_and_repo {
Ok((head, repo)) => {
if child.head_id() == head {
return update_submodules(&repo, gctx, &child_remote_url);
return update_submodules(&repo, gctx, quiet, &child_remote_url);
}
repo
}
Expand All @@ -460,25 +467,33 @@ impl<'a> GitCheckout<'a> {
init(&path, false)?
}
};
// Fetch data from origin and reset to the head commit
let reference = GitReference::Rev(head.to_string());
gctx.shell()
.status("Updating", format!("git submodule `{child_remote_url}`"))?;
fetch(
&mut repo,
&child_remote_url,
&reference,
// Fetch data using git database
let mut source = GitSource::new(
SourceId::from_url(&format!("git+{child_remote_url}#{head}"))?,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of constructing URL manually, we have a SourceId::for_git for the purpose.

gctx,
RemoteKind::GitDependency,
)
.with_context(|| {
let name = child.name().unwrap_or("");
format!("failed to fetch submodule `{name}` from {child_remote_url}",)
})?;
source.set_quiet(quiet);

let (db, actual_rev) = source.update_db(true).with_context(|| {
let name = child.name().unwrap_or("");
format!("failed to fetch submodule `{name}` from {child_remote_url}",)
})?;

// Clone and reset to the head commit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are open to rebasing/reorganizing your commit history throughout the iterations of a pull request. And personally I think this kind of commit should be squashed into wherever made this typo. That would help get a more understandable/traceable git commit history

let (_, guard) = GitCheckout::clone_into(&repo.path(), &db, actual_rev, gctx)
.with_context(|| {
let name = child.name().unwrap_or("");
format!("failed to fetch submodule `{name}` from {child_remote_url}",)
})?;
guard.mark_ok()?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checkout is marked ready then we recurse into submodules. I feel like that means the checkout is not ready yet.

Would this situation happens? _User cancels the operation before recursing to nested submodules. They later they dependency on this submodule directly and Cargo assumes it is fresh so never recurse into nested submodules.


let obj = repo.find_object(head, None)?;
reset(&repo, &obj, gctx)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we still need this reset, given GitCheckout::clone_into already reset it?

update_submodules(&repo, gctx, &child_remote_url)
update_submodules(&repo, gctx, quiet, &child_remote_url)
}
}
}
Expand Down
5 changes: 4 additions & 1 deletion tests/testsuite/git.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1060,7 +1060,10 @@ Caused by:
failed to update submodule `src`

Caused by:
object not found - no match for id ([..]); class=Odb (9); code=NotFound (-3)
failed to fetch submodule `src` from [ROOTURL]/dep2

Caused by:
revspec '[..]' not found; class=Reference (4); code=NotFound (-3)

"#]];

Expand Down