feat(fsck): implement fsck command#371
Conversation
Implement `libra fsck` command for verifying repository integrity: - Object hash verification using Git-style SHA1 (header + content) - Ref consistency validation against SQLite backend - Index file integrity check - Cross-reference validation for trees and commits - Human-readable and JSON output formats - Exit codes: 0 (pass), 1 (corrupted), 2 (broken refs), 3 (index corruption) - Flags: --verbose, --no-cross-ref-check, --no-index-check, --objects-only, --fix Includes 37 unit tests covering: - Object verification (blob, tree, commit) - Missing object detection - Hash mismatch detection - Cross-reference validation - JSON output structure - CLI argument parsing
Add fsck subcommand to CLI enum and dispatch handler.
Add public getter method for base_path field to enable filesystem traversal in fsck command.
Add comprehensive documentation for `libra fsck` command: - Synopsis and description of integrity checks - All options with examples (--verbose, --no-cross-ref-check, etc.) - Human output examples (clean, verbose, with errors) - JSON structured output schema - Design rationale (Git-style hash computation, optional cross-ref validation) - Parameter comparison with Git - Exit code behavior and error handling table - Usage examples for routine checks and CI/CD integration Also update docs/commands/README.md to include fsck in the command index.
- Add .cargo/ and other build artifacts to .gitignore - Update CLAUDE.md project structure comments
- Replace the stub check_index() with a complete implementation - Validates the binary index file (DIRC header, entry modes, stages, cross-references to object storage). - Differentiate exit codes: 1 for object corruption, 2 for broken refs, 3 for index corruption.
…pted index - Delete refs pointing to missing or invalid objects from SQLite - Rebuild corrupted index from HEAD commit's tree structure - Track broken ref names in RefCheckResult for targeted deletion
…sh verification fsck.rs previously hardcoded SHA1_FOR_LEGACY_USE_ONLY in verify_object(), causing all objects in SHA256 repos to be incorrectly flagged as corrupted. Now reads the algorithm from get_hash_kind() (set at init/dispatch time via core.objectformat) and uses the correct ring digest accordingly.
…bleErrorCode Migrate fsck from hardcoded std::process::exit(1/2/3) to a bitmask-based exit code scheme that encodes all failure categories in a single value: - bit 0 (1): object corruption - bit 1 (2): broken refs - bit 2 (4): index corruption
Add 10 integration tests covering fsck behavior under SHA-256 object format: empty repo, committed repo, verbose/json output, corruption detection, missing object detection, single object verification, indexed files, invalid hash handling. Ensures the dynamic SHA1/SHA256 selection works end-to-end through the CLI entrypoint.
- streamline issue report generation - enhance object check handling - remove dead testcase.
…st setup and remove redundant test code
- Refactor check flow into 6 stages: directory, objects, HEAD, reflogs, index, connectivity - Default mode only shows progress; detailed output moved to --verbose flag - Add check_head function to detect unborn branch, avoid false notice - Simplify FsckArgs to only keep object/no_reflogs/verbose options - Output format matches git fsck (e.g., "Checking object directory: 100% (256/256), done.")
Refactor fsck to output the four standard git fsck diagnostic types: - missing <type> <object>: Object referenced but doesn't exist - hash mismatch <type> <object>: Object hash doesn't match content - dangling <type> <object>: Object exists but not referenced - unreachable <type> <object>: Object completely isolated Key changes: - Add object_type field to IssueReport for git-style output - Implement reachability analysis using BFS from refs/reflogs/index - Distinguish dangling (in reflog/index) vs unreachable (isolated) objects - Add print_diagnostic_messages() for git fsck-style output format - Dangling/unreachable objects don't cause non-zero exit code (matches git) - Remove test module temporarily due to major API changes Output format now matches git fsck: dangling blob 8ae04589c000f892abfbe27a7803ac4433846be8 missing tree abc123... hash mismatch commit def456...
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 8 comments.
Comments suppressed due to low confidence (2)
tests/command/fsck_test.rs:304
- This assertion is a tautology —
!x || xis always true for any boolean — so the test never actually verifies anything about the exit status. Either drop the assertion entirely (if the only goal is "must not crash"), or replace it with a concrete expectation, e.g. that the exit code is one of a specific allowed set, similar totest_fsck_with_empty_object_id.
assert!(
!output.status.success() || output.status.success(),
"fsck should handle zero hash without crashing"
);
tests/command/fsck_test.rs:512
- Same tautological assertion as in
test_fsck_with_zero_hash:!output.status.success() || output.status.success()is always true, so this test does not exercise any behavior beyond "the process didn't hang". Replace with a concrete expectation (e.g. an allowed set of exit codes, or assert on stderr containing a specific diagnostic).
assert!(
!output.status.success() || output.status.success(),
"fsck should handle invalid reflog reference"
);
| - **`libra fsck`**: Repository integrity checker analogous to `git fsck`. Verifies | ||
| object hash integrity (SHA1/SHA256), object format validity, ref consistency, | ||
| index integrity, and cross-reference validation (including object type mismatch | ||
| detection for tree entries). Supports `--verbose`, `--json`, `--objects-only`, | ||
| `--no-cross-ref-check`, `--no-index-check`, and `--fix` (auto-repair broken refs | ||
| and rebuild corrupted index). Exit codes use a bitmask scheme: | ||
| bit 0 = object corruption, bit 1 = broken refs, bit 2 = index corruption. | ||
| - **`docs/commands/fsck.md`**: Comprehensive documentation for the `fsck` command | ||
| including parameter comparison with Git, design rationale, and CI/CD examples. |
| pub async fn execute(args: FsckArgs) { | ||
| let storage = ClientStorage::init(path::objects()); | ||
|
|
||
| let result = if let Some(ref object_id) = args.object { | ||
| check_single_object(object_id, &storage).await | ||
| } else { | ||
| check_all_objects(&args, &storage).await | ||
| }; | ||
|
|
||
| match result { | ||
| Ok(fsck_result) => { | ||
| // Exit with failure code only for serious issues (not dangling/unreachable) | ||
| if fsck_result.has_errors { | ||
| std::process::exit(1); | ||
| } | ||
| } | ||
| Err(e) => { | ||
| eprintln!("fatal: {}", e); | ||
| std::process::exit(1); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| pub async fn execute_safe(args: FsckArgs, _output: &OutputConfig) -> CliResult<()> { | ||
| // execute_safe is called from cli.rs but currently just delegates to execute | ||
| // JSON output is not supported for fsck | ||
| execute(args).await; | ||
| Ok(()) | ||
| } |
| fn print_notices(head_is_unborn: bool, _result: &FsckResult) { | ||
| if head_is_unborn { | ||
| eprintln!("notice: HEAD points to an unborn branch (main)"); | ||
| eprintln!("notice: No default references"); | ||
| } | ||
| } |
| let lost_found_dir = storage | ||
| .base_path() | ||
| .parent() | ||
| .expect("storage should have parent") | ||
| .join("lost-found"); |
| // fsck with empty argument should not crash | ||
| let output = run_libra_command(&["fsck", ""], repo.path()); | ||
| // Should return non-zero exit code for invalid input, but not crash | ||
| assert!( | ||
| output.status.code() == Some(1) | ||
| || output.status.code() == Some(128) | ||
| || output.status.code() == Some(0), | ||
| "fsck with empty arg should return valid exit code, got: {:?}", | ||
| output.status.code() |
| /// Find and report tagged commits | ||
| /// Output format matches git fsck --tags: | ||
| /// - For annotated tags: "tagged commit <commit-hash> (<tag-name>) in <tag-object-hash>" | ||
| async fn find_and_report_tags() -> CliResult<()> { | ||
| use sea_orm::EntityTrait; | ||
|
|
||
| use crate::internal::model::reference; | ||
|
|
||
| let db_conn = db::get_db_conn_instance().await; | ||
|
|
||
| // Load all refs that are tags (refs/tags/*) | ||
| let refs = reference::Entity::find() | ||
| .all(&db_conn) | ||
| .await | ||
| .map_err(|e| CliError::fatal(format!("failed to load refs: {}", e)))?; | ||
|
|
||
| for ref_entry in refs { | ||
| let ref_name = match &ref_entry.name { | ||
| Some(name) => name, | ||
| None => continue, | ||
| }; | ||
|
|
||
| // Only process tag refs (refs/tags/*) | ||
| if !ref_name.starts_with("refs/tags/") { | ||
| continue; | ||
| } | ||
|
|
||
| let tag_name = ref_name.strip_prefix("refs/tags/").unwrap(); | ||
| let commit_hash = match &ref_entry.commit { | ||
| Some(hash) => hash, | ||
| None => continue, | ||
| }; | ||
|
|
||
| // Check if this is an annotated tag (tag object exists) | ||
| // For now, just report the tagged commit | ||
| println!("tagged commit {} ({})", commit_hash, tag_name); | ||
| } | ||
|
|
||
| Ok(()) | ||
| } |
| // Add header: "<type> <size>\0" | ||
| let header = format!("{} {}\0", obj_type.to_string().to_lowercase(), size); | ||
| ctx.update(header.as_bytes()); |
| async fn build_object_name_map() -> std::collections::HashMap<String, String> { | ||
| let mut name_map: std::collections::HashMap<String, String> = std::collections::HashMap::new(); | ||
| let db_conn = db::get_db_conn_instance().await; | ||
|
|
||
| // Collect names from refs (e.g., "refs/heads/master") | ||
| let refs = reference::Entity::find() | ||
| .all(&db_conn) | ||
| .await | ||
| .unwrap_or_default(); | ||
|
|
||
| for ref_entry in refs { | ||
| if let Some(commit_hash) = &ref_entry.commit { | ||
| let ref_name = ref_entry.name.clone().unwrap_or_default(); | ||
| // Store ref name (may have multiple names per object) | ||
| name_map | ||
| .entry(commit_hash.clone()) | ||
| .and_modify(|e| *e = format!("{}, {}", e, ref_name)) | ||
| .or_insert(ref_name); | ||
| } | ||
| } | ||
|
|
||
| // Collect names from reflogs (e.g., "HEAD@{1234567890}") | ||
| let reflogs = reflog::Entity::find() | ||
| .all(&db_conn) | ||
| .await | ||
| .unwrap_or_default(); | ||
|
|
Fixes test_add_force_dot_includes_ignored_directory test failure. Changes: - src/command/status.rs: Add list_workdir_files_split_force() and changes_to_be_staged_split_force() functions that recurse into ignored directories to discover files within them - src/command/add.rs: Use force variant when --force flag is set, merging ignored changes into visible changes for staging - src/command/add.rs: Skip directories in stage_a_file() to avoid attempting to read them as files - tests/command/publish_test.rs: Update test to reflect that publish sync is now implemented (no longer expects LBR-UNSUPPORTED-001)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e3ceb01369
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| let refs_dir = repo.path().join(".libra").join("refs").join("heads"); | ||
| fs::create_dir_all(&refs_dir).unwrap(); | ||
| let broken_ref = refs_dir.join("broken"); | ||
| fs::write(&broken_ref, "invalid-hash-not-exist").unwrap(); |
There was a problem hiding this comment.
Corrupt reflog data instead of creating a ref file in test
This test claims to validate broken reflog handling, but it mutates .libra/refs/heads/* files instead of the reflog storage that fsck actually reads (reflog::Entity in src/command/fsck.rs). As written, regressions in reflog validation can ship undetected because the setup does not exercise the intended code path; please seed an invalid row in the reflog table (or helper API) so the scenario is real.
Useful? React with 👍 / 👎.
| assert!( | ||
| !output.status.success() || output.status.success(), |
There was a problem hiding this comment.
Assert a concrete failure condition in reflog test
The assertion !output.status.success() || output.status.success() is a tautology, so this test cannot fail regardless of fsck behavior. That removes protection for a high-risk integrity path and can let silent regressions in reflog error handling pass CI; assert a specific expected outcome (for example, non-zero exit code and/or a corruption diagnostic) instead.
Useful? React with 👍 / 👎.
…tories" This reverts commit e3ceb01.
Signed-off-by: Haj1M1 <1614778789@qq.com>
| - **`libra fsck`**: Repository integrity checker analogous to `git fsck`. Verifies | ||
| object hash integrity (SHA1/SHA256), object format validity, ref consistency, | ||
| index integrity, and cross-reference validation (including object type mismatch | ||
| detection for tree entries). Supports `--verbose`, `--json`, `--objects-only`, | ||
| `--no-cross-ref-check`, `--no-index-check`, and `--fix` (auto-repair broken refs | ||
| and rebuild corrupted index). Exit codes use a bitmask scheme: | ||
| bit 0 = object corruption, bit 1 = broken refs, bit 2 = index corruption. | ||
| - **`docs/commands/fsck.md`**: Comprehensive documentation for the `fsck` command | ||
| including parameter comparison with Git, design rationale, and CI/CD examples. |
| pub async fn execute(args: FsckArgs) { | ||
| let storage = ClientStorage::init(path::objects()); | ||
|
|
||
| let result = if let Some(ref object_id) = args.object { | ||
| check_single_object(object_id, &storage).await | ||
| } else { | ||
| check_all_objects(&args, &storage).await | ||
| }; | ||
|
|
||
| match result { | ||
| Ok(fsck_result) => { | ||
| // Exit with failure code only for serious issues (not dangling/unreachable) | ||
| if fsck_result.has_errors { | ||
| std::process::exit(1); | ||
| } | ||
| } | ||
| Err(e) => { | ||
| eprintln!("fatal: {}", e); | ||
| std::process::exit(1); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| pub async fn execute_safe(args: FsckArgs, _output: &OutputConfig) -> CliResult<()> { | ||
| // execute_safe is called from cli.rs but currently just delegates to execute | ||
| // JSON output is not supported for fsck | ||
| execute(args).await; |
| let lost_found_dir = storage | ||
| .base_path() | ||
| .parent() | ||
| .expect("storage should have parent") | ||
| .join("lost-found"); |
| fn print_notices(head_is_unborn: bool, _result: &FsckResult) { | ||
| if head_is_unborn { | ||
| eprintln!("notice: HEAD points to an unborn branch (main)"); | ||
| eprintln!("notice: No default references"); | ||
| } | ||
| } |
| // Zero hash should be invalid or not found, but should not crash | ||
| assert!( | ||
| !output.status.success() || output.status.success(), | ||
| "fsck should handle zero hash without crashing" |
| if object_path.exists() { | ||
| fs::remove_file(&object_path).unwrap(); | ||
| let output = run_libra_command(&["fsck"], repo.path()); | ||
| assert!( | ||
| !output.status.success(), | ||
| "fsck should fail on missing object" | ||
| ); | ||
| let combined = format!( | ||
| "{}{}", | ||
| String::from_utf8_lossy(&output.stdout), | ||
| String::from_utf8_lossy(&output.stderr) | ||
| ); | ||
| assert!( | ||
| combined.contains("missing") || combined.contains("not found"), | ||
| "should report missing object, got: {}", | ||
| combined | ||
| ); | ||
| } |
| // Print directory progress | ||
| if verbose { | ||
| // --verbose: match git fsck output | ||
| println!("Checking object directory"); | ||
| } else { | ||
| // default: show progress | ||
| println!("Checking object directories: 100% (256/256), done."); | ||
| } | ||
|
|
| // Verify hash integrity using ring crate. | ||
| // Git/Libra computes hash as: SHAx(type + ' ' + size + '\0' + content) | ||
| // The algorithm is determined by the repo's core.objectformat config. | ||
| let mut ctx = Context::new(match get_hash_kind() { | ||
| HashKind::Sha256 => &SHA256, | ||
| _ => &SHA1_FOR_LEGACY_USE_ONLY, | ||
| }); | ||
|
|
||
| // Add header: "<type> <size>\0" | ||
| let header = format!("{} {}\0", obj_type.to_string().to_lowercase(), size); | ||
| ctx.update(header.as_bytes()); | ||
| ctx.update(&data); | ||
| let computed_hash = ctx.finish(); | ||
| let computed_bytes = computed_hash.as_ref(); |
| object::{ObjectTrait, blob::Blob, commit::Commit, tree::Tree, types::ObjectType}, | ||
| }, | ||
| }; | ||
| use hex; |
| result.broken += 1; | ||
| let ref_name = ref_entry.name.clone().unwrap_or_default(); | ||
| result.broken_ref_names.push(ref_name.clone()); | ||
| report(FsckMsgId::Missing, "commit", commit_hash_str); | ||
| } | ||
| } else { | ||
| result.broken += 1; | ||
| let ref_name = ref_entry.name.clone().unwrap_or_default(); | ||
| result.broken_ref_names.push(ref_name.clone()); | ||
| // Invalid hash format - report as bad ref content | ||
| eprintln!("bad ref content: {}: invalid hash format", ref_name); |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5f79024b0a
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| #[command(about = "Update remote refs along with associated objects")] | ||
| Push(command::push::PushArgs), | ||
| #[command(about = "Download objects and refs from another repository")] | ||
| Fetch(command::fetch::FetchArgs), |
There was a problem hiding this comment.
Deduplicate subcommand variants in
Commands enum
This commit adds Push/Fetch/Pull (and also Diff/Grep/Blame) variants that are already defined elsewhere in the same Commands enum, which makes the enum definition invalid and prevents the CLI from compiling at all. Keeping only one definition per variant is required for the command dispatcher to build.
Useful? React with 👍 / 👎.
| let output = run_libra_command(&["fsck", zero_hash], repo.path()); | ||
| // Zero hash should be invalid or not found, but should not crash | ||
| assert!( | ||
| !output.status.success() || output.status.success(), |
There was a problem hiding this comment.
Replace tautological assertion in zero-hash fsck test
The assertion !output.status.success() || output.status.success() is always true, so this test can never fail and does not validate fsck behavior for the zero-hash edge case. That leaves this changed behavior unprotected against regressions; assert a concrete expected outcome (for example, specific exit code and/or error text) instead.
Useful? React with 👍 / 👎.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (1)
tests/command/fsck_test.rs:512
- This assertion
!output.status.success() || output.status.success()is always true and verifies nothing. The test name suggests it should validate that fsck handles an invalid ref hash without crashing, but as written it would also pass on a crash. Consider asserting that the process did not terminate abnormally (e.g.output.status.code().is_some()) and/or that an expected message is produced.
assert!(
!output.status.success() || output.status.success(),
"fsck should handle invalid reflog reference"
);
| - **`libra fsck`**: Repository integrity checker analogous to `git fsck`. Verifies | ||
| object hash integrity (SHA1/SHA256), object format validity, ref consistency, | ||
| index integrity, and cross-reference validation (including object type mismatch | ||
| detection for tree entries). Supports `--verbose`, `--json`, `--objects-only`, | ||
| `--no-cross-ref-check`, `--no-index-check`, and `--fix` (auto-repair broken refs | ||
| and rebuild corrupted index). Exit codes use a bitmask scheme: | ||
| bit 0 = object corruption, bit 1 = broken refs, bit 2 = index corruption. | ||
| - **`docs/commands/fsck.md`**: Comprehensive documentation for the `fsck` command | ||
| including parameter comparison with Git, design rationale, and CI/CD examples. |
| pub async fn execute(args: FsckArgs) { | ||
| let storage = ClientStorage::init(path::objects()); | ||
|
|
||
| let result = if let Some(ref object_id) = args.object { | ||
| check_single_object(object_id, &storage).await | ||
| } else { | ||
| check_all_objects(&args, &storage).await | ||
| }; | ||
|
|
||
| match result { | ||
| Ok(fsck_result) => { | ||
| // Exit with failure code only for serious issues (not dangling/unreachable) | ||
| if fsck_result.has_errors { | ||
| std::process::exit(1); | ||
| } | ||
| } | ||
| Err(e) => { | ||
| eprintln!("fatal: {}", e); | ||
| std::process::exit(1); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| pub async fn execute_safe(args: FsckArgs, _output: &OutputConfig) -> CliResult<()> { | ||
| // execute_safe is called from cli.rs but currently just delegates to execute | ||
| // JSON output is not supported for fsck | ||
| execute(args).await; | ||
| Ok(()) |
| let lost_found_dir = storage | ||
| .base_path() | ||
| .parent() | ||
| .expect("storage should have parent") | ||
| .join("lost-found"); |
| // Zero hash should be invalid or not found, but should not crash | ||
| assert!( | ||
| !output.status.success() || output.status.success(), | ||
| "fsck should handle zero hash without crashing" |
| if object_path.exists() { | ||
| fs::remove_file(&object_path).unwrap(); | ||
| let output = run_libra_command(&["fsck"], repo.path()); | ||
| assert!( | ||
| !output.status.success(), | ||
| "fsck should fail on missing object" | ||
| ); | ||
| let combined = format!( | ||
| "{}{}", | ||
| String::from_utf8_lossy(&output.stdout), | ||
| String::from_utf8_lossy(&output.stderr) | ||
| ); | ||
| assert!( | ||
| combined.contains("missing") || combined.contains("not found"), | ||
| "should report missing object, got: {}", | ||
| combined | ||
| ); | ||
| } |
| fn print_notices(head_is_unborn: bool, _result: &FsckResult) { | ||
| if head_is_unborn { | ||
| eprintln!("notice: HEAD points to an unborn branch (main)"); | ||
| eprintln!("notice: No default references"); | ||
| } | ||
| } |
| // Stage 7: Check connectivity (re-verify all objects in storage order) | ||
| check_connectivity( | ||
| &all_hashes, | ||
| storage, | ||
| &mut result, | ||
| args.verbose, | ||
| args.name_objects, | ||
| args.connectivity_only, | ||
| ) | ||
| .await?; |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bc881ddee1
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| let overall_status = match check_result.status { | ||
| CheckStatus::Ok => { | ||
| println!("Object {} is valid", object_id); |
There was a problem hiding this comment.
Avoid reporting object as valid after emitted fsck errors
check_single_object prints Object <id> is valid whenever check_result.status == Ok, but verify_object can still emit hard-error diagnostics (for example missing author/committer fields) while returning Ok status. In that case fsck exits non-zero yet still prints a success message, which is misleading for operators and machine parsers consuming stdout. Gate the success message on !has_errors (or propagate a non-Ok status when semantic checks fail).
Useful? React with 👍 / 👎.
| Diagnostic messages are printed to stdout and do NOT cause non-zero exit codes: | ||
|
|
||
| ```text | ||
| missing <type> <object-id> | ||
| hash mismatch <type> <object-id> |
There was a problem hiding this comment.
Align fsck docs with actual failure semantics for missing objects
The docs classify missing/hash mismatch as diagnostics that do not cause non-zero exit codes, but the implementation treats them as failure-causing conditions (FsckMsgId::causes_failure), so CI or scripts following this section will mis-handle command outcomes. Update this section to match the shipped behavior so users can rely on exit-code contracts.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: b3bbfeacb6
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if file_abs.is_dir() { | ||
| return Ok(StagedAction::Unchanged); |
There was a problem hiding this comment.
Avoid treating symlinked paths as unstaged directories
The new file_abs.is_dir() guard returns Unchanged before status evaluation, but Path::is_dir() follows symlinks. On Unix, a symlink to a directory is a valid index entry (120000) and should be staged as a symlink; with this change, libra add silently skips it. This creates a real correctness regression for repositories that track directory symlinks (or convert a path into such a symlink), because the change cannot be staged at all.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 76febfd0fb
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if let Some(commit_hash_str) = &ref_entry.commit { | ||
| if let Some(hash) = parse_object_hash(commit_hash_str) { |
There was a problem hiding this comment.
Treat refs with missing target OIDs as corruption
check_refs only validates rows where ref_entry.commit is Some(...); rows with None are silently skipped, so a corrupted branch/tag reference with a missing target OID is counted as checked but never marked broken. In a repo where that is the only corruption, fsck can still report success, which is a false negative for integrity checks. Add an else path that reports bad ref content (at least for non-HEAD refs) and increments the broken count.
Useful? React with 👍 / 👎.
| let tags_dir = repo.path().join(".libra").join("refs").join("tags"); | ||
| fs::create_dir_all(&tags_dir).unwrap(); | ||
| let broken_tag = tags_dir.join("broken-tag"); | ||
| fs::write(&broken_tag, "nonexistent-commit-hash").unwrap(); |
There was a problem hiding this comment.
Seed broken tag refs in DB for fsck --tags coverage
This test writes .libra/refs/tags/broken-tag, but fsck --tags loads tags from the database (reference::Entity) rather than loose ref files, so the fixture does not exercise the code path it claims to validate. As a result, regressions in tag-ref validation/reporting can pass CI undetected. Create the broken tag in the reference table (or through a helper that populates it) so the test actually covers find_and_report_tags behavior.
Useful? React with 👍 / 👎.
|
Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits. |
|
why the checks still pending? |
It's GitHub Action OOM problem, we are working on it. |
- Add list_workdir_files_split_force() to recurse into ignored dirs - Add changes_to_be_staged_split_force() for --force status checks - Skip directories in stage_a_file() to avoid opening dirs as files - Fix cli.rs duplicate command definitions (Diff, Grep, Blame, Fetch, Pull, Push)
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c2af4ce53f
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| match Head::current_result().await { | ||
| Ok(Head::Branch(name)) => { |
There was a problem hiding this comment.
Handle missing repository DB without panicking
check_head() calls Head::current_result() unconditionally, but that API relies on get_db_conn_instance() which panics when .libra/libra.db is missing or unreadable. In those cases (e.g., running libra fsck outside a repo or with a broken DB), fsck crashes the CLI thread and exits via panic instead of returning a user-facing CliError, so callers get a generic failure rather than actionable diagnostics. Please switch this path to a fallible DB access flow and propagate a structured fatal error.
Useful? React with 👍 / 👎.
| if sub_path.is_file() | ||
| && let Some(hash) = try_parse_loose_object(dir_name, &sub_path) | ||
| { |
There was a problem hiding this comment.
Report malformed loose-object filenames as corruption
The object scan only records entries when try_parse_loose_object(...) returns Some, so any non-hex or wrong-length filename under .libra/objects/<2-char>/ is silently ignored. That means fsck can return success while corrupted loose-object directory entries are present, because those files never reach object/ref/connectivity validation. This should emit a corruption diagnostic and mark the run as failed instead of skipping invalid filenames.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.
Comments suppressed due to low confidence (1)
tests/command/fsck_test.rs:512
- Same tautological assertion pattern as in
test_fsck_with_zero_hash:!output.status.success() || output.status.success()is always true regardless of fsck's behavior, so this test only verifies the process exits at all. Please pin the expected exit code/output explicitly.
assert!(
!output.status.success() || output.status.success(),
"fsck should handle invalid reflog reference"
);
| - **`libra fsck`**: Repository integrity checker analogous to `git fsck`. Verifies | ||
| object hash integrity (SHA1/SHA256), object format validity, ref consistency, | ||
| index integrity, and cross-reference validation (including object type mismatch | ||
| detection for tree entries). Supports `--verbose`, `--json`, `--objects-only`, | ||
| `--no-cross-ref-check`, `--no-index-check`, and `--fix` (auto-repair broken refs | ||
| and rebuild corrupted index). Exit codes use a bitmask scheme: | ||
| bit 0 = object corruption, bit 1 = broken refs, bit 2 = index corruption. | ||
| - **`docs/commands/fsck.md`**: Comprehensive documentation for the `fsck` command | ||
| including parameter comparison with Git, design rationale, and CI/CD examples. |
| let lost_found_dir = storage | ||
| .base_path() | ||
| .parent() | ||
| .expect("storage should have parent") | ||
| .join("lost-found"); |
| fn print_notices(head_is_unborn: bool, _result: &FsckResult) { | ||
| if head_is_unborn { | ||
| eprintln!("notice: HEAD points to an unborn branch (main)"); | ||
| eprintln!("notice: No default references"); | ||
| } | ||
| } |
| check_index(storage, &mut result, args.verbose)?; | ||
|
|
||
| // Stage 7: Check connectivity (re-verify all objects in storage order) | ||
| check_connectivity( | ||
| &all_hashes, | ||
| storage, | ||
| &mut result, | ||
| args.verbose, | ||
| args.name_objects, | ||
| args.connectivity_only, | ||
| ) | ||
| .await?; | ||
|
|
||
| // Stage 8: Find dangling and unreachable objects | ||
| find_dangling_unreachable( | ||
| storage, | ||
| &mut result, | ||
| args.unreachable, | ||
| args.no_reflogs, | ||
| args.dangling_enabled(), | ||
| args.lost_found, | ||
| ) | ||
| .await?; |
| // Zero hash should be invalid or not found, but should not crash | ||
| assert!( | ||
| !output.status.success() || output.status.success(), | ||
| "fsck should handle zero hash without crashing" |
| // Should report error or handle gracefully | ||
| let combined = format!( | ||
| "{}{}", | ||
| String::from_utf8_lossy(&output.stdout), | ||
| String::from_utf8_lossy(&output.stderr) | ||
| ); | ||
| // fsck may report "unknown" for invalid commit hashes in tags | ||
| assert!( | ||
| combined.contains("unknown") | ||
| || combined.contains("error") | ||
| || combined.contains("not found") | ||
| || output.status.success(), | ||
| "should handle broken tag reference, got: {}", |
| pub async fn execute(args: FsckArgs) { | ||
| let storage = ClientStorage::init(path::objects()); | ||
|
|
||
| let result = if let Some(ref object_id) = args.object { | ||
| check_single_object(object_id, &storage).await | ||
| } else { | ||
| check_all_objects(&args, &storage).await | ||
| }; | ||
|
|
||
| match result { | ||
| Ok(fsck_result) => { | ||
| // Exit with failure code only for serious issues (not dangling/unreachable) | ||
| if fsck_result.has_errors { | ||
| std::process::exit(1); | ||
| } | ||
| } | ||
| Err(e) => { | ||
| eprintln!("fatal: {}", e); | ||
| std::process::exit(1); | ||
| } | ||
| } | ||
| } | ||
|
|
||
| pub async fn execute_safe(args: FsckArgs, _output: &OutputConfig) -> CliResult<()> { | ||
| // execute_safe is called from cli.rs but currently just delegates to execute | ||
| // JSON output is not supported for fsck | ||
| execute(args).await; | ||
| Ok(()) | ||
| } |
Summary
Implements
fsckcommand for Libra, analogous togit fsck. Verifies repository integrity by checking objects, refs, and index.Features
--verbose,--root,--tags,--dangling,--unreachable,--lost-found,--connectivity-only