Skip to content

Conversation

@rtibbles
Copy link
Member

@rtibbles rtibbles commented Nov 26, 2025

Summary

  • Adds a pre-commit check to ensure files that should be checked in as git lfs files are not accidentally committed.
  • Adds a Python script (.pre-commit-hooks/check_lfs_pointers.py) that uses git's native tools to identify and validate LFS files
  • Clear error messages with step-by-step fix instructions when violations are detected

Implementation details:

  • Updates the global exclude in our pre commit config to allow checking of our font files in the core static folder
  • Uses git check-attr filter to identify files marked for LFS (no manual .gitattributes parsing required)
  • Checks staged content (what will be committed) using git show :filepath to verify it starts with the LFS pointer header
  • Works correctly whether or not Git LFS is installed in the environment running the check
  • Python implementation for cross-platform compatibility
  • Only uses Python stdlib (subprocess, logging, sys)
  • Outputs to stderr for proper pre-commit integration

This will prevent the recurring issue where binary files (fonts, fixtures) were accidentally committed as binary data instead of LFS pointers, which has occurred in multiple PRs (#7092, #6802, #6467).

References

Fixes #7099

Reviewer guidance

  • Run git lfs uninstall
  • Run touch test.woff
  • Add it to git and try to commit it
  • See error!

🤖 This was created by Claude Code. @rtibbles then reviewed the generated output, and did iterative rounds of updates before making it ready for review 🤖

@github-actions github-actions bot added DEV: backend Python, databases, networking, filesystem... SIZE: medium labels Nov 26, 2025
@rtibbles
Copy link
Member Author

@coderabbitai full review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 26, 2025

✅ Actions performed

Full review triggered.

@github-actions github-actions bot added DEV: renderers HTML5 apps, videos, exercises, etc. DEV: frontend labels Nov 26, 2025
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 26, 2025

Walkthrough

This PR adds a Git LFS validation pre-commit hook to detect when binary files are inadvertently committed instead of LFS pointers. It includes a new Python script that checks staged files for LFS attributes and verifies they contain pointer data, plus configuration updates to enable the hook.

Changes

Cohort / File(s) Summary
Pre-commit configuration
\.pre-commit-config\.yaml
Updated exclude regex pattern to explicitly preserve static/assets/fonts directory while excluding other static paths; added check-lfs-pointers hook configuration with entry point to new validation script
LFS validation hook
\.pre-commit-hooks/check_lfs_pointers\.py
New Python script that validates staged files marked with Git LFS filter attribute contain valid LFS pointer headers rather than binary content; includes helper functions for filter detection and pointer verification with subprocess-based git operations and error reporting

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

  • Subprocess operations: The script uses subprocess.run() for git commands; verify proper error handling and argument escaping
  • LFS pointer validation logic: Review the header comparison logic (b"version https://git-lfs.github.com/spec/v1") to ensure it correctly identifies valid pointers
  • Exit code handling: Confirm the aggregation of failures and return logic correctly signals hook success/failure to pre-commit framework

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title 'LFS File pre-commit check' accurately and concisely summarizes the main change: adding a pre-commit hook to validate Git LFS files.
Linked Issues check ✅ Passed The PR fully addresses linked issue #7099 by implementing a linter/pre-commit check that detects files marked for LFS but committed as binary, preventing recurring incidents and providing clear error messages.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the LFS validation check: updates to .pre-commit-config.yaml and addition of the check_lfs_pointers.py script.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check ✅ Passed The pull request description is directly related to the changeset, clearly explaining the addition of an LFS pre-commit check and implementation details that match the file changes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch claude/investigate-and-plan-01Q7GSNtfpZfq5yGfddewNYH

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Contributor

github-actions bot commented Nov 27, 2025

@rtibbles rtibbles force-pushed the claude/investigate-and-plan-01Q7GSNtfpZfq5yGfddewNYH branch from 712e629 to 08d3850 Compare November 27, 2025 00:13
@rtibbles rtibbles marked this pull request as draft November 27, 2025 00:13
@rtibbles rtibbles removed SIZE: medium SIZE: large DEV: renderers HTML5 apps, videos, exercises, etc. DEV: backend Python, databases, networking, filesystem... DEV: frontend labels Nov 27, 2025
@github-actions github-actions bot added DEV: backend Python, databases, networking, filesystem... SIZE: medium labels Nov 29, 2025
@rtibbles rtibbles force-pushed the claude/investigate-and-plan-01Q7GSNtfpZfq5yGfddewNYH branch from 3d5bbe5 to 08d3850 Compare November 29, 2025 16:41
@rtibbles
Copy link
Member Author

@coderabbitai full review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 29, 2025

✅ Actions performed

Full review triggered.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
.pre-commit-hooks/check_lfs_pointers.py (1)

98-99: Consider cross-platform terminal color support.

The ANSI escape codes (\033[31m) work well on Unix terminals and Git Bash, but may render as garbled text on Windows CMD. Since pre-commit is often run in various environments, consider either removing the color or using a conditional approach.

Option 1 - Remove color for simplicity:

-            logger.error("  \033[31m✗\033[0m %s", filepath)
+            logger.error("  ✗ %s", filepath)

Option 2 - Conditional color based on terminal capability:

import os
use_color = hasattr(sys.stderr, 'isatty') and sys.stderr.isatty() and os.name != 'nt'
marker = "\033[31m✗\033[0m" if use_color else "✗"
logger.error("  %s %s", marker, filepath)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75be9db and 08d3850.

📒 Files selected for processing (2)
  • .pre-commit-config.yaml (2 hunks)
  • .pre-commit-hooks/check_lfs_pointers.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.6)
.pre-commit-hooks/check_lfs_pointers.py

31-31: subprocess call: check for execution of untrusted input

(S603)


32-32: Starting a process with a partial executable path

(S607)


37-37: Consider moving this statement to an else block

(TRY300)


56-56: subprocess call: check for execution of untrusted input

(S603)


57-57: Starting a process with a partial executable path

(S607)

🔇 Additional comments (5)
.pre-commit-hooks/check_lfs_pointers.py (4)

1-17: Well-documented module with clear purpose.

Good use of stdlib-only imports for portability. The bytes-based constant correctly handles binary content comparison.


30-39: LGTM - Logic is correct.

The static analysis hints (S603, S607, TRY300) are acceptable here: git is expected to be on PATH in any git hook context, and filenames come from the pre-commit framework. The try/except structure is clear enough.


42-64: Staged content inspection is correctly implemented.

Using git show :filepath to access the staged blob rather than the working tree file is the right approach. The bytes-based comparison correctly handles binary content.


120-121: Standard entry point pattern.

Correctly passes command-line arguments to main() and propagates the exit code.

.pre-commit-config.yaml (1)

1-1: Exclude pattern correctly allows font files to be checked.

The negative lookahead (?!assets/fonts) ensures static/assets/fonts/* paths are NOT excluded, allowing the LFS hook to validate font files while other static assets remain excluded from pre-commit checks.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@rtibbles rtibbles marked this pull request as ready for review December 1, 2025 15:30
@rtibbles
Copy link
Member Author

rtibbles commented Dec 1, 2025

(we should squash commit this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

DEV: backend Python, databases, networking, filesystem... SIZE: medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add linter to prevent LFS files from being checked in

2 participants