🛡️ Sentinel: [HIGH] Fix regex injection and data leakage in publish#104
🛡️ Sentinel: [HIGH] Fix regex injection and data leakage in publish#104calionauta wants to merge 1 commit into
Conversation
- Fix regex newline injection in skill name validation by using \Z instead of $. - Fix broken exclusion logic in git publish by using fnmatch for robust pattern matching. - Prevent sensitive data leakage (sessions, cache, models.json) during publish. - Add regression tests in tests/test_security_remediation.py. Co-authored-by: renatocaliari <1719651+renatocaliari@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Code Review
This pull request addresses security vulnerabilities and logic bugs by updating regex validators in external_source.py and local_source.py to use \Z instead of $ to prevent trailing newline bypasses. It also replaces error-prone manual pattern matching in git_publish.py with fnmatch.fnmatch to properly exclude sensitive files and directories, and adds corresponding unit tests. The review feedback suggests making the file exclusion logic case-insensitive to prevent sensitive capitalized files (e.g., Models.json) from being published on Unix-like systems, adding capitalized test cases to verify this behavior, and removing an unused fnmatch import in the test file.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def _ignore_func(*patterns): | ||
| """Create a callable that returns a list of filenames to ignore.""" | ||
| import fnmatch | ||
|
|
||
| def _ignore(path, names): | ||
| ignored = [] | ||
| for name in names: | ||
| for pattern in patterns: | ||
| if pattern.startswith('*.'): | ||
| if name.endswith(pattern[1:]): | ||
| ignored.append(name) | ||
| break | ||
| elif pattern.startswith('.'): | ||
| if name == pattern or name.startswith(pattern.rstrip('/') + '/'): | ||
| ignored.append(name) | ||
| break | ||
| if fnmatch.fnmatch(name, pattern): | ||
| ignored.append(name) | ||
| break | ||
| return ignored | ||
|
|
||
| return _ignore |
There was a problem hiding this comment.
On Unix-like systems (such as Linux, which is standard for CI/CD and many developer environments), fnmatch.fnmatch is case-sensitive. This means that if a user has sensitive files or directories with capitalized names (e.g., Models.json, Sessions, .ENV, Debug.log), they will not match the lowercase patterns in DEFAULT_IGNORE_PATTERNS and will be silently published to the public repository, leading to critical data leakage.
To ensure robust, platform-independent, and case-insensitive exclusion of sensitive files, we should lowercase both the filenames and the patterns before matching.
| def _ignore_func(*patterns): | |
| """Create a callable that returns a list of filenames to ignore.""" | |
| import fnmatch | |
| def _ignore(path, names): | |
| ignored = [] | |
| for name in names: | |
| for pattern in patterns: | |
| if pattern.startswith('*.'): | |
| if name.endswith(pattern[1:]): | |
| ignored.append(name) | |
| break | |
| elif pattern.startswith('.'): | |
| if name == pattern or name.startswith(pattern.rstrip('/') + '/'): | |
| ignored.append(name) | |
| break | |
| if fnmatch.fnmatch(name, pattern): | |
| ignored.append(name) | |
| break | |
| return ignored | |
| return _ignore | |
| def _ignore_func(*patterns): | |
| """Create a callable that returns a list of filenames to ignore.""" | |
| import fnmatch | |
| lowercased_patterns = [p.lower() for p in patterns] | |
| def _ignore(path, names): | |
| ignored = [] | |
| for name in names: | |
| name_lower = name.lower() | |
| for pattern in lowercased_patterns: | |
| if fnmatch.fnmatch(name_lower, pattern): | |
| ignored.append(name) | |
| break | |
| return ignored | |
| return _ignore |
| import fnmatch | ||
| from agent_sync.publish.local_source import _is_valid_skill_name as local_validate | ||
| from agent_sync.publish.external_source import _is_valid_skill_name as external_validate | ||
| from agent_sync.publish.git_publish import _ignore_func, DEFAULT_IGNORE_PATTERNS |
There was a problem hiding this comment.
The fnmatch module is imported but never used in this test file. We should remove it to keep the imports clean.
| import fnmatch | |
| from agent_sync.publish.local_source import _is_valid_skill_name as local_validate | |
| from agent_sync.publish.external_source import _is_valid_skill_name as external_validate | |
| from agent_sync.publish.git_publish import _ignore_func, DEFAULT_IGNORE_PATTERNS | |
| from agent_sync.publish.local_source import _is_valid_skill_name as local_validate | |
| from agent_sync.publish.external_source import _is_valid_skill_name as external_validate | |
| from agent_sync.publish.git_publish import _ignore_func, DEFAULT_IGNORE_PATTERNS |
| def test_ignore_func_correctly_filters_sensitive_files(): | ||
| """Verify that _ignore_func correctly filters sensitive files and directories.""" | ||
| ignore = _ignore_func(*DEFAULT_IGNORE_PATTERNS) | ||
|
|
||
| # Test cases: (name, should_be_ignored) | ||
| test_cases = [ | ||
| ("sessions", True), | ||
| ("cache", True), | ||
| (".cache", True), | ||
| ("models.json", True), | ||
| ("models.yaml", True), | ||
| (".env", True), | ||
| ("my_skill.md", False), | ||
| ("SKILL.md", False), | ||
| (".git", True), | ||
| ("debug.log", True), | ||
| ("data.sqlite", True), | ||
| ("private.key", True), | ||
| ] | ||
|
|
||
| names = [tc[0] for tc in test_cases] | ||
| ignored = ignore(None, names) | ||
|
|
||
| for name, should_ignore in test_cases: | ||
| is_ignored = name in ignored | ||
| assert is_ignored == should_ignore, f"Expected {name} to be ignored={should_ignore}, but got {is_ignored}" | ||
|
|
There was a problem hiding this comment.
To prevent regressions and ensure that capitalized sensitive files/directories (e.g., Sessions, Models.json, .ENV) are also correctly ignored on all platforms (including case-sensitive Unix systems), we should add capitalized test cases to test_ignore_func_correctly_filters_sensitive_files.
def test_ignore_func_correctly_filters_sensitive_files():
"""Verify that _ignore_func correctly filters sensitive files and directories."""
ignore = _ignore_func(*DEFAULT_IGNORE_PATTERNS)
# Test cases: (name, should_be_ignored)
test_cases = [
("sessions", True),
("Sessions", True),
("cache", True),
("Cache", True),
(".cache", True),
("models.json", True),
("Models.json", True),
("models.yaml", True),
(".env", True),
(".ENV", True),
("my_skill.md", False),
("SKILL.md", False),
(".git", True),
("debug.log", True),
("Debug.log", True),
("data.sqlite", True),
("private.key", True),
]
names = [tc[0] for tc in test_cases]
ignored = ignore(None, names)
for name, should_ignore in test_cases:
is_ignored = name in ignored
assert is_ignored == should_ignore, f"Expected {name} to be ignored={should_ignore}, but got {is_ignored}"
🚨 Severity: HIGH
💡 Vulnerability:
_is_valid_skill_namefunction used the$anchor, which allows trailing newlines to bypass validation in Python'sremodule._ignore_funcused duringshutil.copytreein the publish process was incorrectly implemented, failing to ignore most of the sensitive patterns defined inDEFAULT_IGNORE_PATTERNS(e.g.,sessions/,cache/,models.json).🎯 Impact:
🔧 Fix:
$with\Zin_is_valid_skill_nameregex patterns acrosslocal_source.pyandexternal_source.py._ignore_funcingit_publish.pyto usefnmatch.fnmatchfor robust and correct glob pattern matching.✅ Verification:
tests/test_security_remediation.pywhich explicitly tests newline rejection and verifies that all sensitive patterns are correctly ignored by the updated logic.PR created automatically by Jules for task 2764878419301050900 started by @renatocaliari