Skip to content

🛡️ Sentinel: [HIGH] Fix regex injection and data leakage in publish#104

Open
calionauta wants to merge 1 commit into
mainfrom
sentinel/fix-publish-security-issues-2764878419301050900
Open

🛡️ Sentinel: [HIGH] Fix regex injection and data leakage in publish#104
calionauta wants to merge 1 commit into
mainfrom
sentinel/fix-publish-security-issues-2764878419301050900

Conversation

@calionauta

Copy link
Copy Markdown
Owner

🚨 Severity: HIGH

💡 Vulnerability:

  1. Regex Newline Injection: The _is_valid_skill_name function used the $ anchor, which allows trailing newlines to bypass validation in Python's re module.
  2. Broken Exclusion Logic: The _ignore_func used during shutil.copytree in the publish process was incorrectly implemented, failing to ignore most of the sensitive patterns defined in DEFAULT_IGNORE_PATTERNS (e.g., sessions/, cache/, models.json).

🎯 Impact:

  • Potential path traversal or argument injection if skill names with newlines are used in downstream operations.
  • Critical Data Leakage: Sensitive local data, session logs, and configuration files containing personal information could be inadvertently published to a public GitHub repository.

🔧 Fix:

  • Replaced $ with \Z in _is_valid_skill_name regex patterns across local_source.py and external_source.py.
  • Refactored _ignore_func in git_publish.py to use fnmatch.fnmatch for robust and correct glob pattern matching.

Verification:

  • Added tests/test_security_remediation.py which explicitly tests newline rejection and verifies that all sensitive patterns are correctly ignored by the updated logic.
  • Ran existing publish integration tests to ensure no regressions.

PR created automatically by Jules for task 2764878419301050900 started by @renatocaliari

- Fix regex newline injection in skill name validation by using \Z instead of $.
- Fix broken exclusion logic in git publish by using fnmatch for robust pattern matching.
- Prevent sensitive data leakage (sessions, cache, models.json) during publish.
- Add regression tests in tests/test_security_remediation.py.

Co-authored-by: renatocaliari <1719651+renatocaliari@users.noreply.github.com>
@google-labs-jules

Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses security vulnerabilities and logic bugs by updating regex validators in external_source.py and local_source.py to use \Z instead of $ to prevent trailing newline bypasses. It also replaces error-prone manual pattern matching in git_publish.py with fnmatch.fnmatch to properly exclude sensitive files and directories, and adds corresponding unit tests. The review feedback suggests making the file exclusion logic case-insensitive to prevent sensitive capitalized files (e.g., Models.json) from being published on Unix-like systems, adding capitalized test cases to verify this behavior, and removing an unused fnmatch import in the test file.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment on lines 31 to 44
def _ignore_func(*patterns):
"""Create a callable that returns a list of filenames to ignore."""
import fnmatch

def _ignore(path, names):
ignored = []
for name in names:
for pattern in patterns:
if pattern.startswith('*.'):
if name.endswith(pattern[1:]):
ignored.append(name)
break
elif pattern.startswith('.'):
if name == pattern or name.startswith(pattern.rstrip('/') + '/'):
ignored.append(name)
break
if fnmatch.fnmatch(name, pattern):
ignored.append(name)
break
return ignored

return _ignore

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

On Unix-like systems (such as Linux, which is standard for CI/CD and many developer environments), fnmatch.fnmatch is case-sensitive. This means that if a user has sensitive files or directories with capitalized names (e.g., Models.json, Sessions, .ENV, Debug.log), they will not match the lowercase patterns in DEFAULT_IGNORE_PATTERNS and will be silently published to the public repository, leading to critical data leakage.

To ensure robust, platform-independent, and case-insensitive exclusion of sensitive files, we should lowercase both the filenames and the patterns before matching.

Suggested change
def _ignore_func(*patterns):
"""Create a callable that returns a list of filenames to ignore."""
import fnmatch
def _ignore(path, names):
ignored = []
for name in names:
for pattern in patterns:
if pattern.startswith('*.'):
if name.endswith(pattern[1:]):
ignored.append(name)
break
elif pattern.startswith('.'):
if name == pattern or name.startswith(pattern.rstrip('/') + '/'):
ignored.append(name)
break
if fnmatch.fnmatch(name, pattern):
ignored.append(name)
break
return ignored
return _ignore
def _ignore_func(*patterns):
"""Create a callable that returns a list of filenames to ignore."""
import fnmatch
lowercased_patterns = [p.lower() for p in patterns]
def _ignore(path, names):
ignored = []
for name in names:
name_lower = name.lower()
for pattern in lowercased_patterns:
if fnmatch.fnmatch(name_lower, pattern):
ignored.append(name)
break
return ignored
return _ignore

Comment on lines +1 to +4
import fnmatch
from agent_sync.publish.local_source import _is_valid_skill_name as local_validate
from agent_sync.publish.external_source import _is_valid_skill_name as external_validate
from agent_sync.publish.git_publish import _ignore_func, DEFAULT_IGNORE_PATTERNS

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The fnmatch module is imported but never used in this test file. We should remove it to keep the imports clean.

Suggested change
import fnmatch
from agent_sync.publish.local_source import _is_valid_skill_name as local_validate
from agent_sync.publish.external_source import _is_valid_skill_name as external_validate
from agent_sync.publish.git_publish import _ignore_func, DEFAULT_IGNORE_PATTERNS
from agent_sync.publish.local_source import _is_valid_skill_name as local_validate
from agent_sync.publish.external_source import _is_valid_skill_name as external_validate
from agent_sync.publish.git_publish import _ignore_func, DEFAULT_IGNORE_PATTERNS

Comment on lines +15 to +41
def test_ignore_func_correctly_filters_sensitive_files():
"""Verify that _ignore_func correctly filters sensitive files and directories."""
ignore = _ignore_func(*DEFAULT_IGNORE_PATTERNS)

# Test cases: (name, should_be_ignored)
test_cases = [
("sessions", True),
("cache", True),
(".cache", True),
("models.json", True),
("models.yaml", True),
(".env", True),
("my_skill.md", False),
("SKILL.md", False),
(".git", True),
("debug.log", True),
("data.sqlite", True),
("private.key", True),
]

names = [tc[0] for tc in test_cases]
ignored = ignore(None, names)

for name, should_ignore in test_cases:
is_ignored = name in ignored
assert is_ignored == should_ignore, f"Expected {name} to be ignored={should_ignore}, but got {is_ignored}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent regressions and ensure that capitalized sensitive files/directories (e.g., Sessions, Models.json, .ENV) are also correctly ignored on all platforms (including case-sensitive Unix systems), we should add capitalized test cases to test_ignore_func_correctly_filters_sensitive_files.

def test_ignore_func_correctly_filters_sensitive_files():
    """Verify that _ignore_func correctly filters sensitive files and directories."""
    ignore = _ignore_func(*DEFAULT_IGNORE_PATTERNS)

    # Test cases: (name, should_be_ignored)
    test_cases = [
        ("sessions", True),
        ("Sessions", True),
        ("cache", True),
        ("Cache", True),
        (".cache", True),
        ("models.json", True),
        ("Models.json", True),
        ("models.yaml", True),
        (".env", True),
        (".ENV", True),
        ("my_skill.md", False),
        ("SKILL.md", False),
        (".git", True),
        ("debug.log", True),
        ("Debug.log", True),
        ("data.sqlite", True),
        ("private.key", True),
    ]

    names = [tc[0] for tc in test_cases]
    ignored = ignore(None, names)

    for name, should_ignore in test_cases:
        is_ignored = name in ignored
        assert is_ignored == should_ignore, f"Expected {name} to be ignored={should_ignore}, but got {is_ignored}"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant