🛡️ Sentinel: [HIGH] Harden publish flow against leakage and injection#117
🛡️ Sentinel: [HIGH] Harden publish flow against leakage and injection#117calionauta wants to merge 1 commit into
Conversation
- Fix newline injection vulnerability in skill name validation by using \Z anchor. - Refactor _ignore_func in git_publish.py to use robust fnmatch.filter matching. - Prevent content leakage by disabling symlink following in shutil.copytree and shutil.copy2 during publish. - Add comprehensive security regression tests in tests/test_publish_security.py. Co-authored-by: renatocaliari <1719651+renatocaliari@users.noreply.github.com>
|
👋 Jules, reporting for duty! I'm here to lend a hand with this pull request. When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down. I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job! For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with New to Jules? Learn more at jules.google/docs. For security, I will only act on instructions from the user who triggered this task. |
There was a problem hiding this comment.
Code Review
This pull request hardens the publishing flow against symlink traversal and regex injection by using absolute end-of-string anchors in regex validation, adopting fnmatch for pattern matching, and configuring shutil copy functions to not follow symlinks. The reviewer identified a critical edge case where is_dir() returns True for directory symlinks, which would still trigger shutil.copytree and leak content; they suggested checking is_symlink() first. Additionally, the reviewer recommended using shutil.ignore_patterns for cleaner code and corrected incomplete sentences in the sentinel documentation.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if src_path.is_dir(): | ||
| shutil.copytree(src_path, dest, dirs_exist_ok=True, | ||
| # symlinks=True prevents content leakage by preserving links as links | ||
| shutil.copytree(src_path, dest, dirs_exist_ok=True, symlinks=True, | ||
| ignore=_ignore_func(*DEFAULT_IGNORE_PATTERNS)) | ||
| else: | ||
| dest.parent.mkdir(parents=True, exist_ok=True) | ||
| shutil.copy2(src_path, dest) | ||
| # follow_symlinks=False ensures we don't accidentally copy target content | ||
| shutil.copy2(src_path, dest, follow_symlinks=False) |
There was a problem hiding this comment.
If src_path is a symbolic link pointing to a directory, src_path.is_dir() will return True. Calling shutil.copytree on it will follow the symlink and copy all of its target contents, which leads to the exact content leakage/information disclosure vulnerability this PR aims to prevent. To fix this, explicitly check if src_path is a symlink first and copy it as a symlink using shutil.copy2(..., follow_symlinks=False).
| if src_path.is_dir(): | |
| shutil.copytree(src_path, dest, dirs_exist_ok=True, | |
| # symlinks=True prevents content leakage by preserving links as links | |
| shutil.copytree(src_path, dest, dirs_exist_ok=True, symlinks=True, | |
| ignore=_ignore_func(*DEFAULT_IGNORE_PATTERNS)) | |
| else: | |
| dest.parent.mkdir(parents=True, exist_ok=True) | |
| shutil.copy2(src_path, dest) | |
| # follow_symlinks=False ensures we don't accidentally copy target content | |
| shutil.copy2(src_path, dest, follow_symlinks=False) | |
| if src_path.is_symlink(): | |
| dest.parent.mkdir(parents=True, exist_ok=True) | |
| shutil.copy2(src_path, dest, follow_symlinks=False) | |
| elif src_path.is_dir(): | |
| # symlinks=True prevents content leakage by preserving links as links | |
| shutil.copytree(src_path, dest, dirs_exist_ok=True, symlinks=True, | |
| ignore=_ignore_func(*DEFAULT_IGNORE_PATTERNS)) | |
| else: | |
| dest.parent.mkdir(parents=True, exist_ok=True) | |
| # follow_symlinks=False ensures we don't accidentally copy target content | |
| shutil.copy2(src_path, dest, follow_symlinks=False) |
| if skill.path.is_dir(): | ||
| shutil.copytree(skill.path, dest, dirs_exist_ok=True, | ||
| shutil.copytree(skill.path, dest, dirs_exist_ok=True, symlinks=True, | ||
| ignore=_ignore_func(*DEFAULT_IGNORE_PATTERNS)) | ||
| else: | ||
| dest.parent.mkdir(parents=True, exist_ok=True) | ||
| shutil.copy2(skill.path, dest) | ||
| shutil.copy2(skill.path, dest, follow_symlinks=False) |
There was a problem hiding this comment.
If skill.path is a symbolic link pointing to a directory, skill.path.is_dir() will return True. Calling shutil.copytree on it will follow the symlink and copy all of its target contents, leading to content leakage. Explicitly check if skill.path is a symlink first and copy it as a symlink using shutil.copy2(..., follow_symlinks=False).
| if skill.path.is_dir(): | |
| shutil.copytree(skill.path, dest, dirs_exist_ok=True, | |
| shutil.copytree(skill.path, dest, dirs_exist_ok=True, symlinks=True, | |
| ignore=_ignore_func(*DEFAULT_IGNORE_PATTERNS)) | |
| else: | |
| dest.parent.mkdir(parents=True, exist_ok=True) | |
| shutil.copy2(skill.path, dest) | |
| shutil.copy2(skill.path, dest, follow_symlinks=False) | |
| if skill.path.is_symlink(): | |
| dest.parent.mkdir(parents=True, exist_ok=True) | |
| shutil.copy2(skill.path, dest, follow_symlinks=False) | |
| elif skill.path.is_dir(): | |
| shutil.copytree(skill.path, dest, dirs_exist_ok=True, symlinks=True, | |
| ignore=_ignore_func(*DEFAULT_IGNORE_PATTERNS)) | |
| else: | |
| dest.parent.mkdir(parents=True, exist_ok=True) | |
| shutil.copy2(skill.path, dest, follow_symlinks=False) |
| **Vulnerability:** The public skill publishing flow was vulnerable to information disclosure via symbolic link traversal and brittle file exclusion logic. followed symlinks into sensitive areas, and the hand-rolled ignore function only handled a subset of intended patterns. | ||
| **Learning:** (before 3.12 or if not specified) and follow symlinks by default. Hand-rolling glob matching is error-prone; is the robust standard. |
There was a problem hiding this comment.
The sentences in this section are incomplete and contain missing words/subjects. Please complete them to ensure the documentation is clear and professional.
| **Vulnerability:** The public skill publishing flow was vulnerable to information disclosure via symbolic link traversal and brittle file exclusion logic. followed symlinks into sensitive areas, and the hand-rolled ignore function only handled a subset of intended patterns. | |
| **Learning:** (before 3.12 or if not specified) and follow symlinks by default. Hand-rolling glob matching is error-prone; is the robust standard. | |
| **Vulnerability:** The public skill publishing flow was vulnerable to information disclosure via symbolic link traversal and brittle file exclusion logic. Copy operations followed symlinks into sensitive areas, and the hand-rolled ignore function only handled a subset of intended patterns. | |
| **Learning:** `shutil.copytree` (before 3.12 or if not specified) and `shutil.copy2` follow symlinks by default. Hand-rolling glob matching is error-prone; `fnmatch.filter` is the robust standard. |
| import fnmatch | ||
|
|
||
| def _ignore(path, names): | ||
| ignored = [] | ||
| for name in names: | ||
| for pattern in patterns: | ||
| if pattern.startswith('*.'): | ||
| if name.endswith(pattern[1:]): | ||
| ignored.append(name) | ||
| break | ||
| elif pattern.startswith('.'): | ||
| if name == pattern or name.startswith(pattern.rstrip('/') + '/'): | ||
| ignored.append(name) | ||
| break | ||
| return ignored | ||
| for pattern in patterns: | ||
| # Use fnmatch.filter to robustly match patterns against names | ||
| matches = fnmatch.filter(names, pattern) | ||
| ignored.extend(matches) | ||
| return list(set(ignored)) # Deduplicate | ||
|
|
||
| return _ignore |
There was a problem hiding this comment.
Instead of manually re-implementing pattern filtering using fnmatch.filter, you can use Python's built-in shutil.ignore_patterns which is standard, highly optimized, and returns a set of ignored names directly.
| import fnmatch | |
| def _ignore(path, names): | |
| ignored = [] | |
| for name in names: | |
| for pattern in patterns: | |
| if pattern.startswith('*.'): | |
| if name.endswith(pattern[1:]): | |
| ignored.append(name) | |
| break | |
| elif pattern.startswith('.'): | |
| if name == pattern or name.startswith(pattern.rstrip('/') + '/'): | |
| ignored.append(name) | |
| break | |
| return ignored | |
| for pattern in patterns: | |
| # Use fnmatch.filter to robustly match patterns against names | |
| matches = fnmatch.filter(names, pattern) | |
| ignored.extend(matches) | |
| return list(set(ignored)) # Deduplicate | |
| return _ignore | |
| return shutil.ignore_patterns(*patterns) |
This PR addresses several high-priority security issues in the
publishflow that could lead to information disclosure or command/argument injection._is_valid_skill_nameto use\Zinstead of$, preventing skill names with trailing newlines from bypassing validation.git_publish.pywith a robustfnmatch-based filter that correctly respects allDEFAULT_IGNORE_PATTERNS(e.g.,sessions/,models.json).shutiloperations to preserve symlinks as links rather than following them, preventing the accidental inclusion of sensitive system files in public skill repositories.Verification:
tests/test_publish_security.pyconfirms all fixes.PR created automatically by Jules for task 8613642642801538767 started by @renatocaliari