fix: validate version string in HttpSourceDownloader#418
Merged
Conversation
Add a regex allowlist to getArchiveUrl() that rejects version strings containing characters outside [a-zA-Z0-9._-] before they are interpolated into the GitHub archive URL. This is defense-in-depth input validation, not a security patch. The actual risk is minimal because: - The version string is self-supplied by the user via --version CLI flag - The base URL is hardcoded to a single GitHub repo, so no SSRF to arbitrary hosts is possible - The worst case without this fix is a confusing 404 or downloading a different ref from the same repo (e.g. heads/main instead of a tag) The practical benefit is better error messages: users who mistype a version string now get a clear validation error instead of a cryptic GitHub 404. The guard also provides insurance if the version source ever changes to come from an untrusted input (e.g. a shared workflow file or remote config). Adds 5 test cases covering path traversal (../malicious), nested path traversal (foo/../../heads/main), URL-encoded characters (%2F), query string injection (?ref=main), and fragment injection (#fragment).
There was a problem hiding this comment.
Review Summary
This PR adds input validation for version strings in HttpSourceDownloader.getArchiveUrl() as defense-in-depth protection against URL injection. The implementation is clean and well-tested.
✅ No Blocking Issues
Code Quality
- TypeScript strict mode: ✓ No
anytypes - Named exports: ✓ Correctly uses named exports
- AGPLv3 header: ✓ Present in both files
- CI Status: ✓ Lint, tests, and format checks pass
Domain-Driven Design
HttpSourceDownloaderis correctly placed in the infrastructure layer as an adapter implementing the domain'sSourceDownloaderinterface- Using
UserError(a domain error type) for validation failures is appropriate - it provides clear, actionable feedback to users - The validation is placed at the right abstraction level - in the infrastructure adapter that constructs URLs
Test Coverage
- 5 comprehensive test cases covering:
- Path traversal (
../malicious) - Nested path traversal (
foo/../../heads/main) - URL-encoded characters (
version%2F..) - Query string injection (
v1.0?ref=main) - Fragment injection (
v1.0#fragment)
- Path traversal (
- Tests are co-located with source files ✓
- Tests use proper cleanup in
finallyblocks
Security
- The regex allowlist
/^[a-zA-Z0-9._-]+$/is appropriate for version strings - Defense-in-depth validation is good practice, even when current risk is minimal
- Validation fires before any network request, providing fast feedback
Suggestions (Non-blocking)
- Minor test efficiency: The new tests create temp directories that aren't strictly needed since validation occurs before any filesystem operations. However, this matches the existing test patterns in the file and provides consistency, so it's fine as-is.
LGTM! Well-implemented defense-in-depth fix with proper test coverage.
This was referenced Feb 23, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #413.
Adds a regex allowlist (
/^[a-zA-Z0-9._-]+$/) togetArchiveUrl()inHttpSourceDownloaderthat rejects version strings containing unsafe URL characters (slashes,..,%,?,#) before they are interpolated into the GitHub archive URL.This is defense-in-depth input validation, not a security patch. The actual risk is minimal because:
--versionCLI flag — the "attacker" and "victim" are the same personhttps://github.com/systeminit/swamp/archive/refs— no SSRF to arbitrary hosts is possibleheads/maininstead of a tagged release)The practical benefit is better error messages: users who mistype a version string now get a clear
UserErrorinstead of a cryptic GitHub 404. The guard also provides insurance if the version source ever changes to come from an untrusted input (e.g., a shared workflow file or remote config).Test Plan
http_source_downloader_test.tscovering:../malicious— path traversal with..foo/../../heads/main— nested path traversal escaping/tags/version%2F..— URL-encoded slashv1.0?ref=main— query string injectionv1.0#fragment— fragment injection🤖 Generated with Claude Code