Skip to content

Comments

fix: validate version string in HttpSourceDownloader#418

Merged
stack72 merged 1 commit intomainfrom
fix/validate-version-string-413
Feb 21, 2026
Merged

fix: validate version string in HttpSourceDownloader#418
stack72 merged 1 commit intomainfrom
fix/validate-version-string-413

Conversation

@stack72
Copy link
Contributor

@stack72 stack72 commented Feb 21, 2026

Summary

Closes #413.

Adds a regex allowlist (/^[a-zA-Z0-9._-]+$/) to getArchiveUrl() in HttpSourceDownloader that rejects version strings containing unsafe URL characters (slashes, .., %, ?, #) before they are interpolated into the GitHub archive URL.

This is defense-in-depth input validation, not a security patch. The actual risk is minimal because:

  • The version string is self-supplied by the user via the --version CLI flag — the "attacker" and "victim" are the same person
  • The base URL is hardcoded to https://github.com/systeminit/swamp/archive/refs — no SSRF to arbitrary hosts is possible
  • The worst case without this fix is a confusing GitHub 404 or downloading a different ref from the same repo (e.g., heads/main instead of a tagged release)

The practical benefit is better error messages: users who mistype a version string now get a clear UserError instead of a cryptic GitHub 404. The guard also provides insurance if the version source ever changes to come from an untrusted input (e.g., a shared workflow file or remote config).

Test Plan

  • Added 5 test cases to http_source_downloader_test.ts covering:
    • ../malicious — path traversal with ..
    • foo/../../heads/main — nested path traversal escaping /tags/
    • version%2F.. — URL-encoded slash
    • v1.0?ref=main — query string injection
    • v1.0#fragment — fragment injection
  • All validation fires before any network request is made
  • All 1802 existing tests continue to pass

🤖 Generated with Claude Code

Add a regex allowlist to getArchiveUrl() that rejects version strings
containing characters outside [a-zA-Z0-9._-] before they are
interpolated into the GitHub archive URL.

This is defense-in-depth input validation, not a security patch. The
actual risk is minimal because:

- The version string is self-supplied by the user via --version CLI flag
- The base URL is hardcoded to a single GitHub repo, so no SSRF to
  arbitrary hosts is possible
- The worst case without this fix is a confusing 404 or downloading a
  different ref from the same repo (e.g. heads/main instead of a tag)

The practical benefit is better error messages: users who mistype a
version string now get a clear validation error instead of a cryptic
GitHub 404. The guard also provides insurance if the version source
ever changes to come from an untrusted input (e.g. a shared workflow
file or remote config).

Adds 5 test cases covering path traversal (../malicious), nested path
traversal (foo/../../heads/main), URL-encoded characters (%2F),
query string injection (?ref=main), and fragment injection (#fragment).
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

This PR adds input validation for version strings in HttpSourceDownloader.getArchiveUrl() as defense-in-depth protection against URL injection. The implementation is clean and well-tested.

✅ No Blocking Issues

Code Quality

  • TypeScript strict mode: ✓ No any types
  • Named exports: ✓ Correctly uses named exports
  • AGPLv3 header: ✓ Present in both files
  • CI Status: ✓ Lint, tests, and format checks pass

Domain-Driven Design

  • HttpSourceDownloader is correctly placed in the infrastructure layer as an adapter implementing the domain's SourceDownloader interface
  • Using UserError (a domain error type) for validation failures is appropriate - it provides clear, actionable feedback to users
  • The validation is placed at the right abstraction level - in the infrastructure adapter that constructs URLs

Test Coverage

  • 5 comprehensive test cases covering:
    • Path traversal (../malicious)
    • Nested path traversal (foo/../../heads/main)
    • URL-encoded characters (version%2F..)
    • Query string injection (v1.0?ref=main)
    • Fragment injection (v1.0#fragment)
  • Tests are co-located with source files ✓
  • Tests use proper cleanup in finally blocks

Security

  • The regex allowlist /^[a-zA-Z0-9._-]+$/ is appropriate for version strings
  • Defense-in-depth validation is good practice, even when current risk is minimal
  • Validation fires before any network request, providing fast feedback

Suggestions (Non-blocking)

  1. Minor test efficiency: The new tests create temp directories that aren't strictly needed since validation occurs before any filesystem operations. However, this matches the existing test patterns in the file and provides consistency, so it's fine as-is.

LGTM! Well-implemented defense-in-depth fix with proper test coverage.

@stack72 stack72 merged commit b8cec4e into main Feb 21, 2026
4 checks passed
@stack72 stack72 deleted the fix/validate-version-string-413 branch February 21, 2026 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

security: validate version string in HttpSourceDownloader to prevent unexpected URL construction

1 participant