Skip to content

Harden orchestration and MCP response paths#41

Merged
christopherkarani merged 3 commits intomainfrom
audit/mission-critical-fixes
Feb 20, 2026
Merged

Harden orchestration and MCP response paths#41
christopherkarani merged 3 commits intomainfrom
audit/mission-critical-fixes

Conversation

@christopherkarani
Copy link
Owner

Pull Request

Thank you for contributing to Swarm! πŸŽ‰

We appreciate your effort to improve the framework. This template helps ensure high-quality contributions that align with the project's standards.


Description

What does this PR do?

Motivation and Context

Fixes # (issue)


Type of Change

Please check the relevant option(s):

  • πŸ› Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • πŸ”¨ Refactoring (code improvement without functional changes)
  • πŸ“š Documentation (updates to README, inline docs, or examples)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • πŸ§ͺ Test improvement (adding missing tests or improving existing ones)
  • πŸ”§ Build/CI (changes to build process, dependencies, or CI configuration)

Changes Made

Implementation Details

Files Changed

  • path/to/file.swift - Description of changes
  • path/to/another/file.swift - Description of changes

API Changes (if applicable)


Testing

Test Coverage

  • I have followed Test-Driven Development (TDD) practices (wrote tests first, then implementation)
  • All new code is covered by unit tests
  • I have added tests for edge cases and error conditions
  • I have used mock protocols for external dependencies (LLM providers, network, etc.)

Test Results

# Paste the output of: swift test

Manual Testing


Breaking Changes

Does this PR introduce breaking changes?

  • Yes (please describe below)
  • No

Breaking Change Details


Pre-Submission Checklist

Code Quality

  • My code follows the Swift 6.2 standards and project conventions
  • I have performed a self-review of my own code
  • I have run swift build and ensured no compilation errors
  • I have run swift test and all tests pass
  • I have run SwiftFormat: swift package plugin --allow-writing-to-package-directory swiftformat
  • I have run SwiftLint and fixed any warnings: swiftlint lint

Concurrency & Safety

  • All public types conform to Sendable where appropriate
  • I have used async/await and structured concurrency correctly
  • I have properly applied @MainActor, actor, or nonisolated annotations
  • My code is free from data races and concurrency issues

Documentation

  • I have added/updated documentation comments for public APIs
  • I have updated the README.md if needed
  • I have added examples to Sources/Swarm/Examples/ if introducing new features
  • My code does not use print() statements (uses swift-log instead)

Testing (TDD Required)

  • I wrote tests before writing the implementation (Red phase)
  • I wrote minimal code to make tests pass (Green phase)
  • I refactored while keeping tests green (Refactor phase)
  • Mock protocols are used for external dependencies

Version Compatibility

  • My code supports the minimum requirements: iOS 17+, macOS 14+, Swift 6.2
  • I have not introduced dependencies that break compatibility

Additional Context

Screenshots (if applicable)

Related Issues/PRs

  • Related to #
  • Depends on #
  • Blocks #

Questions or Concerns


Reviewer Checklist (for maintainers)

  • Code follows project architecture and patterns
  • Tests are comprehensive and follow TDD principles
  • Documentation is clear and complete
  • No security vulnerabilities introduced (SQL injection, XSS, command injection, etc.)
  • Performance implications are acceptable
  • Breaking changes are properly documented
  • CI/CD checks pass

Thank you again for your contribution! Your efforts help make Swarm better for the entire Swift community. If you have any questions, feel free to ask in the comments below.

@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@christopherkarani
Copy link
Owner Author

@claude review

@claude
Copy link

claude bot commented Feb 19, 2026

Claude finished @christopherkarani's task in 1m 35s β€”β€” View job


Code Review: Harden orchestration and MCP response paths

  • Gather context and read changed files
  • Review MCPProtocol.swift
  • Review OrchestrationError.swift
  • Review OrchestrationHiveEngine.swift
  • Review ParallelToolExecutor.swift
  • Review OrchestrationTests.swift
  • Review Package.resolved
  • Post final review

Summary

This PR makes three targeted hardening improvements: removes fatalError from MCP factory methods, replaces a precondition crash with a recoverable error in the Hive engine, and eliminates a false fatalError("Unreachable") from the tool executor. The direction is correct β€” these are real reliability improvements β€” but there are a few issues worth addressing before merge.


Issues Found

1. Contradictory contract for empty steps in OrchestrationHiveEngine (Medium)

OrchestrationHiveEngine.swift now has two conflicting behaviors for steps.isEmpty:

  • execute() (line 115): early-returns successfully with input passed through unchanged
  • makeGraph() (line 298): throws OrchestrationError.invalidWorkflow

Since makeGraph is only called after the early-return guard, the guard !steps.isEmpty in makeGraph is now unreachable dead code. More importantly, the design question isn't answered: should an empty steps workflow silently succeed (passthrough) or be a programmer error (throw)?

A silent passthrough could mask bugs at call sites that accidentally pass an empty array. Consider either:

  • Removing the guard in makeGraph (since it's dead code) and document the passthrough behavior as intentional
  • Or having execute() throw invalidWorkflow instead of silently returning

Fix this β†’

2. validatedJsonrpc parameter name is misleading in MCPProtocol.swift (Minor)

Sources/Swarm/MCP/MCPProtocol.swift, line 279:

private init(
    validatedJsonrpc jsonrpc: String,  // name implies validation, but none occurs
    ...
)

The argument label validatedJsonrpc implies that the value has been validated, but the private initializer performs no validation. The string "2.0" is hardcoded at every call site anyway, making the parameter redundant. Consider either:

  • Hardcoding "2.0" in the initializer body (removing the parameter entirely)
  • Or renaming to something honest like jsonrpc: since this is already private

Fix this β†’

3. fatalError removed but invariant is still unenforced in MCPProtocol.swift (Minor)

The old code used fatalError to signal an invariant violation. The new code correctly removes that, but the invariant β€” "exactly one of result or error is set" β€” is now only enforced by convention. Both fields are Optional, so it's possible (via the public init(from decoder:)) to construct an MCPResponse where both are nil or both are set. Consider adding a debug assertion:

private init(validatedJsonrpc jsonrpc: String, ...) {
    assert(
        (result == nil) != (error == nil),
        "MCPResponse invariant: exactly one of result/error must be set"
    )
    ...
}

4. Fragile duration conversion in empty-steps early return (Minor)

OrchestrationHiveEngine.swift, lines 127–130:

"orchestration.total_duration": .double(
    Double(duration.components.seconds) +
        Double(duration.components.attoseconds) / 1e18
)

Check whether the rest of the file already has a helper for this ContinuousClock.Duration β†’ Double conversion. Duplicating this calculation inline is fragile (easy to get the exponent wrong) and inconsistent. If there's an existing helper, use it here too.

5. failFast switch case is now a silent no-op (Minor)

Sources/Swarm/Tools/ParallelToolExecutor.swift, lines 279–281:

case .failFast:
    // Handled above with true cancellation; fall back to results if logic changes.
    break

The comment is helpful, but if the early-exit if case .failFast guard at line 245 is ever removed or bypassed, failFast will silently fall through and return results including failures β€” the opposite of what the strategy promises. The comment "if logic changes" doesn't protect against this. A defensive assertionFailure("Unreachable") (which is a no-op in release, unlike fatalError) would be more appropriate here than the removed fatalError.


What's Done Well

  • precondition β†’ guard/throw: Converting a crash into a recoverable thrown error is the right move. precondition has no place in library code called by users.
  • fatalError removal from MCPResponse: The factory methods were always safe (guaranteed inputs), so the do/catch/fatalError was genuinely misleading. The new approach is simpler and honest.
  • OrchestrationError.invalidWorkflow: Clean addition with all three conformances (Error, LocalizedError, CustomDebugStringConvertible) properly updated.
  • Test gating with #if canImport(HiveCore): Correct conditional compilation for platform-specific tests.
  • Package.resolved pin: Adding the hive package at a fixed revision is correct dependency management.

Missing Test Coverage

  • No test for OrchestrationError.invalidWorkflow being thrown (or for the passthrough behavior β€” whichever is the intended contract)
  • No test for the MCPResponse invariant (both nil, or both set, from a decoded payload)
  • No test for ParallelToolExecutor failFast path to confirm it still cancels correctly after this refactor

@christopherkarani
Copy link
Owner Author

@claude fix all issues

@claude
Copy link

claude bot commented Feb 20, 2026

Claude finished @christopherkarani's task in 4m 7s β€”β€” View job


All review issues fixed


Summary of changes

Sources/Swarm/MCP/MCPProtocol.swift

  • Removed the validatedJsonrpc jsonrpc: String parameter from the private init; "2.0" is now hardcoded directly in the initializer body, which matches the reality at all call sites and eliminates the misleading implication that validation is being delegated to the caller.
  • Added assert((result == nil) != (error == nil), ...) to the private init so that the JSON-RPC 2.0 mutual-exclusivity invariant is actively checked in debug builds, not just documented by convention.

Sources/Swarm/Orchestration/OrchestrationHiveEngine.swift

  • Removed the unreachable guard !steps.isEmpty { throw invalidWorkflow } from makeGraph(). The execute() early-return already handles this before makeGraph is ever called. Replaced it with assert(!steps.isEmpty, ...) so future regressions are caught in debug builds without producing a production crash.
  • Added a doc comment to the steps.isEmpty early-return in execute() making the passthrough behaviour explicit.
  • Extracted a private static func durationAsDouble(_ duration: Duration) -> Double helper to centralise the seconds + attoseconds / 1e18 calculation that was duplicated at two call sites.

Sources/Swarm/Tools/ParallelToolExecutor.swift

  • Replaced the silent break in the case .failFast: branch of the post-execution switch with assertionFailure(...). If the early-exit guard is ever removed or bypassed, the assert fires in debug builds instead of silently returning results that include failures β€” which would contradict the failFast contract.
    Β· Branch

- MCPProtocol: remove misleading `validatedJsonrpc` param from private
  init; hardcode "2.0" directly in init body instead. Add assert to
  enforce the result/error mutual-exclusivity invariant in debug builds.

- OrchestrationHiveEngine: remove dead-code guard in makeGraph() (the
  empty-steps case is handled by execute() before makeGraph is called);
  replace with an assert to catch future regressions. Document the
  intentional passthrough behaviour for empty steps in execute().
  Extract durationAsDouble(_:) helper to centralise the fragile
  seconds + attoseconds/1e18 conversion and eliminate duplication.

- ParallelToolExecutor: replace silent break in failFast switch case
  with assertionFailure to surface future logic regressions in debug
  builds β€” a silent fallthrough would silently return failures,
  violating the failFast contract.

Co-authored-by: Christopher Karani <christopherkarani@users.noreply.github.com>
@claude
Copy link

claude bot commented Feb 20, 2026

Code Review: PR #41 β€” Harden orchestration and MCP response paths

Summary

This PR makes four targeted hardening changes: removes fatalError from MCPResponse factory methods, adds an invalidWorkflow error case, handles empty orchestration step lists gracefully, and fixes a dead-code fatalError in ParallelToolExecutor. The changes are clean and correct.


Build / Test Results

Note: The build failure below is pre-existing and unrelated to this PR.

swift build β†’ FAIL
error: no such module 'CryptoKit'
  Wax/Sources/WaxCore/Checksum/SHA256Checksum.swift:1:8

CryptoKit is Apple-platform-only and unavailable on the Linux CI runner. This issue exists on main before this PR's changes. This PR does not introduce or worsen it.

swiftlint lint β€” unavailable in this environment (SourceKit library missing). Manual style review performed instead.


Change-by-change Analysis

MCPProtocol.swift β€” Remove try/catch/fatalError anti-pattern βœ…

The old pattern was internally inconsistent β€” a throwing initializer was called in a context where failure is impossible, requiring a misleading catch { fatalError(...) }:

// OLD
do {
    return try MCPResponse(id: id, result: result, error: nil)
} catch {
    fatalError("MCPResponse.success: unexpected validation failure - \(error)")
}

The new private init(validatedJsonrpc:id:result:error:) makes the invariant explicit at the call site. Clear improvement.

Minor: The validatedJsonrpc label is slightly unconventional (embedding validation semantics in the label). A plain jsonrpc with a doc comment on the initializer stating the invariant is an alternative, but the current naming is acceptable.


OrchestrationError.swift β€” Add invalidWorkflow case βœ…

Clean addition following the existing pattern exactly. All three conformances (Error, LocalizedError, CustomDebugStringConvertible) are updated correctly. No issues.


OrchestrationHiveEngine.swift β€” Handle empty steps ⚠️ Minor inconsistency

Two locations now handle empty steps with different semantics:

  1. execute() β€” returns success (input passed through, total_steps = 0)
  2. makeGraph() β€” throws invalidWorkflow

Since execute() returns early before calling makeGraph(), the guard in makeGraph is defensive dead code on the current call path. This is fine for a private helper, but the two locations disagree: is empty steps a valid no-op or a programming error?

Recommendation: Add a clarifying comment to the makeGraph guard:

// Defensive: `execute` already guards this path. Retained for direct-call safety.
guard !steps.isEmpty else {
    throw OrchestrationError.invalidWorkflow(reason: "Hive orchestration requires at least one step.")
}

The duration metadata computation is correct (seconds + attoseconds / 1e18).


ParallelToolExecutor.swift β€” Remove unreachable fatalError βœ…

Replacing fatalError("Unreachable: failFast handled above") with break is correct. The failFast case is dispatched before the switch statement via executeWithFailFast, so the switch arm is unreachable. The comment // Handled above with true cancellation; fall back to results if logic changes. adequately documents the intent. Production code should not crash on a theoretically unreachable branch β€” good fix.


OrchestrationTests.swift β€” New test for empty steps βœ…

The #if canImport(HiveCore) guard is appropriate for platform-conditional tests. Assertions cover the key properties:

  • Output is passed through unchanged
  • orchestration.engine is "hive"
  • orchestration.total_steps is 0

Minor: Assertions for result.iterationCount == 0 and result.toolCalls.isEmpty would give fuller coverage of the early-return path. Not a blocker.


PR Description

The PR description template was not filled in. Please add at minimum:

  • What does this PR do / motivation
  • Type of change (bug fix / refactor)

Verdict

Approve with minor suggestions. The changes are correct, improve robustness, and follow project conventions. No security issues. The asymmetry between execute() silently succeeding vs makeGraph() throwing for empty steps warrants a clarifying comment before merge.

Review performed by Claude Code (claude-sonnet-4-6)

@christopherkarani christopherkarani merged commit 24edfb8 into main Feb 20, 2026
0 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant