Skip to content

Fix simulation infrastructure bitrot in fdev testing framework #1611

Open
@sanity

Description

@sanity

Problem

The simulation infrastructure appears to be suffering from bitrot and is currently non-functional. All network simulation tests fail with connection errors, preventing large-scale algorithmic testing.

Current Behavior

When running simulation tests, all nodes fail to connect to gateways:

cargo run --bin fdev -- test --nodes 2 --gateways 1 --events 1 single-process

Error Output:

ERROR freenet::operations::connect: Failed while attempting connection to gateway, 
error: failed notifying, channel closed

Results:

  • 0% of nodes successfully connect to gateways
  • Tests report "success" despite complete connection failure
  • Network connectivity validation fails for any node count (2, 10, 50, 1000)

Root Cause Analysis

The simulation infrastructure in crates/core/src/node/testing_impl.rs and crates/fdev/src/testing/ has several issues:

  1. Channel Communication Failures: failed notifying, channel closed indicates async channel APIs have drifted
  2. Configuration Mismatches: Gateway configuration inconsistencies between setup and connection phases
  3. API Drift: The MemoryConnManager and SimNetwork interfaces may be out of sync with current networking code
  4. No CI Coverage: Simulation tests aren't run in CI, allowing bitrot to accumulate undetected

Expected Behavior

The simulation infrastructure should:

  • Successfully establish connections between nodes and gateways
  • Validate network connectivity before running events
  • Execute real Freenet algorithms (routing, contract propagation) at scale
  • Support testing with 10s to 1000s of nodes using in-memory transport

Impact

This prevents:

  • Large-scale algorithmic validation (testing small-world routing at scale)
  • Performance characterization (network behavior with 100+ nodes)
  • Topology testing (partial connectivity, network partition scenarios)
  • Regression testing for core network algorithms

Affected Components

  • crates/fdev/src/testing/single_process.rs - Main simulation entry point
  • crates/core/src/node/testing_impl.rs - Core simulation infrastructure
  • crates/core/src/node/network_bridge/in_memory.rs - In-memory transport
  • SimNetwork, MemoryConnManager, EventChain - Key simulation types

Proposed Solution

  1. Audit and fix async channel usage in simulation infrastructure
  2. Align configuration APIs between simulation and production code
  3. Add basic simulation tests to CI to prevent future bitrot
  4. Create integration tests that validate end-to-end simulation functionality
  5. Document simulation architecture and maintenance requirements

Testing Strategy

Once fixed, the simulation infrastructure should support:

  • Small tests (2-10 nodes) for basic connectivity validation
  • Medium tests (50-100 nodes) for algorithmic behavior
  • Large tests (1000+ nodes) for scaling characterization

Priority

Medium-High - This infrastructure could provide valuable testing capabilities for Freenet's core algorithms, but currently no tests depend on it working.

Related Work

  • The production network tests (ping tests) work correctly and test real network behavior
  • GitHub Actions CI currently only runs production network tests, not simulation tests

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Triage

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions