Skip to content

fix(cli): Add retry logic for executable copy to handle EPERM race condition#5278

Open
ThomasSteinbach wants to merge 1 commit intoDioxusLabs:mainfrom
ThomasSteinbach:fix/eperm-race-condition
Open

fix(cli): Add retry logic for executable copy to handle EPERM race condition#5278
ThomasSteinbach wants to merge 1 commit intoDioxusLabs:mainfrom
ThomasSteinbach:fix/eperm-race-condition

Conversation

@ThomasSteinbach
Copy link

@ThomasSteinbach ThomasSteinbach commented Feb 1, 2026

Problem

Fixes #5275
Fixes #5256

When running dx serve or dx build, the CLI intermittently fails with:

Error: Operation not permitted (os error 1)

This occurs at the file copy step after cargo completes compilation. The root cause is a race condition where cargo has finished building but hasn't fully released OS-level file handles to the executable yet.

Failure rate: 20-90% depending on system load and build frequency
Affected platforms: Primarily macOS, potentially other Unix systems
Impact: Blocks development workflow, requires repeated manual retries

Root Cause

The error occurs in write_executable() at the std::fs::copy() call:

std::fs::copy(exe, self.main_exe())?;

Timing window: When this copy happens 0-200ms after cargo exits, the OS may still be releasing file locks/handles. This produces EPERM (error code 1).

Solution

Add retry logic with exponential backoff specifically for EPERM errors:

  • Retry policy: Up to 5 attempts with exponential backoff (20ms, 40ms, 80ms, 160ms, 320ms)
  • Scope: Only retries on EPERM (errno 1), other errors fail immediately
  • Logging: Warns on retry attempts, logs success after retries
  • Performance: 0ms overhead for 90% of builds (no race), 20-320ms only when race occurs

Testing

Before fix: 20-90% failure rate in stress testing
After fix: 0% failure rate in 20+ consecutive builds

The fix has been verified with:

  1. Repeated clean builds (rm -rf target/dx && dx build)
  2. Rapid rebuild cycles (triggering race condition frequently)
  3. Normal development workflow (dx serve)

Changes

  • Add Duration to imports in packages/cli/src/build/request.rs
  • Replace single std::fs::copy() call with retry loop handling EPERM
  • Add informative logging for debugging when retries occur

Alternative Approaches Considered

  1. Longer fixed delay: Not adaptive to system load, adds unnecessary latency
  2. Wait for cargo process exit: Already happens, race is at OS file handle level
  3. Poll file readiness: More complex, not portable across platforms
  4. Ignore EPERM: Would hide real permission errors

The retry approach is simple, robust, and has minimal performance impact.

…ndition

Fixes DioxusLabs#5275

When dx serve copies the compiled executable, cargo may not have fully
released file handles yet. This causes intermittent EPERM errors with
20-90% failure rate depending on system load.

Solution: Retry with exponential backoff (20ms, 40ms, 80ms, 160ms, 320ms)
- Only retries on EPERM (errno 1)
- Max 5 attempts
- Logs warnings when retries occur
- Verified 0% failure rate in stress testing (20/20 builds)
@ThomasSteinbach ThomasSteinbach requested a review from a team as a code owner February 1, 2026 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant