Skip to content

Fix Bad Gateway and optimize PHP workers#3219

Merged
adamziel merged 5 commits intotrunkfrom
fix-bad-gateway
Jan 29, 2026
Merged

Fix Bad Gateway and optimize PHP workers#3219
adamziel merged 5 commits intotrunkfrom
fix-bad-gateway

Conversation

@JanJakes
Copy link
Member

@JanJakes JanJakes commented Jan 29, 2026

Motivation for the change, related issues

This PR fixes Bad Gateway (502) errors that occur when multiple concurrent requests are made to the PHP request handler.

Implementation details

  1. The primary PHP instance was not used for enqueued requests. Instead, all enqueued requests needed to wait for the secondary instances, and the primary instance only served new requests directly, never from the queue.

    The fix refactors PHPProcessManager to a pool-based design where all spawned instances (including primary) are added to an idle pool, acquirePHPInstance() takes from the pool or spawns new ones if needed, and instances are returned to the pool for reuse. All PHP instances should be automatically rotated every 400 requests.

  2. Increased instance wait timeout from 5s to 30s: When all PHP instances are busy, new requests wait for one to become available. The previous 5s timeout was likely a bit too short for some scenarios. This timeout limits wait time for an available instance, not PHP execution time.

  3. Semaphore stale resolver bug: When a semaphore acquire timed out, its resolver remained in the queue. Later releases would notify these stale resolvers instead of actual waiting requests. The fix removes timed-out resolvers from the queue.

Testing Instructions (or ideally a Blueprint)

The fix can be verified by running concurrent requests in the browser console:

const iframe = document.querySelector('iframe');
const playground = iframe.contentWindow.playground;
const start = Date.now();
const requests = [
  playground.run({ code: '<?php sleep(4); echo "done1";' }),
  playground.run({ code: '<?php sleep(4); echo "done2";' }),
  playground.run({ code: '<?php sleep(4); echo "done3";' }),
  playground.run({ code: '<?php sleep(4); echo "done4";' }),
  playground.run({ code: '<?php sleep(4); echo "done5";' }),
  playground.run({ code: '<?php sleep(4); echo "done6";' }),
  // Add more and modify sleep time as needed for testing.
];
const results = await Promise.allSettled(requests);
console.log('Total time:', Date.now() - start, 'ms');
console.log(results);

On trunk, some of these promises would be rejected. In this branch, all should succeed.

@JanJakes JanJakes force-pushed the fix-bad-gateway branch 2 times, most recently from f8d7d38 to 2a7f9b3 Compare January 29, 2026 16:37
@adamziel
Copy link
Collaborator

Woah all tests are green ❤️

@JanJakes JanJakes force-pushed the fix-bad-gateway branch 3 times, most recently from 5536236 to efba470 Compare January 29, 2026 19:12
expect(() => semaphore.acquire()).rejects.toThrow(AcquireTimeoutError);
});

it('should not leave stale resolvers in queue after timeout', async () => {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

@JanJakes JanJakes marked this pull request as ready for review January 29, 2026 19:48
@JanJakes JanJakes requested a review from a team January 29, 2026 19:49
@adamziel adamziel merged commit c7bbcd7 into trunk Jan 29, 2026
35 checks passed
@adamziel adamziel deleted the fix-bad-gateway branch January 29, 2026 22:26
JanJakes added a commit that referenced this pull request Feb 2, 2026
This test covers the Bad Gateway issue that was addressed in
#3219.
JanJakes added a commit that referenced this pull request Feb 5, 2026
)

## Motivation for the change, related issues

This PR adds a regression test for the Bad Gateway fix implemented in
#3219. The test ensures that the `PHPProcessManager` correctly queues
requests when all PHP instances are busy and properly reuses instances
once they become available.

## Implementation details

The test simulates 6 concurrent requests with only 2 PHP instances
available, each request taking 50ms to complete. With proper queuing and
instance reuse, all 6 requests should complete within the 200ms timeout
(3 batches × 50ms = 150ms).

The test pre-boots both PHP instances before starting the concurrent
requests to eliminate instance spawning time from the timing
calculations.

## Testing Instructions (or ideally a Blueprint)

CI should be green. The new test can be run using:

```
npx nx run php-wasm-node:test-group-4-jspi --testNamePattern="should correctly queue requests and reuse PHP instances"
```

---------

Co-authored-by: Ashish Kumar <ashfame@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants