Better sandbox lifecycle management by nathanl · Pull Request #95 · ftes/phoenix_test_playwright

nathanl · 2025-11-06T20:55:51Z

Previously, I needed to add Process.sleep/1 calls to the ends of some of my tests to allow any LiveViews that were querying via the sandbox time to shut down before the test process ended; otherwise I would see connection ownership errors being logged. This PR makes two changes to remove that need.

Use Ecto.Adapters.SQL.Sandbox.html.start_owner!/2 to make the sandbox owner not be the test process, and use an on_exit/1 to terminate the sandbox after the test process terminates, as shown here.
Since that only adds a small amount of delay before the sandbox is terminated, add a sandbox_owner_shutdown_delay option so users can specify a number of additional milliseconds to wait before shutting the sandbox down.

Item 2 feels a bit smelly TBH, because it's a global (not per-test) option, and only guesswork can find the right value. But given that Playwright tests can't manage the LV processes in any way, I don't know a better solution, and in my case a 100 ms delay seems fine.

nathanl · 2025-11-06T20:56:57Z

@ftes what do you think? I don't love it, but I love my Process.sleep/1 per test even less. All I want is stable, clean test runs, even if they're slower.

ftes · 2025-11-07T06:09:13Z

Slack discussion

lib/phoenix_test/playwright/case.ex

ftes · 2025-11-07T09:37:36Z

lib/phoenix_test/playwright/config.ex

      doc:
        "Define custom Playwright [selector engines](https://playwright.dev/docs/extensibility#custom-selector-engines)."
+    ],
+    sandbox_owner_shutdown_delay: [


If we add this key to setup_keys below (line 130), this could be overridden per test/describe/suite via exunit @tag sandbox_owner_shutdown_delay: x.

Wdyt?

I love the idea. Will try it out.

This works well and makes using this much nicer, since we don't have to keep increasing the global value for the test that has the longest-running dangling processes. 😌

ftes · 2025-11-07T09:40:39Z

Hey Nathan, this is a highly welcome PR!

Can we avoid the slowdown by stopping the sandbox owner in a separate, unlinked, process?
So async instead of blocking the next test run?

I guess the overall question is: Do we care if stopping the sandbox fails?
Or more specific: How likely is it to fail? What happens to the rest of the test suite run if it fails?

ftes · 2025-11-07T09:46:59Z

The existing implementation was copied from wallaby.

Given that wallaby has been around a lot longer, and the standard Wallaby.Feature implementation suffers from the same problems, it might be worthwhile to find existing solutions/workarounds for wallaby.

nathanl · 2025-11-07T19:50:23Z

Can we avoid the slowdown by stopping the sandbox owner in a separate, unlinked, process?

I did some tinkering, but every way I tried this created errors, and I think I finally realized why. The on_exit docs say:

If on_exit/2 is called inside setup/1 or inside a test, it's executed in a blocking fashion after the test exits and before running the next test. This means that no other test from the same test case will be running while the on_exit/2 callback for a previous test is running.

If we use a spawn/1 inside our on_exit/1 code, that guarantee is broken; the on_exit/1 finishes, the next test starts, and the sandbox owner process is still running, maybe in "shared" mode if we're not doing async tests. So the processes that are spun up for the next test start using that old sandbox connection, only to have it die soon after.

Bottom line: we need the on_exit/1 to terminate the sandbox before it's finished. But having the sleep time configurable per test makes that less painful than if it were one global value.

nathanl · 2025-11-07T19:53:44Z

Regarding wallaby - we had lots of flakiness with our wallaby tests in the past and eventually abandoned them. (I say "we", it was mostly another team, so I can't say much in detail.) So maybe a different approach here is good.

nathanl · 2025-11-07T19:55:43Z

Since we seem to be on the same page and I'm up to 7 commits, I'm going to squash it down to one clean one.

nathanl · 2025-11-07T20:25:59Z

lib/phoenix_test/playwright/case.ex

+    defp maybe_start_sandbox_owner(repo, context, config) do
+      case start_sandbox_owner(repo, context) do
+        {:ok, pid} ->
+          delay = context[:sandbox_shutdown_delay] || config[:sandbox_shutdown_delay]


It will be nil in the context if not specified via tag

config[:sandbox_owner_shutdown_delay] should be sufficient.
It is filled with the tag value from ExUnit context, and falls back to global config otherwise.
Case.do_setup

oh, nice, thanks

nathanl · 2025-11-07T20:35:51Z

lib/phoenix_test/playwright/config.ex

+      Delay in milliseconds before shutting down the Ecto sandbox owner after a
+      test ends. Use this to allow LiveViews and other processes in your app
+      time to stop using database connections before the sandbox owner is
+      terminated. Default is 0 (immediate shutdown).


This renders fine in ExDoc and is easier than using <>.

nathanl · 2025-11-07T20:38:02Z

lib/phoenix_test/playwright.ex

+
+  In that case, you may encounter ownership errors like:
+  ```
+  ** (DBConnection.OwnershipError) cannot find owner for ...


I love that this renders like an error in the docs. 😄

That is cool!

ftes · 2025-11-10T11:12:54Z

If we use a spawn/1 inside our on_exit/1 code, that guarantee is broken; the on_exit/1 finishes, the next test starts, and the sandbox owner process is still running, maybe in "shared" mode if we're not doing async tests. So the processes that are spun up for the next test start using that old sandbox connection, only to have it die soon after.

Is this only an issue in shared mode?
If so: Can't we just use the existing (no-op) code for shared mode - since we don't need sandboxing there at all?

I'll try to find some time to tinker a bit myself.

ftes · 2025-11-10T12:58:59Z

Can you reproduce the original error and validate the fix in an example repo (e.g. mix phx.new + mix phx.gen.auth)?

nathanl · 2025-11-10T16:23:31Z

Is this only an issue in shared mode?

The issue of having a subsequent test re-use the old test's sandbox is, yes. So in async tests we can have on_exit spawn a process to shut down the sandbox. I pushed that change.

Can't we just use the existing (no-op) code for shared mode - since we don't need sandboxing there at all?

I'm not sure what you mean here. We do need sandboxing both in shared mode and outside it, it's just a difference in how the sandbox is shared between the test and other processes.

nathanl · 2025-11-10T16:24:03Z

Can you reproduce the original error and validate the fix in an example repo (e.g. mix phx.new + mix phx.gen.auth)?

I'll see what I can do.

nathanl · 2025-11-10T21:16:55Z

@ftes see Query page for a demo

Previously, the test process itself was the made owner of the sandbox process, which meant the sandbox would be terminated when the test terminated. Unlike some other kinds of tests, playwright tests are not linked to the LiveViews and other processes that are being tested, so those processes don't terminate with the test and may go on trying to use the sandbox connection after the test terminates. If they do, they will raise ownership errors. Switch to `Ecto.Adapters.SQL.Sandbox.html.start_owner!/2`, creating a separate process to own the sandbox connection, which is shut down after the test process terminates using `on_exit/1`. This reduces ownership errors that may occur after the test terminates. By default, this is done with no delay, but `sandbox_shutdown_delay` can be configured or specified via tag, which will wait that long before terminating the sandbox during `on_exit/1`.

ftes

I've added some Ecto tests.
Maybe we can use improve those as a follow-up.

Either way: Thanks so much Nathan!

ftes · 2025-11-14T08:12:04Z

I've added some ecto based tests based your example (ftes/phoenix_test_playwright_example#5).

If you have time to look at them, that would be great.

nathanl · 2025-11-14T14:42:11Z

Thank you!

ftes · 2026-02-17T17:50:29Z

lib/phoenix_test/playwright/case.ex

+      pid = Sandbox.start_owner!(repo, shared: !context.async)
+      {:ok, pid}
+    rescue
+      _ -> {:error, :probably_already_started}


@nathanl Do you remember if this blanket rescue clause was just a guess, or handling actual errors you saw?

I've got a report of this rescue masking checkout errors.

FYI removing the blanket rescue for now in 0a8538c

nathanl force-pushed the sandbox_owner branch from 455dc74 to 7c99987 Compare November 6, 2025 21:13

ftes reviewed Nov 7, 2025

View reviewed changes

lib/phoenix_test/playwright/case.ex Outdated Show resolved Hide resolved

ftes reviewed Nov 7, 2025

View reviewed changes

lib/phoenix_test/playwright/case.ex Outdated Show resolved Hide resolved

ftes reviewed Nov 7, 2025

View reviewed changes

nathanl force-pushed the sandbox_owner branch from 3942a2e to a308bc4 Compare November 7, 2025 20:24

nathanl commented Nov 7, 2025

View reviewed changes

nathanl force-pushed the sandbox_owner branch from a308bc4 to 7735c92 Compare November 7, 2025 20:30

nathanl commented Nov 7, 2025

View reviewed changes

nathanl mentioned this pull request Nov 10, 2025

Query page ftes/phoenix_test_playwright_example#5

Closed

nathanl added 2 commits November 10, 2025 16:22

Update CHANGELOG

4390b8a

nathanl force-pushed the sandbox_owner branch from 32af1aa to 4390b8a Compare November 10, 2025 21:24

mix format

fe1da30

ftes approved these changes Nov 14, 2025

View reviewed changes

ftes merged commit 3b54699 into ftes:main Nov 14, 2025
1 check passed

nathanl deleted the sandbox_owner branch November 14, 2025 14:42

ftes reviewed Feb 17, 2026

View reviewed changes

Comments

Conversation

nathanl commented Nov 6, 2025

Uh oh!

nathanl commented Nov 6, 2025

Uh oh!

ftes commented Nov 7, 2025

Uh oh!

Uh oh!

Uh oh!

ftes Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nathanl Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ftes commented Nov 7, 2025

Uh oh!

ftes commented Nov 7, 2025

Uh oh!

nathanl commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nathanl commented Nov 7, 2025

Uh oh!

nathanl commented Nov 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ftes commented Nov 10, 2025

Uh oh!

ftes commented Nov 10, 2025

Uh oh!

nathanl commented Nov 10, 2025

Uh oh!

nathanl commented Nov 10, 2025

Uh oh!

nathanl commented Nov 10, 2025

Uh oh!

ftes left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ftes commented Nov 14, 2025

Uh oh!

nathanl commented Nov 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ftes Nov 7, 2025 •

edited

Loading

nathanl Nov 7, 2025 •

edited

Loading

nathanl commented Nov 7, 2025 •

edited

Loading