Skip to content

Conversation

jsantell
Copy link

This implements the pierce/ selector to select elements within shadow roots for all relevant APIs (waitForSelector, $, $$, etc. for Page and ElementHandle).

<div>
  #shadow-root (open)
    <x-button test-id="registration"></x-button>
</div>
  const handle = await page.waitForSelector(
    'pierce/[test-id="registration"]',
  );

Some implementation notes:

  • query.ts uses Element type for functions executed in content, and would then introduce the first type directive in non-test source (lib=dom): Any issues with doing so?
  • pierce/ was chosen due to its simplicity as a piercing selector, though there are other forms that have been used in past tools and browsers (>>>, >>>>, /deep/, etc.).

Happy to address any feedback needed if this is of interest upstream. Thank you for the project!

@jsantell jsantell force-pushed the pierce-selector branch 2 times, most recently from 97facbc to f35dea4 Compare July 23, 2025 16:10
@jsantell
Copy link
Author

jsantell commented Jul 23, 2025

  • Updated permission type error in debug.ts, required in deno 2.4+ (was previously valid in at least deno 2.3.5)

@lino-levan
Copy link
Owner

lino-levan commented Jul 23, 2025

One of the big mistakes in puppeteer in my opinion is the selector system. I would strongly prefer

  const handle = await page.waitForSelector(
    '[test-id="registration"]',
    { type: "css" (or xpath or ..., css by default) }
  );

In terms of API. In this case though, is there a reason pierce even needs to be a selector type? What I would do is assume the user meant to pierce by default, and have a { pierce: false } for all of these APIs – if a user cares about query performance then they should use pierce false if they don't need it.

query.ts uses Element type for functions executed in content, and would then introduce the first type directive in non-test source (lib=dom): Any issues with doing so?

I personally don't want to do this, it forces the end user into including types that they might not necessarily want – would rather have unknown or any with a comment explaining why it's necessary.

Edit: Seems like we really can't use the triple slash directive – it prevents publishing

@jsantell
Copy link
Author

jsantell commented Jul 23, 2025

Thanks for the response!

One of the big mistakes in puppeteer in my opinion is the selector system. I would strongly prefer

  const handle = await page.waitForSelector(
    '[test-id="registration"]',
    { type: "css" (or xpath or ..., css by default) }
  );

In terms of API. In this case though, is there a reason pierce even needs to be a selector type? What I would do is assume the user meant to pierce by default, and have a { pierce: false } for all of these APIs – if a user cares about query performance then they should use pierce false if they don't need it.

No attachments to the selector syntax, just used due to being prior art, and specifically the simplicity of the pierce/ selector -- a SelectorOptions sounds like a good fit for the selector-based methods:

interface SelectorOptions {
  pierce?: boolean;
}

// Extend WaitForSelectorOptions
interface WaitForSelectorOptions = SelectorOptions & {
  timeout?: number
}

A note, the puppeteer-style pierce/$SELECTOR by definition matches elements that match a shallow (e.g. no combinators) $SELECTOR within shadow roots -- pierce/div will not select divs outside of shadow roots.

This specific implementation of "piercing" may not directly translate to a { pierce?: boolean } flag, which 1) I'd guess that would both select elements in light DOM, and 2) support more layers of combinators e.g. $("div a", { pierce: true }) looks like it could select an anchor element in a shadow root, descending from a div in (possibly) light DOM (let's call this implementation "full transparent piercing"). FWIW, implementing full transparent piercing in content (e.g. div a[:last-child] > span) could be a good chunk of work (could be overestimating), and probably the rationale for other "piercing selector" alternatives like ">>>>".

Given the current pierce/$SELECTOR semantics, what would be most preferable functionality of a {pierce: boolean} flag? {pierce:true} as a default seems like it'd only work if "full transparent piercing" implemented. In lieu of full transparent piercing, is there another implementation that could work? The shortest path forward without custom selector strings I think would look like:

interface SelectorOptions {
  pierce?: boolean;
}

class ElementHandle {
  $(selector: string, opts:SelectorOptions = { pierce: false }) {
    if (opts.pierce) contentPiercingQuerySelector(..)
    else CDTQuerySelector(..)
  });
}

query.ts uses Element type for functions executed in content, and would then introduce the first type directive in non-test source (lib=dom): Any issues with doing so?

I personally don't want to do this, it forces the end user into including types that they might not necessarily want – would rather have unknown or any with a comment explaining why it's necessary.

Edit: Seems like we really can't use the triple slash directive – it prevents publishing

Spotted that in CI, good catch -- unknown or a minimal type Element shim should work just fine. Related, a handful of tests fail intermittently on the main branch on multiple local machines (possibly with resolution of example.com) -- any tips here for fully testing?

@lino-levan
Copy link
Owner

lino-levan commented Jul 23, 2025

A note, the puppeteer-style pierce/$SELECTOR by definition matches elements that match a shallow (e.g. no combinators) $SELECTOR within shadow roots -- pierce/div will not select divs outside of shadow roots.

Ah okay, I missed that. If that's the case, then let's just do pierce: false by default (and have that maintain current behavior). Would the piercing setup work if we decided to add something like xpath? Would strategy: "xpath", pierce: true be feasible to implement?

Spotted that in CI, good catch -- unknown or a minimal type Element shim should work just fine. Related, a handful of tests fail intermittently on the main branch on multiple local machines (possibly with resolution of example.com) -- any tips here for fully testing?

Not aware of any intermittent test failures, what does it look like?

@jsantell
Copy link
Author

A note, the puppeteer-style pierce/$SELECTOR by definition matches elements that match a shallow (e.g. no combinators) $SELECTOR within shadow roots -- pierce/div will not select divs outside of shadow roots.

Ah okay, I missed that. If that's the case, then let's just do pierce: false by default (and have that maintain current behavior). Would the piercing setup work if we decided to add something like xpath? Would strategy: "xpath", pierce: true be feasible to implement?

For future strategies, I think of pierce as another strategy, rather than a general modifier to pierce shadow roots for other strategies, as it's (currently) the Puppeteer Pierce Selector with the above rules. What do you think of:

type SelectorStrategy = "native" | "pierce"; // Could be "xpath" in the future
interface SelectorOptions {
  strategy: SelectorStrategy;
}

// Same with $, $$, waitForSelector, etc.
function $(selector: string, opts: SelectorOptions = { strategy: "native" }): Promise<ElementHandle | null>

Spotted that in CI, good catch -- unknown or a minimal type Element shim should work just fine. Related, a handful of tests fail intermittently on the main branch on multiple local machines (possibly with resolution of example.com) -- any tips here for fully testing?

Not aware of any intermittent test failures, what does it look like?

All on linux, some failed tests (that sometimes work), I'll look more into it:

running 1 test from ./tests/evaluate_test.ts
Testing evaluate ... FAILED (55s)
running 3 tests from ./tests/extract_test.ts
Extracting page content ... FAILED (58s)
running 2 tests from ./tests/install_lock_file_test.ts
Test concurrent getBinary calls ...'Test concurrent getBinary calls' has been running for over (1m0s)
 FAILED (1m10s)
running 2 tests from ./tests/resource_test.ts
Browser - close with 'using' keyword ... FAILED (54s)
Fail wait for selector ... FAILED (51s)

It may take a few days for me to get back to this, but interested in following through here, thanks for the feedback!

@lino-levan
Copy link
Owner

For future strategies, I think of pierce as another strategy, rather than a general modifier to pierce shadow roots for other strategies, as it's (currently) the Puppeteer Pierce Selector with the above rules. What do you think of:

Looks good to me, that would be excellent

All on linux, some failed tests (that sometimes work), I'll look more into it:

Haven't seen that before – thanks for noting.

It may take a few days for me to get back to this, but interested in following through here, thanks for the feedback!

Thank you for the contribution! This is great work.

@jsantell jsantell force-pushed the pierce-selector branch 2 times, most recently from 937343e to ddf8165 Compare July 24, 2025 02:25
… supporting an optional shadow root piercing strategy.
@jsantell
Copy link
Author

Getting back to this, just updated the formatting

@jsantell
Copy link
Author

jsantell commented Jul 31, 2025

The linux CI failures are similar to what I see locally on this branch (and main): intermittent timeouts in tests that use http://example.com -- sometimes even curl http://example.com hangs

If it's of interest, I could add a simple web server to run alongside the tests (either here, or in a new PR), rough idea:

  • Some simple web server at ./tests/fixtures/server.ts that serves some simple HTML on localhost:8000 (could copy example.com, not sure how dependent tests are on content yet)
  • Update tests to use const serverUrl = Deno.env.get("TEST_SERVER_URL") || "http://example.com"
  • In CI, run the dev server and invoke CI via TEST_SERVER_URL=http://localhost:8000 deno task test

@lino-levan
Copy link
Owner

If the test server would fix test timeouts – that would be great. Makes a ton of sense. I would slightly prefer if that was in a seperate PR to make it easier to review. Really good idea!

@jsantell
Copy link
Author

If the test server would fix test timeouts – that would be great. Makes a ton of sense. I would slightly prefer if that was in a seperate PR to make it easier to review. Really good idea!

Excellent, this went pretty smooth: #167

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants