Skip to content

Missing text/selection manipulation primitives #388

Open
@andreastt

Description

@andreastt

https://www.w3.org/Bugs/Public/show_bug.cgi?id=29135

w3c.org@masklinn.net:

As far as I could see, the WebDriver spec currently provides very little in terms of emulating textual manipulations.

NOTE: I will use the term "insertion point" to refer to the textual cursor within e.g. a text box, to differentiate it from the "pointer" cursor

Current Provisions

  • the entire textual content of an element can be retrieved
  • it is possible to [clear] an element or [sendKeys] to it (emulating keyboard input)
  • implicitly, the insertion point and selection can be manipulated using actions (click and pointerDown/pointerMove/pointerUp).

Primary Issues

Pointer actions work in term of offsets, but as far as I could tell

  • the specification provides no way to perform textual matching and transform that into bounding boxes, thus no way to easily position the insertion point or draw selections
  • the specification provides no way to query the insertion point or selection for position or bounding boxe, thus no way to get simple feedback while probing blindly

Use case

Test/demonstrate RTEs or other contenteditable elements, allow cross-platform text insertion within existing textual nodes rather than just around them

Possible solutions?

Rect textRect(needle[, element][, skip])

  • would return the same thing as Element Rect ({x, y, width, height} relative to the document element).
    • would only match visible text (so text contained in a visible element)
    • would generate an error if no matching visible text is found?
  • needle would be the text to look for, possibly a regex? The specification does not currently use regex anywhere so that might be a bit much.
  • skip would probably be necessary as the reference text could occur multiple times in the source.
  • a WebElement "root reference" would probably allow easier precise matching and less skipping.
  • Testing Chrome, Firefox and Safari on OSX, selecting a glyph requires going through the majority of the glyph so selecting from a textual boundary won't risk selecting the preceding glyph.
  • It's somewhat inconvenient for single-letter boundary selections though as there might be need for lots of skipping.
  • It doesn't try to count characters/glyphs and thus might help avoid possible confusion issues with respect to code units, normalisation (maybe?), codepoints and glyphs at the interface-level (these concerns may have to be handled at the spec level though).

Unknowns for this possible solution

  • would/should it be possible to match text across multiple elements? This is possible for users e.g. my browser's in-page search will find a match for "requests | preferences" on the current page even tough that spans two links and a span in two separate list elements.
  • would/should the rect be augmented with the text's container element(s) in the style of a DOM Range? It doesn't seem to make much sense from a user-interaction perspective.

Unsolved

Should it be possible to query the current selection's span/rect as well, independently from arbitrary text? I don't have a use-case for that right now but a "living" user would see the current text selection displayed in the UI so it could make sense.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions