Description
https://www.w3.org/Bugs/Public/show_bug.cgi?id=29135
As far as I could see, the WebDriver spec currently provides very little in terms of emulating textual manipulations.
NOTE: I will use the term "insertion point" to refer to the textual cursor within e.g. a text box, to differentiate it from the "pointer" cursor
Current Provisions
- the entire textual content of an element can be retrieved
- it is possible to [clear] an element or [sendKeys] to it (emulating keyboard input)
- implicitly, the insertion point and selection can be manipulated using actions (click and pointerDown/pointerMove/pointerUp).
Primary Issues
Pointer actions work in term of offsets, but as far as I could tell
- the specification provides no way to perform textual matching and transform that into bounding boxes, thus no way to easily position the insertion point or draw selections
- the specification provides no way to query the insertion point or selection for position or bounding boxe, thus no way to get simple feedback while probing blindly
Use case
Test/demonstrate RTEs or other contenteditable elements, allow cross-platform text insertion within existing textual nodes rather than just around them
Possible solutions?
Rect textRect(needle[, element][, skip])
- would return the same thing as Element Rect ({x, y, width, height} relative to the document element).
- would only match visible text (so text contained in a visible element)
- would generate an error if no matching visible text is found?
- needle would be the text to look for, possibly a regex? The specification does not currently use regex anywhere so that might be a bit much.
- skip would probably be necessary as the reference text could occur multiple times in the source.
- a WebElement "root reference" would probably allow easier precise matching and less skipping.
- Testing Chrome, Firefox and Safari on OSX, selecting a glyph requires going through the majority of the glyph so selecting from a textual boundary won't risk selecting the preceding glyph.
- It's somewhat inconvenient for single-letter boundary selections though as there might be need for lots of skipping.
- It doesn't try to count characters/glyphs and thus might help avoid possible confusion issues with respect to code units, normalisation (maybe?), codepoints and glyphs at the interface-level (these concerns may have to be handled at the spec level though).
Unknowns for this possible solution
- would/should it be possible to match text across multiple elements? This is possible for users e.g. my browser's in-page search will find a match for "requests | preferences" on the current page even tough that spans two links and a span in two separate list elements.
- would/should the rect be augmented with the text's container element(s) in the style of a DOM Range? It doesn't seem to make much sense from a user-interaction perspective.
Unsolved
Should it be possible to query the current selection's span/rect as well, independently from arbitrary text? I don't have a use-case for that right now but a "living" user would see the current text selection displayed in the UI so it could make sense.