Skip to content

Editable: Treat value as HTML string, eliminate children source #2750

Closed

Description

Previously: #421
Related: #2463, #2418

The children source matcher was introduced as a means of extracting and representing the value of rich text in content. It was not expected that a developer should need to interact with it, thus we were comfortable with it being an "opaque" object shape: in the original implementation, an array of React elements. Primarily, this helped avoid the need for dangerouslySetInnerHTML when implementing the save representation of a block, allowing the implementer to instead use the attribute value directly.

Example:

// Before:
<div dangerouslySetInnerHTML={ { __html: attributes.content } } />

// After
<div>{ attributes.content }</div>

There were a few downsides to this approach however:

  • When it was the case that the value would need to be manipulated, it was difficult to work with ([1], [2], [3])
  • It required us to convert to an element structure in both parsing and Editable content getters. In the latter case, the overhead in converting the DOM structure deters us from being more aggressive/immediate with onChange callbacks.
  • The presence of a separate children source increases knowledge necessary to become productive with blocks, as block implementers now need to understand difference between: text, html, children, node

Proposal:

To address the original need to avoid dangerouslySetInnerHTML, we could instead create a separate component which abstracts this implementation detail. Example:

<Editable.Value>{ attributes.content }</Editable.Value>

In this case, the underlying implementation may be to merely assign dangerouslySetInnerHTML, but it is not surfaced to the block implementer, thereby avoiding some verbosity and worries around its application.

This could then allow us to eliminate or alter the value of rich text to that of our choosing. In #2463 (more accurately the update/children-value branch), I tried an array structure.

After further thought, if possible, it would be desirable to eliminate the separate children source altogether, if it is possible to represent the value in raw HTML (leverage html source instead). This has the added benefit of requiring no post-processing in the Editable change handlers, instead merely passing the value of this.editor.getBody().innerHTML (or this.editor.getContent( { format: 'raw' } )). Finally, as a string, it can be easily concatenated, but may be difficult to work with if further manipulation of the DOM structure is desirable.

Open questions:

  • The default behavior for TinyMCE's getContent is to run a serializer over the body contents. We should consult with the Ephox team to determine what of this serialization behavior we might need, whether it can be avoided, or if there is a performance concern in frequently calling to the non-raw format content getter (particularly given that we can expect single TinyMCE instances not to be very large or complex documents)
  • How much are we opening the potential for self-XSS? The value of an HTML string should only ever be derived from either the saved post content, or TinyMCE, where we should like to assume that the content is already sanitized.
  • Where are we manipulating the value of an Editable currently, and in those cases would representing the value as a simple string be problematic? I might imagine that iterating over or extracting from the DOM structure would become more difficult, albeit not impossible (assigning into throwaway DOM wrapper, for instance).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Labels

[Feature] Rich TextRelated to the Rich Text component that allows developers to render a contenteditable[Type] TaskIssues or PRs that have been broken down into an individual action to take

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions