Description
openedon Sep 20, 2017
Previously: #421
Related: #2463, #2418
The children
source matcher was introduced as a means of extracting and representing the value of rich text in content. It was not expected that a developer should need to interact with it, thus we were comfortable with it being an "opaque" object shape: in the original implementation, an array of React elements. Primarily, this helped avoid the need for dangerouslySetInnerHTML
when implementing the save
representation of a block, allowing the implementer to instead use the attribute value directly.
Example:
// Before:
<div dangerouslySetInnerHTML={ { __html: attributes.content } } />
// After
<div>{ attributes.content }</div>
There were a few downsides to this approach however:
- When it was the case that the value would need to be manipulated, it was difficult to work with ([1], [2], [3])
- It required us to convert to an element structure in both parsing and Editable content getters. In the latter case, the overhead in converting the DOM structure deters us from being more aggressive/immediate with
onChange
callbacks. - The presence of a separate
children
source increases knowledge necessary to become productive with blocks, as block implementers now need to understand difference between:text
,html
,children
,node
Proposal:
To address the original need to avoid dangerouslySetInnerHTML
, we could instead create a separate component which abstracts this implementation detail. Example:
<Editable.Value>{ attributes.content }</Editable.Value>
In this case, the underlying implementation may be to merely assign dangerouslySetInnerHTML
, but it is not surfaced to the block implementer, thereby avoiding some verbosity and worries around its application.
This could then allow us to eliminate or alter the value of rich text to that of our choosing. In #2463 (more accurately the update/children-value
branch), I tried an array structure.
After further thought, if possible, it would be desirable to eliminate the separate children
source altogether, if it is possible to represent the value in raw HTML (leverage html
source instead). This has the added benefit of requiring no post-processing in the Editable change handlers, instead merely passing the value of this.editor.getBody().innerHTML
(or this.editor.getContent( { format: 'raw' } )
). Finally, as a string, it can be easily concatenated, but may be difficult to work with if further manipulation of the DOM structure is desirable.
Open questions:
- The default behavior for TinyMCE's
getContent
is to run a serializer over the body contents. We should consult with the Ephox team to determine what of this serialization behavior we might need, whether it can be avoided, or if there is a performance concern in frequently calling to the non-raw format content getter (particularly given that we can expect single TinyMCE instances not to be very large or complex documents) - How much are we opening the potential for self-XSS? The value of an HTML string should only ever be derived from either the saved post content, or TinyMCE, where we should like to assume that the content is already sanitized.
- Where are we manipulating the value of an Editable currently, and in those cases would representing the value as a simple string be problematic? I might imagine that iterating over or extracting from the DOM structure would become more difficult, albeit not impossible (assigning into throwaway DOM wrapper, for instance).