Description
Issue Overview
The Paragraph block, as well as other textual blocks, allow <abbr>
, <b>
, <code>
, <i>
, <kbd>
, <mark>
, <span>
, <time>
, and various other inline formatting and semantic tags to be added using the "Edit as HTML" option, and these are considered valid and are not removed on save.
However, when converting a Classic block to standard blocks using the "Convert to Blocks" option, the paragraphs, lists, and blockquotes are stripped of some inline formatting tags, and a few are converted to tags that have a different semantic meaning.
I think inline formatting that is considered valid by the standard blocks should not be removed when converting a Classic block to standard blocks. Otherwise, you are stripping out formatting from the original post when you do not need to.
Steps to Reproduce (for bugs)
- Create a post using the Classic Editor or insert a Classic block in the Gutenberg editor.
- Insert the following:
<abbr>abbr</abbr> <b>b</b> <br>br <bdi>bdi</bdi> <bdo dir="rtl">bdo</bdo> <cite>cite</cite> <code>code</code> <data value="value">data</data> <dfn>dfn</dfn> <em>em</em> <i>i</i> <kbd>kbd</kbd> <mark>mark</mark> <q>q</q> <ruby>ruby <rb>rb</rb> <rp>rp</rp> <rt>rt</rt> <rtc>rtc</rtc> <rp>rp</rp></ruby> <s>s</s> <samp>samp</samp> <small>small</small> <span style="color:red">span</span> <strong>strong</strong> <sub>sub</sub> <sup>sup</sup> <time datetime="2018">time</time> <u>u</u> <var>var</var> <wbr>wbr
- Save the post.
- Open the post in the Gutenberg editor.
- Convert the Classic block to standard blocks using the Convert to Blocks option.
- Notice how some of the inline formatting has been removed, and some of it has been converted to other elements with different semantic meaning, e.g.
<b>
tags are converted to<strong>
tags.
Expected Behavior
Converting a Classic block to standard blocks using the "Convert to Blocks" option should preserve all inline formatting tags that are considered valid by the resulting blocks.
Current Behavior
Converting a Classic block to standard blocks using the "Convert to Blocks" option removes several valid HTML5 tags, and converts some tags to other tags with different semantic meanings, e.g. <b>
tags are converted to <strong>
tags. The previously given sample input above is transformed into the following:
<p><abbr>abbr</abbr> <strong>b</strong> <br/>br bdi bdo cite <code>code</code> data dfn <em>em</em> <em>i</em> kbd mark q ruby rb rp rt rtc rp s samp small span <strong>strong</strong> <sub>sub</sub> <sup>sup</sup> time u var wbr</p>
Other Notes
If a Classic block contains obsolete HTML tags like <color>
, then converting them to <span>
tags upon conversion to standard blocks seems like a good idea.
<b>
and <i>
tags are converted to <strong>
and <em>
respectively. However, it should be considered that there are valid cases where a <b>
or <i>
are used semantically as per the HTML5 specification. Additionally, <span>
tags containing font-weight:bold
or font-style:italic
are converted to <strong>
or <em>
tags respectively, and this seems like a bad idea. Most of the time, when someone makes text bold or italic using a <span>
tag, they are doing it for purely stylistic reasons and not semantic reasons, so converting those to <strong>
and <em>
tags seems like a bad idea.
Related Issues and/or PRs
- Document known changes in editor behavior #4186
Converting To Blocks Looses Existing Anchor Tag Parameters #4498Allowable tags gettings stripped from Header blocks when drafts are auto-saved #5876Converting a table in a classic block to a table block removes all inline styles #6096- Raw Handling: Distinguish between Paste and Conversion #6878
- Block Validation, Deprecation and Migration Experience #7604