Skip to content

Classic + Custom HTML blocks: Convert to Blocks removes valid inline formatting #6102

Closed

Description

Issue Overview

The Paragraph block, as well as other textual blocks, allow <abbr>, <b>, <code>, <i>, <kbd>, <mark>, <span>, <time>, and various other inline formatting and semantic tags to be added using the "Edit as HTML" option, and these are considered valid and are not removed on save.

However, when converting a Classic block to standard blocks using the "Convert to Blocks" option, the paragraphs, lists, and blockquotes are stripped of some inline formatting tags, and a few are converted to tags that have a different semantic meaning.

I think inline formatting that is considered valid by the standard blocks should not be removed when converting a Classic block to standard blocks. Otherwise, you are stripping out formatting from the original post when you do not need to.

Steps to Reproduce (for bugs)

  1. Create a post using the Classic Editor or insert a Classic block in the Gutenberg editor.
  2. Insert the following:
<abbr>abbr</abbr> <b>b</b> <br>br <bdi>bdi</bdi> <bdo dir="rtl">bdo</bdo> <cite>cite</cite> <code>code</code> <data value="value">data</data> <dfn>dfn</dfn> <em>em</em> <i>i</i> <kbd>kbd</kbd> <mark>mark</mark> <q>q</q> <ruby>ruby <rb>rb</rb> <rp>rp</rp> <rt>rt</rt> <rtc>rtc</rtc> <rp>rp</rp></ruby> <s>s</s> <samp>samp</samp> <small>small</small> <span style="color:red">span</span> <strong>strong</strong> <sub>sub</sub> <sup>sup</sup> <time datetime="2018">time</time> <u>u</u> <var>var</var> <wbr>wbr
  1. Save the post.
  2. Open the post in the Gutenberg editor.
  3. Convert the Classic block to standard blocks using the Convert to Blocks option.
  4. Notice how some of the inline formatting has been removed, and some of it has been converted to other elements with different semantic meaning, e.g. <b> tags are converted to <strong> tags.

Expected Behavior

Converting a Classic block to standard blocks using the "Convert to Blocks" option should preserve all inline formatting tags that are considered valid by the resulting blocks.

Current Behavior

Converting a Classic block to standard blocks using the "Convert to Blocks" option removes several valid HTML5 tags, and converts some tags to other tags with different semantic meanings, e.g. <b> tags are converted to <strong> tags. The previously given sample input above is transformed into the following:

<p><abbr>abbr</abbr> <strong>b</strong> <br/>br bdi bdo cite <code>code</code> data dfn <em>em</em> <em>i</em> kbd mark q ruby rb rp rt rtc rp s samp small span <strong>strong</strong> <sub>sub</sub> <sup>sup</sup> time u var wbr</p>

Other Notes

If a Classic block contains obsolete HTML tags like <color>, then converting them to <span> tags upon conversion to standard blocks seems like a good idea.

<b> and <i> tags are converted to <strong> and <em> respectively. However, it should be considered that there are valid cases where a <b> or <i> are used semantically as per the HTML5 specification. Additionally, <span> tags containing font-weight:bold or font-style:italic are converted to <strong> or <em> tags respectively, and this seems like a bad idea. Most of the time, when someone makes text bold or italic using a <span> tag, they are doing it for purely stylistic reasons and not semantic reasons, so converting those to <strong> and <em> tags seems like a bad idea.

Related Issues and/or PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

Backwards CompatibilityIssues or PRs that impact backwards compatability[Feature] Block TransformsBlock transforms from one block to another[Feature] BlocksOverall functionality of blocks[Feature] Paste[Feature] Rich TextRelated to the Rich Text component that allows developers to render a contenteditable

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions