Skip to content

Commit

Permalink
HTML API: Make WP_HTML_Processor::get_tag() namespace aware.
Browse files Browse the repository at this point in the history
The HTML specification indicates that an HTML tag with the name "IMAGE"
should be renamed as "IMG" and handled as if it were an "IMG", but this
only applies to elements in the HTML namespace.

In this patch the HTML Processor is updated to ensure that it doesn't
remap the tag name when processing foreign content, such as SVG and
MathML markup.

Developed in https://github.com/wordpress/wordpress-develop/7330
Discussed in https://core.trac.wordpress.org/ticket/61656

Props dmsnell, jonsurrell.
See #61576.



git-svn-id: https://develop.svn.wordpress.org/trunk@59014 602fd350-edb4-49c9-b593-d223f7449a82
  • Loading branch information
dmsnell committed Sep 11, 2024
1 parent 4dc4907 commit f7d9b1b
Show file tree
Hide file tree
Showing 2 changed files with 58 additions and 11 deletions.
18 changes: 7 additions & 11 deletions src/wp-includes/html-api/class-wp-html-processor.php
Original file line number Diff line number Diff line change
Expand Up @@ -4727,17 +4727,13 @@ public function get_tag(): ?string {

$tag_name = parent::get_tag();

switch ( $tag_name ) {
case 'IMAGE':
/*
* > A start tag whose tag name is "image"
* > Change the token's tag name to "img" and reprocess it. (Don't ask.)
*/
return 'IMG';

default:
return $tag_name;
}
/*
* > A start tag whose tag name is "image"
* > Change the token's tag name to "img" and reprocess it. (Don't ask.)
*/
return ( 'IMAGE' === $tag_name && 'html' === $this->get_namespace() )
? 'IMG'
: $tag_name;
}

/**
Expand Down
51 changes: 51 additions & 0 deletions tests/phpunit/tests/html-api/wpHtmlProcessor.php
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,57 @@ public function test_get_tag_is_null_once_document_is_finished() {
$this->assertNull( $processor->get_tag() );
}

/**
* Ensures that the proper tag-name remapping happens for the `IMAGE` tag.
*
* An HTML parser should treat an IMAGE tag as if it were an IMG tag, but
* only when found in the HTML namespace. As part of this rule, IMAGE tags
* in the HTML namespace are also void elements, while those in foreign
* content are not, making the self-closing flag significant.
*
* Example:
*
* // This input...
* <image/><svg><image/></svg>
*
* // ...is equivalent to this normative HTML.
* <img><svg><image/></svg>
*
* @ticket 61576
*
* @covers WP_HTML_Processor::get_tag
*/
public function test_get_tag_replaces_image_with_namespace_awareness() {
$processor = WP_HTML_Processor::create_fragment( '<image/><svg><image/></svg>' );

$this->assertTrue(
$processor->next_tag(),
'Could not find initial "<image/>" tag: check test setup.'
);

$this->assertSame(
'IMG',
$processor->get_tag(),
'HTML tags with the name "IMAGE" should be remapped to "IMG"'
);

$this->assertTrue(
$processor->next_tag(),
'Could not find "<svg>" tag: check test setup.'
);

$this->assertTrue(
$processor->next_tag(),
'Could not find SVG "<image/>" tag: check test setup.'
);

$this->assertSame(
'IMAGE',
$processor->get_tag(),
'Should not remap "IMAGE" to "IMG" for foreign elements.'
);
}

/**
* Ensures that the HTML Processor maintains its internal state through seek calls.
*
Expand Down

0 comments on commit f7d9b1b

Please sign in to comment.