Skip to content

Bug: Dom\HTMLDocument corrupts closing tags within scripts. #21041

@nickchomey

Description

@nickchomey

Description

I noticed something peculiar about how HTMLDocument handles html closing tags within script tags. My expectation was that it wouldnt do anything at all and just treat anything between <script></script> tags as a string, but its modifying the closing tag on heading tags.

Here's a small sample php script

<?php
/**
 * Compare old DOMDocument vs new Dom\HTMLDocument for script content handling.
 * Testing various libxml constants to see if any preserve script content.
 */

$html = <<<'HTML'
<!DOCTYPE html>
<html>
    <head></head>
    <body>
        <h3>body heading</h3>
        <script type="text/html" id="tmpl-test">
            <h3>template heading</h3>
        </script>
        <script type="text/javascript">
            var a = "<h1>asdf</h1>";
            var b = `<h1>asdf</h1>`;
            <h1>asdf</h1>
        </script>
    </body>
</html>
HTML;

echo "\n\n=== Old DOMDocument ===\n";
$oldDom = new DOMDocument();
@$oldDom->loadHTML( $html );
echo $oldDom->saveHTML();

echo "\n\n=== New Dom\\HTMLDocument ===\n";
$newDom = \Dom\HTMLDocument::createFromString( $html );
echo $newDom->saveHTML();

It outputs the following, and you can see that the closing heading tags are just </1>

=== Old DOMDocument ===
<!DOCTYPE html>
<html>
    <head></head>
    <body>
        <h3>body heading</h3>
        <script type="text/html" id="tmpl-test">
            <h3>template heading</h3>
        </script>
        <script type="text/javascript">
            var a = "<h1>asdf</h1>";
            var b = `<h1>asdf</h1>`;
            <h1>asdf</h1>
        </script>
    </body>
</html>


=== New Dom\HTMLDocument ===
<!DOCTYPE html><html><head></head>
    <body>
        <h3>body heading</h3>
        <script type="text/html" id="tmpl-test">
            <h3>template heading</3>
        </script>
        <script type="text/javascript">
            var a = "<h1>asdf</1>";
            var b = `<h1>asdf</1>`;
            <h1>asdf</1>
        </script>
    
</body></html>

Its strange because DOMDocument was known to have issues with closing tags within script tags, and apparently HTMLDocument was supposed to fix this. But its literally the opposite in this case.

PHP Version

PHP 8.4.17 (cli) (built: Jan 16 2026 02:36:09) (ZTS gcc 10.2.1 x86_64)
Copyright (c) The PHP Group
Built by Static PHP <https://static-php.dev> #StandWithUkraine
Zend Engine v4.4.17, Copyright (c) Zend Technologies
    with Zend OPcache v8.4.17, Copyright (c), by Zend Technologies

Operating System

Ubuntu 24.04

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions