Skip to content

Can't Parse HTML from HTMLDocument #2780

Open
@liamh101

Description

@liamh101

Describe the bug and add attachments

When creating a document from HTML, when adding HTML via the static call HTML::addHTML if the content contains a image tag with a p tag. The call fails with the DomDocument exception DOMDocument::loadXML(): Opening and ending tag mismatch.

The HTML was generated via PHP 8.4's HTMLDocument class.

Expected behavior

The HTML is accepted as valid HTML.

Is there an easy way to mitigate this?

Steps to reproduce

<?php

$dom = Dom\HTMLDocument::createFromString(<<<'HTML'
<!DOCTYPE html>
<html>
<body>
   <p><img style="aspect-ratio:12/13;" src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAwAAAANCAYAAACdKY9CAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAAEnQAABJ0Ad5mH3gAAAJuaVRYdFNuaXBNZXRhZGF0YQAAAAAAeyJjbGlwUG9pbnRzIjpbeyJ4IjoxLCJ5IjozfSx7IngiOjIsInkiOjR9LHsieCI6MiwieSI6NX0seyJ4IjozLCJ5Ijo2fSx7IngiOjQsInkiOjd9LHsieCI6NCwieSI6OH0seyJ4Ijo0LCJ5Ijo5fSx7IngiOjUsInkiOjEwfSx7IngiOjUsInkiOjExfSx7IngiOjYsInkiOjEyfSx7IngiOjYsInkiOjEzfSx7IngiOjcsInkiOjEzfSx7IngiOjcsInkiOjE0fSx7IngiOjgsInkiOjE0fSx7IngiOjksInkiOjE0fSx7IngiOjEwLCJ5IjoxNH0seyJ4IjoxMSwieSI6MTR9LHsieCI6MTEsInkiOjEzfSx7IngiOjEyLCJ5IjoxMn0seyJ4IjoxMiwieSI6MTF9LHsieCI6MTIsInkiOjEwfSx7IngiOjEyLCJ5Ijo5fSx7IngiOjEyLCJ5Ijo4fSx7IngiOjEyLCJ5Ijo3fSx7IngiOjEyLCJ5Ijo1fSx7IngiOjEyLCJ5Ijo0fSx7IngiOjEyLCJ5IjozfSx7IngiOjEyLCJ5IjoyfSx7IngiOjExLCJ5IjoxfSx7IngiOjEwLCJ5IjowfSx7IngiOjksInkiOjB9LHsieCI6OCwieSI6MH0seyJ4Ijo3LCJ5IjowfSx7IngiOjYsInkiOjB9LHsieCI6NSwieSI6MH0seyJ4Ijo0LCJ5IjowfSx7IngiOjMsInkiOjB9LHsieCI6MiwieSI6MH0seyJ4IjoxLCJ5IjowfSx7IngiOjAsInkiOjB9XX0Gg0zKAAAAg0lEQVQoU42PwQ2AIAxFW3ULEoi6iGzkJo6EI7gCB&#43;8OAEEbGxIjobwLtMnPf0Wtxw0QVxCIIdrz9DsaMy8JkuN9FQp1AOHgWaQfetd57y9K8k4E&#43;QVtpsTfKo/SS2tLbiCUMgt58ljkEyAktazUyi8g3fJTImpaRaVaS7GBKLXEEO0NwE0ruorm1rsAAAAASUVORK5CYII&#61;" width="12" height="13" /></p>
</body>
</html>
HTML);

$doc = new PhpOffice\PhpWord\PhpWord();
$section = $doc->addSection([
    'headerHeight' => PhpOffice\PhpWord\Shared\Converter::cmToTwip(1.54),
]);

$html = $dom->saveHtml();

PhpOffice\PhpWord\Shared\Html::addHtml($section, $html);

PHPWord version(s) where the bug happened

1.3.0

PHP version(s) where the bug happened

8.4

Priority

  • I want to crowdfund the bug fix (with @algora-io) and fund a community developer.
  • I want to pay the bug fix and fund a maintainer for that. (Contact @Progi1984)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions