Skip to content

Fix mutable nodeList reference during traversal #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 26, 2023
Merged

Conversation

veewee
Copy link
Owner

@veewee veewee commented Jan 26, 2023

Q A
Type bug
BC Break no
Fixed issues #49

Summary

Closes #49

childNodes is a mutable nodeList of which the count gets altered when one of its nodes gets removed whilst iterating.
When one of the linked nodes in the nodeList is removed, the iterator stops.
This breaks the foreach:

https://3v4l.org/FksZo

<?php

$doc = new DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->loadXML(
    <<<EOXML
    <x>
        <item1 />
        <item2 />
        <item3 />
        <item4 />
    </x>
EOXML
);

foreach ($doc->documentElement->childNodes as $i => $node) {
    echo $node->localName . PHP_EOL;
    if ($i === 2) {
        $node->remove();
    }
}

Output:

item1
item2
item3

To make sure we keep track of all original nodes, this PR uses our handy built-in immutable NodeList instead inside the traverser whilst going over the nodes.
Making it possible to make visitors like this:

class RemoveEmptyElementsVisitor extends AbstractVisitor {
    public function onNodeLeave(\DOMNode $node): Action
    {
        if (!is_element($node)) {
            return new Action\Noop();
        }

        if (trim($node->textContent) !== '') {
            return new Action\Noop();
        }

        return new Action\RemoveNode();
    }
}

Those can strip out all empty tags:
From

<movies>
    <movie>
        <name>Terminator</name>
        <genre>
            <action />
            <romance />
        </genre>
        <prices>
            <oscar amount="4" />
        </prices>
    </movie>
</movies>

To

<movies>
    <movie>
        <name>Terminator</name>
    </movie>
</movies>

@veewee veewee force-pushed the immutable-traversal branch from 3bda501 to 2aa54e0 Compare January 26, 2023 17:59
@veewee veewee merged commit b34b0d6 into main Jan 26, 2023
@veewee veewee deleted the immutable-traversal branch January 26, 2023 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document traverse works only once
1 participant