Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecation of libxml_disable_entity_loader() should warn about constants which will override the new default behaviour #1408

Closed
andrewnicols opened this issue Feb 14, 2022 · 2 comments · Fixed by #4036

Comments

@andrewnicols
Copy link
Contributor

From manual page: https://php.net/function.libxml-disable-entity-loader


libxml_disable_entity_loader() was deprecated in php/php-src@e0fa48f with the note:

Since PHP now requires libxml >= 2.9.0 external entity loading no longer
needs to be disabled to prevent these attacks. It is disabled by default.

The documentation for libxml_disable_entity_loader also notes:

However, as of libxml 2.9.0 entity substitution is disabled by default, so there is no need to disable the loading of external entities, unless there is the need to resolve internal entity references with LIBXML_NOENT.

However, there is no mention of the many ways it can be enabled, or how to make it safe where it has been enabled.

Previously you could call code such as:

libxml_disable_entity_loader(true);
$document = new DOMDocument();
$document->loadXML($xml, LIBXML_DTDLOAD | LIBXML_DTDATTR);

The presence of either the DTDLOAD or DTDATTR constants would normally cause the entity to be fetched, but the presence of libxml_disable_entity_loader overrdies and prevents this.

Now, after removing the deprecated call, the entity is fetched.

Whilst the honus is on the developer to understand the change to libxml, the attack vector of an XXE is not sufficiently clear and could be significantly improved in this documentation. The list of libxml constants does not adequately explain the risk of fetching the DTD, or explain that the behaviour of these flags has changed with the deprecation and removal of libxml_disable_entity_loader.

I would recommend that:

  • the libxml_disable_entity_loader docs be updated to note that presence of the relevant LIBXML_ flags will override the default behaviour to not load the entity with advice on how to mitigate this; and
  • the Predefined Constants docs be updated to provide more information on the meaning of the various flags, and their impact.
@andrewnicols
Copy link
Contributor Author

andrewnicols commented Feb 14, 2022

From a quick test, the following constants cause the dtd to be loaded:

  • LIBXML_DTDLOAD
  • LIBXML_DTDATTR
  • LIBXML_DTDVALID

I believe (and testing confirms) that the following are safe unless LIBXML_NOENT is specified:

  • LIBXML_DTDLOAD
  • LIBXML_DTDATTR

The following is not safe:

  • LIBXML_DTDVALID
  • LIBXML_DTDLOAD | LIBXML_NOENT
  • LIBXML_DTDATTR | LIBXML_NOENT

This is further backed up by the libxml2 source at https://gitlab.gnome.org/GNOME/libxml2/-/blob/9edc20c154234ede9cde4002367a7239bde680ce/parser.c#L2768-2772

Testing with the following excerpt demonstrates this:

<?php

$xml = <<<EOF
<?xml version="1.0" ?>
<!DOCTYPE r [
<!ELEMENT r ANY >
<!ENTITY sp SYSTEM "http://localhost/test.txt">
]>
<r>&sp;</r>
EOF;
$document = new DOMDocument();

// Cause load from an external entity, but not exploitable:
//$document->loadXML($xml, LIBXML_DTDLOAD);
//$document->loadXML($xml, LIBXML_DTDATTR);

// Cause load and are exploitable
//$document->loadXML($xml, LIBXML_DTDLOAD | LIBXML_NOENT);
//$document->loadXML($xml, LIBXML_DTDATTR | LIBXML_NOENT);
//$document->loadXML($xml, LIBXML_DTDVALID);
//$document->loadXML($xml, LIBXML_DTDVALID | LIBXML_NOENT);
//$document->loadXML($xml, LIBXML_NOENT);

// LIBXML_NONET seems to restore the previous behaviour
//$document->loadXML($xml, LIBXML_DTDLOAD | LIBXML_NOENT | LIBXML_NONET);
//$document->loadXML($xml, LIBXML_DTDATTR | LIBXML_NOENT | LIBXML_NONET);
//$document->loadXML($xml, LIBXML_DTDVALID | LIBXML_NONET);
//$document->loadXML($xml, LIBXML_DTDVALID | LIBXML_NOENT | LIBXML_NONET);
//$document->loadXML($xml, LIBXML_NOENT | LIBXML_NONET);

(example exploit courtesy of https://gist.github.com/staaldraad/01415b990939494879b4)

With the previous behaviour, simply specifying libxml_disable_entity_loader() prevented the exploit regardless of the flags provided.

@andrewnicols andrewnicols changed the title Deprecation of libxml_disable_entity_loader() should note use of LIBXML_NONET constant Deprecation of libxml_disable_entity_loader() should warn about constants which will override the new default behaviour Feb 14, 2022
@andrewnicols
Copy link
Contributor Author

It's also worth noting that removing the call to libxml_disable_entity_loader() will cause php to hang waiting for a timeout:

<?php

$xml = <<<EOF
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html></html>

$document->loadXML($xml, LIBXML_DTDLOAD);
EOF;

This is because the entity was hosted at http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd which seems to hang on certain user agents (https://www.w3.org/blog/2008/02/w3c_s_excessive_dtd_traffic/ seems relevant).

andrewnicols added a commit to andrewnicols/doc-en that referenced this issue Feb 14, 2022
This commit adds warning that the `LIBXML_DTD_VALID` constant is also
susceptible to XXE exploit, and that the use of `LIBXML_DTDATTR` or
`LIBXML_DTDLOAD` constants will cause external entity fetching.

Also adds a note to the deprecated libxml_disable_entity_loader function
to note that LIBXML_NONET will prevent loading of external entities.

Prior to deprecation of the `libxml_disable_entity_loader` function, the
use of this function prevented both XXE via DTD parsing, and fetching of
external entities as specified by any of the above constants.

Fixes phpGH-1408.
@php php deleted a comment Mar 19, 2022
nielsdos added a commit to nielsdos/doc-en that referenced this issue Nov 11, 2024
…warn about constants which will override the new default behaviour

Based on stale PR phpGH-1409.
Closes phpGH-1409.
Closes phpGH-1408.

Co-authored-by: Andrew Nicols <andrew@nicols.co.uk>
nielsdos added a commit to nielsdos/doc-en that referenced this issue Nov 11, 2024
…warn about constants which will override the new default behaviour

Based on stale PR phpGH-1409.
Closes phpGH-1409.
Closes phpGH-1408.

Co-authored-by: Andrew Nicols <andrew@nicols.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment