You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After using remove_namespaces!, to_xhtml does not add the XHTML xmlns to a document, but only if a DOCTYPE is present. Without a DOCTYPE, the XHTML xmlns is added (as expected).
Without DOCTYPE: "<html xmlns=\"http://www.w3.org/1999/xhtml\"></html>\n"
With DOCTYPE: "<!DOCTYPE html>\n<html></html>\n"
NOTE: This a regression: version 1.11.3 does not exhibit this behavior; version >= 1.11.4 have this bug.
Help us reproduce what you're seeing
Reproduction script:
#!/usr/bin/env rubyrequire"nokogiri"pNokogiri::VERSIONh1="<html xmlns=\"http://www.w3.org/1999/xhtml\"></html>\n"doc1=Nokogiri::XML(h1)doc1.remove_namespaces!ph1pdoc1.to_xhtmlraise"different serialization"ifh1 != doc1.to_xhtmlh2="<!DOCTYPE html>\n<html xmlns=\"http://www.w3.org/1999/xhtml\"></html>\n"doc2=Nokogiri::XML(h2)doc2.remove_namespaces!# <<< the bug disappears if this line is commented outph2pdoc2.to_xhtmlraise"different serialization"ifh2 != doc2.to_xhtml
Expected behavior / Actual behavior
Regardless of whether a DOCTYPE is present or remove_namespaces! has been used, to_xhtml should always produce conformant XHTML files with the required xmlns.
This is the output produced by the reproduction script with Nokogiri 1.11.3 (expected):
I'm going to give an answer similar to the one I gave at #2265, which is that remove_namespaces! shouldn't be used if you want standards-compliant behavior.
In particular we disagree about this statement:
Regardless of whether ... remove_namespaces! has been used, to_xhtml should always produce conformant XHTML files with the required xmlns.
This probably is likely happening in Nokogiri v1.11.4 and later because that's the version that upgraded libxml2 to v2.9.12, which did have many changes to namespace handling behavior, particularly in HTML/XHTML contexts.
Can I ask why you're removing namespaces from the document if you expect namespaces to be handled correctly?
Please describe the bug
After using
remove_namespaces!
,to_xhtml
does not add the XHTMLxmlns
to a document, but only if aDOCTYPE
is present. Without aDOCTYPE
, the XHTMLxmlns
is added (as expected).Without
DOCTYPE
:"<html xmlns=\"http://www.w3.org/1999/xhtml\"></html>\n"
With
DOCTYPE
:"<!DOCTYPE html>\n<html></html>\n"
NOTE: This a regression: version 1.11.3 does not exhibit this behavior; version >= 1.11.4 have this bug.
Help us reproduce what you're seeing
Reproduction script:
Expected behavior / Actual behavior
Regardless of whether a
DOCTYPE
is present orremove_namespaces!
has been used,to_xhtml
should always produce conformant XHTML files with the requiredxmlns
.This is the output produced by the reproduction script with Nokogiri 1.11.3 (expected):
This is the output produced with Nokogiri 1.11.4 and 1.11.7 (broken):
Environment
The text was updated successfully, but these errors were encountered: