Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix for parsing xml that starts with whitespace #7

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

rasmuserik
Copy link

Rhino has problems when parsing xml that starts with whitespace (and XML.ignoreWhitespace === true), ie:

js> (new XML(" <a/>")).toSource()
&lt;a/&gt;

This patch fixes this issues such that:

js> (new XML(" <a/>")).toSource()
<a/>

Unit test also added.

This is implemented by removing the special case for strings not starting with <. According to ECMA-357 10.3, numbers and booleans are passed to the same xml-conversion as strings, so as far as I can see the special case is not needed.

Rasmus Erik Voel Jensen added 2 commits August 12, 2011 13:13
Ie. without the patch:

js> (new XML(" <a/>")).toSource()
&lt;a/&gt;

and with the patch:

js> (new XML(" <a/>")).toSource()
<a/>

- unit test also added.

In the same time it changes the way
xml for booleans/numbers are made to be actually parsed
rather than built as a text-node with a special case.
@rasmuserik
Copy link
Author

trim leading/trailing space within xml-textnodes as in spidermonkey and ecma-357

Added another patch+unittest on whitespace and e4x: Rhino treats leading/trailing whitespaces in e4x differently from SpiderMonkey, with XML.ignoreWhitespace and XML.prettyPrinting. ie. Rhino does:

js> XML.ignoreWhitespace = true; XML.prettyPrinting = true; uneval(<a> b </a>)
<a>b</a>
js> XML.ignoreWhitespace = true; XML.prettyPrinting = false; uneval(<a> b </a>)
<a>b</a>
js> XML.ignoreWhitespace = false; XML.prettyPrinting = true; uneval(<a> b </a>)
<a> b </a>
js> XML.ignoreWhitespace = false; XML.prettyPrinting = false; uneval(<a> b </a>)
<a> b </a>

where the latest SpiderMonkey does:

js> XML.ignoreWhitespace = true; XML.prettyPrinting = true; uneval(<a> b </a>)
"<a>b</a>"
js> XML.ignoreWhitespace = true; XML.prettyPrinting = false; uneval(<a> b </a>)
"<a> b </a>"
js> XML.ignoreWhitespace = false; XML.prettyPrinting = true; uneval(<a> b </a>)
"<a>b</a>"
js> XML.ignoreWhitespace = false; XML.prettyPrinting = false; uneval(<a> b </a>)
"<a> b </a>"

The second commit changes Rhinos behaviour, such that leading/trailing spaces are not trimmed when building/trimming the xml-tree, but is trimmed when prettyprinting, - which seems to be same behaviour as SpiderMonkey and in the e4x-standard.

@mbd-dbc-dk
Copy link
Contributor

Would be great to have this patch merged - we do a lot of E4X manipulations using Rhino, and could use this patch (disclaimer: from same org. as Rasmus).

@rasmuserik
Copy link
Author

What is the status of this,
ie. what needs to be done
to get this one pulled?

@p-bakker p-bakker added the E4X Issues related to the (deprecated) ECMAScript for XML (E4X/ECMA-357) standard label Jun 30, 2021
@p-bakker p-bakker marked this pull request as draft September 8, 2024 07:10
@p-bakker
Copy link
Collaborator

p-bakker commented Sep 8, 2024

Converted to draft as this has been lying around since forever and needs someone to go through to determine whether we want/needs this PR not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
E4X Issues related to the (deprecated) ECMAScript for XML (E4X/ECMA-357) standard Help wanted
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants