Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handling namespaces when parsing XML #1395

Closed
Jefidev opened this issue Mar 19, 2020 · 2 comments
Closed

Handling namespaces when parsing XML #1395

Jefidev opened this issue Mar 19, 2020 · 2 comments

Comments

@Jefidev
Copy link

Jefidev commented Mar 19, 2020

Currently the namespace of an XML attributes is discarded
That could lead to undesirable behavior. Parsing the following XML :
readXML("\<Data xmlns:ns=\"http://trivial\"\>\<Field name=\"A\" ns:name=\"a\"\>\</Field\>\</Data\>");

Lead to that :
node: "Data"(["Field"( [], name="a")])

Only one of the two "name" attribute was kept.

Proposed solution
The namespace could be handled like this :
node: "Data"(["Field"( [], default = [(name, "a")])] ns = [(name, "A")])])

Or like this :
node: "Data"(["Field"( [], ("", "name") = "a", ("ns", "name") = "A"])

But there may be a more "Rascalest" way.
The "get" method provided by the XML module should also accept an optional parameter in order to precise the namespace required :
get(node, "name") => "a"
get(node, "name", "ns") => "A"

This allows us to search the attribute associated with the correct namespace and that doesn't break the backward compatibility of the get function.

@jurgenvinju
Copy link
Member

jurgenvinju commented Mar 19, 2020

The XML reader has a fullyQualify option which seems to work, but it has an issue:

rascal>n = readXML("\<Field xmlns:ns=\"http://trivial\" name=\"A\" ns:name=\"a\"\>\</Field\>",fullyQualify=true);
node: "Field"(
  [],
  ns:name="a",
  name="A",
  xmlns=("ns":"http://trivial"))

The ns:name is not syntactically correct, so we can not address it in Rascal code.

@DavyLandman we might:

  1. allow : in escaped qualified names, like so: \ns:name
  2. substitute the : for - like so: \ns-name

I prefer the latter since it does not require adapting the Rascal syntax, but it's not so nice as the : solution. @PaulKlint @tvdstorm should have an opinion about this too..

@jurgenvinju
Copy link
Member

Ok:

  1. adding : to escaped names would be ambiguous; see a::qualified::Name
  2. a complex solution would add only : and reject :: from escaped names; this has too much impact in the grammar in my mind
  3. adding . to the escaped names would be an option, but still a substitution is required in the XML reader.
  4. so for now it seems substituting : for - will do the trick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants