Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore replacements for Xalan-J #1829

Open
flavorjones opened this issue Dec 3, 2018 · 11 comments
Open

Explore replacements for Xalan-J #1829

flavorjones opened this issue Dec 3, 2018 · 11 comments

Comments

@flavorjones
Copy link
Member

See #1760 for more context.

The JRuby implementation of Nokogiri uses Xalan-J for XSLT support, but this thread reveals that development (and likely support) has stalled on that project.

A potential replacement, noted in the thread linked above, is Saxon. There may be other potential replacements as well.

The outcome intended for this issue is to do some exploration of replacing Xalan-J:

  • is there a replacement that will functionally support what Xalan-J supports today?
  • how difficult will a transition be?

Tagging @jvshahid and @kares for visibility

@headius
Copy link
Contributor

headius commented Aug 22, 2022

I don't believe there's been much forward momentum on this, but another motivator to make a move came up recently in the form of a CVE for Xalan:

GHSA-qwq9-89rg-ww72

The comment here asks whether we might be able to transition (at least) to the shipping Xalan in OpenJDK, to which the answer is "probably": https://github.com/sparklemotion/nokogiri/security/advisories/GHSA-qwq9-89rg-ww72#advisory-comment-74545

In any case this should probably be addressed sooner than later since Xalan-J has been EOL for many years now, and is only being maintained in JDK as part of JDK. Saxon may be the best path forward, but using the JDK Xalan may be a short-term way to avoid the super-stale external version.

flavorjones added a commit that referenced this issue Aug 22, 2022
@flavorjones
Copy link
Member Author

@headius Thanks for commenting. I may need help with the mechanics of switching to the OpenJDK version of Xalan. I've got a branch where I've updated the package paths, but there are a few places where the API has changed that I need to work through. I'll push it shortly.

@flavorjones
Copy link
Member Author

I've created a draft PR migrating to the JDK Xalan that needs help to get unstuck: #2632

@kares
Copy link
Contributor

kares commented Aug 24, 2022

Saxon may be the best path forward, but using the JDK Xalan may be a short-term way to avoid the super-stale external version.

@headius so the only downside is we will need to add-opens the internal com.sun.org.apache package?
not sure if this is smt for JDK 17+, whether Nokogiri would be able to rely on those in the long-term ...

@flavorjones
Copy link
Member Author

#2632 is a PR that @kares and I collaborated on to try to move Nokogiri away from upstream Xalan to using the JDK internal packages.

Unfortunately, relying on JDK internals doesn't seem to work well (at least not without a bit more work and potentially breaking changes). Java 17 disallows usage of internals with this error:

Java::JavaLang::IllegalAccessError: superclass access check failed: class nokogiri.internals.XalanDTMManagerPatch (in unnamed module @0x41d20f06) cannot access class com.sun.org.apache.xml.internal.dtm.ref.DTMManagerDefault (in module java.xml) because module java.xml does not export com.sun.org.apache.xml.internal.dtm.ref to unnamed module @0x41d20f06

In @kares's words:

all in all the change to the internal xalan would need more work but at the same time I am not sure it's worth the effort, given the reliance on internals ... [and the] need to adjust JRUBY_OPTS flags to get it working

The only other option that's been suggested is porting to Saxon. I'm not comfortable enough with these libraries to estimate how much effort that would be. Does anybody here want to hazard a guesstimate?

Are there any other options (to Xalan-J) we haven't considered?

@kares
Copy link
Contributor

kares commented Sep 11, 2022

👍 and just to clarify: Java::JavaLang::IllegalAccessError means users that would use Nokogiri with JDK Xalan would likely have to edit their JRUBY/JAVA_OPTS to --add-ones for the internals to be accessible (assuming we can still access them despite not being exported).

The only other option that's been suggested is porting to Saxon. I'm not comfortable enough with these libraries to estimate how much effort that would be. Does anybody here want to hazard a guesstimate?

I guess ~ 80% should be easy but than there's a lot of internals used which might be impossible -> rewriting those parts with Saxon specific APIs. My raw guess would be 10 work days.

@headius
Copy link
Contributor

headius commented Sep 13, 2022

Are these illegal accesses from using reflection to dig inside Xalan? If so we might be able to reduce these by being less invasive, but perhaps that would hinder compatibility.

Moving to a supported external library like Saxon would definitely be the best option long term, but I agree with @kares it would involve at least a week or two of work.

@kares
Copy link
Contributor

kares commented Sep 14, 2022

Are these illegal accesses from using reflection to dig inside Xalan?

From what we've bumped into so far it was not reflection - simply using Xalan's classes (from the com.sun.org.apache.xml.internal package) - on 11 it warns on 17 errors. As noted fiddling with --add-opens might do some damage control but it feels like a breaking change for JRuby users ...

@headius
Copy link
Contributor

headius commented Sep 14, 2022

I think the flag we'd want is --add-exports so that the java.xml module exports the internal com.sun.org.apache.xml.internal package to external consumers, but you're right... it's still a breaking change.

I am not familiar with the Xalan-related code in Nokogiri... is there a reason we can't just use the published XSLT APIs? Can one of you point out some example code that requires direct access to Xalan?

@headius
Copy link
Contributor

headius commented Sep 14, 2022

Poking around outside of Xalan, it really seems like Saxon is the only game in town, and for sure if you want XSL 2.0 or 3.0 support. Given the lifecycle of Xalan and the inaccessibility of the JDK version, I think we will need to bite the bullet and move to Saxon sooner or later.

@chadlwilson
Copy link
Contributor

FWIW, seems our ASF friends did cut a Xalan-J 2.7.3 release after 9 long years last week, PR at #2873

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants