-
-
Notifications
You must be signed in to change notification settings - Fork 897
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
XSLT segfault #2800
Comments
@Roguelazer Thank you so much for reporting this! I'll take a look this week. |
Running this under valgrind shows the following memory access problem:
So: consider this reproduced! ☑️ I'll dig in on root causes next. |
OK, I've got a handle on this. Here's a minimal repro: #! /usr/bin/env ruby
require "bundler/inline"
gemfile do
source "https://rubygems.org"
gem "nokogiri", path: "."
end
require "nokogiri"
doc = Nokogiri::XML(<<XML)
<root> </root>
XML
stylesheet = Nokogiri::XSLT(<<~STYLE)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:strip-space elements="*"/>
</xsl:stylesheet>
STYLE
doc.xpath("//root/text()") # wrap a Ruby object around the whitespace text node
stylesheet.transform(doc)
GC.start What happens is the following:
This is the first destructive operation I've seen from libxslt, and is making me reconsider a lot of assumptions we have made about libxml2/libxslt's memory model ... I need to think a bit about the solution space here. 🙏 |
In any case: the workaround you found makes sense: when you |
Here's an even better demonstration of what's going on, which is that libxslt is modifying the document we pass in for transformation. This is surprising and probably should be considered a bug in libxslt. #! /usr/bin/env ruby
require "bundler/inline"
gemfile do
source "https://rubygems.org"
gem "nokogiri", path: "."
end
require "nokogiri"
doc = Nokogiri::XML(<<~XML)
<catalog>
<entry>
<title> </title>
</entry>
</catalog>
XML
stylesheet = Nokogiri::XSLT(<<~XSL)
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:strip-space elements="title" />
</xsl:stylesheet>
XSL
doc.to_xml
# => "<?xml version=\"1.0\"?>\n" +
# "<catalog>\n" +
# " <entry>\n" +
# " <title> </title>\n" +
# " </entry>\n" +
# "</catalog>\n"
stylesheet.transform(doc)
doc.to_xml
# => "<?xml version=\"1.0\"?>\n" +
# "<catalog>\n" +
# " <entry>\n" +
# " <title/>\n" +
# " </entry>\n" +
# "</catalog>\n" Specifically, the blank text node is being removed from the original document which I think is surprising. @tenderlove What do you think about doing a defensive Copying the doc would obviously incur performance and memory overhead but may be worth it to avoid subtle bugs and to guarantee the document won't be modified in surprising ways. |
LOL this has already been filed upstream multiple times: |
OK, I've got a proof-of-concept that will raise an exception if both of these things are true:
This is similar to the workaround @tenderlove wrote for #2001. I'll polish it up and make a PR shortly. |
this approach makes a defensive copy of the doc if there's a chance the original may be modified in an unsafe way: - if any spaces will be stripped - if there are blank nodes Fixes #2800
which it does if xsl:strip-space is used this approach makes a defensive copy of the doc if there's a chance the original may be modified in an unsafe way: - if any spaces will be stripped - if there are blank nodes Fixes #2800
which it does if xsl:strip-space is used this approach makes a defensive copy of the doc if there's a chance the original may be modified in an unsafe way: - if any spaces will be stripped - if there are blank nodes Fixes #2800
which it does be default when xsl:strip-space is used this approach makes a defensive copy of the doc if there's a chance the original may be modified in an unsafe way: - if any spaces will be stripped - and there are blank node objects that might be removed Fixes #2800
This will be fixed in v1.15 when it's released, please watch the milestone for an idea of when that might happen (ideally within the month). |
Please describe the bug
I've found a reproducible XSLT-related crash; it happens in both macOS and Linux and with both Ruby 3.1 and 3.2, but is rather much easier to trigger on macOS (not sure why). I have not been able to reproduce it with my minimal reproducer below on Ruby 2.7, but my full (much larger) application also crashes on Ruby 2.7, so I think it's broken there, too.
It can be fixed by
dup
ing the document passed toXSLT#transform
, or by disabling garbage-collection process-wide.Reproduction Script
Run this as "ruby script.rb" and it should crash (most of the time; it depends on exactly what the GC feels like doing), run it as "ruby script.rb dup" and it does not crash ever.
Backtrace
Environment
The text was updated successfully, but these errors were encountered: