Skip to content

Commit

Permalink
[NO-TICKET] Fix rpath for linking to libdatadog when loading from ext…
Browse files Browse the repository at this point in the history
…ension dir

**What does this PR do?**

This PR is a follow-up to
#3582 .

In that PR, we fixed loading the profiling native extension so that
it could be loaded from the Ruby extensions directory (see the original
PR for more details).

It turns out this was not enough! Specifically, the customer reported
that they saw the following error

> Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling
> native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux
> due to libdatadog_profiling.so: cannot open shared object file: No such file or directory

Specifically, what this message tells is that we're finding the
profiling native extension BUT it's failing to load BECAUSE the dynamic
loader is not able to find its `libdatadog_profiling.so` dependency.

From debugging the issue with the customer, I suspect that what
we're seeing here is a repeat of
#2067 /
#2125 , that is, the
paths where the profiler is compiled are changed at deployment, and
so we also need to adjust the relative rpath to account for this.

I haven't yet confirmed with the customer that this is their issue,
BUT I was able to reproduce the exact problem if I moved the
installation of the library in the way I mention above (see "how to test
the change", below).

**Motivation:**

Fix this weird corner case that made the profiler not load.

**Additional Notes:**

This is a really really weird corner case, so I'm happy to further
describe what the issue is if my description above + the comments in the
code are still too cryptic to understand.

**How to test the change?**

I've added test code for the helper, but actually validating the whole
rpath thing is a bit annoying.

Here's how I triggered the issue myself, and then used it to validate
the fix:

```
 # Build fixed gem into folder, will be used later
$ bundle exec rake build
datadog 2.0.0.rc1 built to pkg/datadog-2.0.0.rc1.gem.

 # Open a clean Ruby docker installation
$ docker run --network=host -ti -v `pwd`:/working ruby:3.2.2-bookworm /bin/bash

 # I've created a minimal test gemfile ahead of time
/working/rpathtest# cat gems.rb
source 'https://rubygems.org'

gem 'datadog'
 # Tell bundler to install the gem into a folder
/working/rpathtest# bundle config set --local path 'vendor/bundle'
/working/rpathtest# bundle install

 # Confirm profiler works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Now let's simulate the native extension being loaded from the
 # extensions directory:
/working/rpathtest# find | grep \.so$ | grep datadog
./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_loader.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so
./vendor/bundle/ruby/3.2.0/gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux-musl/libdatadog-x86_64-alpine-linux-musl/lib/libdatadog_profiling.so
./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so
/working/rpathtest# rm ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so  ./vendor/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so

 # Confirm profiler still works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Now let's simulate the folders being moved (the issue being fixed):
/working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor/bundle"
 # Update this to vendor2...
working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor2/bundle"
 # and move the folder
/working/rpathtest# mv vendor/ vendor2

 # Now we've triggered the exact same error message as reported by the
 # customer
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
W, [2024-06-05T15:51:12.488843 #517]  WARN -- datadog: [datadog] Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux due to libdatadog_profiling.so: cannot open shared object file: No such file or directory' at '/working/rpathtest/vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog/profiling/load_native_extension.rb:41:in `<top (required)>''

 # Now let's test the fix. Let's start by recreating the issue:
 # Put the fixed version into the bundler cache...
/working/rpathtest# cp /working/pkg/datadog-2.0.0.rc1.gem vendor2/bundle/ruby/3.2.0/cache/datadog-2.0.0.rc1.gem
 # force bundler to reinstall...
working/rpathtest# rm -rf vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/
working/rpathtest# bundle install
 # Force gem to be loaded from extension directory
/working/rpathtest# rm ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_native_extension.3.2.2_x86_64-linux.so  ./vendor2/bundle/ruby/3.2.0/gems/datadog-2.0.0.rc1/lib/datadog_profiling_loader.3.2.2_x86_64-linux.so
 # Confirm it works:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # Let's now change the vendor folder again:
/working/rpathtest# cat /usr/local/bundle/config
---
BUNDLE_PATH: "vendor3/bundle"
/working/rpathtest# mv vendor2/ vendor3

 # And it now doesn't fail:
/working/rpathtest# DD_PROFILING_ENABLED=true bundle exec ddprofrb exec ruby -e "sleep 1"
 # ... No errors loading profiler ...

 # And extra confirmation that the relative paths are working:
/working/rpathtest# ldd ./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/datadog_profiling_native_extension.3.2.2_x86_64-linux.so
	libdatadog_profiling.so => /working/rpathtest/./vendor3/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0.rc1/../../../../gems/libdatadog-9.0.0.1.0-x86_64-linux/vendor/libdatadog-9.0.0/x86_64-linux/libdatadog-x86_64-unknown-linux-gnu/lib/libdatadog_profiling.so (0x00007ff127c00000)
```
  • Loading branch information
ivoanjo committed Jun 12, 2024
1 parent bd08222 commit efe87a6
Show file tree
Hide file tree
Showing 3 changed files with 92 additions and 4 deletions.
10 changes: 6 additions & 4 deletions ext/datadog_profiling_native_extension/extconf.rb
Original file line number Diff line number Diff line change
Expand Up @@ -211,12 +211,14 @@ def add_compiler_flag(flag)
skip_building_extension!(Datadog::Profiling::NativeExtensionHelpers::Supported::COMPILER_ATOMIC_MISSING)
end

# See comments on the helper method being used for why we need to additionally set this.
# See comments on the helper methods being used for why we need to additionally set this.
# The extremely excessive escaping around ORIGIN below seems to be correct and was determined after a lot of
# experimentation. We need to get these special characters across a lot of tools untouched...
$LDFLAGS += \
' -Wl,-rpath,$$$\\\\{ORIGIN\\}/' \
"#{Datadog::Profiling::NativeExtensionHelpers.libdatadog_folder_relative_to_native_lib_folder}"
extra_relative_rpaths = [
Datadog::Profiling::NativeExtensionHelpers.libdatadog_folder_relative_to_native_lib_folder,
*Datadog::Profiling::NativeExtensionHelpers.libdatadog_folder_relative_to_ruby_extensions_folders,
]
extra_relative_rpaths.each { |folder| $LDFLAGS += " -Wl,-rpath,$$$\\\\{ORIGIN\\}/#{folder.to_str}" }
Logging.message("[datadog] After pkg-config $LDFLAGS were set to: #{$LDFLAGS.inspect}\n")

# Tag the native extension library with the Ruby version and Ruby platform.
Expand Down
46 changes: 46 additions & 0 deletions ext/datadog_profiling_native_extension/native_extension_helpers.rb
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,52 @@ def self.libdatadog_folder_relative_to_native_lib_folder(
Pathname.new(libdatadog_lib_folder).relative_path_from(Pathname.new(profiling_native_lib_folder)).to_s
end

# In https://github.com/DataDog/dd-trace-rb/pull/3582 we got a report of a customer for which the native extension
# only got installed into the extensions folder.
#
# But then this fix was not enough to fully get them moving because then they started to see the issue from
# https://github.com/DataDog/dd-trace-rb/issues/2067 / https://github.com/DataDog/dd-trace-rb/pull/2125 :
#
# > Profiling was requested but is not supported, profiling disabled: There was an error loading the profiling
# > native extension due to 'RuntimeError Failure to load datadog_profiling_native_extension.3.2.2_x86_64-linux
# > due to libdatadog_profiling.so: cannot open shared object file: No such file or directory
#
# The problem is that when loading the native extension from the extensions directory, the relative rpath we add
# with the #libdatadog_folder_relative_to_native_lib_folder helper above is not correct, we need to add a relative
# rpath to the extensions directory.
#
# So how do we find the full path where the native extension is placed?
# * From https://github.com/ruby/ruby/blob/83f02d42e0a3c39661dc99c049ab9a70ff227d5b/lib/bundler/runtime.rb#L166
# `extension_dirs = Dir["#{Gem.dir}/extensions/*/*/*"] + Dir["#{Gem.dir}/bundler/gems/extensions/*/*/*"]`
# we get that's in one of two fixed subdirectories of `Gem.dir`
# * From https://github.com/ruby/ruby/blob/83f02d42e0a3c39661dc99c049ab9a70ff227d5b/lib/rubygems/basic_specification.rb#L111-L115
# we get the structure of the subdirectory (platform/extension_api_version/gem_and_version)
#
# Thus, `Gem.dir` of `/var/app/current/vendor/bundle/ruby/3.2.0` becomes (for instance)
# `/var/app/current/vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/datadog-2.0.0/` or
# `/var/app/current/vendor/bundle/ruby/3.2.0/bundler/gems/extensions/x86_64-linux/3.2.0/datadog-2.0.0/`
#
# We then compute the relative path between these folders and the libdatadog folder, and use that as a relative path.
def self.libdatadog_folder_relative_to_ruby_extensions_folders(
gem_dir: Gem.dir,
libdatadog_pkgconfig_folder: Libdatadog.pkgconfig_folder
)
return unless libdatadog_pkgconfig_folder

# For the purposes of calculating a folder relative to the other, we don't actually NEED to fill in the
# platform, extension_api_version and gem version. We're basically just after how many folders it is deep from
# the Gem.dir.
expected_ruby_extensions_folders = [
"#{gem_dir}/extensions/platform/extension_api_version/datadog_version/",
"#{gem_dir}/bundler/gems/extensions/platform/extension_api_version/datadog_version/",
]
libdatadog_lib_folder = "#{libdatadog_pkgconfig_folder}/../"

expected_ruby_extensions_folders.map do |folder|
Pathname.new(libdatadog_lib_folder).relative_path_from(Pathname.new(folder)).to_s
end
end

# Used to check if profiler is supported, including user-visible clear messages explaining why their
# system may not be supported.
module Supported
Expand Down
40 changes: 40 additions & 0 deletions spec/datadog/profiling/native_extension_helpers_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,46 @@
end
end

describe '.libdatadog_folder_relative_to_ruby_extensions_folders' do
context 'when libdatadog is available' do
before do
skip_if_profiling_not_supported(self)
if PlatformHelpers.mac? && Libdatadog.pkgconfig_folder.nil? && ENV['LIBDATADOG_VENDOR_OVERRIDE'].nil?
raise 'You have a libdatadog setup without macOS support. Did you forget to set LIBDATADOG_VENDOR_OVERRIDE?'
end
end

it 'returns a relative path to libdatadog folder from the ruby extensions folders' do
extensions_relative, bundler_extensions_relative =
described_class.libdatadog_folder_relative_to_ruby_extensions_folders

libdatadog_extension = RbConfig::CONFIG['SOEXT'] || raise('Missing SOEXT for current platform')
libdatadog = "libdatadog_profiling.#{libdatadog_extension}"

expect(extensions_relative).to start_with('../')
expect(bundler_extensions_relative).to start_with('../')

extensions_full =
"#{Gem.dir}/extensions/platform/extension_api_version/datadog_version/#{extensions_relative}/#{libdatadog}"
bundler_extensions_full =
"#{Gem.dir}/bundler/gems/extensions/platform/extension_api_version/datadog_version/" \
"#{bundler_extensions_relative}/#{libdatadog}"

expect(File.exist?(Pathname.new(extensions_full).cleanpath.to_s))
.to be(true), "Libdatadog not available in expected path: #{extensions_full.inspect}"
expect(File.exist?(Pathname.new(bundler_extensions_full).cleanpath.to_s))
.to be(true), "Libdatadog not available in expected path: #{bundler_extensions_full.inspect}"
end
end

context 'when libdatadog is unsupported' do
it do
expect(described_class.libdatadog_folder_relative_to_ruby_extensions_folders(libdatadog_pkgconfig_folder: nil)).to be nil
end
end
end


describe '::LIBDATADOG_VERSION' do
it 'must match the version restriction set on the gemspec' do
# This test is expected to break when the libdatadog version on the .gemspec is updated but we forget to update
Expand Down

0 comments on commit efe87a6

Please sign in to comment.