Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROF-4198] Gather CPU time in profiler without monkey patching Thread for all supported Ruby versions (2.1 to 3.x) #1740

Merged
merged 14 commits into from
Nov 5, 2021
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/test-cross-platform.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ jobs:
fail-fast: false
matrix:
os: [windows-latest]
ruby: [2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, '3.0', head]
ruby: [2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, '3.0']
runs-on: ${{ matrix.os }}
env:
SKIP_SIMPLECOV: 1
Expand Down
3 changes: 3 additions & 0 deletions .rubocop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -375,3 +375,6 @@ Style/TrailingCommaInHashLiteral:
Style/GlobalVars:
Exclude:
- 'ext/ddtrace_profiling_native_extension/extconf.rb'

Naming/VariableNumber:
Enabled: false
2 changes: 1 addition & 1 deletion ddtrace.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ Gem::Specification.new do |spec|
end

# Used by the profiler
spec.add_dependency 'ffi', '~> 1.0'
spec.add_dependency 'debase-ruby_core_source', '>= 0.10.12'
Copy link
Member

@marcotc marcotc Oct 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens for users with versions of Ruby not in the list, for example 2.5.6p201? https://github.com/ruby-debug/debase-ruby_core_source/tree/master/lib/debase/ruby_core_source

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It falls back to the closest lower version:

$ ruby -v
ruby 2.5.9p229 (2021-04-05 revision 67939) [x86_64-darwin19]
$ bundle exec rake clean compile
(...)
creating extconf.h
checking for vm_core.h... no
checking for vm_core.h... no
**************************************************************************
No source for ruby-2.5.9-p229 (revision 67939) provided with
debase-ruby_core_source gem. Falling back to ruby-2.5.1-p57.
**************************************************************************
checking for vm_core.h... yes

Currently they have the first version of every Ruby release (all the .0). For some versions, they're missing the latest point release (they have 2.2.9 but not 2.2.10), but these core VM apis seem to rarely shift, and we also test with the latest versions in our CI.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Major concern here is would this dependency in anyway prevent users who use tracing on supported versions, platforms and engines from installing & using tracing, if they were to upgrade ddtrace and use this dependency?

Sounds like there's some kind of fallback mechanism; is this a sufficient guarantee against the prior question?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've replied in context in #1740 (comment) since I was already touching on this subject.


spec.extensions = ['ext/ddtrace_profiling_native_extension/extconf.rb']
end
36 changes: 27 additions & 9 deletions ext/ddtrace_profiling_native_extension/NativeExtensionDesign.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ in future releases -- e.g. disabling the extension will disable profiling entire
The profiling native extension is (and must always be) designed to **not cause failures** during gem installation, even
if some features, Ruby versions, or operating systems are not supported.

E.g. the extension must cleanly build on Ruby 2.1 on Windows, even if at run time it will effectively do nothing for
such a setup.
E.g. the extension must cleanly build on Ruby 2.1 (or the oldest Ruby version we support at the time) on Windows,
even if at run time it will effectively do nothing for such a setup.

We have a CI setup to help validate this, but this is really important to keep in mind when adding to or changing the
existing codebase.
Expand All @@ -32,17 +32,35 @@ existing codebase.
To implement some of the features below, we sometimes require access to private Ruby header files (that describe VM
internal types, structures and functions).

Because these private header files are not included in regular Ruby installations, our current workaround is to piggy
back on a special header that Ruby includes that is only intended for use by the Ruby MJIT compiler. This header is
placed inside the `include/` directory in a Ruby installation, and is named for that specific Ruby version. e.g.
`rb_mjit_min_header-2.7.4.h`.
Because these private header files are not included in regular Ruby installations, we have two different workarounds:

This header was introduced by the first Ruby version that added MJIT, which is 2.6+.

For older Ruby versions (see safety section above), this header is not available, and this must be handled gracefully.
1. for Ruby versions >= 2.6 we make use use the Ruby private MJIT header
2. for Ruby versions < 2.6 (legacy Rubies) we make use of the `debase-ruby_core_source` gem

Functions which make use of these headers are defined in the <private_vm_api_acccess.c> file.

**Important Note**: Our medium/long-term plan is to stop relying on all private Ruby headers, and instead request and
contribute upstream changes so that they become official public VM APIs.

### Approach 1: Using the Ruby private MJIT header

Ruby versions >= 2.6 introduced a JIT compiler called MJIT. This compiler does not directly generate machine code;
instead it generates C code and uses the system C compiler to turn it into machine code.

The generated C code `#include`s a private header -- which we reference as "the MJIT header" everywhere.
The MJIT header gets shipped with all MJIT-enabled Rubies and includes the layout of many internal VM structures;
and of course the intention is that it is only used by the Ruby MJIT compiler.

This header is placed inside the `include/` directory in a Ruby installation, and is named for that specific Ruby
version. e.g. `rb_mjit_min_header-2.7.4.h`.

### Approach 2: Using the `debase-ruby_core_source` gem

The [`debase-ruby_core_source`](https://github.com/ruby-debug/debase-ruby_core_source) contains almost no code;
instead, it just contains per-Ruby-version folders with the private VM headers (`.h`) files for that version.

Thus, even though a regular Ruby installation does not include these files, we can access the copy inside this gem.

## Feature: Getting thread CPU-time clock_ids

* **OS support**: Linux
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,19 @@

// This file is only compiled on systems where pthread_getcpuclockid() is available;
// Otherwise we compile clock_id_noop.c
#if defined(HAVE_PTHREAD_GETCPUCLOCKID) && defined(USE_MJIT_HEADER)
#ifdef HAVE_PTHREAD_GETCPUCLOCKID

#include <pthread.h>
#include <time.h>
#include <errno.h>

#include <ruby.h>

#ifdef RUBY_2_1_WORKAROUND
#include <thread_native.h>
#else
#include <ruby/thread_native.h>
#endif

#include "private_vm_api_access.h"

Expand Down
2 changes: 1 addition & 1 deletion ext/ddtrace_profiling_native_extension/clock_id_noop.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

// This file is the dual of clock_id_from_pthread.c for systems where that info
// is not available.
#if !(defined(HAVE_PTHREAD_GETCPUCLOCKID) && defined(USE_MJIT_HEADER))
#ifndef HAVE_PTHREAD_GETCPUCLOCKID

#include <ruby.h>

Expand Down
84 changes: 61 additions & 23 deletions ext/ddtrace_profiling_native_extension/extconf.rb
Original file line number Diff line number Diff line change
@@ -1,16 +1,20 @@
# typed: false
# typed: ignore

def skip_building_extension?
# We don't support JRuby for profiling, and JRuby doesn't support native extensions, so let's just skip this entire
# thing so that JRuby users of dd-trace-rb aren't impacted.
on_jruby = RUBY_ENGINE == 'jruby'

# We don't officially support TruffleRuby for dd-trace-rb at all BUT let's not break adventurous customers that
# want to give it a try.
on_truffleruby = RUBY_ENGINE == 'truffleruby'

# Experimental toggle to disable building the extension.
# Disabling the extension will lead to the profiler not working in future releases.
# If you needed to use this, please tell us why on <https://github.com/DataDog/dd-trace-rb/issues/new>.
disabled_via_env = ENV['DD_PROFILING_NO_EXTENSION'].to_s.downcase == 'true'

on_jruby || disabled_via_env
on_jruby || on_truffleruby || disabled_via_env
end

# IMPORTANT: When adding flags, remember that our customers compile with a wide range of gcc/clang versions, so
Expand All @@ -20,11 +24,19 @@ def add_compiler_flag(flag)
end

if skip_building_extension?
# rubocop:disable Style/StderrPuts
$stderr.puts(%(
+------------------------------------------------------------------------------+
| Skipping build of profiling native extension and replacing it with a no-op |
| Makefile |
+------------------------------------------------------------------------------+

))

File.write('Makefile', 'all install clean: # dummy makefile that does nothing')
return
end

# rubocop:disable Style/StderrPuts
$stderr.puts(%(
+------------------------------------------------------------------------------+
| **Preparing to build the ddtrace native extension...** |
Expand All @@ -40,6 +52,7 @@ def add_compiler_flag(flag)
| |
| Thanks for using ddtrace! You rock! |
+------------------------------------------------------------------------------+

))
# rubocop:enable Style/StderrPuts

Expand All @@ -57,10 +70,6 @@ def add_compiler_flag(flag)
# cause a segfault later. Let's ensure that never happens.
add_compiler_flag '-Werror-implicit-function-declaration'

# Older Rubies don't have the MJIT header (used by the JIT compiler, and we piggy back on it)
# TODO: Development builds of Ruby 3.1 seem to be failing on Windows; to be revisited once 3.1.0 stable is out
$defs << '-DUSE_MJIT_HEADER' unless RUBY_VERSION < '2.6' || (RUBY_VERSION >= '3.1' && Gem.win_platform?)

if RUBY_PLATFORM.include?('linux')
# Supposedly, the correct way to do this is
# ```
Expand All @@ -72,24 +81,53 @@ def add_compiler_flag(flag)
$defs << '-DHAVE_PTHREAD_GETCPUCLOCKID'
end

create_header

# The MJIT header is always (afaik?) suffixed with the exact Ruby VM version,
# including patch (e.g. 2.7.2). Thus, we add to the header file a definition
# containing the exact file, so that it can be used in a #include in the C code.
header_contents =
File.read($extconf_h)
.sub('#endif',
<<-EXTCONF_H.strip
#define RUBY_MJIT_HEADER "rb_mjit_min_header-#{RUBY_VERSION}.h"

#endif
EXTCONF_H
)
File.open($extconf_h, 'w') { |file| file.puts(header_contents) }
# Older Rubies don't have the MJIT header, used by the JIT compiler, so we need to use a different approach
CAN_USE_MJIT_HEADER = RUBY_VERSION >= '2.6'

# Tag the native extension library with the Ruby version and Ruby platform.
# This makes it easier for development (avoids "oops I forgot to rebuild when I switched my Ruby") and ensures that
# the wrong library is never loaded.
# When requiring, we need to use the exact same string, including the version and the platform.
create_makefile "ddtrace_profiling_native_extension.#{RUBY_VERSION}_#{RUBY_PLATFORM}"
EXTENSION_NAME = "ddtrace_profiling_native_extension.#{RUBY_VERSION}_#{RUBY_PLATFORM}".freeze

if CAN_USE_MJIT_HEADER
$defs << '-DUSE_MJIT_HEADER'

# NOTE: This needs to come after all changes to $defs
create_header

# The MJIT header is always (afaik?) suffixed with the exact Ruby VM version,
# including patch (e.g. 2.7.2). Thus, we add to the header file a definition
# containing the exact file, so that it can be used in a #include in the C code.
header_contents =
File.read($extconf_h)
.sub('#endif',
<<-EXTCONF_H.strip
#define RUBY_MJIT_HEADER "rb_mjit_min_header-#{RUBY_VERSION}.h"

#endif
EXTCONF_H
)
File.open($extconf_h, 'w') { |file| file.puts(header_contents) }

create_makefile EXTENSION_NAME
else
# On older Rubies, we use the debase-ruby_core_source gem to get access to private VM headers.
# This gem ships source code copies of these VM headers for the different Ruby VM versions;
# see https://github.com/ruby-debug/debase-ruby_core_source for details

thread_native_for_ruby_2_1 = proc { true }
if RUBY_VERSION < '2.2'
# This header became public in Ruby 2.2, but we need to pull it from the private headers folder for 2.1
thread_native_for_ruby_2_1 = proc { have_header('thread_native.h') }
$defs << '-DRUBY_2_1_WORKAROUND'
end

create_header

require 'debase/ruby_core_source'
dir_config('ruby') # allow user to pass in non-standard core include directory

Debase::RubyCoreSource
.create_makefile_with_core(proc { have_header('vm_core.h') && thread_native_for_ruby_2_1.call }, EXTENSION_NAME)
end
13 changes: 11 additions & 2 deletions ext/ddtrace_profiling_native_extension/private_vm_api_access.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,19 @@

// This file exports functions used to access private Ruby VM APIs and internals.
// To do this, it imports a few VM internal (private) headers.
// Be very careful when changing things here :)
//
// **Important Note**: Our medium/long-term plan is to stop relying on all private Ruby headers, and instead request and
// contribute upstream changes so that they become official public VM APIs.
//
// In the meanwhile, be very careful when changing things here :)

#ifdef USE_MJIT_HEADER
// Pick up internal structures from the private Ruby MJIT header file
#include RUBY_MJIT_HEADER
#else
// On older Rubies, use a copy of the VM internal headers shipped in the debase-ruby_core_source gem
#include <vm_core.h>
#endif

// MRI has a similar rb_thread_ptr() function which we can't call it directly
// because Ruby does not expose the thread_data_type publicly.
Expand All @@ -23,4 +33,3 @@ static inline struct rb_thread_struct *thread_struct_from_object(VALUE thread) {
rb_nativethread_id_t pthread_id_for(VALUE thread) {
return thread_struct_from_object(thread)->thread_id;
}
#endif
2 changes: 0 additions & 2 deletions lib/ddtrace/profiling.rb
Original file line number Diff line number Diff line change
Expand Up @@ -128,9 +128,7 @@ def self.unsupported_reason
private_class_method def self.load_profiling
return false unless supported?

require 'ddtrace/profiling/ext/cpu'
require 'ddtrace/profiling/ext/forking'

require 'ddtrace/profiling/collectors/stack'
require 'ddtrace/profiling/exporter'
require 'ddtrace/profiling/recorder'
Expand Down
63 changes: 21 additions & 42 deletions lib/ddtrace/profiling/collectors/stack.rb
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
# typed: true

require 'ddtrace/profiling/native_extension'
require 'ddtrace/profiling/backtrace_location'
require 'ddtrace/profiling/events/stack'
require 'ddtrace/utils/only_once'
Expand Down Expand Up @@ -32,7 +34,8 @@ class Stack < Worker # rubocop:disable Metrics/ClassLength
:trace_identifiers_helper,
:ignore_thread,
:max_time_usage_pct,
:thread_api
:thread_api,
:cpu_time_provider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this abstraction.


def initialize(
recorder,
Expand All @@ -42,6 +45,7 @@ def initialize(
max_time_usage_pct: DEFAULT_MAX_TIME_USAGE_PCT,
max_threads_sampled: DEFAULT_MAX_THREADS_SAMPLED,
thread_api: Thread,
cpu_time_provider: Datadog::Profiling::NativeExtension,
fork_policy: Workers::Async::Thread::FORK_POLICY_RESTART, # Restart in forks by default
interval: MIN_INTERVAL,
enabled: true
Expand All @@ -53,6 +57,8 @@ def initialize(
@max_time_usage_pct = max_time_usage_pct
@max_threads_sampled = max_threads_sampled
@thread_api = thread_api
# Only set the provider if it's able to work in the current Ruby/OS combo
@cpu_time_provider = cpu_time_provider unless cpu_time_provider.cpu_time_ns_for(thread_api.current).nil?

# Workers::Async::Thread settings
self.fork_policy = fork_policy
Expand All @@ -63,8 +69,6 @@ def initialize(
# Workers::Polling settings
self.enabled = enabled

@warn_about_missing_cpu_time_instrumentation_only_once = Datadog::Utils::OnlyOnce.new

# Cache this proc, since it's pretty expensive to keep recreating it
@build_backtrace_location = method(:build_backtrace_location).to_proc
# Cache this buffer, since it's pretty expensive to keep accessing it
Expand Down Expand Up @@ -167,17 +171,10 @@ def collect_thread_event(thread, current_wall_time_ns)
end

def get_cpu_time_interval!(thread)
# Return if we can't get the current CPU time
unless thread.respond_to?(:cpu_time_instrumentation_installed?) && thread.cpu_time_instrumentation_installed?
warn_about_missing_cpu_time_instrumentation(thread)
return
end
return unless cpu_time_provider

current_cpu_time_ns = thread.cpu_time(:nanosecond)
current_cpu_time_ns = cpu_time_provider.cpu_time_ns_for(thread)

# NOTE: This can still be nil even when all of the checks above passed because of a race: there's a bit of
# initialization that needs to be done by the thread itself, and it's possible for us to try to sample
# *before* the thread had time to finish the initialization
return unless current_cpu_time_ns

get_elapsed_since_last_sample_and_set_value(thread, THREAD_LAST_CPU_TIME_KEY, current_cpu_time_ns)
Expand Down Expand Up @@ -226,33 +223,6 @@ def build_backtrace_location(_id, base_label, lineno, path)

private

def warn_about_missing_cpu_time_instrumentation(thread)
@warn_about_missing_cpu_time_instrumentation_only_once.run do
# Is the profiler thread instrumented? If it is, then we know instrumentation is available, but seems to be
# missing on this thread we just found.
#
# As far as we know, it can be missing due to one the following:
#
# a) The thread was started before we installed our instrumentation.
# In this case, the fix is to make sure ddtrace gets loaded before any other parts of the application.
#
# b) The thread was started using the Ruby native APIs (e.g. from a C extension such as ffi).
# Known cases right now that trigger this are the ethon/typhoeus gems.
# We currently have no solution for this case; these threads will always be missing our CPU instrumentation.
#
# c) The thread was started with `Thread.start`/`Thread.fork` and hasn't yet enabled the instrumentation.
# When threads are started using these APIs, there's a small time window during which the thread has started
# but our code to apply the instrumentation hasn't run yet; in these cases it's just a matter of allowing
# it to run and our instrumentation to be applied.
#
if thread_api.current.respond_to?(:cpu_time) && thread_api.current.cpu_time
Datadog.logger.debug(
"Thread ('#{thread}') is missing profiling instrumentation; other threads should be unaffected"
)
end
end
end

# If the profiler is started for a while, stopped and then restarted OR whenever the process forks, we need to
# clean up any leftover per-thread counters, so that the first sample after starting doesn't end up with:
#
Expand All @@ -274,9 +244,18 @@ def reset_cpu_time_tracking
end

def get_elapsed_since_last_sample_and_set_value(thread, key, current_value)
# See cthread.rb for more details, but this is a workaround for https://bugs.ruby-lang.org/issues/17807 ;
# using all thread_variable related methods on these instances also triggers a crash and for now we just
# skip it for the affected Rubies
# Process::Waiter crash workaround:
#
# This is a workaround for a Ruby VM segfault (usually something like
# "[BUG] Segmentation fault at 0x0000000000000008") in the affected Ruby versions.
# See https://bugs.ruby-lang.org/issues/17807 for details.
#
# In those Ruby versions, there's a very special subclass of `Thread` called `Process::Waiter` that causes VM
# crashes whenever something tries to read its instance or thread variables. This subclass of thread only
# shows up when the `Process.detach` API gets used.
# In the specs you'll find crash regression tests that include a way of reproducing it.
#
# As workaround for now we just skip it for the affected Rubies
return 0 if @needs_process_waiter_workaround && thread.is_a?(::Process::Waiter)

last_value = thread.thread_variable_get(key) || current_value
Expand Down
Loading