DEBUG-2334 Dynamic Instrumentation code tracker component #3942

p-datadog · 2024-09-24T13:51:06Z

What does this PR do?

Adds the code tracker component. This is responsible for tracking the mapping from source file path to RubyVM::InstructionSequence object used for setting targeted trace points.

Motivation:
Efficient instrumentation of lines.

Additional Notes:
There will be further functionality added to CodeTracker later to instrument loaded code (requires the instrumentation component that hasn't been PR'ed yet).

How to test the change?
Unit tests at this time.

Unsure? Have a question? Request a review!

datadog-datadog-prod-us1 · 2024-09-24T13:51:43Z

lib/datadog/di/code_tracker.rb

+          # path separator to use.
+          if path.length > suffix.length && (
+            path[path.length - suffix.length - 1] == "/" ||
+            suffix[0] == "/"


⚪ Code Quality Violation

Suggested change

suffix[0] == "/"

suffix.first == "/"

Improve readability with first (...read more)

This rule encourages the use of first and last methods over array indexing to access the first and last elements of an array, respectively. The primary reason behind this rule is to improve code readability. Using first and last makes it immediately clear that you are accessing the first or last element of the array, which might not be immediately obvious with array indexing, especially for developers who are new to Ruby.

The use of these methods also helps to make your code more idiomatic, which is a crucial aspect of writing effective Ruby code. Idiomatic code is easier to read, understand, and maintain. It also tends to be more efficient, as idioms often reflect patterns that are optimized for the language.

To adhere to this rule, replace the use of array indexing with first or last methods when you want to access the first and last elements of an array. For instance, instead of arr[0] use arr.first and instead of arr[-1] use arr.last. However, note that this rule should be applied only when reading values. When modifying the first or last elements, array indexing should still be used. For example, arr[0] = 'new_value' and arr[-1] = 'new_value'.

datadog-datadog-prod-us1 · 2024-09-24T13:51:43Z

spec/datadog/di/code_tracker_spec.rb

+
+      path = tracker.send(:registry).each.to_a.first.first
+      # The path in the registry should be absolute.
+      expect(path[0]).to eq "/"


⚪ Code Quality Violation

Suggested change

expect(path[0]).to eq "/"

expect(path.first).to eq "/"

Improve readability with first (...read more)

This rule encourages the use of first and last methods over array indexing to access the first and last elements of an array, respectively. The primary reason behind this rule is to improve code readability. Using first and last makes it immediately clear that you are accessing the first or last element of the array, which might not be immediately obvious with array indexing, especially for developers who are new to Ruby.

The use of these methods also helps to make your code more idiomatic, which is a crucial aspect of writing effective Ruby code. Idiomatic code is easier to read, understand, and maintain. It also tends to be more efficient, as idioms often reflect patterns that are optimized for the language.

To adhere to this rule, replace the use of array indexing with first or last methods when you want to access the first and last elements of an array. For instance, instead of arr[0] use arr.first and instead of arr[-1] use arr.last. However, note that this rule should be applied only when reading values. When modifying the first or last elements, array indexing should still be used. For example, arr[0] = 'new_value' and arr[-1] = 'new_value'.

pr-commenter · 2024-09-24T14:28:14Z

Benchmarks

Benchmark execution time: 2024-09-27 19:09:03

Comparing candidate commit ff0d80c in PR branch di-code-tracker with baseline commit 69ac3b8 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 23 metrics, 2 unstable metrics.

use Hash and Mutex instead

datadog-datadog-prod-us1 · 2024-09-24T19:37:50Z

lib/datadog/di/code_tracker.rb

+            # path separator to use.
+            if path.length > suffix.length && (
+              path[path.length - suffix.length - 1] == "/" ||
+              suffix[0] == "/"


⚪ Code Quality Violation

Suggested change

suffix[0] == "/"

suffix.first == "/"

Improve readability with first (...read more)

This rule encourages the use of first and last methods over array indexing to access the first and last elements of an array, respectively. The primary reason behind this rule is to improve code readability. Using first and last makes it immediately clear that you are accessing the first or last element of the array, which might not be immediately obvious with array indexing, especially for developers who are new to Ruby.

The use of these methods also helps to make your code more idiomatic, which is a crucial aspect of writing effective Ruby code. Idiomatic code is easier to read, understand, and maintain. It also tends to be more efficient, as idioms often reflect patterns that are optimized for the language.

To adhere to this rule, replace the use of array indexing with first or last methods when you want to access the first and last elements of an array. For instance, instead of arr[0] use arr.first and instead of arr[-1] use arr.last. However, note that this rule should be applied only when reading values. When modifying the first or last elements, array indexing should still be used. For example, arr[0] = 'new_value' and arr[-1] = 'new_value'.

lib/datadog/di/code_tracker.rb

marcotc · 2024-09-24T20:04:25Z

lib/datadog/di/code_tracker.rb

+        @registry_lock = Hash.new
+      end
+
+      def start


If this method is not called at a high frequency (which it doesn't seem like it is), I believe to make it truly tread safe we have to wrap the whole execution in a mutex.

I already have a lock taken for all reads and writes of instance variables and I believe I account for e.g. state changing during constructor execution. Can you elaborate on what situation you think is not properly covered by existing locking?

I think Marco has a great point -- the current code is racy but seems correct -- e.g. multiple threads may see active? == false, and then create the tracepoint,. but then "walk it back" inside the lock.

But initialization is probably not performance-sensitive, so rather than creating a bunch of code that needs to be reasoned about every time we need to touch it, initializing the whole thing under a lock seems to be both safer and not particularly slower?

marcotc · 2024-09-24T20:05:01Z

lib/datadog/di/code_tracker.rb

+        @trace_point_lock = Mutex.new
+        @registry_lock = Hash.new
+      end
+


I suggest adding, even if a very simple, method level documentation to the start method.

Yes, i will type up a comment.

spec/datadog/di/code_tracker_spec.rb

Strech

I have a small feedback, but nothing critical to say

Strech · 2024-09-25T12:44:18Z

spec/datadog/di/code_tracker_spec.rb

+  describe ".new" do
+    it "creates an instance" do
+      expect(tracker).to be_a(described_class)
+    end
+  end


Sorry for asking, but what a point of this test? Technically we are testing here that Ruby is going to instantiate a class if we call new, but do we really challenge it?

This test only exercises the constructor. If you object to it I can remove it.

spec/datadog/di/code_tracker_spec.rb

lib/datadog/di/code_tracker.rb

ivoanjo · 2024-09-25T12:54:54Z

lib/datadog/di/code_tracker.rb

+        @registry_lock = Hash.new
+      end
+
+      def start


I think Marco has a great point -- the current code is racy but seems correct -- e.g. multiple threads may see active? == false, and then create the tracepoint,. but then "walk it back" inside the lock.

But initialization is probably not performance-sensitive, so rather than creating a bunch of code that needs to be reasoned about every time we need to touch it, initializing the whole thing under a lock seems to be both safer and not particularly slower?

ivoanjo · 2024-09-25T12:55:42Z

lib/datadog/di/code_tracker.rb

+        compiled_trace_point = TracePoint.trace(:script_compiled) do |tp|
+          # Useful attributes of the trace point object here:
+          # .instruction_sequence
+          # .method_id
+          # .path (refers to the code location that called the require/eval/etc.,
+          #   not where the loaded code is; use .path on the instruction sequence
+          #   to obtain the location of the compiled code)
+          # .eval_script
+          #
+          # For now just map the path to the instruction sequence.
+          path = tp.instruction_sequence.path
+          registry_lock.synchronize do
+            registry[path] = tp.instruction_sequence
+          end
+        end


Does script_compiled get emitted for each individual method in a file?

No, it should be emitted once per file.

Wait, in that case, how do we know the correct iseq to target, if 1 file has N iseqs? (Or I may be misunderstanding how this works?)

There is one iseq per file.

ivoanjo · 2024-09-25T12:59:38Z

lib/datadog/di/code_tracker.rb

+          # disable our trace point and do nothing.
+          if @compiled_trace_point
+            # Disable the local variable, leave instance variable as it is.
+            compiled_trace_point.disable


Is it me or was the tracepoint was not enabled before we disable it?

TracePoint.trace enables the trace point. .new does not enable. I agree it can be confusing at first glance.

Strech · 2024-09-25T13:06:43Z

lib/datadog/di/code_tracker.rb

+            if path.length > suffix.length && (
+              path[path.length - suffix.length - 1] == "/" ||
+              suffix[0] == "/"


May I ask – can we improve split the if to improve the readability and maybe use variable to give a name to our intents? We anyway is inside the mutex section and performance is probably not highly impacted by extra branching? WDYT?

I am not sure what you are asking. The logic of this code fragment is explained in the comment above. What are you suggesting to split exactly?

Co-authored-by: Sergey Fedorov <oni.strech@gmail.com>

Co-authored-by: datadog-datadog-prod-us1[bot] <88084959+datadog-datadog-prod-us1[bot]@users.noreply.github.com>

Co-authored-by: Sergey Fedorov <oni.strech@gmail.com>

…tions

…by 2.5

Strech · 2024-09-26T09:04:35Z

lib/datadog/di/code_tracker.rb

+            if path.length > suffix.length && (
+              path[path.length - suffix.length - 1] == "/" ||
+              suffix[0] == "/"
+            ) && path.end_with?(suffix)
+              inexact << iseq
+            end


I think something like this. Not sure where do we stand in "comments vs code", but maybe this may be it?

Suggested change

if path.length > suffix.length && (

path[path.length - suffix.length - 1] == "/" ||

suffix[0] == "/"

) && path.end_with?(suffix)

inexact << iseq

end

if path.length > suffix.length

previous_char = path[path.length - suffix.length - 1]

inexact << iseq if previous_char == "/" && path.end_with?(suffix)

end

It looks to me like you lost the suffix[0] == "/" condition along the way and I personally find your proposed version more difficult to follow because it lacks symmetry where there is symmetry in the logic and inverts cause and effect with the postfix if, thus also putting the action in the middle of the conditions, but I'm happy to commit a rendering you are happy with as long as it is equivalent functionally to what is currently in the diff.

codecov-commenter · 2024-09-27T18:48:10Z

Codecov Report

Attention: Patch coverage is 97.14286% with 3 lines in your changes missing coverage. Please review.

Project coverage is 97.87%. Comparing base (74363cc) to head (ff0d80c).
Report is 2 commits behind head on master.

Files with missing lines	Patch %	Lines
lib/datadog/di/code_tracker.rb	91.89%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3942      +/-   ##
==========================================
+ Coverage   97.85%   97.87%   +0.01%     
==========================================
  Files        1305     1311       +6     
  Lines       78224    78324     +100     
  Branches     3887     3893       +6     
==========================================
+ Hits        76549    76658     +109     
+ Misses       1675     1666       -9

Flag	Coverage Δ
	`97.87% <97.14%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

DEBUG-2334 Dynamic Instrumentation code tracker component

bcf3991

p-datadog requested a review from a team as a code owner September 24, 2024 13:51

datadog-datadog-prod-us1 bot reviewed Sep 24, 2024

View reviewed changes

Remove Concurrent::Map usage to avoid dependency on concurrent-ruby

12b3d63

use Hash and Mutex instead

datadog-datadog-prod-us1 bot reviewed Sep 24, 2024

View reviewed changes

marcotc reviewed Sep 24, 2024

View reviewed changes

p-datadog mentioned this pull request Sep 25, 2024

DEBUG-2334 upgrade steep & rbs #3950

Merged

y9v approved these changes Sep 25, 2024

View reviewed changes

spec/datadog/di/code_tracker_spec.rb Show resolved Hide resolved

Strech reviewed Sep 25, 2024

View reviewed changes

ivoanjo reviewed Sep 25, 2024

View reviewed changes

Strech reviewed Sep 25, 2024

View reviewed changes

p-datadog and others added 10 commits September 25, 2024 10:56

Update spec/datadog/di/code_tracker_spec.rb

657f8bf

Co-authored-by: Sergey Fedorov <oni.strech@gmail.com>

Update lib/datadog/di/code_tracker.rb

ed08f40

Co-authored-by: Sergey Fedorov <oni.strech@gmail.com>

Update lib/datadog/di/code_tracker.rb

5e25e1f

Co-authored-by: datadog-datadog-prod-us1[bot] <88084959+datadog-datadog-prod-us1[bot]@users.noreply.github.com>

Update lib/datadog/di/code_tracker.rb

7e5066d

Co-authored-by: datadog-datadog-prod-us1[bot] <88084959+datadog-datadog-prod-us1[bot]@users.noreply.github.com>

Update spec/datadog/di/code_tracker_spec.rb

125d6e4

Co-authored-by: Sergey Fedorov <oni.strech@gmail.com>

fix registry lock type

2e1de86

start docstring

66be4eb

put entire start method under trace point lock

dd15c2e

Merge branch 'master' into di-code-tracker

cd766bb

use patched rbs to get RubyVM::InstructionSequence method type defini…

17952c4

…tions

p-datadog force-pushed the di-code-tracker branch from f65506a to 17952c4 Compare September 25, 2024 17:00

p added 2 commits September 25, 2024 13:08

mark tests as di tests because needed trace points do not exist on ru…

3b39340

…by 2.5

add spec helper require

ba20d12

Strech reviewed Sep 26, 2024

View reviewed changes

p-datadog and others added 3 commits September 26, 2024 09:44

Merge branch 'master' into di-code-tracker

5b1f118

skip DI tests on ruby 2.5

ec3426e

Merge branch 'master' into di-code-tracker

f120f8f

standard

ff0d80c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DEBUG-2334 Dynamic Instrumentation code tracker component #3942

DEBUG-2334 Dynamic Instrumentation code tracker component #3942

p-datadog commented Sep 24, 2024

datadog-datadog-prod-us1 bot Sep 24, 2024

datadog-datadog-prod-us1 bot Sep 24, 2024

pr-commenter bot commented Sep 24, 2024 •

edited

Loading

datadog-datadog-prod-us1 bot Sep 24, 2024

marcotc Sep 24, 2024

p-datadog Sep 25, 2024

ivoanjo Sep 25, 2024

marcotc Sep 24, 2024

p-datadog Sep 25, 2024

Strech left a comment

Strech Sep 25, 2024

p-datadog Sep 25, 2024

ivoanjo Sep 25, 2024

ivoanjo Sep 25, 2024

p-datadog Sep 25, 2024

ivoanjo Sep 25, 2024

p-datadog Sep 25, 2024

ivoanjo Sep 25, 2024

p-datadog Sep 25, 2024

Strech Sep 25, 2024

p-datadog Sep 25, 2024

Strech Sep 26, 2024

p-datadog Sep 26, 2024

codecov-commenter commented Sep 27, 2024

DEBUG-2334 Dynamic Instrumentation code tracker component #3942

Are you sure you want to change the base?

DEBUG-2334 Dynamic Instrumentation code tracker component #3942

Conversation

p-datadog commented Sep 24, 2024

datadog-datadog-prod-us1 bot Sep 24, 2024

Choose a reason for hiding this comment

⚪ Code Quality Violation

datadog-datadog-prod-us1 bot Sep 24, 2024

Choose a reason for hiding this comment

⚪ Code Quality Violation

pr-commenter bot commented Sep 24, 2024 • edited Loading

Benchmarks

datadog-datadog-prod-us1 bot Sep 24, 2024

Choose a reason for hiding this comment

⚪ Code Quality Violation

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Strech left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Sep 27, 2024

Codecov Report

pr-commenter bot commented Sep 24, 2024 •

edited

Loading