Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto Enable All Instrumentation Feature #1260

Merged
merged 49 commits into from
Jan 21, 2021
Merged

Conversation

ericmustin
Copy link
Contributor

Summary

This PR adds functionality to automatically enable all available integrations without needing to add a Datadog.configure block.

This works by adding a script that can be require'd in an application startup lifecyce, which adds a railtie which iterates over all available integrations (except for those blacklisted), and enabled them automatically, silencing any logs that might be normally emitted from a failed patching attempt.

In non rails environments, the feature does not use a railtie, and is still functional but requires users to require their integration's gems first.

The docs are also updated to include usage instructions.

Notes

  • There's still some tests here I'd like to add, But i was hoping to have the bulk of this work reviewed first for sanity. the current tests pass fine on it's own, but there appears to be leakage of config between rails tests. If you notice, you can intentionally remove the require 'ddtrace/auto_instrument' line from the support/rails[3|4|5|6] specs and the tests will still pass. Isolating the test via fit makes them pass/fail as expected, so I believe the rails application initialisation is getting leaked between spec tests. I tried wrapping the auto_instrument tests in a forked process but that didn't appear to help. Still investigating this

  • cleaned up some odd behavior in our Rake integration where we were not being defensive enough

@ericmustin ericmustin requested a review from a team November 23, 2020 14:16
docs/GettingStarted.md Outdated Show resolved Hide resolved
docs/GettingStarted.md Outdated Show resolved Hide resolved
@marcotc
Copy link
Member

marcotc commented Nov 24, 2020

@ericmustin if you need to require 'ddtrace/auto_instrument', but then "unrequire" it in a subsequent test, you best option is to isolate the tests that permanently pollute the workspace in their own set of Rake tasks.

We do that for bundle exec appraisal rails6-postgres rake spec:railsdisableenv for example, because we need a completely clean execution (no Rails ever initalized) for these tests.

I recommend you separate tests into batch sets that can be run on the same "dirty" environment, and create isolated tests for the ones that can't mix up.

target.integrations_pending_activation.each do |integration|
integration.patch if integration.respond_to?(:patch)
integration.patch(silence_logs) if integration.respond_to?(:patch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very tricky, as silencing all logs will also hide important information. For example, the instrumentation might not be compatible with the gem version installed (we support >= 1.0, but 0.9 is installed) (Patchable.compatible?), or the gem might not have been required yet (Patchable.loaded?).

Both of these cases currently output a warning today.

Maybe we could have Patchable#patch return true is patching was successful, or a Hash if something failed:

{
  available: self.class.available?,
  loaded: self.class.loaded?,
  compatible: self.class.compatible?,
  patchable: self.class.patchable?,
}

And let the caller handle the error reporting.

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, updated. I still think the logging is a little clunky but it's more flexible than previously.

@@ -86,6 +87,14 @@ def fetch_integration(name)
registry[name] ||
raise(InvalidIntegrationError, "'#{name}' is not a valid integration.")
end

def silence_logs?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think verbose? seems like more straightforward name for this option (defaulting to true, and set to false when patching-all).

@marcotc marcotc added core Involves Datadog core libraries feature Involves a product feature labels Nov 30, 2020
docs/GettingStarted.md Outdated Show resolved Hide resolved
docs/GettingStarted.md Outdated Show resolved Hide resolved
docs/GettingStarted.md Outdated Show resolved Hide resolved
Comment on lines 6 to 7
EXCLUDED_RAILS_INTEGRATIONS = [:rack, :action_cable, :active_support, :action_pack, :action_view, :active_record].freeze
EXCLUDED_TEST_INTEGRATIONS = [:cucumber, :rspec].freeze
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we add a new Rails integration or new test integration, and forget to add them here, I don't think any tests would fail today.

I don't have a strong preference on what approach to take, but I think we should ensure that, when a new integration is added, either by us or community, there should be no surprises with AutoInstrument with the integration addition.

I can see this being done in different ways:

  • Having an automatic way to skip Rails sub-integrations and skip test integrations.
  • Having integrations declare that they are allowed to be automatically activated (probably through a new field in Datadog::Contrib::Patchable::ClassMethods or similar).
  • Having a whitelist of instrumentations allowed to be automatically enabled. This one is a bit worse, as it required the knowledge of a separate file than the contrib files added in an integration PR, but still a valid solution.

Like I said, I have no strong preference here, but I think we should address this concern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, makes sense. I tend to think Having integrations declare that they are allowed to be automatically activated (probably through a new field in Datadog::Contrib::Patchable::ClassMethods or similar). is the right approach here, as the rails + test libraries non-auto-instrumentation logic is pretty arbitrary, and i'd rather this just be declared on a per integration basis explicitly. i'll add this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a auto_instrument? method to Patchable::InstanceMethods that accomplishes this. It's a bit tricky to define the logic for the rails sub-modules (or those auto-enabled by rails), since they behave differently depending on whether rails/railtie is present. But i've added some helpers here to get this to work and think it' much cleaner now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one tradeoff here is that this PR is now quite a bit bigger but mostly it's adding boilerplate/repetitive methods to integration / integration_spec files. for your own sanity:

  • looking at the auto_instrument? related changes to integration/integration_spec.rb files for a "rails-y" integration (like, active_record) is the same as all other "rails-y" integrations,
  • cucumber / rspec are changes identical,
  • and then all the other integrations inherit the behavior and have the same base test added to their integration_spec.rb

delner
delner previously requested changes Dec 17, 2020
Copy link
Contributor

@delner delner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea here is good and workable, but I don't think the implementation is right.

There's a deliberate design decision to keep integrations (Rails) separate from the core (AutoInstrument), and this creates coupling between the two. This means we always have to load Rails to do any kind of Ruby instrumentation, something we want to avoid if we want to move contributions eventually to their own separate package.

I would suggest instead implementing some kind of callback pattern: have the auto-instrumentation code emit an event in which the loaded Rails integration can hook into, receive, and act accordingly. Perhaps something else, as long as it avoids coupling the core to integration. It may not be totally that simple to do, but its definitely worthwhile avoiding this coupling.

Comment on lines 29 to 36
# Railtie to include AutoInstrumentation in rails loading
class Railtie < Rails::Railtie
# we want to load before config initializers so that any user supplied config
# in config/initializers/datadog.rb will take precedence
initializer 'datadog.start_tracer', before: :load_config_initializers do
AutoInstrument.patch_all
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following up on @delner's suggestion, about separating core from contrib.

We have 3 parts to this file which I think we can break apart:

  1. This Rails-specific snippet. Can be extracted to the Rails sub-directory under contrib.
  2. The patch_all implementation. This parts knows how to patch all instrumentations, but does not trigger itself. I don't know 100% where this would go, but Datadog::Contrib::Extensions looks very similar to where it could land. Datadog::Contrib::Extensions might be too overloaded for it, but I think it give us a good idea to where "integration core" features live today.
    To avoid coupling today Datadog::Contrib::Extensions injects itself into the global settings object: Datadog::Configuration::Settings.send(:include, Configuration::Settings). This is a valid technique to prevent coupling.
  3. A "product" part, where we activate it all. This is the code that is ultimately called when we require ddtrace/auto_instrument.

To make 1 and 2 work, one possible suggestion is to introduce another functionality to integrations: similarly to how they support #patch today, they could support something like auto_patch, and we could call that on all integrations that support it. This way new integration can choose to provide custom auto instrumentation or not.

@codecov-io
Copy link

codecov-io commented Jan 6, 2021

Codecov Report

Merging #1260 (013f1dd) into master (e4c430a) will decrease coverage by 0.06%.
The diff coverage is 91.61%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1260      +/-   ##
==========================================
- Coverage   98.13%   98.07%   -0.07%     
==========================================
  Files         752      758       +6     
  Lines       35740    36027     +287     
==========================================
+ Hits        35074    35333     +259     
- Misses        666      694      +28     
Impacted Files Coverage Δ
spec/ddtrace/contrib/sinatra/activerecord_spec.rb 77.77% <ø> (ø)
...dtrace/contrib/rails/rails_auto_instrument_spec.rb 64.00% <64.00%> (ø)
spec/ddtrace/auto_instrument_spec.rb 75.43% <75.43%> (ø)
spec/ddtrace/contrib/rails/support/rails3.rb 94.18% <83.33%> (-1.06%) ⬇️
spec/ddtrace/contrib/rails/support/rails4.rb 90.14% <83.33%> (-1.17%) ⬇️
spec/ddtrace/contrib/rails/support/rails5.rb 91.93% <83.33%> (-1.40%) ⬇️
spec/ddtrace/contrib/rails/support/rails6.rb 93.75% <83.33%> (-1.13%) ⬇️
lib/ddtrace.rb 100.00% <100.00%> (ø)
lib/ddtrace/auto_instrument.rb 100.00% <100.00%> (ø)
lib/ddtrace/auto_instrument_base.rb 100.00% <100.00%> (ø)
... and 62 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e4c430a...013f1dd. Read the comment docs.

Comment on lines 8 to 21
def self.extended(base)
base.send(:extend, Patch)
end

# Patch adds method for invoking auto_instrumentation
module Patch
def add_auto_instrument
if Datadog::Contrib::Rails::Utils.railtie_supported?
require 'ddtrace/contrib/rails/auto_instrument_railtie'
else
AutoInstrument.patch_all
end
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the way we are composing add_auto_instrument doesn't seem like it would work nicely if another implementation tries to add new behaviour that also happen during add_auto_instrument.

I suggest we call super (if defined) in add_auto_instrument to ensure that the parent implementation is also called.

@@ -0,0 +1,3 @@
require 'ddtrace'

Datadog.add_auto_instrument if Datadog.respond_to?(:add_auto_instrument)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Datadog.add_auto_instrument is always defined now, as of latest changes, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, can remove the defensive check here now

Comment on lines 5 to 8
def self.included(base)
base.send(:extend, InstanceMethods)
base.send(:include, InstanceMethods)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to indirectly extend InstanceMethods, instead of defining AutoInstrumentBase as:

module Datadog
  # base methods stubbed for adding auto instrument extensions
  module AutoInstrumentBase
    # stubbed methods for adding auto instrument
    def add_auto_instrument; end
  end
end

and then extending it on the base module:

module Datadog
  extend AutoInstrumentBase

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no not really, i was following a pattern i'd seen with some of the config modules, but this is far simpler so can update

class DatadogAutoInstrumentRailtie < Rails::Railtie
# we want to load before config initializers so that any user supplied config
# in config/initializers/datadog.rb will take precedence
initializer 'datadog.start_tracer', before: :load_config_initializers do
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noting that a Railtie is necessary, as otherwise load order is done by rails alphabetically and we will not patch any gems loaded after ddtrace. this has been tested without a railtie and patching was unsuccesful on gems like faraday, http (anything alphabetically after dd-trace)

Copy link
Member

@marcotc marcotc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 thank you for working on the feedback provided, looks great!

@ericmustin ericmustin requested a review from delner January 8, 2021 14:57
@marcotc marcotc dismissed delner’s stale review January 20, 2021 21:33

I believe this feedback has been addressed with recent changes.

@marcotc marcotc merged commit 9663661 into master Jan 21, 2021
@marcotc marcotc deleted the autoload_instrumentation branch January 21, 2021 16:44
@github-actions github-actions bot added this to the 0.45.0 milestone Jan 21, 2021
@@ -165,6 +180,25 @@ Install and configure the Datadog Agent to receive traces from your now instrume

### Quickstart for Ruby applications

#### Ruby Auto Instrument all Integrations

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not say here "Auto Instrument All Integrations". That's a bit wordy and inconsistent with how we describe in other languages. Maybe just "Automatic Instrumentation" and "Manual Instrumentation"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It reads much better with your suggestion, probably an oversight on our side here. I'll make these changes. 🙇 @andrewsouthard1

marcotc added a commit that referenced this pull request Jan 27, 2021
As suggested in #1260 (comment), this PR cleans up our section titles, removing repetitive language.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Involves Datadog core libraries feature Involves a product feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants