Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse HTTP connection between trace flushes #2227

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from
Draft

Conversation

marcotc
Copy link
Member

@marcotc marcotc commented Aug 18, 2022

I think it would be a bit safer to implement a retry logic first, as unforeseen persistent connections issues would be mitigated by a retry mechanism

This PR changes the Tracing HTTP transport adapter (lib/ddtrace/transport/http/adapters/net.rb) to reuse its HTTP connection internally.

Previously, we created a new connection and closed it on every trace batch flush, which can happens as frequently as once per second.

This PR does not close the connection, and instead reuses it for successive flushes.

If the remote connection errs out (e.g. other side closes the connection), the connection will be retried internally by Net::HTTP.

Benchmarking

Invoke the tracer transport pipeline, flushing a single trace containing 50 spans. The tail end of the pipeline will invoke the HTTP client modified in this PR.

Results in "transport flushes per second".

Before

     http_transport     441.739  (±15.2%) i/s -     29.486k in  70.012534s

After

     http_transport     774.359  (± 5.3%) i/s -     54.072k in  70.056594s

Reusing the HTTP connection is 1.75x faster then creating a brand new connection every time.

@marcotc marcotc force-pushed the reuse-connection branch 7 times, most recently from 3646aa4 to 7d8b375 Compare October 3, 2022 19:25
@github-actions github-actions bot added core Involves Datadog core libraries profiling Involves Datadog profiling tracing labels Oct 6, 2022
@github-actions github-actions bot removed core Involves Datadog core libraries profiling Involves Datadog profiling labels Oct 6, 2022
@marcotc marcotc force-pushed the reuse-connection branch 5 times, most recently from 1747a53 to 354311a Compare October 7, 2022 19:12
Comment on lines -45 to -55
def open_file_descriptors
# Unix-specific way to get the current process' open file descriptors and the files (if any) they correspond to
Dir['/dev/fd/*'].each_with_object({}) do |fd, hash|
hash[fd] =
begin
File.realpath(fd)
rescue SystemCallError # This can fail due to... reasons, and we only want it for debugging so let's ignore
nil
end
end
end
Copy link
Member Author

@marcotc marcotc Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to a shared spec helper, now that checking for leaked FDs is needed in multiple places.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided not to use this approach for HTTP testing, but I feel like this refactor is still good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should move this out of this PR.

@marcotc marcotc force-pushed the reuse-connection branch 2 times, most recently from 8248b34 to da5c144 Compare October 7, 2022 19:37
@@ -1,30 +0,0 @@
# typed: true
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this quite clever monkey patch to expose private members for testing, but is actually never used.

This kind of test-specific, internal access to the tracer (specifically monitoring if the AsyncTransport has finished flushing) was the best way to write reliable tests around HTTP connection reuse.

But polluting the main class space with methods that won't actually be present when users use ddtrace in production is scary, since we could write features that use methods that are not available in production, but will always succeed in our test suite.

Because of this, I've moved this monkey patching to a refinement, so that it only applies to the current spec file under test, never leaking to lib/ code.

Because the file changed so much, git thinks I completely deleted it (which I kind of did, for most of the unused features). spec/support/test_access_refinement.rb contains these new refinements.

@marcotc marcotc force-pushed the reuse-connection branch 2 times, most recently from ba77e58 to 403da37 Compare October 7, 2022 22:14
@marcotc marcotc force-pushed the reuse-connection branch 3 times, most recently from 5559f7b to 919b126 Compare October 20, 2022 22:50
@codecov-commenter
Copy link

codecov-commenter commented Oct 20, 2022

Codecov Report

Merging #2227 (6fe546d) into master (9dbd6d5) will decrease coverage by 0.00%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #2227      +/-   ##
==========================================
- Coverage   98.32%   98.32%   -0.01%     
==========================================
  Files        1120     1125       +5     
  Lines       60401    60703     +302     
==========================================
+ Hits        59389    59685     +296     
- Misses       1012     1018       +6     
Impacted Files Coverage Δ
lib/datadog/appsec/extensions.rb 85.91% <0.00%> (-2.15%) ⬇️
lib/datadog/appsec/response.rb 86.66% <0.00%> (-1.80%) ⬇️
lib/datadog/appsec/processor.rb 91.75% <0.00%> (-1.32%) ⬇️
...dog/profiling/collectors/cpu_and_wall_time_spec.rb 97.48% <0.00%> (-0.42%) ⬇️
spec/datadog/integration_spec.rb 97.40% <0.00%> (-0.13%) ⬇️
lib/datadog/core/configuration/settings.rb 99.18% <0.00%> (-0.01%) ⬇️
spec/support/faux_transport.rb 100.00% <0.00%> (ø)
lib/datadog/core/metrics/ext.rb 100.00% <0.00%> (ø)
spec/ddtrace/release_gem_spec.rb 100.00% <0.00%> (ø)
lib/datadog/appsec/configuration.rb 100.00% <0.00%> (ø)
... and 32 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@marcotc marcotc force-pushed the reuse-connection branch 2 times, most recently from 6fe546d to 12fd980 Compare January 4, 2023 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants