Skip to content

Very high performance overhead on JRuby #640

Closed

Description

Hi Datadog,

I'm migrating a Rails (4.2.10) app from MRI Ruby 2.3.8 and Unicorn to JRuby 9.1.17.0 and Puma.
The dd-trace-rb version used is 0.17.2

While doing some performance tests, I saw a big difference between my benchmarks in dev mode and prod mode with JRuby:

  • dev mode with JRuby/Puma: ~280req/s
  • prod mode with JRuby/Puma: ~100req/s .

I found that disabling the Datadog agent (by surrounding the Datadog.configure do |c| ... code in the Rails initializer with an if false) restored the prod mode performances (~320req/s) !

I also tested the differences in prod mode with MRI Ruby and Unicorn between Datadog agent enabled vs disabled.
While there's a difference, it's not as huge as with JRuby/ Puma:

  • prod mode, Datadog enabled: ~100req/s
  • prod mode, Datadog disabled: ~150req/s

So, let's summarise these numbers:

Datadog Agent JRuby / Puma perfs (prod mode) MRI Ruby / Unicorn (prod mode)
enabled ~100 req/s ~100 req/s
disabled ~320 req/s ~150 req/s

Another interesting thing to notice is that CPU usage with JRuby/Puma is not at 100% when dd-trace-rb is disabled while it's full 100% when enabled. 🤔


I should precise that the Datadog agent is not installed in my OS, so all calls from the dd-trace-rb in my app to this OS agent failed.
Maybe, this could be the reason of the problem. I don't know for now. I still have to test that.


Here's how I configure dd-trace-rb:

  • Gemfile
gem 'ddtrace'
  • myapp/config/initializers/datadog-tracer.rb
if ! (Rails.env.test? || Rails.env.development?)
  Datadog.configure do |c|
    c.tracer enabled: Rails.env.production? || Rails.env.staging? || Rails.env.testing?, env: Rails.env

    c.use :rails, service_name: "#{Settings.datadog.service_name}-rails", distributed_tracing: true

    c.use :mysql2, service_name: "#{Settings.datadog.service_name}-mysql2"
    c.use :aws, service_name: "#{Settings.datadog.service_name}-aws"
    c.use :grape, service_name: "#{Settings.datadog.service_name}-grape"
    c.use :redis, service_name: "#{Settings.datadog.service_name}-redis"
    c.use :resque, service_name: "#{Settings.datadog.service_name}-resque", workers: [ ... ]
  end
end

Here's the script I use to benchmark.
It uses this tool https://github.com/giltene/wrk2

#!/usr/bin/env bash

trap "exit" INT # https://stackoverflow.com/a/32146079/2431728

echo "WARM UP"

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json
wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

echo
echo "========="
echo "BENCH 1"
echo "========="
echo

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

echo
echo "========="
echo "BENCH 2"
echo "========="
echo

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

echo
echo "========="
echo "BENCH 3"
echo "========="
echo

wrk2 -t2 -c100 -d1m -R2000 -L http://localhost:3000/route/endpoint.json

Raw benchmark results for JRuby/Puma with dd-trace-rb disabled (sorry it's small)

Show data screenshot 2018-11-27 12 31 28 - jruby_9 1 17 0_prod_datadog_disabled-1 screenshot 2018-11-27 12 31 22 - jruby_9 1 17 0_prod_datadog_disabled-2 screenshot 2018-11-27 12 30 45 - jruby_9 1 17 0_prod_datadog_disabled-3

Raw benchmark results for JRuby/Puma with dd-trace-rb enabled (sorry it's small)

Show data screenshot 2018-11-27 12 43 19 - jruby_9 1 17 0_prod_datadog_enabled-1 screenshot 2018-11-27 12 43 13 - jruby_9 1 17 0_prod_datadog_enabled-2 screenshot 2018-11-27 12 43 08 - jruby_9 1 17 0_prod_datadog_enabled-3

Raw benchmark results for MRI/Unicorn with dd-trace-rb disabled (sorry it's small)

Show data screenshot 2018-11-27 14 06 39 - mri2 3 8_prod_datadog_disabled-1 screenshot 2018-11-27 14 06 32 - mri2 3 8_prod_datadog_disabled-2 screenshot 2018-11-27 14 06 26 - mri2 3 8_prod_datadog_disabled-3

Raw benchmark results for MRI/Unicorn with dd-trace-rb enabled (sorry it's small)

Show data screenshot 2018-11-27 13 50 17 - mri2 3 8_prod_datadog_enabled-1 screenshot 2018-11-27 13 50 12 - mri2 3 8_prod_datadog_enabled-2 screenshot 2018-11-27 13 50 06 - mri2 3 8_prod_datadog_enabled-3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

communityWas opened by a community membercoreInvolves Datadog core libraries

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions