Description
openedon Nov 5, 2020
Hello, we've been encountering segmentation faults that seem related to ddtrace and were hoping you could help with debugging.
Here is a minimal reproduction app: https://github.com/cjford/ddtrace-repro
It assumes mongodb reachable at mongodb://0.0.0.0:27017/development
and datadog-agent
running with default configuration. You can run it with bundle exec ruby app.rb
On Catalina 10.15.6, I get the following output (no backtrace or crash reports present):
[2020-11-04T18:56:00.134787 #43967] INFO -- ddtrace: [ddtrace] DATADOG TRACER CONFIGURATION - {"date":"2020-11-04T18:56:00-07:00","os_name":"x86_64-apple-darwin19.2.0","version":"0.41.0","lang":"ruby","lang_version":"2.6.5","enabled":true,"agent_url":"http://127.0.0.1:8126?timeout=1","debug":false,"analytics_enabled":false,"runtime_metrics_enabled":false,"integrations_loaded":"mongo@2.13.1","vm":"ruby-2.6.5","partial_flushing_enabled":false,"priority_sampling_enabled":false,"health_metrics_enabled":false,"integration_mongo_analytics_enabled":"false","integration_mongo_analytics_sample_rate":"1.0","integration_mongo_enabled":"true","integration_mongo_service_name":"mongo","integration_mongo_quantize":"{:show=\u003e[:collection, :database, :operation]}"}
[BUG] Segmentation fault at 0x0000000000000040
ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-darwin19]
-- Crash Report log information --------------------------------------------
See Crash Report log file under the one of following:
* ~/Library/Logs/DiagnosticReports
* /Library/Logs/DiagnosticReports
for more details.
Don't forget to include the above Crash Report log file in bug reports.
-- Machine register context ------------------------------------------------
rax: 0x0000000000000000 rbx: 0x00007fba4a240003 rcx: 0x0000000000000000
rdx: 0x000000010d7f60f0 rdi: 0x00007fba4a240003 rsi: 0x0000000000000000
rbp: 0x000070000955cd20 rsp: 0x000070000955cd20 r8: 0x0000000000000000
r9: 0x00007fba51ef6578 r10: 0x00007fba51ef6570 r11: 0x00007fba51ef6574
r12: 0x00007fba51ef8cb0 r13: 0x000070000955cf28 r14: 0x000070000955cd30
r15: 0x00007fba4e3f4000 rip: 0x000000010d5dec38 rfl: 0x0000000000010206
-- C level backtrace information -------------------------------------------
We've also observed this in live environments, here's a log that includes a stack trace (Debian 9, kernel v4.19.132-1):
PROD-read_only [37] pry(main)> exit
/bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:149: [BUG] Segmentation fault at 0x0000000000000bb0
ruby 2.6.5p114 (2019-10-01 revision 67812) [x86_64-linux]
-- Control frame information -----------------------------------------------
c:0021 p:0010 s:0109 e:000105 METHOD /bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:149
c:0020 p:0033 s:0101 e:000100 METHOD /bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:98
c:0019 p:0025 s:0092 e:000091 METHOD /bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:669
c:0018 p:0290 s:0087 e:000086 METHOD /bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:664
c:0017 p:0048 s:0076 e:000075 METHOD /bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:395
c:0016 p:0076 s:0069 e:000068 METHOD /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/metrics.rb:93
c:0015 p:0009 s:0060 e:000059 BLOCK /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/runtime/metrics.rb:57 [FINISH]
c:0014 p:---- s:0055 e:000054 CFUNC :each
c:0013 p:0021 s:0051 e:000050 BLOCK /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/runtime/metrics.rb:57
c:0012 p:0002 s:0048 e:000047 METHOD /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/runtime/metrics.rb:69
c:0011 p:0027 s:0043 e:000042 METHOD /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/runtime/metrics.rb:57
c:0010 p:0007 s:0039 e:000038 METHOD /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/runtime_metrics.rb:33
c:0009 p:0010 s:0035 e:000034 BLOCK /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/loop.rb:17
c:0008 p:0009 s:0032 e:000031 BLOCK /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/loop.rb:83 [FINISH]
c:0007 p:---- s:0029 e:000028 CFUNC :loop
c:0006 p:0010 s:0025 e:000024 METHOD /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/loop.rb:80
c:0005 p:0005 s:0021 e:000020 METHOD /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/loop.rb:17
c:0004 p:0012 s:0016 e:000013 BLOCK /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/async.rb:21
c:0003 p:0003 s:0011 e:000010 BLOCK /bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/async.rb:128
c:0002 p:0041 s:0007 e:000006 BLOCK /bundle/ruby/2.6.0/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:474 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]
-- Ruby level backtrace information ----------------------------------------
/bundle/ruby/2.6.0/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/async.rb:128:in `block in start_worker'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/async.rb:21:in `block in perform'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/loop.rb:17:in `perform'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/loop.rb:80:in `perform_loop'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/loop.rb:80:in `loop'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/loop.rb:83:in `block in perform_loop'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/loop.rb:17:in `block in perform'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/workers/runtime_metrics.rb:33:in `perform'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/runtime/metrics.rb:57:in `flush'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/runtime/metrics.rb:69:in `try_flush'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/runtime/metrics.rb:57:in `block in flush'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/runtime/metrics.rb:57:in `each'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/runtime/metrics.rb:57:in `block (2 levels) in flush'
/bundle/ruby/2.6.0/gems/ddtrace-0.41.0/lib/ddtrace/metrics.rb:93:in `gauge'
/bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:395:in `gauge'
/bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:664:in `send_stats'
/bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:669:in `send_stat'
/bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:98:in `write'
/bundle/ruby/2.6.0/gems/dogstatsd-ruby-4.6.0/lib/datadog/statsd.rb:149:in `send_message'
-- Machine register context ------------------------------------------------
RIP: 0x00007f75a840ad81 RBP: 0x00007f75a052d9c0 RSP: 0x00007f75a052d9b8
RAX: 0x0000000000000bb0 RBX: 0x0000000000000ba1 RCX: 0x00000000000000ba
RDX: 0x00007f75a052d9c0 RDI: 0x00007f75820a5500 RSI: 0x00000000000000ba
R8: 0x00000000000003ff R9: 0x0000000000000010 R10: 0x0000000000000000
R11: 0x00007f7590e21eb0 R12: 0x00000feeb1c1802e R13: 0x00007f758e0c5c78
R14: 0x00007f75a052da18 R15: 0x00007f758e0a6710 EFL: 0x0000000000010202
-- C level backtrace information -------------------------------------------
The above examples are both triggered when calling exit
from a ruby repl, after making a mongo query with dd tracing enabled. We've also observed it from our web server processes when they are shutting down. In these cases the backtrace points to puma, but I suspect it's the same root cause since the application code also traces mongo queries and triggers an exit. I'll leave out that log to reduce clutter, but let me know if you think it would be helpful.
We've isolated the cause of the error to version 0.38.0
of ddtrace, specifically this code change: #1091. By starting with 0.41.0
and manually reverting only the above diff, we can no longer produce the error.
Thanks for reading, let me know if you have any ideas on causes/fixes or need any other info.