Description
Issue Description
After upgrading to sentry 5.3 we ran into a situation where the session tracking feature caused problems for the puma server.
In our stack, we integrate the rack_timeout
gem (https://github.com/zombocom/rack-timeout) to prevent requests from running longer than a specific period of time. If the request should be killed, rack_timeout
sends SIGTERM
to the puma
worker and puma
starts the shutdown process.
This leads to puma threads being in an aborting
state, but current requests are still being finished. Later then Sentry::SessionFlusher
logic causes freeze of aborting
process and puma
s worker_shutdown_timeout
is reached. By default, if hangs puma worker for 30 seconds waiting to shutdown.
I doubted this is a sentry-related issue, but it seems that modification of SessionFlusher
solves the whole thing. If it is needed to provide more details which would help, please guide me.
Reproduction Steps
Currently was not possible to reproduce on blank rails application. Probably dependent on stack complexity.
Expected Behavior
Immediate shutdown puma worker, skip session tracking when it is not possible to finish.
Actual Behavior
SessionFlusher
attempts to spawn a new thread, even the current thread is in an aborting
state. This leads to raise of ThreadError (can't alloc thread)
(https://github.com/ruby/ruby/blob/5ae83151b16858083e6443fadc76e7fde0ff1199/thread.c#L895)
From some unclear reason for me, when ThreadError
is raised, it also causes a hangup of the whole shutdown process of the puma worker.
If we skip whole rack_timeout
logic and cause kill of puma worker from controller action, we get following log
# frozen_string_literal: true
class BrokenController < ActionController::API
def fail
Thread.new do
raise StandardError
end.abort_on_exception = true
end
end
source=rack-timeout id=9f46256b-7617-4a36-918c-ed89ad27c189 timeout=60000ms term_on_timeout=1 state=ready
Started GET "/fail" for 127.0.0.1 at 2022-07-20 10:56:47 +0200
source=rack-timeout id=9f46256b-7617-4a36-918c-ed89ad27c189 timeout=60000ms service=1ms term_on_timeout=1 state=active
[Tracing] Discarding <rails.request> transaction </fail> because it's not included in the random sample (sampling rate = 0.2)
(0.7ms) SELECT "schema_migrations"."version" FROM "schema_migrations" ORDER BY "schema_migrations"."version" ASC
[Tracing] Discarding <rack.request> transaction </fail> because it's not included in the random sample (sampling rate = 0.2)
Processing by BrokenController#fail as HTML
#<Thread:0x000000000e60e6a0 /home/khanus/git_old/c2e_backend/app/controllers/broken_controller.rb:6 run> terminated with exception (report_on_exception is true):
/home/khanus/git_old/c2e_backend/app/controllers/broken_controller.rb:7:in `block in fail': StandardError (StandardError)
Killing session flusher
Completed 406 Not Acceptable in 2ms (ActiveRecord: 0.0ms | Allocations: 983)
Shutting down background worker
User excluded error: #<ActionController::MissingExactTemplate: BrokenController#fail is missing a template for request formats: text/html>
ThreadError (can't alloc thread):
sentry-ruby-core (5.3.0) lib/sentry/session_flusher.rb:71:in `new'
sentry-ruby-core (5.3.0) lib/sentry/session_flusher.rb:71:in `ensure_thread'
sentry-ruby-core (5.3.0) lib/sentry/session_flusher.rb:34:in `add_session'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:192:in `end_session'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:201:in `ensure in with_session_tracking'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:201:in `with_session_tracking'
sentry-ruby-core (5.3.0) lib/sentry-ruby.rb:351:in `with_session_tracking'
sentry-ruby-core (5.3.0) lib/sentry/rack/capture_exceptions.rb:17:in `block in call'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:59:in `with_scope'
sentry-ruby-core (5.3.0) lib/sentry-ruby.rb:331:in `with_scope'
sentry-ruby-core (5.3.0) lib/sentry/rack/capture_exceptions.rb:16:in `call'
bullet (7.0.2) lib/bullet/rack.rb:12:in `call'
apollo_upload_server (2.0.5) lib/apollo_upload_server/middleware.rb:11:in `call'
rack (2.2.4) lib/rack/etag.rb:27:in `call'
rack (2.2.4) lib/rack/conditional_get.rb:27:in `call'
rack (2.2.4) lib/rack/head.rb:12:in `call'
activerecord (6.1.6.1) lib/active_record/migration.rb:601:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/callbacks.rb:27:in `block in call'
activesupport (6.1.6.1) lib/active_support/callbacks.rb:98:in `run_callbacks'
actionpack (6.1.6.1) lib/action_dispatch/middleware/callbacks.rb:26:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/actionable_exceptions.rb:18:in `call'
sentry-rails (5.3.0) lib/sentry/rails/rescued_exception_interceptor.rb:12:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/debug_exceptions.rb:29:in `call'
sentry-ruby-core (5.3.0) lib/sentry/rack/capture_exceptions.rb:26:in `block (2 levels) in call'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:199:in `with_session_tracking'
sentry-ruby-core (5.3.0) lib/sentry-ruby.rb:351:in `with_session_tracking'
sentry-ruby-core (5.3.0) lib/sentry/rack/capture_exceptions.rb:17:in `block in call'
sentry-ruby-core (5.3.0) lib/sentry/hub.rb:59:in `with_scope'
sentry-ruby-core (5.3.0) lib/sentry-ruby.rb:331:in `with_scope'
sentry-ruby-core (5.3.0) lib/sentry/rack/capture_exceptions.rb:16:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/show_exceptions.rb:33:in `call'
railties (6.1.6.1) lib/rails/rack/logger.rb:37:in `call_app'
railties (6.1.6.1) lib/rails/rack/logger.rb:26:in `block in call'
activesupport (6.1.6.1) lib/active_support/tagged_logging.rb:99:in `block in tagged'
activesupport (6.1.6.1) lib/active_support/tagged_logging.rb:37:in `tagged'
activesupport (6.1.6.1) lib/active_support/tagged_logging.rb:99:in `tagged'
railties (6.1.6.1) lib/rails/rack/logger.rb:26:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/remote_ip.rb:81:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/request_id.rb:26:in `call'
rack (2.2.4) lib/rack/runtime.rb:22:in `call'
rack-timeout (0.6.0) lib/rack/timeout/core.rb:151:in `block in call'
rack-timeout (0.6.0) lib/rack/timeout/support/timeout.rb:19:in `timeout'
rack-timeout (0.6.0) lib/rack/timeout/core.rb:150:in `call'
activesupport (6.1.6.1) lib/active_support/cache/strategy/local_cache_middleware.rb:29:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/executor.rb:14:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/static.rb:24:in `call'
rack (2.2.4) lib/rack/sendfile.rb:110:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/debug_locks.rb:41:in `call'
actionpack (6.1.6.1) lib/action_dispatch/middleware/host_authorization.rb:148:in `call'
rack-cors (1.1.1) lib/rack/cors.rb:100:in `call'
railties (6.1.6.1) lib/rails/engine.rb:539:in `call'
puma (5.6.4) lib/puma/configuration.rb:252:in `call'
puma (5.6.4) lib/puma/request.rb:77:in `block in handle_request'
puma (5.6.4) lib/puma/thread_pool.rb:340:in `with_force_shutdown'
puma (5.6.4) lib/puma/request.rb:76:in `handle_request'
puma (5.6.4) lib/puma/server.rb:441:in `process_client'
puma (5.6.4) lib/puma/thread_pool.rb:147:in `block in spawn_thread'
source=rack-timeout id=9f46256b-7617-4a36-918c-ed89ad27c189 timeout=60000ms service=80ms term_on_timeout=1 state=completed
At the moment we disabled automatic session tracking. However, following change to Sentry::SessionFlusher
eliminates unexpected behavior.
def ensure_thread
return if @thread&.alive?
return if Thread.current.status == "aborting"
@thread = Thread.new do
loop do
sleep(FLUSH_INTERVAL)
flush
end
end
rescue ThreadError => e # capture exception due to potential possibility of terminating current thread between state check and child thread spawn.
nil
end
Ruby Version
3.0.2
SDK Version
5.3.0
Integration and Its Version
Rails 6.1.4, Rack 2.2.4
Sentry Config
Sentry.init do |config|
config.dsn = "dummy://12345:67890@sentry.localdomain/sentry/42"
config.background_worker_threads = Rails.env.test? ? 0 : 5
config.sample_rate = 1
config.traces_sample_rate = 0.2
config.auto_session_tracking = true
config.send_default_pii = true
filter = ActiveSupport::ParameterFilter.new(Rails.application.config.filter_parameters)
config.before_send = ->(event, _hint) { filter.filter(event.to_hash) }
end
Activity