Skip to content

Revert "Guard against subscriber exceptions in ActiveSupport::Notifications" #43561

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

casperisfine
Copy link
Contributor

Reverts #43282

@theojulienne @matthewd @rafaelfranca @jhawthorn I'm afraid this PR breaks a tons of code and more importantly breaks the legitimate use case of raising from a subscriber to interrupt an action.

Could we somehow split the finish in two, so that we first complete all the events, and then call all the subscribers?

Alternatively as a short term solution I suppose we can re-raise the first exception we got? It's a bit dirty, but would work as a stopgap?

@jhawthorn
Copy link
Member

I'm afraid this PR breaks a tons of code and more importantly breaks the legitimate use case of raising from a subscriber to interrupt an action.

Do you have an example of these cases? I'd think raising to interrupt an action should still work even if the exception is wrapped.

@casperisfine
Copy link
Contributor Author

Yeah, we have some test helpers like:

  def assert_no_redis_commands(&block)
    callback = -> (_name, _start, _finnish, _message_id, values) do
      raise(Minitest::Assertion, "Unexpected call to a Redis method: \n #{values[:commands]}")
    end
    ActiveSupport::Notifications.subscribed(callback, "command.shopify_metrics_redis", &block)
  end

This is just an example, but the exception class changing break various things.

@matthewd
Copy link
Member

Yeah, I've incidentally encountered this exception-wrapping and found it made things more difficult to reason about.

Calling all subscribers, then re-raising the first exception, seems like it might be the least-bad option to me. We'll go from aborting iteration immediately to at least giving each subscriber a chance to handle it... but would swallow any exceptions arising from the calls we historically would've skipped.

The only alternative that comes to mind would be to re-raise a single exception untouched, but raise a new MultipleErrors exception (with cause correctly connected to the first exception, as well as an accessor to get the full set) when more than one thing has failed. Advantage: we don't change the raised exception for otherwise-working subscribers with one [deliberate or otherwise] exception; we still communicate every failure to the caller, and if they have a workflow that depends upon raise-and-rescue, they can "upgrade" to rescue both MyException and .any?(MyException). Disadvantage: sometimes the raised exception will still hide the full backtrace, and a failed assertion that also caused some other callback to fail would still raise the wrong thing; a failure in callback B can change what exception the caller sees from a raise in A (though I guess that's already differently-true if B runs before A).

@casperisfine
Copy link
Contributor Author

Calling all subscribers, then re-raising the first exception, seems like it might be the least-bad option to me

As a quick fix yes, but I think ultimately the proper behavior would be to finish all events first, and then call the callbacks in order without rescuing anything.

@casperisfine
Copy link
Contributor Author

What about Shopify@e053a86 ? It's simple, solve the immediate problem.

@matthewd
Copy link
Member

matthewd commented Nov 1, 2021

While that ensures we internally keep completions balanced, I think it still leaves subscribers potentially seeing a start without a finish (right?), which seems generally undesirable.

To me it makes sense that we'd stop processing immediately if a start raises, but later, it feels reasonable to expect that anything which got started will also be finished -- you're past the point of legitimately raising to stop processing. (Does this imply we should finish anything that's already been started if we do hit an otherwise-aborting raise during the start phase? 🤔)

@jhawthorn
Copy link
Member

The only alternative that comes to mind would be to re-raise a single exception untouched, but raise a new MultipleErrors exception (with cause correctly connected to the first exception, as well as an accessor to get the full set) when more than one thing has failed.

I think I like this one. Any rescues or similar not matching seems like a feature if there are multiple exceptions (because otherwise it's arbitrary).

Side-note: I'd love to see us move away from the start/finish pattern and towards the publish_event API Jean introduced for background jobs. In addition to avoiding this problem it's consistent with background jobs and I believe (eventually, with improvements) it will be faster since it only has to iterate through subscribers once. (This is the path I've started down in #43390)

@rafaelfranca
Copy link
Member

What is the difference between publish_event and publish?

@jhawthorn
Copy link
Member

jhawthorn commented Nov 1, 2021

What is the difference between publish_event and publish?

They're similar, but publish behaves inconsistently (events are dropped from subscribers which don't support publish like EventObject, technically supports arbitrary arguments) and is limited to passing all the arguments. I like publish_event (introduced in #41590) because we deal with an EventObject so have monotonic time and allocations and any other data we want in that structure, and passing around the same object should be fast.

@rafaelfranca
Copy link
Member

Should we remove publish then? They having very similar names doesn't help users to know when use one over the other and everything you said as different are mostly implementation details. Is there any reason why publish can't be removed or made work in the same way as publish_event other than backward compatibility?

@casperisfine
Copy link
Contributor Author

I think it still leaves subscribers potentially seeing a start without a finish (right?), which seems generally undesirable.

I guess some subscribers might leak state of things like that if their finish is not called.

alternative that comes to mind would be to re-raise a single exception untouched, but raise a new MultipleErrors exception (with cause correctly connected to the first exception, as well as an accessor to get the full set) when more than one thing has failed.

I suppose the multiple exception scenario is rare enough that we can indeed "optimize" it out.

I'll open a PR for that.

@casperisfine
Copy link
Contributor Author

Closing in favor of #43581

@casperisfine casperisfine deleted the revert-43282-guard-against-instrumentation-exceptions branch November 2, 2021 11:21
casperisfine pushed a commit to Shopify/rails that referenced this pull request Nov 2, 2021
Ref: rails#43282
Ref: rails#43561

It can be legitimate for subscriber to want to bubble up some exception
to the caller, so wrapping it change the exception class which might break
the calling code expecting a specific error.

We can minimize this by only using InstrumentationSubscriberError
when more than one subscriber raised.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants