Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supressing errors when handling retries #3835

Open
liaden opened this issue Aug 8, 2024 · 4 comments
Open

Supressing errors when handling retries #3835

liaden opened this issue Aug 8, 2024 · 4 comments
Labels
community Was opened by a community member feature-request A request for a new feature or change to an existing one

Comments

@liaden
Copy link

liaden commented Aug 8, 2024

Problem:

We are using RestClient to make make a request to an external API which is raising RestClient::ServerBrokeConnection which we plan on addressing using ruby's retry since the majority of requests are succeeding; however, I do not see

Given that datadog's RestClient patch is going to associate the error with the span and finish the span before my app code can retry, I'm curious how I can suppress these errors when I am retrying the error.

Describe the goal of the feature

Guidance and/or functionality for handling error retrying within an application that crosses the datadog patching boundary, ideally for other integrations.

Additionally, it'd be nice to have some insight into these transient errors, or retries.

Describe alternatives you've considered
Switching from RestClient to faraday and using the retry middleware to hopefully be handled within datadog's error tracking of the span.

@liaden liaden added community Was opened by a community member feature-request A request for a new feature or change to an existing one labels Aug 8, 2024
@liaden
Copy link
Author

liaden commented Aug 8, 2024

Created this as a new issue instead of commenting on #3820 since the perspectives are different (patching redis that does retries vs app handling retries outside of datadog and the library).

@marcotc
Copy link
Member

marcotc commented Aug 8, 2024

Hey @liaden! 👋

We are using RestClient to make make a request to an external API which is raising RestClient::ServerBrokeConnection which we plan on addressing using ruby's retry since the majority of requests are succeeding; however, I do not see

Because you creating a custom wrapper to perform the retrying, I recommend creating a span that represents your wrapper. This way you'll have a custom span representing the specific operation that you created.

Given that datadog's RestClient patch is going to associate the error with the span and finish the span before my app code can retry, I'm curious how I can suppress these errors when I am retrying the error.

If you create a span to represent your wrapper, the RestClient requests will be encapsulated in a single parent span, so you won't have issues with Datadog spans finishing before your code runs. The errors will also not be propagated, given your wrapper will capture them, so there's no concern with error reporting.

Please let me know if I'm misunderstanding your set up.

@liaden
Copy link
Author

liaden commented Aug 15, 2024

@marcotc I don't think I understand fully what you are proposing, and that may be because I am missing/misunderstanding some of the capabilities of Datadog?

If I wrap the RestClient::Request.execute that my code is doing with my own span, that span will be associated with my sidekiq.job operation and not the rest_client.request operation that is attached to the api.external-vendor.com service where I was looking at the error.

Is there something special about manually creating a span that wraps only one other span?

@marcotc
Copy link
Member

marcotc commented Aug 21, 2024

🤔 Maybe I misunderstand your current scenario.

which we plan on addressing using ruby's retry since the majority of requests are succeeding

How would your retry code look like, more specifically how you would it interact with the RestClient calls?

My suggestion from the earlier comment is based on the fact that errors in the Datadog Error Tracking product only count if they bubble up all the way to the top span of a trace. Error in internal spans, that get rescued, do not count for error tracking. The spans will be marked as error, because that's an accurate representation of the Ruby process control flow, but it will not trigger Datadog error tracking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Was opened by a community member feature-request A request for a new feature or change to an existing one
Projects
None yet
Development

No branches or pull requests

2 participants