-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More robust gRPC connections, using tenacity
where possible
#521
Conversation
tenacity
where possible
@MaxiBoether Not fully what the problem with partial result states is, but have you considered using retry contexts? |
Ah, nope. I guess I can refactor the code to retry contexts then |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #521 +/- ##
==========================================
- Coverage 82.55% 82.50% -0.05%
==========================================
Files 214 214
Lines 9975 10039 +64
==========================================
+ Hits 8235 8283 +48
- Misses 1740 1756 +16 ☔ View full report in Codecov by Sentry. |
Thanks for suggesting that. I adapted the code. However, I needed to manually catch and reraise exceptions sometimes because I find their callback system to be quite ugly in object-oriented contexts (we need to write a static function that recovers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the refactoring! Looks good now :)
Sometimes, we face outages/random disconnections during training. This fixes it in places where I encountered it last night. I tried to integrate
tenacity
as suggested by @robinholzi, but it's not always possible since the retry logic involves keeping track of already done work, which I don't want to put into class statePart 6/n of porting over SIGMOD changes.