-
Notifications
You must be signed in to change notification settings - Fork 96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow rdkafka versions newer than 0.15.0 #388
Conversation
@Thomascountz @bquorning @razumau i could use 👀 on this please |
No breaking changes that would affect racecar in this release (rdkafka maintainer here). |
@mensfeld How might we go about getting this into a release? |
I don't know and really don't care as I maintain aside from rdkafka (the gem you were updating) another kafka framework: https://github.com/karafka/karafka/ I am not a maintainer of racecar and just giving a friendly heads-up that I did not break any APIs in the release you upgraded to |
c0ae82f
to
9b149ae
Compare
Some of the tests are failing with |
9b149ae
to
f73e7d1
Compare
@bquorning I already pushed a fix for this |
we have a new test failure after latest run, looks like cooperative-sticky assignment during a rebalance allows healthy consumers to keep processing their paritions is having issues. is it related to karafka/rdkafka-ruby#493? @mensfeld @bquorning |
can you explain it better? This patch was needed to prevent cooperative sticky from hanging all processing endlessly and I had no problems with it. Exactly that code was later on merged to librdkafka. |
rdkafka 0.18.0 uses librdkafka 2.5.0. It was fixed in 2.6.0:
Aside from that there were few other fixes. If you show me the repro I can relate to that. |
@@ -77,7 +77,7 @@ def deliver! | |||
# The raising can be avoided if max_wait_timeout below is greater than | |||
# config.message_timeout, but config is not available here (without | |||
# changing the interface). | |||
handle.wait(max_wait_timeout: 60, wait_timeout: 0.1) | |||
handle.wait(max_wait_timeout: 60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment (line 71) is wrong now.
Also, there is another use of wait_time
in spec/runner_spec.rb
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Deleted wait_time
in spec/runner_spec.rb and also updated the comment in customer.rb. Let me know what the comment should be if the updated one doesn't make sense
The integration test doesn't work here: cooperative_sticky_assignment_spec.rb and cooperative_sticky_assignment_spec.rb |
@hoangshopify it seems to fail because of some segfaults. It seems like mishandling of rdkafka in racecar. Beyond scope of my support sorry 🙏 |
These segmentation faults turned out to be a red herring: they happen when we shut down consumers and don’t affect test results. This is something we should investigate separately. The real reason why integration tests are failing is that on newer versions of rdkafka we process messages too fast. Wait what? The failing spec is this sticky assignment test:
In this test, we create two consumers, generate 20 messages, wait for a bit, then shut down one of the consumers and check that there are revocation events for this consumer and not for the other one. That “wait a bit” actually means “start yet another consumer and go through five messages”. The reason why this spec fails is that by the time this bonus consumer goes through its five messages, the main consumers complete all 20 messages, so when we shut down one of them, there are no revocations. I am a bit confused by why this happens. I bisected rdkafka changes and found a commit after which the test starts to fail. But this was a change in producer setup, and the time it takes to publish these message hasn’t changed (according to my measurements). The time it takes for producers to finish their 20 messages has decreased, and that’s why the test fails. Regardless of more specific reasons, I think that changing the overall number of messages we send in this spec is a good enough fix. @hoangshopify you can cherry-pick this commit or make the same change yourself. |
60ded23
to
89b48aa
Compare
@razumau I really appreciate your investigation and detailed writeup, also thanks @razumau @bquorning so much to take that long to look at and support this PR! @mensfeld Thank you for your input for rdkafka |
What
This PR completes a housekeeping item: