Add a check to ensure broker connection is ready during poll if auto commit is disabled#2672
Add a check to ensure broker connection is ready during poll if auto commit is disabled#2672TribuneX wants to merge 1 commit intodpkp:masterfrom
Conversation
|
@dpkp Waiting for your feedback on this. We would be happy to integrate this soon in order to fix our issue. |
|
To confirm, we have added connection liveness check and it fixed all the issues - with both CPU and wired UnknownMembeID exceptions |
|
Any updates @dpkp since we would love to integrate this into an official release? |
|
The design and implementation of the kafka consumer is intended to mirror the upstream java consumer. If this is a bug in the java implementation you should file a bug report there. If not (my assumption), then we need to understand whether there is a bug in the python re-implementation. Fixes like this are generally not accepted because they introduce artificial divergence and interfere with keeping the library in sync with upstream changes. My quick review of the code suggests the bug here is that the java client uses a non-zero poll timeout after sending new fetches, while kafka-python always uses 0 timeout. If there is no connection, no sends can be sent and this likely causes a busy poll loop. |
|
So is there a plan to fix it somehow like change that non-zero timeout? |
|
After looking at this more closely I think the best fix is #2695 |
Potentially fixes #2667
If
enable_auto_commit=Falsethe CPU usage of a kafka consumer increases to 100% if the broker connection is lost. Ifenable_auto_commit=True, the connection loss is detected by the periodic auto commit, which acts as a sort of health check in this case.I added a similar check to the
pollmethod. I am not sure if this the best location, thats why I also did not add any test yet.We might also only check the readiness in case
enabled_auto_commit=False.Looking forward to your feedback.