You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.
We use Bookkeeper extensively in our project. While in general Bookkeeper provides good write performance, we noticed that under too much load, the Bookkeeper client may exhibit failures such as BKNotEnoughBookiesException: Not enough non-faulty bookies available.
As I understand, this problem may be caused due to the lack of throttling between the Bookkeeper Client (4.8.2) and Server (4.9.2), which may lead the client to queue up too many requests, and therefore overload the server. This is my conclusion given that the BKNotEnoughBookiesException is normally preceded by errors like ERROR o.a.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: LXXX EYYY, given that one of the Bookies has been "disconnected" during the high load period (e.g., INFO o.a.b.proto.PerChannelBookieClient - Disconnected from bookie channel and WARN o.a.b.c.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies).
While I can understand that Bookies can be temporarily non-responsive due to high load reasons, my question is: how do we handle this situation? Apparently, the Bookkeeper Client tags the overloaded Bookies as "faulty" and they are left like this, right? Is there a way for the Bookkeeper Client to use again the Bookies classified as "faulty"? The reason is that, after inducing high load to a 3-Bookie ensemble and seeing this issue, Bookies can be used afterwards (they are not permanently crashed). However, the Bookkeeper Client is left in this state in which some of the Bookies are tagged as "faulty".
PS: I understand that "having more Bookies" could be a workaround, but my question is specifically on how to deal with the Bookkeeper Client when it quarantines a "faulty" Bookie and we want to use that Bookie later on.
The text was updated successfully, but these errors were encountered:
Original Issue: apache#2277
QUESTION
We use Bookkeeper extensively in our project. While in general Bookkeeper provides good write performance, we noticed that under too much load, the Bookkeeper client may exhibit failures such as
BKNotEnoughBookiesException: Not enough non-faulty bookies available
.As I understand, this problem may be caused due to the lack of throttling between the Bookkeeper Client (4.8.2) and Server (4.9.2), which may lead the client to queue up too many requests, and therefore overload the server. This is my conclusion given that the
BKNotEnoughBookiesException
is normally preceded by errors likeERROR o.a.bookkeeper.client.PendingAddOp - Write of ledger entry to quorum failed: LXXX EYYY
, given that one of the Bookies has been "disconnected" during the high load period (e.g.,INFO o.a.b.proto.PerChannelBookieClient - Disconnected from bookie channel
andWARN o.a.b.c.RackawareEnsemblePlacementPolicyImpl - Failed to find 1 bookies : excludeBookies
).While I can understand that Bookies can be temporarily non-responsive due to high load reasons, my question is: how do we handle this situation? Apparently, the Bookkeeper Client tags the overloaded Bookies as "faulty" and they are left like this, right? Is there a way for the Bookkeeper Client to use again the Bookies classified as "faulty"? The reason is that, after inducing high load to a 3-Bookie ensemble and seeing this issue, Bookies can be used afterwards (they are not permanently crashed). However, the Bookkeeper Client is left in this state in which some of the Bookies are tagged as "faulty".
PS: I understand that "having more Bookies" could be a workaround, but my question is specifically on how to deal with the Bookkeeper Client when it quarantines a "faulty" Bookie and we want to use that Bookie later on.
The text was updated successfully, but these errors were encountered: