-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: convert some vasserts to exceptions #22822
storage: convert some vasserts to exceptions #22822
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, seems that we need to be a bit more conservative with our uses of vassert()
versus exceptions/error codes in the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: I'll post a separate fix for the codepath in offsets commit that caused the vassert to fire.
shouldn't we fix this, and leave the asserts in place?
Until we make it significantly more difficult to build empty batches, the burden of proof of ensuring storage applications never write empty batches is too high. Ideally we could point to something in our code that guarantees this to never fire, rather than proof by audit. Short of that, throwing at least lets us debug more easily in a pinch. FWIW in the cluster we saw this fire, all nodes went down and it made chasing this much more difficult. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there are places where empty batches are ok (for better user experience). For example the RPC in question here offset_commit_request, (I think) it makes sense to return success if there is nothing to commit but that decision should probably be done on a case by case basis via code audit. Until then exception instead of assert makes sense to me.
We've seen at least one instance where empty batches are pushed to the log. Rather than bringing down the process, this updates a couple vasserts to throw instead (converted a couple that seemed related to data inputs, and left in a vassert that seems much more problematic related to being on the correct shard).
5868f7b
to
5838337
Compare
Tweaked the test to assert on the error types (for my own curiosity) |
Yea maybe we should make the batch builder return a result<offset, error_code> and make callers handle some empty code. |
ya that sounds like a good idea. |
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/52730#01913901-e324-4a45-8d03-558649d670a4 |
/backport v24.2.x |
/backport v24.1.x |
We've seen at least one instance where empty batches are pushed to the log. Rather than bringing down the process, this updates a couple vasserts to throw instead (converted a couple that seemed related to data inputs, and left in a vassert that seems much more problematic related to being on the correct shard).
Fix for the codepath in offsets commit that caused the vassert to fire is here #22824
Backports Required
Release Notes
Bug Fixes