-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PWX-21175: handle kernel crash in fastpath #229
Conversation
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please add a description for the issue being fixed.
mutex_unlock(&pxd_dev->disk->queue->sysfs_lock); | ||
mutex_unlock(&pxd_dev->disk->queue->sysfs_lock); | ||
#else | ||
blk_set_queue_dying(pxd_dev->disk->queue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does QUEUE_FLAG_SET() not work for all cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may. There are additional actions done while marking a device queue as removing, please look inside blk_set_queue_dying. It would be safe to use a direct exported API instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right but the reason we were not using blk_set_queue_dying() is because it's not available on all kernels. Please check if PX_BLKMQ covers that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PX_BLKMQ is defined from kernel 4.18+ or if el8 is defined. And in vanilla kernel is available from 3.19. We should be safe with this change. Will look out for any compilation issues after merge.
BUG_ON("unexpected condition"); | ||
} | ||
#endif | ||
BUG_ON(!rq_is_special(rq)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rq_is_special() only checks for discard, Based on the below it should also check for WRITE_SAME and zeroout if there's an op for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. px device only supports read/write and discard, write same and write zeros are not supported on px device.
The below logic converts the received discard to map to the backing device (btrfs file or dmthin volume).
It may convert discard internally to write same or write zero if discard is not available.
Please see detailed inline comments on the first PR #227 |
mutex_unlock(&pxd_dev->disk->queue->sysfs_lock); | ||
mutex_unlock(&pxd_dev->disk->queue->sysfs_lock); | ||
#else | ||
blk_set_queue_dying(pxd_dev->disk->queue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right but the reason we were not using blk_set_queue_dying() is because it's not available on all kernels. Please check if PX_BLKMQ covers that.
rq_for_each_segment(bv, rq, rq_iter) nr_bvec++; | ||
if (!specialops) | ||
rq_for_each_segment(bv, rq, rq_iter) nr_bvec++; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine. But we need a take a closer look at it.
Since the compilation doesn't fail a kernel upgrade on an existing px node will result in a kernel panic and likely in a loop. We either not use apis like this or have a way of sanity checking the arguments before the call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code cannot be future proof, since our driver is outside tree. I think the only focus we can make is issue these api calls only on a need basis and not assume they will behave the same otherwise. Like in this example, if this is a discard request, we do know there aren't any iovec carried, so skipping the call, but for read/writes this has to exist.
Yes exactly the latter from your argument, check if discard, then avoid making the call.
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan lns@portworx.com
What this PR does / why we need it:
Which issue(s) this PR fixes (optional)
Closes # https://portworx.atlassian.net/browse/PWX-21175
Special notes for your reviewer:
PASSED last 6 consecutive runs in the automation test.