-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PWX-18756 fix race in restart and new req processing #190
Conversation
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
This reverts commit 033aeed.
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
@prabirpaul or @maxkozlovsky can you please review this ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syncrhonize_rcu() is necessary. Otherwise the thread can check the values of connected/!allow_disconnected, find them false and proceed queueing request. synchronize_rcu() makes sure that all operations that found disconnected false will finish before synchronize_rcu() returns to the caller. READ_ONCE() is not really necessary here as far as I can tell, even if they change while being read it is only can go towards disconnected state. d do READ_ONCE() is necessary with a rcu protected pointer value to avoid operating on two different pointers.
The race seems to be calling fuse_end_queued_requests() while the allocation is in progress so they end up operating on the same request. Just moving the check for disconnected before allocation should be enough.
pxd.c
Outdated
@@ -2097,6 +2101,7 @@ static int pxd_control_open(struct inode *inode, struct file *file) | |||
file->private_data = fc; | |||
|
|||
pxdctx_set_connected(ctx, true); | |||
wmb(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memory barrier is redundant here since spin_unlock just happened which does memory barrier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allow_disconnected is outside lock. And this path is called only once. wmb() ensures values get committed and sure visible after this call.
pxd.c
Outdated
@@ -2120,6 +2125,7 @@ static int pxd_control_release(struct inode *inode, struct file *file) | |||
pxd_printk("%s: not opened\n", __func__); | |||
} else { | |||
ctx->fc.connected = 0; | |||
wmb(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a barrier inside schedule_delayed_work this is redundant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wmb even if redundant assures of the modified value.
pxd.c
Outdated
/* Let other threads see the value of allow_disconnected. */ | ||
synchronize_rcu(); | ||
wmb(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wmb by itself does not do much.
synchronize_rcu() has a stronger memory barrier guarantee. By the time synchronize_rcu() returns all threads including this one has done a memory barrier and will see the updated value of allow_disconnected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It did not happen. Earlier the usage was with synchronize_rcu() and read_[lock|unlock]_rcu. It did not ensure updated values get reflected in the IO enqueue path and that was the reason new requests got enqueued while abort handling was running. Also access to variables without READ/WRITE_ONCE macros allows those access to be reordered during execution.
In any case, with the posted changes the original issue of invalid index and assertion is not seen anymore.
How could this race be open if the values of allow_disconnected is consistent? This is precisely addressed by wmb(). |
pxd.c
Outdated
@@ -993,7 +994,8 @@ static void pxd_rq_fn(struct request_queue *q) | |||
break; | |||
|
|||
/* Filter out block requests we don't understand. */ | |||
if (BLK_RQ_IS_PASSTHROUGH(rq)) { | |||
if (BLK_RQ_IS_PASSTHROUGH(rq) || | |||
(!fc->allow_disconnected && !fc->connected)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
early return based on allow_disconnected and connected state. This ensures no allocation and immediate freeing of requests happen later.
pxd.c
Outdated
|
||
if (BLK_RQ_IS_PASSTHROUGH(rq)) | ||
if (BLK_RQ_IS_PASSTHROUGH(rq) || | ||
(!fc->allow_disconnected && !fc->connected)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
early return based on allow_disconnected and connected state. This ensures no allocation and immediate freeing of requests happen later.
use READ/WRITE_ONCE macro to work with connected/allow_disconnected var Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue not seen even with posted changes.
/* | ||
* Ensures checking the value of allow_disconnected and adding request to | ||
* queue is done atomically. | ||
*/ | ||
rcu_read_lock(); | ||
|
||
if (force) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only allow queueing request to userspace if either of connected or allow_disconnected is true, even for fastpath requests. So removed force option.
{ | ||
req->in.unique = fuse_get_unique(fc); | ||
fc->request_map[req->in.unique & (FUSE_MAX_REQUEST_IDS - 1)] = req; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified queue_request to also assign unique ids. queue_request should get called only if userspace enqueue conditions are met.
dev.c
Outdated
} | ||
rcu_read_unlock(); | ||
} else if (fc->connected || fc->allow_disconnected) { | ||
if (READ_ONCE(fc->connected) || READ_ONCE(fc->allow_disconnected)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
modified all connected and allow_disconnected to use READ/WRITE_ONCE macros so no compiler optimizations/reordering happens.
@@ -222,24 +231,15 @@ static void request_end(struct fuse_conn *fc, struct fuse_req *req, | |||
if (shouldfree) fuse_request_free(req); | |||
} | |||
|
|||
void fuse_request_send_nowait(struct fuse_conn *fc, struct fuse_req *req, bool force) | |||
void fuse_request_send_nowait(struct fuse_conn *fc, struct fuse_req *req) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only allow queueing request to userspace if either of connected or allow_disconnected is true, even for fastpath requests. So removed force option.
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
@@ -141,7 +141,13 @@ static void fuse_put_unique(struct fuse_conn *fc, u64 uid) | |||
{ | |||
struct fuse_per_cpu_ids *my_ids; | |||
int num_free; | |||
int cpu = get_cpu(); | |||
int cpu; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possible to call put without get unique id, handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok,
note though that READ_ONCE()
does not actually inhibit reordering. READ_ONCE()
is equivalent to
std::atomic_load(&val, std::memory_order_relaxed);
Any other loads and stores can be reordered around it. It does guarantee that the value will be loaded once and only once. If READ_ONCE()
happens in a loop, it would be done once per loop iteration.
pxd.c
Outdated
|
||
if (BLK_RQ_IS_PASSTHROUGH(rq)) | ||
if (BLK_RQ_IS_PASSTHROUGH(rq) || | ||
(!READ_ONCE(fc->allow_disconnected) && !READ_ONCE(fc->connected))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the duplicate check necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fc->connected can be killed, checking just for allow_disconnected is sufficient.
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
introducing READ_ONCE() ensures that these variables have concurrent updates and values do not get cached before the point of access. A necessity for rcu critical section to work, without which compiler can cache/reorder access of vars as needed. |
* address race between restart and new req processing Signed-off-by: Lakshmi Narasimhan Sundararajan <lns@portworx.com>
Issue:
Stuck requests from block layer but fuse has none.
Assertion with invalid uid seen.
JIRA:
https://portworx.atlassian.net/browse/PWX-18756
Root cause:
1/ rcu usage incorrect, code accessing protected variables naked causing stale values to be read.
fc->connected and fc->allow_disconnected though protected with rcu read and sync calls (not all places),
were never protected against compiler optimization by accessing them WRITE_ONCE(), READ_ONCE() macros.
There are many places where updates to these happen, hence removed them altogether and switched to use memory barriers instead.
2/ when new requests get submitted in disconnected state, uids are still allocated, request initialized and freed again.
This can all be avoided by early return.
NOTES: