Message ID | AA2DB53B-DFC7-4B88-9515-E4C9AFA6435D@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Eric Van Hensbergen |
Headers | show |
Series | 9p/trans_fd: avoid sending req to a cancelled conn | expand |
Sishuai Gong wrote on Tue, Aug 08, 2023 at 12:44:31PM -0400: > When a connection is cancelled by p9_conn_cancel(), all requests on it > should be cancelled---mark req->status as REQ_STATUS_ERROR. However, > because a race over m->err between p9_conn_cancel() and p9_fd_request(), > p9_fd_request might see the old value of m->err, think that the connection > is NOT cancelled, and then add new requests to this cancelled connection. > > Fixing this issue by lock-protecting the check on m->err. Sorry for the (long) delay in reviewing this, and thanks for the patch! I've got a bit of a mixed feelings on this, as there are other obviously unlocked m->err accesses in the work functions and we don't really want to take the req lock there everytime.. These should be ok in the normal p9_conn_destroy that disarms poll and cancels work before calling p9_conn_cancel, but the other calls to p9_conn_cancel are simiarly racy and for example a cancel that'd be initiated by the write end could process a new request in p9_read_work after the reqs have been drained out... On the the other hand, this is better than the current situation so I think we should take it for now until someone has time to do better; it's just leak in the unlikely case this race is hit but we might as well close part of the problem. I don't see how that'd break anything anyway, so I'll push this to my next branch tomorrow after testing the rest of my patches. > > Signed-off-by: Sishuai Gong <sishuai.system@gmail.com> > --- > net/9p/trans_fd.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c > index 00b684616e8d..e43a850f5190 100644 > --- a/net/9p/trans_fd.c > +++ b/net/9p/trans_fd.c > @@ -671,10 +671,14 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) > > p9_debug(P9_DEBUG_TRANS, "mux %p task %p tcall %p id %d\n", > m, current, &req->tc, req->tc.id); > - if (m->err < 0) > - return m->err; > > spin_lock(&m->req_lock); > + > + if (m->err < 0) { > + spin_unlock(&m->req_lock); > + return m->err; > + } > + > WRITE_ONCE(req->status, REQ_STATUS_UNSENT); > list_add_tail(&req->req_list, &m->unsent_req_list); > spin_unlock(&m->req_lock); > — > >
On Monday, October 23, 2023 2:54:53 PM CEST asmadeus@codewreck.org wrote: > Sishuai Gong wrote on Tue, Aug 08, 2023 at 12:44:31PM -0400: > > When a connection is cancelled by p9_conn_cancel(), all requests on it > > should be cancelled---mark req->status as REQ_STATUS_ERROR. However, > > because a race over m->err between p9_conn_cancel() and p9_fd_request(), > > p9_fd_request might see the old value of m->err, think that the connection > > is NOT cancelled, and then add new requests to this cancelled connection. > > > > Fixing this issue by lock-protecting the check on m->err. > > Sorry for the (long) delay in reviewing this, and thanks for the patch! > > I've got a bit of a mixed feelings on this, as there are other obviously > unlocked m->err accesses in the work functions and we don't really want > to take the req lock there everytime.. These should be ok in the normal > p9_conn_destroy that disarms poll and cancels work before calling > p9_conn_cancel, but the other calls to p9_conn_cancel are simiarly racy > and for example a cancel that'd be initiated by the write end could > process a new request in p9_read_work after the reqs have been drained > out... > > On the the other hand, this is better than the current situation so I > think we should take it for now until someone has time to do better; > it's just leak in the unlikely case this race is hit but we might as > well close part of the problem. > > I don't see how that'd break anything anyway, so I'll push this to my > next branch tomorrow after testing the rest of my patches. Same here: I don't see it making it worse, so at least a little improvement: Reviewed-by: Christian Schoenebeck <linux_oss@crudebyte.com> > > > > Signed-off-by: Sishuai Gong <sishuai.system@gmail.com> > > --- > > net/9p/trans_fd.c | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c > > index 00b684616e8d..e43a850f5190 100644 > > --- a/net/9p/trans_fd.c > > +++ b/net/9p/trans_fd.c > > @@ -671,10 +671,14 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) > > > > p9_debug(P9_DEBUG_TRANS, "mux %p task %p tcall %p id %d\n", > > m, current, &req->tc, req->tc.id); > > - if (m->err < 0) > > - return m->err; > > > > spin_lock(&m->req_lock); > > + > > + if (m->err < 0) { > > + spin_unlock(&m->req_lock); > > + return m->err; > > + } > > + > > WRITE_ONCE(req->status, REQ_STATUS_UNSENT); > > list_add_tail(&req->req_list, &m->unsent_req_list); > > spin_unlock(&m->req_lock); > > — > > > >
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index 00b684616e8d..e43a850f5190 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -671,10 +671,14 @@ static int p9_fd_request(struct p9_client *client, struct p9_req_t *req) p9_debug(P9_DEBUG_TRANS, "mux %p task %p tcall %p id %d\n", m, current, &req->tc, req->tc.id); - if (m->err < 0) - return m->err; spin_lock(&m->req_lock); + + if (m->err < 0) { + spin_unlock(&m->req_lock); + return m->err; + } + WRITE_ONCE(req->status, REQ_STATUS_UNSENT); list_add_tail(&req->req_list, &m->unsent_req_list); spin_unlock(&m->req_lock);
When a connection is cancelled by p9_conn_cancel(), all requests on it should be cancelled---mark req->status as REQ_STATUS_ERROR. However, because a race over m->err between p9_conn_cancel() and p9_fd_request(), p9_fd_request might see the old value of m->err, think that the connection is NOT cancelled, and then add new requests to this cancelled connection. Fixing this issue by lock-protecting the check on m->err. Signed-off-by: Sishuai Gong <sishuai.system@gmail.com> --- net/9p/trans_fd.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) —