Message ID | 20181025194057.16973-1-sagi@grimberg.me (mailing list archive) |
---|---|
State | Accepted |
Commit | e48d8ed9c6193502d849b35767fd18e20bbd7ba2 |
Headers | show |
Series | rxe: fix error completion wr_id and qp_num | expand |
On 2018/10/26 3:40, Sagi Grimberg wrote: > Error completions must still contain a valid wr_id and > qp_num such that the consumer can rely on. Correctly > fill these fields in receive error completions. > > Reported-by: Walker Benjamin <benjamin.walker@intel.com> > Cc: stable@vger.kernel.org > Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Tested in my linux environment. This patch worked well. Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> > --- > drivers/infiniband/sw/rxe/rxe_resp.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c > index aa5833318372..3438f0653df0 100644 > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > @@ -841,11 +841,16 @@ static enum resp_states do_complete(struct rxe_qp *qp, > > memset(&cqe, 0, sizeof(cqe)); > > - wc->wr_id = wqe->wr_id; > - wc->status = qp->resp.status; > - wc->qp = &qp->ibqp; > + if (qp->rcq->is_user) { > + uwc->status = qp->resp.status; > + uwc->qp_num = qp->ibqp.qp_num; > + uwc->wr_id = wqe->wr_id; > + } else { > + wc->status = qp->resp.status; > + wc->qp = &qp->ibqp; > + wc->wr_id = wqe->wr_id; > + } > > - /* fields after status are not required for errors */ > if (wc->status == IB_WC_SUCCESS) { > wc->opcode = (pkt->mask & RXE_IMMDT_MASK && > pkt->mask & RXE_WRITE_MASK) ?
Hi, I just checked IB sepc. It looks like, IB spec in relevant chapter (11.4.2) doesn't have any requirement for qp_num to be valid in case of error. Only wr_id and status must be valid. I think, wr_id and status should be valid in existing implementation. struct rxe_cqe { union { struct ib_wc ibwc; struct ib_uverbs_wc uibwc; }; }; In both cases wr_id and status are in the same places So lines wc->wr_id = wqe->wr_id; wc->status = qp->resp.status set the valid values for rxe too. But I agree, that at the first glance, the code looks invalid and some refactoring can be done. Regards Sasha On 25/10/2018 22:40, Sagi Grimberg wrote: > Error completions must still contain a valid wr_id and > qp_num such that the consumer can rely on. Correctly > fill these fields in receive error completions. > > Reported-by: Walker Benjamin <benjamin.walker@intel.com> > Cc: stable@vger.kernel.org > Signed-off-by: Sagi Grimberg <sagi@grimberg.me> > --- > drivers/infiniband/sw/rxe/rxe_resp.c | 13 +++++++++---- > 1 file changed, 9 insertions(+), 4 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c > index aa5833318372..3438f0653df0 100644 > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > @@ -841,11 +841,16 @@ static enum resp_states do_complete(struct rxe_qp *qp, > > memset(&cqe, 0, sizeof(cqe)); > > - wc->wr_id = wqe->wr_id; > - wc->status = qp->resp.status; > - wc->qp = &qp->ibqp; > + if (qp->rcq->is_user) { > + uwc->status = qp->resp.status; > + uwc->qp_num = qp->ibqp.qp_num; > + uwc->wr_id = wqe->wr_id; > + } else { > + wc->status = qp->resp.status; > + wc->qp = &qp->ibqp; > + wc->wr_id = wqe->wr_id; > + } > > - /* fields after status are not required for errors */ > if (wc->status == IB_WC_SUCCESS) { > wc->opcode = (pkt->mask & RXE_IMMDT_MASK && > pkt->mask & RXE_WRITE_MASK) ? >
On Mon, 2018-10-29 at 10:16 +0200, Sasha Kotchubievsky wrote: > I think, wr_id and status should be valid in existing implementation. > struct rxe_cqe { > union { > struct ib_wc ibwc; > struct ib_uverbs_wc uibwc; > }; > }; > In both cases wr_id and status are in the same places > So lines > wc->wr_id = wqe->wr_id; The original bug report clearly called out wr_id as being invalid in their specific scenario. > wc->status = qp->resp.status > set the valid values for rxe too. > > But I agree, that at the first glance, the code looks invalid and some > refactoring can be done. Personally, I like Sagi's patch. It's clear and unambiguous and you don't need to understand that there is a union in play making things work. And the original code obviously got the qp setting for user space wrong, period. You can argue that it doesn't need to provide the qpnum to user space, but the original code set it to the internal qp address for both user and kernel space and needs to be fixed. I'm inclined to take the patch, especially since it has verified test results to go along with it. So, applied to for-next, please post any fixups as incremental patches. > Regards > Sasha > > On 25/10/2018 22:40, Sagi Grimberg wrote: > > Error completions must still contain a valid wr_id and > > qp_num such that the consumer can rely on. Correctly > > fill these fields in receive error completions. > > > > Reported-by: Walker Benjamin <benjamin.walker@intel.com> > > Cc: stable@vger.kernel.org > > Signed-off-by: Sagi Grimberg <sagi@grimberg.me> > > --- > > drivers/infiniband/sw/rxe/rxe_resp.c | 13 +++++++++---- > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c > > index aa5833318372..3438f0653df0 100644 > > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > > @@ -841,11 +841,16 @@ static enum resp_states do_complete(struct rxe_qp *qp, > > > > memset(&cqe, 0, sizeof(cqe)); > > > > - wc->wr_id = wqe->wr_id; > > - wc->status = qp->resp.status; > > - wc->qp = &qp->ibqp; > > + if (qp->rcq->is_user) { > > + uwc->status = qp->resp.status; > > + uwc->qp_num = qp->ibqp.qp_num; > > + uwc->wr_id = wqe->wr_id; > > + } else { > > + wc->status = qp->resp.status; > > + wc->qp = &qp->ibqp; > > + wc->wr_id = wqe->wr_id; > > + } > > > > - /* fields after status are not required for errors */ > > if (wc->status == IB_WC_SUCCESS) { > > wc->opcode = (pkt->mask & RXE_IMMDT_MASK && > > pkt->mask & RXE_WRITE_MASK) ? > >
On Tue, 2018-11-06 at 16:24 -0500, Doug Ledford wrote: > On Mon, 2018-10-29 at 10:16 +0200, Sasha Kotchubievsky wrote: > > I think, wr_id and status should be valid in existing implementation. > > struct rxe_cqe { > > union { > > struct ib_wc ibwc; > > struct ib_uverbs_wc uibwc; > > }; > > }; > > In both cases wr_id and status are in the same places > > So lines > > wc->wr_id = wqe->wr_id; > > The original bug report clearly called out wr_id as being invalid in > their specific scenario. I thought it was, but only qp_num was actually invalid. My interpretation of wr_id was dependent on an earlier use of qp_num and I didn't realize that was the fundamental problem. > > > wc->status = qp->resp.status > > set the valid values for rxe too. > > > > But I agree, that at the first glance, the code looks invalid and some > > refactoring can be done. > > Personally, I like Sagi's patch. It's clear and unambiguous and you > don't need to understand that there is a union in play making things > work. And the original code obviously got the qp setting for user space > wrong, period. You can argue that it doesn't need to provide the qpnum > to user space, but the original code set it to the internal qp address > for both user and kernel space and needs to be fixed. The man page for ibv_poll_cq says qp_num should be valid. I understand the verbs specification may indicate otherwise, but given what the man page says and how easy it is to make qp_num valid, I am very glad to see that this is the approach being taken. That said, I've since reworked my code to avoid needing the qp_num on the error path and everything is working both with and without this patch. Thanks for the help everyone. > > I'm inclined to take the patch, especially since it has verified test > results to go along with it. So, applied to for-next, please post any > fixups as incremental patches. > > > Regards > > Sasha > > > > On 25/10/2018 22:40, Sagi Grimberg wrote: > > > Error completions must still contain a valid wr_id and > > > qp_num such that the consumer can rely on. Correctly > > > fill these fields in receive error completions. > > > > > > Reported-by: Walker Benjamin <benjamin.walker@intel.com> > > > Cc: stable@vger.kernel.org > > > Signed-off-by: Sagi Grimberg <sagi@grimberg.me> > > > --- > > > drivers/infiniband/sw/rxe/rxe_resp.c | 13 +++++++++---- > > > 1 file changed, 9 insertions(+), 4 deletions(-) > > > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c > > > b/drivers/infiniband/sw/rxe/rxe_resp.c > > > index aa5833318372..3438f0653df0 100644 > > > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > > > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > > > @@ -841,11 +841,16 @@ static enum resp_states do_complete(struct rxe_qp > > > *qp, > > > > > > memset(&cqe, 0, sizeof(cqe)); > > > > > > - wc->wr_id = wqe->wr_id; > > > - wc->status = qp->resp.status; > > > - wc->qp = &qp->ibqp; > > > + if (qp->rcq->is_user) { > > > + uwc->status = qp->resp.status; > > > + uwc->qp_num = qp->ibqp.qp_num; > > > + uwc->wr_id = wqe->wr_id; > > > + } else { > > > + wc->status = qp->resp.status; > > > + wc->qp = &qp->ibqp; > > > + wc->wr_id = wqe->wr_id; > > > + } > > > > > > - /* fields after status are not required for errors */ > > > if (wc->status == IB_WC_SUCCESS) { > > > wc->opcode = (pkt->mask & RXE_IMMDT_MASK && > > > pkt->mask & RXE_WRITE_MASK) ? > > > > >
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index aa5833318372..3438f0653df0 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -841,11 +841,16 @@ static enum resp_states do_complete(struct rxe_qp *qp, memset(&cqe, 0, sizeof(cqe)); - wc->wr_id = wqe->wr_id; - wc->status = qp->resp.status; - wc->qp = &qp->ibqp; + if (qp->rcq->is_user) { + uwc->status = qp->resp.status; + uwc->qp_num = qp->ibqp.qp_num; + uwc->wr_id = wqe->wr_id; + } else { + wc->status = qp->resp.status; + wc->qp = &qp->ibqp; + wc->wr_id = wqe->wr_id; + } - /* fields after status are not required for errors */ if (wc->status == IB_WC_SUCCESS) { wc->opcode = (pkt->mask & RXE_IMMDT_MASK && pkt->mask & RXE_WRITE_MASK) ?
Error completions must still contain a valid wr_id and qp_num such that the consumer can rely on. Correctly fill these fields in receive error completions. Reported-by: Walker Benjamin <benjamin.walker@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Sagi Grimberg <sagi@grimberg.me> --- drivers/infiniband/sw/rxe/rxe_resp.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-)