Message ID | 1431632941-3430-1-git-send-email-sean.hefty@intel.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On Thu, 2015-05-14 at 12:49 -0700, sean.hefty@intel.com wrote: > From: Ted Kim <ted.h.kim@oracle.com> > > Problem reported by: Ted Kim <ted.h.kim@oracle.com>: > > We have a case where a Linux system and a non-Linux system are > trying to interoperate. The Linux host is the active side and > starts the connection establishment, but later decides to not go > through with the connection setup and does rdma_destroy_id(). > > The rdma_destroy_id() eventually works its way down to cm_destroy_id() > in core/cm.c, where a REJ is sent. The non-Linux system > has some trouble recognizing the REJ because of: > > A. CM states which can't receive the REJ > B. Some issues about REJ formatting (missing comm ID) > > ISSUE A: That part of the spec says, a Consumer Reject REJ can be > sent for a connection abort, but it goes further > and says: can send a REJ message with a "Consumer Reject" > Reason code if they are in a CM state (i.e. REP > Rcvd, MRA(REP) Sent, REQ Rcvd, MRA Sent) that allows > a REJ to be sent (lines 35-38). > > Of the states listed there in that sentence, it would > seem to limit the active side to using the Consumer Reject > (for the abort case) in just the REP-Rcvd and MRA-REP-Sent > states. That is basically only after the active side > sees a REP (or alternatively goes down the state transitions > to timeout in which case a Timeout REJ is sent). > > As a fix, in cm-destroy-id() move the IB-CM-MRA-REQ-RCVD case > to the same as REQ-SENT. Essentially, make a REJ sent after > getting an MRA on active side a timeout rather than Consumer- > Reject, which is arguably more correct with the CM state > diagrams previous to getting a REP. > > Signed-off-by: Ted Kim <ted.h.kim@oracle.com> > Signed-off-by: Sean Hefty <sean.hefty@intel.com> Sean, a few questions: Are you proposing for 4.1-rc? I assume this passed testing on the linux/non-linux setup? Was it tested in a similar situation on linux/linux setup? How do we handle this specific rej scenario? Did you intend to add an ISSUE B: section to the commit message? > --- > drivers/infiniband/core/cm.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c > index e28a494..346243e 100644 > --- a/drivers/infiniband/core/cm.c > +++ b/drivers/infiniband/core/cm.c > @@ -862,6 +862,7 @@ retest: > cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT); > break; > case IB_CM_REQ_SENT: > + case IB_CM_MRA_REQ_RCVD: > ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg); > spin_unlock_irq(&cm_id_priv->lock); > ib_send_cm_rej(cm_id, IB_CM_REJ_TIMEOUT, > @@ -880,7 +881,6 @@ retest: > NULL, 0, NULL, 0); > } > break; > - case IB_CM_MRA_REQ_RCVD: > case IB_CM_REP_SENT: > case IB_CM_MRA_REP_RCVD: > ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
> Are you proposing for 4.1-rc? The code has been this way forever, so IMO 4.2 should be fine. > I assume this passed testing on the linux/non-linux setup? The patch came from Ted. I'm assuming that he tested with this. > Was it tested in a similar situation on linux/linux setup? How do we > handle this specific rej scenario? This change effectively changes the reject code that is reported in the reject message from 'consumer reject' to 'timeout'. The code already handles this. > Did you intend to add an ISSUE B: section to the commit message? The issue is listed as the missing comm id. Though, thinking about it again, the issue may have been a missing node GUID. I can't recall now. Maybe Ted does. The IB spec is goofy here in that it only allows a CM REQ, once sent, to timeout. It doesn't define an 'abort' case. The spec also treats the connection is being in a single 'REP wait' state, regardless of whether an MRA has been received or not. This change results in the 'I got an MRA but I'm waiting for a REP' to be handled the same as 'I'm waiting for a REP', at least for the destruction case. - Sean
On Thu, 2015-05-14 at 12:49 -0700, sean.hefty@intel.com wrote: > From: Ted Kim <ted.h.kim@oracle.com> > > Problem reported by: Ted Kim <ted.h.kim@oracle.com>: > > We have a case where a Linux system and a non-Linux system are > trying to interoperate. The Linux host is the active side and > starts the connection establishment, but later decides to not go > through with the connection setup and does rdma_destroy_id(). > > The rdma_destroy_id() eventually works its way down to cm_destroy_id() > in core/cm.c, where a REJ is sent. The non-Linux system > has some trouble recognizing the REJ because of: > > A. CM states which can't receive the REJ > B. Some issues about REJ formatting (missing comm ID) > > ISSUE A: That part of the spec says, a Consumer Reject REJ can be > sent for a connection abort, but it goes further > and says: can send a REJ message with a "Consumer Reject" > Reason code if they are in a CM state (i.e. REP > Rcvd, MRA(REP) Sent, REQ Rcvd, MRA Sent) that allows > a REJ to be sent (lines 35-38). > > Of the states listed there in that sentence, it would > seem to limit the active side to using the Consumer Reject > (for the abort case) in just the REP-Rcvd and MRA-REP-Sent > states. That is basically only after the active side > sees a REP (or alternatively goes down the state transitions > to timeout in which case a Timeout REJ is sent). > > As a fix, in cm-destroy-id() move the IB-CM-MRA-REQ-RCVD case > to the same as REQ-SENT. Essentially, make a REJ sent after > getting an MRA on active side a timeout rather than Consumer- > Reject, which is arguably more correct with the CM state > diagrams previous to getting a REP. > > Signed-off-by: Ted Kim <ted.h.kim@oracle.com> > Signed-off-by: Sean Hefty <sean.hefty@intel.com> I've picked this up for 4.2.
On 05/15/15 10:06 AM, Hefty, Sean wrote: >> Did you intend to add an ISSUE B: section to the commit message? > > The issue is listed as the missing comm id. Though, thinking about it > again, the issue may have been a missing node GUID. I can't recall now. > Maybe Ted does. Imagine the case of the Linux active side sending REQ with it's CM communication id. When it gets the REP back, the other side's communication ID will be received and internal CM state updated. So any REJ from this point on can send both communication IDs. Now imagine a slightly different case. The active side sends the REQ, and the passive side responds with MRA. At this point the CM should update the state with the second communication ID coming in the MRA. But it does not do so. This is "ISSUE B". This is will require a different fix from the patch in question to add the communication ID updating when getting an MRA. So as a side effect, a REJ issued at this point by the active side has only its own communication ID. The problem with only one communication ID REJs is that they are all generated by remote systems from the one being sent the REJ and therefore might not be unique. The way that TIMEOUT REJ addresses this is to have a GUID in the ARI to make it unique. But other REJ codes don't have this. -ted > The IB spec is goofy here in that it only allows a CM REQ, once sent, > to timeout. It doesn't define an 'abort' case. The spec also treats > the connection is being in a single 'REP wait' state, regardless of > whether an MRA has been received or not. This change results in > the 'I got an MRA but I'm waiting for a REP' to be handled the > same as 'I'm waiting for a REP', at least for the destruction case.
On Fri, May 15, 2015 at 05:06:52PM +0000, Hefty, Sean wrote: > > Are you proposing for 4.1-rc? > > The code has been this way forever, so IMO 4.2 should be fine. Yes, but surely we want this backported to -stable and distros? It isn't tagged for that.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Yes, but surely we want this backported to -stable and distros? It > isn't tagged for that.. I don't think that the problem justifies backporting the patch. The passive side drops the reject, which could have happened anyway, and gets a reject message when sending a REP. - Sean -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c index e28a494..346243e 100644 --- a/drivers/infiniband/core/cm.c +++ b/drivers/infiniband/core/cm.c @@ -862,6 +862,7 @@ retest: cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT); break; case IB_CM_REQ_SENT: + case IB_CM_MRA_REQ_RCVD: ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg); spin_unlock_irq(&cm_id_priv->lock); ib_send_cm_rej(cm_id, IB_CM_REJ_TIMEOUT, @@ -880,7 +881,6 @@ retest: NULL, 0, NULL, 0); } break; - case IB_CM_MRA_REQ_RCVD: case IB_CM_REP_SENT: case IB_CM_MRA_REP_RCVD: ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);