diff mbox

ib/cm: Change reject message type when destroying cm_id

Message ID 1431632941-3430-1-git-send-email-sean.hefty@intel.com (mailing list archive)
State Accepted
Headers show

Commit Message

Hefty, Sean May 14, 2015, 7:49 p.m. UTC
From: Ted Kim <ted.h.kim@oracle.com>

Problem reported by: Ted Kim <ted.h.kim@oracle.com>:

We have a case where a Linux system and a non-Linux system are
trying to interoperate.  The Linux host is the active side and
starts the connection establishment, but later decides to not go
through with the connection setup and does rdma_destroy_id().

The rdma_destroy_id() eventually works its way down to cm_destroy_id()
in core/cm.c, where a REJ is sent. The non-Linux system
has some trouble recognizing the REJ because of:

A. CM states which can't receive the REJ
B. Some issues about REJ formatting (missing comm ID)

ISSUE A: That part of the spec says, a Consumer Reject REJ can be
sent for a connection abort, but it goes further
and says: can send a REJ message with a "Consumer Reject"
Reason code if they are in a CM state (i.e. REP
Rcvd, MRA(REP) Sent, REQ Rcvd, MRA Sent) that allows
a REJ to be sent (lines 35-38).

Of the states listed there in that sentence, it would
seem to limit the active side to using the Consumer Reject
(for the abort case) in just the REP-Rcvd and MRA-REP-Sent
states. That is basically only after the active side
sees a REP (or alternatively goes down the state transitions
to timeout in which case a Timeout REJ is sent).

As a fix, in cm-destroy-id() move the IB-CM-MRA-REQ-RCVD case
to the same as REQ-SENT.  Essentially, make a REJ sent after
getting an MRA on active side a timeout rather than Consumer-
Reject, which is arguably more correct with the CM state
diagrams previous to getting a REP.

Signed-off-by: Ted Kim <ted.h.kim@oracle.com>
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
---
 drivers/infiniband/core/cm.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

Doug Ledford May 15, 2015, 4:40 p.m. UTC | #1
On Thu, 2015-05-14 at 12:49 -0700, sean.hefty@intel.com wrote:
> From: Ted Kim <ted.h.kim@oracle.com>
> 
> Problem reported by: Ted Kim <ted.h.kim@oracle.com>:
> 
> We have a case where a Linux system and a non-Linux system are
> trying to interoperate.  The Linux host is the active side and
> starts the connection establishment, but later decides to not go
> through with the connection setup and does rdma_destroy_id().
> 
> The rdma_destroy_id() eventually works its way down to cm_destroy_id()
> in core/cm.c, where a REJ is sent. The non-Linux system
> has some trouble recognizing the REJ because of:
> 
> A. CM states which can't receive the REJ
> B. Some issues about REJ formatting (missing comm ID)
> 
> ISSUE A: That part of the spec says, a Consumer Reject REJ can be
> sent for a connection abort, but it goes further
> and says: can send a REJ message with a "Consumer Reject"
> Reason code if they are in a CM state (i.e. REP
> Rcvd, MRA(REP) Sent, REQ Rcvd, MRA Sent) that allows
> a REJ to be sent (lines 35-38).
> 
> Of the states listed there in that sentence, it would
> seem to limit the active side to using the Consumer Reject
> (for the abort case) in just the REP-Rcvd and MRA-REP-Sent
> states. That is basically only after the active side
> sees a REP (or alternatively goes down the state transitions
> to timeout in which case a Timeout REJ is sent).
> 
> As a fix, in cm-destroy-id() move the IB-CM-MRA-REQ-RCVD case
> to the same as REQ-SENT.  Essentially, make a REJ sent after
> getting an MRA on active side a timeout rather than Consumer-
> Reject, which is arguably more correct with the CM state
> diagrams previous to getting a REP.
> 
> Signed-off-by: Ted Kim <ted.h.kim@oracle.com>
> Signed-off-by: Sean Hefty <sean.hefty@intel.com>

Sean, a few questions:

Are you proposing for 4.1-rc?
I assume this passed testing on the linux/non-linux setup?
Was it tested in a similar situation on linux/linux setup?  How do we
handle this specific rej scenario?
Did you intend to add an ISSUE B: section to the commit message?

> ---
>  drivers/infiniband/core/cm.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
> index e28a494..346243e 100644
> --- a/drivers/infiniband/core/cm.c
> +++ b/drivers/infiniband/core/cm.c
> @@ -862,6 +862,7 @@ retest:
>  		cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT);
>  		break;
>  	case IB_CM_REQ_SENT:
> +	case IB_CM_MRA_REQ_RCVD:
>  		ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
>  		spin_unlock_irq(&cm_id_priv->lock);
>  		ib_send_cm_rej(cm_id, IB_CM_REJ_TIMEOUT,
> @@ -880,7 +881,6 @@ retest:
>  				       NULL, 0, NULL, 0);
>  		}
>  		break;
> -	case IB_CM_MRA_REQ_RCVD:
>  	case IB_CM_REP_SENT:
>  	case IB_CM_MRA_REP_RCVD:
>  		ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
Hefty, Sean May 15, 2015, 5:06 p.m. UTC | #2
> Are you proposing for 4.1-rc?


The code has been this way forever, so IMO 4.2 should be fine.

> I assume this passed testing on the linux/non-linux setup?


The patch came from Ted.  I'm assuming that he tested with this.

> Was it tested in a similar situation on linux/linux setup?  How do we

> handle this specific rej scenario?


This change effectively changes the reject code that is reported in the reject message from 'consumer reject' to 'timeout'.  The code already handles this.

> Did you intend to add an ISSUE B: section to the commit message?


The issue is listed as the missing comm id.  Though, thinking about it again, the issue may have been a missing node GUID.  I can't recall now.  Maybe Ted does.

The IB spec is goofy here in that it only allows a CM REQ, once sent, to timeout.  It doesn't define an 'abort' case.  The spec also treats the connection is being in a single 'REP wait' state, regardless of whether an MRA has been received or not.  This change results in the 'I got an MRA but I'm waiting for a REP' to be handled the same as 'I'm waiting for a REP', at least for the destruction case.

- Sean
Doug Ledford May 18, 2015, 3:04 p.m. UTC | #3
On Thu, 2015-05-14 at 12:49 -0700, sean.hefty@intel.com wrote:
> From: Ted Kim <ted.h.kim@oracle.com>
> 
> Problem reported by: Ted Kim <ted.h.kim@oracle.com>:
> 
> We have a case where a Linux system and a non-Linux system are
> trying to interoperate.  The Linux host is the active side and
> starts the connection establishment, but later decides to not go
> through with the connection setup and does rdma_destroy_id().
> 
> The rdma_destroy_id() eventually works its way down to cm_destroy_id()
> in core/cm.c, where a REJ is sent. The non-Linux system
> has some trouble recognizing the REJ because of:
> 
> A. CM states which can't receive the REJ
> B. Some issues about REJ formatting (missing comm ID)
> 
> ISSUE A: That part of the spec says, a Consumer Reject REJ can be
> sent for a connection abort, but it goes further
> and says: can send a REJ message with a "Consumer Reject"
> Reason code if they are in a CM state (i.e. REP
> Rcvd, MRA(REP) Sent, REQ Rcvd, MRA Sent) that allows
> a REJ to be sent (lines 35-38).
> 
> Of the states listed there in that sentence, it would
> seem to limit the active side to using the Consumer Reject
> (for the abort case) in just the REP-Rcvd and MRA-REP-Sent
> states. That is basically only after the active side
> sees a REP (or alternatively goes down the state transitions
> to timeout in which case a Timeout REJ is sent).
> 
> As a fix, in cm-destroy-id() move the IB-CM-MRA-REQ-RCVD case
> to the same as REQ-SENT.  Essentially, make a REJ sent after
> getting an MRA on active side a timeout rather than Consumer-
> Reject, which is arguably more correct with the CM state
> diagrams previous to getting a REP.
> 
> Signed-off-by: Ted Kim <ted.h.kim@oracle.com>
> Signed-off-by: Sean Hefty <sean.hefty@intel.com>

I've picked this up for 4.2.
Ted H. Kim May 18, 2015, 9:27 p.m. UTC | #4
On 05/15/15 10:06 AM, Hefty, Sean wrote:
>> Did you intend to add an ISSUE B: section to the commit message?
>
> The issue is listed as the missing comm id.  Though, thinking about it
> again, the issue may have been a missing node GUID.  I can't recall now.
> Maybe Ted does.

Imagine the case of the Linux active side sending REQ with it's CM communication
id. When it gets the REP back, the other side's communication ID
will be received and internal CM state updated. So any REJ from this
point on can send both communication IDs.

Now imagine a slightly different case. The active side sends the REQ,
and the passive side responds with MRA. At this point the CM should
update the state with the second communication ID coming in the
MRA. But it does not do so. This is "ISSUE B". This is will require
a different fix from the patch in question to add the communication ID
updating when getting an MRA.

So as a side effect, a REJ issued at this point by the active side
has only its own communication ID. The problem with only one communication ID
REJs is that they are all generated by remote systems from the one being
sent the REJ and therefore might not be unique. The way that TIMEOUT
REJ addresses this is to have a GUID in the ARI to make it unique.
But other REJ codes don't have this.

-ted

> The IB spec is goofy here in that it only allows a CM REQ, once sent,
> to timeout.  It doesn't define an 'abort' case.  The spec also treats
> the connection is being in a single 'REP wait' state, regardless of
> whether an MRA has been received or not.  This change results in
> the 'I got an MRA but I'm waiting for a REP' to be handled the
> same as 'I'm waiting for a REP', at least for the destruction case.
Jason Gunthorpe May 19, 2015, 6:14 p.m. UTC | #5
On Fri, May 15, 2015 at 05:06:52PM +0000, Hefty, Sean wrote:
> > Are you proposing for 4.1-rc?
> 
> The code has been this way forever, so IMO 4.2 should be fine.

Yes, but surely we want this backported to -stable and distros? It
isn't tagged for that..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hefty, Sean May 19, 2015, 6:25 p.m. UTC | #6
> Yes, but surely we want this backported to -stable and distros? It
> isn't tagged for that..

I don't think that the problem justifies backporting the patch.  The passive side drops the reject, which could have happened anyway, and gets a reject message when sending a REP.

- Sean
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index e28a494..346243e 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -862,6 +862,7 @@  retest:
 		cm_reject_sidr_req(cm_id_priv, IB_SIDR_REJECT);
 		break;
 	case IB_CM_REQ_SENT:
+	case IB_CM_MRA_REQ_RCVD:
 		ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);
 		spin_unlock_irq(&cm_id_priv->lock);
 		ib_send_cm_rej(cm_id, IB_CM_REJ_TIMEOUT,
@@ -880,7 +881,6 @@  retest:
 				       NULL, 0, NULL, 0);
 		}
 		break;
-	case IB_CM_MRA_REQ_RCVD:
 	case IB_CM_REP_SENT:
 	case IB_CM_MRA_REP_RCVD:
 		ib_cancel_mad(cm_id_priv->av.port->mad_agent, cm_id_priv->msg);