diff mbox

RDMA/nes: Adding queue drain functions

Message ID 20160329175837.GA9868@TENIKOLO-MOBL2 (mailing list archive)
State Accepted
Headers show

Commit Message

Nikolova, Tatyana E March 29, 2016, 5:58 p.m. UTC
Adding sq and rq drain functions, which block until all
previously posted wr-s in the specified queue have completed.
A completion object is signaled to unblock the thread,
when the last cqe for the corresponding queue is processed.

Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Faisal Latif <faisal.latif@intel.com>
---
 drivers/infiniband/hw/nes/nes_verbs.c | 34 ++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/nes/nes_verbs.h |  2 ++
 2 files changed, 36 insertions(+)

Comments

Steve Wise March 29, 2016, 8:49 p.m. UTC | #1
> 
> Adding sq and rq drain functions, which block until all
> previously posted wr-s in the specified queue have completed.
> A completion object is signaled to unblock the thread,
> when the last cqe for the corresponding queue is processed.
> 
> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
> Signed-off-by: Faisal Latif <faisal.latif@intel.com>

Looks good to me.  No locking needed though?

Reviewed-by: Steve Wise <swise@opengridcomputing.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Nikolova, Tatyana E March 29, 2016, 9:18 p.m. UTC | #2
There is locking in place in nes_poll_cq() which also protects the added drain queue functionality.

Thank you,
Tatyana

-----Original Message-----
From: Steve Wise [mailto:swise@opengridcomputing.com] 
Sent: Tuesday, March 29, 2016 3:49 PM
To: Nikolova, Tatyana E <tatyana.e.nikolova@intel.com>; 'Doug Ledford' <dledford@redhat.com>
Cc: linux-rdma@vger.kernel.org; Latif, Faisal <faisal.latif@intel.com>; leon@leon.nu
Subject: RE: [PATCH] RDMA/nes: Adding queue drain functions

> 
> Adding sq and rq drain functions, which block until all previously 
> posted wr-s in the specified queue have completed.
> A completion object is signaled to unblock the thread, when the last 
> cqe for the corresponding queue is processed.
> 
> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
> Signed-off-by: Faisal Latif <faisal.latif@intel.com>

Looks good to me.  No locking needed though?

Reviewed-by: Steve Wise <swise@opengridcomputing.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky March 30, 2016, 7:11 a.m. UTC | #3
On Tue, Mar 29, 2016 at 12:58:37PM -0500, Tatyana Nikolova wrote:
> Adding sq and rq drain functions, which block until all
> previously posted wr-s in the specified queue have completed.
> A completion object is signaled to unblock the thread,
> when the last cqe for the corresponding queue is processed.
> 
> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
> Signed-off-by: Faisal Latif <faisal.latif@intel.com>
> ---
>  drivers/infiniband/hw/nes/nes_verbs.c | 34 ++++++++++++++++++++++++++++++++++
>  drivers/infiniband/hw/nes/nes_verbs.h |  2 ++
>  2 files changed, 36 insertions(+)
> 
> diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
> index fba69a3..170d43a 100644
> --- a/drivers/infiniband/hw/nes/nes_verbs.c
> +++ b/drivers/infiniband/hw/nes/nes_verbs.c
> @@ -1315,6 +1315,8 @@ static struct ib_qp *nes_create_qp(struct ib_pd *ibpd,
>  			nes_debug(NES_DBG_QP, "Invalid QP type: %d\n", init_attr->qp_type);
>  			return ERR_PTR(-EINVAL);
>  	}
> +	init_completion(&nesqp->sq_drained);
> +	init_completion(&nesqp->rq_drained);
>  
>  	nesqp->sig_all = (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR);
>  	init_timer(&nesqp->terminate_timer);
> @@ -3452,6 +3454,29 @@ out:
>  	return err;
>  }
>  
> +/**
> + * nes_drain_sq - drain sq
> + * @ibqp: pointer to ibqp
> + */
> +static void nes_drain_sq(struct ib_qp *ibqp)
> +{
> +	struct nes_qp *nesqp = to_nesqp(ibqp);
> +
> +	if (nesqp->hwqp.sq_tail != nesqp->hwqp.sq_head)

I'm not sure that you need these checks in all places.
It won't protect from anything.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky March 31, 2016, 1:19 a.m. UTC | #4
On Tue, Mar 29, 2016 at 03:49:18PM -0500, Steve Wise wrote:
> > 
> > Adding sq and rq drain functions, which block until all
> > previously posted wr-s in the specified queue have completed.
> > A completion object is signaled to unblock the thread,
> > when the last cqe for the corresponding queue is processed.
> > 
> > Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
> > Signed-off-by: Faisal Latif <faisal.latif@intel.com>
> 
> Looks good to me.  No locking needed though?
> 
> Reviewed-by: Steve Wise <swise@opengridcomputing.com>

Steve,
I see that this implementation follows your reference implementation of
iw_cxgb4 which was iWARP specific. Can you point me to the relevant
information which explain why this specific case exists?

Thanks.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg April 3, 2016, 4:17 p.m. UTC | #5
Looks fine.

Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

P.S.
Which ULP was used to test the patch?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky April 4, 2016, 1:31 a.m. UTC | #6
On Sun, Apr 03, 2016 at 07:17:02PM +0300, sagig wrote:
> Looks fine.
> 
> Reviewed-by: Sagi Grimberg <sagi@grimberg.me>

Sagi,
Can you please explain why iWARP devices need explicit drain QP logic?

> 
> P.S.
> Which ULP was used to test the patch?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg April 4, 2016, 6:39 a.m. UTC | #7
> Sagi,
> Can you please explain why iWARP devices need explicit drain QP logic?

Steve can explain it much better than I can (so I'd wait for him to
confirm).

My understanding is that in iWARP, after moving the QP to error state,
all the posts should fail instead of completing with a flush error (I
think that SQ_DRAIN is the state that triggers flush errors).

This is why iWARP devices need a special handling for draining a
queue-pair.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise April 4, 2016, 1:54 p.m. UTC | #8
> On Tue, Mar 29, 2016 at 03:49:18PM -0500, Steve Wise wrote:
> > >
> > > Adding sq and rq drain functions, which block until all
> > > previously posted wr-s in the specified queue have completed.
> > > A completion object is signaled to unblock the thread,
> > > when the last cqe for the corresponding queue is processed.
> > >
> > > Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
> > > Signed-off-by: Faisal Latif <faisal.latif@intel.com>
> >
> > Looks good to me.  No locking needed though?
> >
> > Reviewed-by: Steve Wise <swise@opengridcomputing.com>
> 
> Steve,
> I see that this implementation follows your reference implementation of
> iw_cxgb4 which was iWARP specific. Can you point me to the relevant
> information which explain why this specific case exists?

The iWARP Verbs spec mandates that when the QP is in ERROR, post_send() and
post_recv() must, at some point, fail synchronously.  See:

http://tools.ietf.org/html/draft-hilland-rddp-verbs-00#section-6.2.4

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise April 4, 2016, 2:42 p.m. UTC | #9
> > Sagi,
> > Can you please explain why iWARP devices need explicit drain QP logic?
> 
> Steve can explain it much better than I can (so I'd wait for him to
> confirm).
> 
> My understanding is that in iWARP, after moving the QP to error state,
> all the posts should fail instead of completing with a flush error (I
> think that SQ_DRAIN is the state that triggers flush errors).
> 
> This is why iWARP devices need a special handling for draining a
> queue-pair.

Sorry for the delayed reply Leon, I replied to your original email today (I was
on vacation).

Sagi is correct, iWARP QPs need to fail post_send/recv requests synchronously
and I didn't want to divert from the iWARP Verbs specification.

Steve.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky April 4, 2016, 2:52 p.m. UTC | #10
On Mon, Apr 04, 2016 at 08:54:57AM -0500, Steve Wise wrote:
> > On Tue, Mar 29, 2016 at 03:49:18PM -0500, Steve Wise wrote:
> > > >
> > > > Adding sq and rq drain functions, which block until all
> > > > previously posted wr-s in the specified queue have completed.
> > > > A completion object is signaled to unblock the thread,
> > > > when the last cqe for the corresponding queue is processed.
> > > >
> > > > Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
> > > > Signed-off-by: Faisal Latif <faisal.latif@intel.com>
> > >
> > > Looks good to me.  No locking needed though?
> > >
> > > Reviewed-by: Steve Wise <swise@opengridcomputing.com>
> > 
> > Steve,
> > I see that this implementation follows your reference implementation of
> > iw_cxgb4 which was iWARP specific. Can you point me to the relevant
> > information which explain why this specific case exists?
> 
> The iWARP Verbs spec mandates that when the QP is in ERROR, post_send() and
> post_recv() must, at some point, fail synchronously.  See:
> 
> http://tools.ietf.org/html/draft-hilland-rddp-verbs-00#section-6.2.4

OK, 
I see it, the iWARP devices need to flush QP before moving to
drain(error) state.

This leads to another question, why don't we have common specific
functions for iWARP devices? Right now, we have two drivers with the
similar code.

The suggested refactoring can be as follows:

post_marker_func(..)
{
    post_send ....
}

ib_drain_qp(..)
{
    move_to_error
    call_post_marker_func
}

iw_drain_qp(..)
{
    call_to_post_marker_func
    move_to_drain
}

What do you think?

> 
> Steve.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Steve Wise April 4, 2016, 3:01 p.m. UTC | #11
> > The iWARP Verbs spec mandates that when the QP is in ERROR, post_send() and
> > post_recv() must, at some point, fail synchronously.  See:
> >
> > http://tools.ietf.org/html/draft-hilland-rddp-verbs-00#section-6.2.4
> 
> OK,
> I see it, the iWARP devices need to flush QP before moving to
> drain(error) state.
> 
> This leads to another question, why don't we have common specific
> functions for iWARP devices? Right now, we have two drivers with the
> similar code.
> 

Based on the initial review/discussion of the drain API, it was agreed that the
drain logic wouldn't have IB and IW cases, but rather a common and
device-specific case. 

> The suggested refactoring can be as follows:
> 
> post_marker_func(..)
> {
>     post_send ....
> }
> 
> ib_drain_qp(..)
> {
>     move_to_error
>     call_post_marker_func
> }
> 
> iw_drain_qp(..)
> {
>     call_to_post_marker_func
>     move_to_drain
> }
>

This won't work if the QP is already in out of RTS which happens as part of
rdma_disconnect().  And currently the ULP usage is to disconnect and then drain.

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky April 4, 2016, 5:21 p.m. UTC | #12
On Mon, Apr 04, 2016 at 10:01:49AM -0500, Steve Wise wrote:
> > The suggested refactoring can be as follows:
> > 
> > post_marker_func(..)
> > {
> >     post_send ....
> > }
> > 
> > ib_drain_qp(..)
> > {
> >     move_to_error
> >     call_post_marker_func
> > }
> > 
> > iw_drain_qp(..)
> > {
> >     call_to_post_marker_func
> >     move_to_drain
> > }
> >
> 
> This won't work if the QP is already in out of RTS which happens as part of
> rdma_disconnect().  And currently the ULP usage is to disconnect and then drain.

Ok, just to summarize it.

It is not enough do not transfer QP to error state, but also we
can't send work requests. The RDMA disconnect serves as a gate keeper
which protects from new work requests to entry into drained QP.

Thank you for the explanation.

> 
> Steve.
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford May 13, 2016, 7:59 p.m. UTC | #13
On 03/29/2016 01:58 PM, Tatyana Nikolova wrote:
> Adding sq and rq drain functions, which block until all
> previously posted wr-s in the specified queue have completed.
> A completion object is signaled to unblock the thread,
> when the last cqe for the corresponding queue is processed.
> 
> Signed-off-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
> Signed-off-by: Faisal Latif <faisal.latif@intel.com>
> ---
>  drivers/infiniband/hw/nes/nes_verbs.c | 34 ++++++++++++++++++++++++++++++++++
>  drivers/infiniband/hw/nes/nes_verbs.h |  2 ++
>  2 files changed, 36 insertions(+)

Thanks, applied.
diff mbox

Patch

diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index fba69a3..170d43a 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -1315,6 +1315,8 @@  static struct ib_qp *nes_create_qp(struct ib_pd *ibpd,
 			nes_debug(NES_DBG_QP, "Invalid QP type: %d\n", init_attr->qp_type);
 			return ERR_PTR(-EINVAL);
 	}
+	init_completion(&nesqp->sq_drained);
+	init_completion(&nesqp->rq_drained);
 
 	nesqp->sig_all = (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR);
 	init_timer(&nesqp->terminate_timer);
@@ -3452,6 +3454,29 @@  out:
 	return err;
 }
 
+/**
+ * nes_drain_sq - drain sq
+ * @ibqp: pointer to ibqp
+ */
+static void nes_drain_sq(struct ib_qp *ibqp)
+{
+	struct nes_qp *nesqp = to_nesqp(ibqp);
+
+	if (nesqp->hwqp.sq_tail != nesqp->hwqp.sq_head)
+		wait_for_completion(&nesqp->sq_drained);
+}
+
+/**
+ * nes_drain_rq - drain rq
+ * @ibqp: pointer to ibqp
+ */
+static void nes_drain_rq(struct ib_qp *ibqp)
+{
+	struct nes_qp *nesqp = to_nesqp(ibqp);
+
+	if (nesqp->hwqp.rq_tail != nesqp->hwqp.rq_head)
+		wait_for_completion(&nesqp->rq_drained);
+}
 
 /**
  * nes_poll_cq
@@ -3581,6 +3606,13 @@  static int nes_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *entry)
 					wq_tail = nesqp->hwqp.rq_tail;
 				}
 			}
+
+			if (nesqp->iwarp_state > NES_CQP_QP_IWARP_STATE_RTS) {
+				if (nesqp->hwqp.sq_tail == nesqp->hwqp.sq_head)
+					complete(&nesqp->sq_drained);
+				if (nesqp->hwqp.rq_tail == nesqp->hwqp.rq_head)
+					complete(&nesqp->rq_drained);
+			}
 
 			entry->wr_id = wrid;
 			entry++;
@@ -3754,6 +3786,8 @@  struct nes_ib_device *nes_init_ofa_device(struct net_device *netdev)
 	nesibdev->ibdev.req_notify_cq = nes_req_notify_cq;
 	nesibdev->ibdev.post_send = nes_post_send;
 	nesibdev->ibdev.post_recv = nes_post_recv;
+	nesibdev->ibdev.drain_sq = nes_drain_sq;
+	nesibdev->ibdev.drain_rq = nes_drain_rq;
 
 	nesibdev->ibdev.iwcm = kzalloc(sizeof(*nesibdev->ibdev.iwcm), GFP_KERNEL);
 	if (nesibdev->ibdev.iwcm == NULL) {
diff --git a/drivers/infiniband/hw/nes/nes_verbs.h b/drivers/infiniband/hw/nes/nes_verbs.h
index 7029088..e02a566 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.h
+++ b/drivers/infiniband/hw/nes/nes_verbs.h
@@ -189,6 +189,8 @@  struct nes_qp {
 	u8                    pau_pending;
 	u8                    pau_state;
 	__u64                 nesuqp_addr;
+	struct completion     sq_drained;
+	struct completion     rq_drained;
 };
 
 struct ib_mr *nes_reg_phys_mr(struct ib_pd *ib_pd,