From patchwork Thu Jan 24 03:31:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Dalessandro X-Patchwork-Id: 10778281 X-Patchwork-Delegate: dledford@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2B62813BF for ; Thu, 24 Jan 2019 03:31:50 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 192B12DEA1 for ; Thu, 24 Jan 2019 03:31:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 110AE2DEBB; Thu, 24 Jan 2019 03:31:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3789E2DEBB for ; Thu, 24 Jan 2019 03:31:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727126AbfAXDbt (ORCPT ); Wed, 23 Jan 2019 22:31:49 -0500 Received: from mga12.intel.com ([192.55.52.136]:50893 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726881AbfAXDbs (ORCPT ); Wed, 23 Jan 2019 22:31:48 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 Jan 2019 19:31:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,514,1539673200"; d="scan'208";a="128424305" Received: from scymds02.sc.intel.com ([10.82.195.37]) by orsmga002.jf.intel.com with ESMTP; 23 Jan 2019 19:31:47 -0800 Received: from scvm10.sc.intel.com (scvm10.sc.intel.com [10.82.195.27]) by scymds02.sc.intel.com with ESMTP id x0O3VlPQ018747; Wed, 23 Jan 2019 19:31:47 -0800 Received: from scvm10.sc.intel.com (localhost [127.0.0.1]) by scvm10.sc.intel.com with ESMTP id x0O3VkcT007936; Wed, 23 Jan 2019 19:31:46 -0800 Subject: [PATCH for-next 12/17] IB/hfi1: Add functions for restarting TID RDMA READ request From: Dennis Dalessandro To: jgg@ziepe.ca, dledford@redhat.com Cc: linux-rdma@vger.kernel.org, Mike Marciniszyn , Kaike Wan Date: Wed, 23 Jan 2019 19:31:46 -0800 Message-ID: <20190124033138.3371.78175.stgit@scvm10.sc.intel.com> In-Reply-To: <20190124032400.3371.74628.stgit@scvm10.sc.intel.com> References: <20190124032400.3371.74628.stgit@scvm10.sc.intel.com> User-Agent: StGit/0.17.1-18-g2e886-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Kaike Wan This patch adds functions to retry TID RDMA READ request. Since TID RDMA READ request could be retried from any segment boundary, it requires a number of tracking fields in various structures and those fields should be reset properly. The qp->s_num_rd_atomic field is reset before retry and therefore should be incremented for each new or retried RDMA READ or atomic request. Reviewed-by: Mike Marciniszyn Signed-off-by: Kaike Wan Signed-off-by: Dennis Dalessandro --- drivers/infiniband/hw/hfi1/rc.c | 94 ++++++++++++++++++++++++++------- drivers/infiniband/hw/hfi1/tid_rdma.c | 82 +++++++++++++++++++++++++++++ drivers/infiniband/hw/hfi1/tid_rdma.h | 2 + 3 files changed, 158 insertions(+), 20 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c index 6e74cd3..e478a0b 100644 --- a/drivers/infiniband/hw/hfi1/rc.c +++ b/drivers/infiniband/hw/hfi1/rc.c @@ -503,16 +503,14 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps) * Don't allow more operations to be started * than the QP limits allow. */ - if (newreq) { - if (qp->s_num_rd_atomic >= - qp->s_max_rd_atomic) { - qp->s_flags |= RVT_S_WAIT_RDMAR; - goto bail; - } - qp->s_num_rd_atomic++; - if (!(qp->s_flags & RVT_S_UNLIMITED_CREDIT)) - qp->s_lsn++; + if (qp->s_num_rd_atomic >= + qp->s_max_rd_atomic) { + qp->s_flags |= RVT_S_WAIT_RDMAR; + goto bail; } + qp->s_num_rd_atomic++; + if (newreq && !(qp->s_flags & RVT_S_UNLIMITED_CREDIT)) + qp->s_lsn++; put_ib_reth_vaddr( wqe->rdma_wr.remote_addr, &ohdr->u.rc.reth); @@ -534,14 +532,12 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps) * Don't allow more operations to be started * than the QP limits allow. */ - if (newreq) { - if (qp->s_num_rd_atomic >= - qp->s_max_rd_atomic) { - qp->s_flags |= RVT_S_WAIT_RDMAR; - goto bail; - } - qp->s_num_rd_atomic++; + if (qp->s_num_rd_atomic >= + qp->s_max_rd_atomic) { + qp->s_flags |= RVT_S_WAIT_RDMAR; + goto bail; } + qp->s_num_rd_atomic++; /* FALLTHROUGH */ case IB_WR_OPFN: @@ -971,6 +967,43 @@ void hfi1_send_rc_ack(struct hfi1_packet *packet, bool is_fecn) } /** + * update_num_rd_atomic - update the qp->s_num_rd_atomic + * @qp: the QP + * @psn: the packet sequence number to restart at + * @wqe: the wqe + * + * This is called from reset_psn() to update qp->s_num_rd_atomic + * for the current wqe. + * Called at interrupt level with the QP s_lock held. + */ +static void update_num_rd_atomic(struct rvt_qp *qp, u32 psn, + struct rvt_swqe *wqe) +{ + u32 opcode = wqe->wr.opcode; + + if (opcode == IB_WR_RDMA_READ || + opcode == IB_WR_ATOMIC_CMP_AND_SWP || + opcode == IB_WR_ATOMIC_FETCH_AND_ADD) { + qp->s_num_rd_atomic++; + } else if (opcode == IB_WR_TID_RDMA_READ) { + struct tid_rdma_request *req = wqe_to_tid_req(wqe); + struct hfi1_qp_priv *priv = qp->priv; + + if (cmp_psn(psn, wqe->lpsn) <= 0) { + u32 cur_seg; + + cur_seg = (psn - wqe->psn) / priv->pkts_ps; + req->ack_pending = cur_seg - req->comp_seg; + priv->pending_tid_r_segs += req->ack_pending; + qp->s_num_rd_atomic += req->ack_pending; + } else { + priv->pending_tid_r_segs += req->total_segs; + qp->s_num_rd_atomic += req->total_segs; + } + } +} + +/** * reset_psn - reset the QP state to send starting from PSN * @qp: the QP * @psn: the packet sequence number to restart at @@ -984,9 +1017,12 @@ static void reset_psn(struct rvt_qp *qp, u32 psn) u32 n = qp->s_acked; struct rvt_swqe *wqe = rvt_get_swqe_ptr(qp, n); u32 opcode; + struct hfi1_qp_priv *priv = qp->priv; lockdep_assert_held(&qp->s_lock); qp->s_cur = n; + priv->pending_tid_r_segs = 0; + qp->s_num_rd_atomic = 0; /* * If we are starting the request from the beginning, @@ -996,9 +1032,9 @@ static void reset_psn(struct rvt_qp *qp, u32 psn) qp->s_state = OP(SEND_LAST); goto done; } + update_num_rd_atomic(qp, psn, wqe); /* Find the work request opcode corresponding to the given PSN. */ - opcode = wqe->wr.opcode; for (;;) { int diff; @@ -1008,8 +1044,11 @@ static void reset_psn(struct rvt_qp *qp, u32 psn) break; wqe = rvt_get_swqe_ptr(qp, n); diff = cmp_psn(psn, wqe->psn); - if (diff < 0) + if (diff < 0) { + /* Point wqe back to the previous one*/ + wqe = rvt_get_swqe_ptr(qp, qp->s_cur); break; + } qp->s_cur = n; /* * If we are starting the request from the beginning, @@ -1019,8 +1058,10 @@ static void reset_psn(struct rvt_qp *qp, u32 psn) qp->s_state = OP(SEND_LAST); goto done; } - opcode = wqe->wr.opcode; + + update_num_rd_atomic(qp, psn, wqe); } + opcode = wqe->wr.opcode; /* * Set the state to restart in the middle of a request. @@ -1042,6 +1083,10 @@ static void reset_psn(struct rvt_qp *qp, u32 psn) qp->s_state = OP(RDMA_READ_RESPONSE_MIDDLE); break; + case IB_WR_TID_RDMA_READ: + qp->s_state = TID_OP(READ_RESP); + break; + default: /* * This case shouldn't happen since its only @@ -1095,6 +1140,14 @@ void hfi1_restart_rc(struct rvt_qp *qp, u32 psn, int wait) wqe = do_rc_completion(qp, wqe, ibp); qp->s_flags &= ~RVT_S_WAIT_ACK; } else { + if (wqe->wr.opcode == IB_WR_TID_RDMA_READ) { + struct tid_rdma_request *req; + + req = wqe_to_tid_req(wqe); + hfi1_kern_exp_rcv_clear_all(req); + hfi1_kern_clear_hw_flow(priv->rcd, qp); + } + rvt_send_complete(qp, wqe, IB_WC_RETRY_EXC_ERR); rvt_error_qp(qp, IB_WC_WR_FLUSH_ERR); @@ -1108,7 +1161,8 @@ void hfi1_restart_rc(struct rvt_qp *qp, u32 psn, int wait) } ibp = to_iport(qp->ibqp.device, qp->port_num); - if (wqe->wr.opcode == IB_WR_RDMA_READ) + if (wqe->wr.opcode == IB_WR_RDMA_READ || + wqe->wr.opcode == IB_WR_TID_RDMA_READ) ibp->rvp.n_rc_resends++; else ibp->rvp.n_rc_resends += delta_psn(qp->s_psn, psn); diff --git a/drivers/infiniband/hw/hfi1/tid_rdma.c b/drivers/infiniband/hw/hfi1/tid_rdma.c index 4636d5c..410f293 100644 --- a/drivers/infiniband/hw/hfi1/tid_rdma.c +++ b/drivers/infiniband/hw/hfi1/tid_rdma.c @@ -1622,6 +1622,27 @@ u64 hfi1_access_sw_tid_wait(const struct cntr_entry *entry, return dd->verbs_dev.n_tidwait; } +static struct tid_rdma_flow *find_flow_ib(struct tid_rdma_request *req, + u32 psn, u16 *fidx) +{ + u16 head, tail; + struct tid_rdma_flow *flow; + + head = req->setup_head; + tail = req->clear_tail; + for ( ; CIRC_CNT(head, tail, MAX_FLOWS); + tail = CIRC_NEXT(tail, MAX_FLOWS)) { + flow = &req->flows[tail]; + if (cmp_psn(psn, flow->flow_state.ib_spsn) >= 0 && + cmp_psn(psn, flow->flow_state.ib_lpsn) <= 0) { + if (fidx) + *fidx = tail; + return flow; + } + } + return NULL; +} + static struct tid_rdma_flow * __find_flow_ranged(struct tid_rdma_request *req, u16 head, u16 tail, u32 psn, u16 *fidx) @@ -2710,3 +2731,64 @@ bool hfi1_handle_kdeth_eflags(struct hfi1_ctxtdata *rcd, drop: return ret; } + +/* + * "Rewind" the TID request information. + * This means that we reset the state back to ACTIVE, + * find the proper flow, set the flow index to that flow, + * and reset the flow information. + */ +void hfi1_tid_rdma_restart_req(struct rvt_qp *qp, struct rvt_swqe *wqe, + u32 *bth2) +{ + struct tid_rdma_request *req = wqe_to_tid_req(wqe); + struct tid_rdma_flow *flow; + int diff; + u32 tididx = 0; + u16 fidx; + + if (wqe->wr.opcode == IB_WR_TID_RDMA_READ) { + *bth2 = mask_psn(qp->s_psn); + flow = find_flow_ib(req, *bth2, &fidx); + if (!flow) + return; + } else { + return; + } + + diff = delta_psn(*bth2, flow->flow_state.ib_spsn); + + flow->sent = 0; + flow->pkt = 0; + flow->tid_idx = 0; + flow->tid_offset = 0; + if (diff) { + for (tididx = 0; tididx < flow->tidcnt; tididx++) { + u32 tidentry = flow->tid_entry[tididx], tidlen, + tidnpkts, npkts; + + flow->tid_offset = 0; + tidlen = EXP_TID_GET(tidentry, LEN) * PAGE_SIZE; + tidnpkts = rvt_div_round_up_mtu(qp, tidlen); + npkts = min_t(u32, diff, tidnpkts); + flow->pkt += npkts; + flow->sent += (npkts == tidnpkts ? tidlen : + npkts * qp->pmtu); + flow->tid_offset += npkts * qp->pmtu; + diff -= npkts; + if (!diff) + break; + } + } + + if (flow->tid_offset == + EXP_TID_GET(flow->tid_entry[tididx], LEN) * PAGE_SIZE) { + tididx++; + flow->tid_offset = 0; + } + flow->tid_idx = tididx; + /* Move flow_idx to correct index */ + req->flow_idx = fidx; + + req->state = TID_REQUEST_ACTIVE; +} diff --git a/drivers/infiniband/hw/hfi1/tid_rdma.h b/drivers/infiniband/hw/hfi1/tid_rdma.h index d428236..beb5982 100644 --- a/drivers/infiniband/hw/hfi1/tid_rdma.h +++ b/drivers/infiniband/hw/hfi1/tid_rdma.h @@ -207,5 +207,7 @@ u32 hfi1_build_tid_rdma_read_resp(struct rvt_qp *qp, struct rvt_ack_entry *e, bool hfi1_handle_kdeth_eflags(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd, struct hfi1_packet *packet); +void hfi1_tid_rdma_restart_req(struct rvt_qp *qp, struct rvt_swqe *wqe, + u32 *bth2); #endif /* HFI1_TID_RDMA_H */