From patchwork Fri Jun 7 12:25:25 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Dalessandro X-Patchwork-Id: 10981609 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 38E4815E6 for ; Fri, 7 Jun 2019 12:25:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2855628B72 for ; Fri, 7 Jun 2019 12:25:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1CE6728BCD; Fri, 7 Jun 2019 12:25:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9786528B73 for ; Fri, 7 Jun 2019 12:25:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727693AbfFGMZe (ORCPT ); Fri, 7 Jun 2019 08:25:34 -0400 Received: from mga03.intel.com ([134.134.136.65]:8771 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728526AbfFGMZe (ORCPT ); Fri, 7 Jun 2019 08:25:34 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2019 05:25:33 -0700 X-ExtLoop1: 1 Received: from sedona.ch.intel.com ([10.2.136.157]) by orsmga006.jf.intel.com with ESMTP; 07 Jun 2019 05:25:33 -0700 Received: from awfm-01.aw.intel.com (awfm-01.aw.intel.com [10.228.212.213]) by sedona.ch.intel.com (8.14.3/8.14.3/Standard MailSET/Hub) with ESMTP id x57CPRgH062796; Fri, 7 Jun 2019 05:25:27 -0700 Received: from awfm-01.aw.intel.com (localhost [127.0.0.1]) by awfm-01.aw.intel.com (8.14.7/8.14.7) with ESMTP id x57CPPbd158510; Fri, 7 Jun 2019 08:25:25 -0400 Subject: [PATCH for-rc 1/3] IB/hfi1: Validate fault injection opcode user input From: Dennis Dalessandro To: jgg@ziepe.ca, dledford@redhat.com Cc: linux-rdma@vger.kernel.org, Mike Marciniszyn , stable@vger.kernel.org, Dan Carpenter , Kaike Wan Date: Fri, 07 Jun 2019 08:25:25 -0400 Message-ID: <20190607122525.158478.61319.stgit@awfm-01.aw.intel.com> In-Reply-To: <20190607113807.157915.48581.stgit@awfm-01.aw.intel.com> References: <20190607113807.157915.48581.stgit@awfm-01.aw.intel.com> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Kaike Wan The opcode range for fault injection from user should be validated before it is applied to the fault->opcodes[] bitmap to avoid out-of-bound error. In addition, this patch also simplifies the code by using the BIT macro. Fixes: a74d5307caba ("IB/hfi1: Rework fault injection machinery") Cc: Reported-by: Dan Carpenter Reviewed-by: Mike Marciniszyn Signed-off-by: Kaike Wan Signed-off-by: Dennis Dalessandro --- drivers/infiniband/hw/hfi1/fault.c | 5 +++++ drivers/infiniband/hw/hfi1/fault.h | 6 +++--- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/fault.c b/drivers/infiniband/hw/hfi1/fault.c index 3fd3315..13ba291 100644 --- a/drivers/infiniband/hw/hfi1/fault.c +++ b/drivers/infiniband/hw/hfi1/fault.c @@ -153,6 +153,7 @@ static ssize_t fault_opcodes_write(struct file *file, const char __user *buf, char *dash; unsigned long range_start, range_end, i; bool remove = false; + unsigned long bound = BIT(BITS_PER_BYTE); end = strchr(ptr, ','); if (end) @@ -178,6 +179,10 @@ static ssize_t fault_opcodes_write(struct file *file, const char __user *buf, BITS_PER_BYTE); break; } + /* Check the inputs */ + if (range_start >= bound || range_end >= bound) + break; + for (i = range_start; i <= range_end; i++) { if (remove) clear_bit(i, fault->opcodes); diff --git a/drivers/infiniband/hw/hfi1/fault.h b/drivers/infiniband/hw/hfi1/fault.h index a833827..c61035c 100644 --- a/drivers/infiniband/hw/hfi1/fault.h +++ b/drivers/infiniband/hw/hfi1/fault.h @@ -60,13 +60,13 @@ struct fault { struct fault_attr attr; struct dentry *dir; - u64 n_rxfaults[(1U << BITS_PER_BYTE)]; - u64 n_txfaults[(1U << BITS_PER_BYTE)]; + u64 n_rxfaults[BIT(BITS_PER_BYTE)]; + u64 n_txfaults[BIT(BITS_PER_BYTE)]; u64 fault_skip; u64 skip; u64 fault_skip_usec; unsigned long skip_usec; - unsigned long opcodes[(1U << BITS_PER_BYTE) / BITS_PER_LONG]; + unsigned long opcodes[BIT(BITS_PER_BYTE) / BITS_PER_LONG]; bool enable; bool suppress_err; bool opcode; From patchwork Fri Jun 7 12:25:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Dalessandro X-Patchwork-Id: 10981611 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3D75C15E6 for ; Fri, 7 Jun 2019 12:25:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2887A28BBD for ; Fri, 7 Jun 2019 12:25:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1974628BCD; Fri, 7 Jun 2019 12:25:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A2BA228BBD for ; Fri, 7 Jun 2019 12:25:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728826AbfFGMZh (ORCPT ); Fri, 7 Jun 2019 08:25:37 -0400 Received: from mga07.intel.com ([134.134.136.100]:39615 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728526AbfFGMZg (ORCPT ); Fri, 7 Jun 2019 08:25:36 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2019 05:25:36 -0700 X-ExtLoop1: 1 Received: from sedona.ch.intel.com ([10.2.136.157]) by fmsmga008.fm.intel.com with ESMTP; 07 Jun 2019 05:25:35 -0700 Received: from awfm-01.aw.intel.com (awfm-01.aw.intel.com [10.228.212.213]) by sedona.ch.intel.com (8.14.3/8.14.3/Standard MailSET/Hub) with ESMTP id x57CPXnv062822; Fri, 7 Jun 2019 05:25:34 -0700 Received: from awfm-01.aw.intel.com (localhost [127.0.0.1]) by awfm-01.aw.intel.com (8.14.7/8.14.7) with ESMTP id x57CPVAv158521; Fri, 7 Jun 2019 08:25:32 -0400 Subject: [PATCH for-rc 2/3] IB/hfi1: Close PSM sdma_progress sleep window From: Dennis Dalessandro To: jgg@ziepe.ca, dledford@redhat.com Cc: linux-rdma@vger.kernel.org, Mike Marciniszyn , stable@vger.kernel.org, Gary Leshner Date: Fri, 07 Jun 2019 08:25:31 -0400 Message-ID: <20190607122531.158478.35392.stgit@awfm-01.aw.intel.com> In-Reply-To: <20190607113807.157915.48581.stgit@awfm-01.aw.intel.com> References: <20190607113807.157915.48581.stgit@awfm-01.aw.intel.com> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Mike Marciniszyn The call to sdma_progress() is called outside the wait lock. In this case, there is a race condition where sdma_progress() can return false and the sdma_engine can idle. If that happens, there will be no more sdma interrupts to cause the wakeup and the user_sdma xmit will hang. Fix by moving the lock to enclose the sdma_progress() call. Also, delete busycount. The need for this was removed by: commit bcad29137a97 ("IB/hfi1: Serve the most starved iowait entry first") Cc: Fixes: 7724105686e7 ("IB/hfi1: add driver files") Reviewed-by: Gary Leshner Signed-off-by: Mike Marciniszyn Signed-off-by: Dennis Dalessandro --- drivers/infiniband/hw/hfi1/user_sdma.c | 12 ++++-------- drivers/infiniband/hw/hfi1/user_sdma.h | 1 - 2 files changed, 4 insertions(+), 9 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c index 8bfbc6d..fd754a1 100644 --- a/drivers/infiniband/hw/hfi1/user_sdma.c +++ b/drivers/infiniband/hw/hfi1/user_sdma.c @@ -130,20 +130,16 @@ static int defer_packet_queue( { struct hfi1_user_sdma_pkt_q *pq = container_of(wait->iow, struct hfi1_user_sdma_pkt_q, busy); - struct user_sdma_txreq *tx = - container_of(txreq, struct user_sdma_txreq, txreq); - if (sdma_progress(sde, seq, txreq)) { - if (tx->busycount++ < MAX_DEFER_RETRY_COUNT) - goto eagain; - } + write_seqlock(&sde->waitlock); + if (sdma_progress(sde, seq, txreq)) + goto eagain; /* * We are assuming that if the list is enqueued somewhere, it * is to the dmawait list since that is the only place where * it is supposed to be enqueued. */ xchg(&pq->state, SDMA_PKT_Q_DEFERRED); - write_seqlock(&sde->waitlock); if (list_empty(&pq->busy.list)) { iowait_get_priority(&pq->busy); iowait_queue(pkts_sent, &pq->busy, &sde->dmawait); @@ -151,6 +147,7 @@ static int defer_packet_queue( write_sequnlock(&sde->waitlock); return -EBUSY; eagain: + write_sequnlock(&sde->waitlock); return -EAGAIN; } @@ -804,7 +801,6 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, u16 maxpkts) tx->flags = 0; tx->req = req; - tx->busycount = 0; INIT_LIST_HEAD(&tx->list); /* diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h index 14dfd75..4d8510b0 100644 --- a/drivers/infiniband/hw/hfi1/user_sdma.h +++ b/drivers/infiniband/hw/hfi1/user_sdma.h @@ -245,7 +245,6 @@ struct user_sdma_txreq { struct list_head list; struct user_sdma_request *req; u16 flags; - unsigned int busycount; u16 seqnum; }; From patchwork Fri Jun 7 12:25:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dennis Dalessandro X-Patchwork-Id: 10981613 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0F87514E5 for ; Fri, 7 Jun 2019 12:25:48 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F1E3A28B73 for ; Fri, 7 Jun 2019 12:25:47 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E64D528BCC; Fri, 7 Jun 2019 12:25:47 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5BF0228BCD for ; Fri, 7 Jun 2019 12:25:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728906AbfFGMZq (ORCPT ); Fri, 7 Jun 2019 08:25:46 -0400 Received: from mga07.intel.com ([134.134.136.100]:39619 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728597AbfFGMZp (ORCPT ); Fri, 7 Jun 2019 08:25:45 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Jun 2019 05:25:44 -0700 X-ExtLoop1: 1 Received: from sedona.ch.intel.com ([10.2.136.157]) by orsmga004.jf.intel.com with ESMTP; 07 Jun 2019 05:25:42 -0700 Received: from awfm-01.aw.intel.com (awfm-01.aw.intel.com [10.228.212.213]) by sedona.ch.intel.com (8.14.3/8.14.3/Standard MailSET/Hub) with ESMTP id x57CPdoO062826; Fri, 7 Jun 2019 05:25:40 -0700 Received: from awfm-01.aw.intel.com (localhost [127.0.0.1]) by awfm-01.aw.intel.com (8.14.7/8.14.7) with ESMTP id x57CPcUF158531; Fri, 7 Jun 2019 08:25:38 -0400 Subject: [PATCH for-rc 3/3] IB/hfi1: Correct tid qp rcd to match verbs context From: Dennis Dalessandro To: jgg@ziepe.ca, dledford@redhat.com Cc: linux-rdma@vger.kernel.org, Mike Marciniszyn , stable@vger.kernel.org, Kaike Wan Date: Fri, 07 Jun 2019 08:25:38 -0400 Message-ID: <20190607122538.158478.62945.stgit@awfm-01.aw.intel.com> In-Reply-To: <20190607113807.157915.48581.stgit@awfm-01.aw.intel.com> References: <20190607113807.157915.48581.stgit@awfm-01.aw.intel.com> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Mike Marciniszyn The qp priv rcd pointer doesn't match the context being used for verbs causing issues when 9B and kdeth packets are processed by different receive contexts and hence different CPUs. When running on different CPUs the following panic can occur: [476262.398106] WARNING: CPU: 3 PID: 2584 at lib/list_debug.c:59 __list_del_entry+0xa1/0xd0 [476262.398109] list_del corruption. prev->next should be ffff9a7ac31f7a30, but was ffff9a7c3bc89230 [476262.398266] CPU: 3 PID: 2584 Comm: z_wr_iss Kdump: loaded Tainted: P OE ------------ 3.10.0-862.2.3.el7_lustre.x86_64 #1 [476262.398272] Call Trace: [476262.398277] [] dump_stack+0x19/0x1b [476262.398314] [] __warn+0xd8/0x100 [476262.398317] [] warn_slowpath_fmt+0x5f/0x80 [476262.398320] [] __list_del_entry+0xa1/0xd0 [476262.398402] [] process_rcv_qp_work+0xb5/0x160 [hfi1] [476262.398424] [] handle_receive_interrupt_nodma_rtail+0x20b/0x2b0 [hfi1] [476262.398438] [] receive_context_interrupt+0x23/0x40 [hfi1] [476262.398447] [] __handle_irq_event_percpu+0x44/0x1c0 [476262.398450] [] handle_irq_event_percpu+0x32/0x80 [476262.398454] [] handle_irq_event+0x3c/0x60 [476262.398460] [] handle_edge_irq+0x7f/0x150 [476262.398469] [] handle_irq+0xe4/0x1a0 [476262.398475] [] do_IRQ+0x4d/0xf0 [476262.398481] [] common_interrupt+0x162/0x162 [476262.398482] [] ? memcpy+0x6/0x110 [476262.398645] [] ? abd_copy_from_buf_off_cb+0x1d/0x30 [zfs] [476262.398678] [] ? abd_copy_to_buf_off_cb+0x30/0x30 [zfs] [476262.398696] [] abd_iterate_func+0x97/0x120 [zfs] [476262.398710] [] abd_copy_from_buf_off+0x39/0x60 [zfs] [476262.398726] [] arc_write_ready+0x178/0x300 [zfs] [476262.398732] [] ? mutex_lock+0x12/0x2f [476262.398734] [] ? mutex_lock+0x12/0x2f [476262.398837] [] zio_ready+0x65/0x3d0 [zfs] [476262.398884] [] ? tsd_get_by_thread+0x2e/0x50 [spl] [476262.398893] [] ? taskq_member+0x18/0x30 [spl] [476262.398968] [] zio_execute+0xa2/0x100 [zfs] [476262.398982] [] taskq_thread+0x2ac/0x4f0 [spl] [476262.399001] [] ? wake_up_state+0x20/0x20 [476262.399043] [] ? zio_taskq_member.isra.7.constprop.10+0x80/0x80 [zfs] [476262.399055] [] ? taskq_thread_spawn+0x60/0x60 [spl] [476262.399067] [] kthread+0xd1/0xe0 [476262.399072] [] ? insert_kthread_work+0x40/0x40 [476262.399082] [] ret_from_fork_nospec_begin+0x21/0x21 [476262.399087] [] ? insert_kthread_work+0x40/0x40 Fix by reading the map entry in the same manner as the hardware so that the kdeth and verbs contexts match. Fixes: 5190f052a365 ("IB/hfi1: Allow the driver to initialize QP priv struct") Cc: Reviewed-by: Kaike Wan Signed-off-by: Mike Marciniszyn Signed-off-by: Dennis Dalessandro --- drivers/infiniband/hw/hfi1/chip.c | 13 +++++++++++++ drivers/infiniband/hw/hfi1/chip.h | 1 + drivers/infiniband/hw/hfi1/tid_rdma.c | 5 ++--- 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c index 4221a99e..674f62a 100644 --- a/drivers/infiniband/hw/hfi1/chip.c +++ b/drivers/infiniband/hw/hfi1/chip.c @@ -14032,6 +14032,19 @@ static void init_kdeth_qp(struct hfi1_devdata *dd) } /** + * hfi1_get_qp_map + * @dd: device data + * @idx: index to read + */ +u8 hfi1_get_qp_map(struct hfi1_devdata *dd, u8 idx) +{ + u64 reg = read_csr(dd, RCV_QP_MAP_TABLE + (idx / 8) * 8); + + reg >>= (idx % 8) * 8; + return (u8)reg; +} + +/** * init_qpmap_table * @dd - device data * @first_ctxt - first context diff --git a/drivers/infiniband/hw/hfi1/chip.h b/drivers/infiniband/hw/hfi1/chip.h index 4e6c355..b76cf81 100644 --- a/drivers/infiniband/hw/hfi1/chip.h +++ b/drivers/infiniband/hw/hfi1/chip.h @@ -1445,6 +1445,7 @@ int hfi1_set_ctxt_pkey(struct hfi1_devdata *dd, struct hfi1_ctxtdata *ctxt, void remap_intr(struct hfi1_devdata *dd, int isrc, int msix_intr); void remap_sdma_interrupts(struct hfi1_devdata *dd, int engine, int msix_intr); void reset_interrupts(struct hfi1_devdata *dd); +u8 hfi1_get_qp_map(struct hfi1_devdata *dd, u8 idx); /* * Interrupt source table. diff --git a/drivers/infiniband/hw/hfi1/tid_rdma.c b/drivers/infiniband/hw/hfi1/tid_rdma.c index 6fb9303..d77276d 100644 --- a/drivers/infiniband/hw/hfi1/tid_rdma.c +++ b/drivers/infiniband/hw/hfi1/tid_rdma.c @@ -312,9 +312,8 @@ static struct hfi1_ctxtdata *qp_to_rcd(struct rvt_dev_info *rdi, if (qp->ibqp.qp_num == 0) ctxt = 0; else - ctxt = ((qp->ibqp.qp_num >> dd->qos_shift) % - (dd->n_krcv_queues - 1)) + 1; - + ctxt = hfi1_get_qp_map(dd, + (u8)(qp->ibqp.qp_num >> dd->qos_shift)); return dd->rcd[ctxt]; }