From patchwork Mon Apr 25 18:17:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiraz Saleem X-Patchwork-Id: 12826035 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E1DBC433FE for ; Mon, 25 Apr 2022 18:17:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244278AbiDYSUT (ORCPT ); Mon, 25 Apr 2022 14:20:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239876AbiDYSUT (ORCPT ); Mon, 25 Apr 2022 14:20:19 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 028293B28B for ; Mon, 25 Apr 2022 11:17:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1650910635; x=1682446635; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4Tbq8hEjCHTn0hqCh6+mAIGAPWI62oyksZcO3JD0MPE=; b=Im2/H7eb6vgf3XVD1/AR3PF/CJGU66IM24ckUJ4pMjnL4Qt0zsyVDxwR aUR3gSGWh+dP9TzL3Mf3S6AhOhJylUODcfS/bgdWGpdLYEj5JrU+9Cu9J 3gKhdY24Sm/UVx5VSsrFL3h6ruO/wdyvI186v/ALo3JKFnGm0+w3CAz8o XYKxIvb0ciaEZm/isqNBp9A5+JQOZ5UAgtWDkrKr6UkABFoXO5sHx1f7I nFGzu1Rz/XGa0V2WrwzVjlQiPOoURSlT010eyvyR3xej5kg200g7XR3tc SoXsgIp7PAjlhDsQ/P6n0oF1qPP0vBG8ihlcKvOQhAUN1XP3YBE3sMcDr A==; X-IronPort-AV: E=McAfee;i="6400,9594,10328"; a="252697319" X-IronPort-AV: E=Sophos;i="5.90,289,1643702400"; d="scan'208";a="252697319" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 11:17:14 -0700 X-IronPort-AV: E=Sophos;i="5.90,289,1643702400"; d="scan'208";a="537708092" Received: from ssaleem-mobl1.amr.corp.intel.com ([10.212.50.71]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 11:17:13 -0700 From: Shiraz Saleem To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, Tatyana Nikolova , Shiraz Saleem Subject: [PATCH for-rc 1/3] RDMA/irdma: Flush iWARP QP if modified to ERR from RTR state Date: Mon, 25 Apr 2022 13:17:01 -0500 Message-Id: <20220425181703.1634-2-shiraz.saleem@intel.com> X-Mailer: git-send-email 2.35.2 In-Reply-To: <20220425181703.1634-1-shiraz.saleem@intel.com> References: <20220425181703.1634-1-shiraz.saleem@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Tatyana Nikolova When connection establishment fails in iWARP mode, an app can drain the QPs and hang because flush isn't issued when the QP is modified from RTR state to error. Issue a flush in this case using function irdma_cm_disconn(). Update irdma_cm_disconn() to do flush when cm_id is NULL, which is the case when the QP is in RTR state and there is an error in the connection establishment. Fixes: b48c24c2d710 ("RDMA/irdma: Implement device supported verb APIs") Signed-off-by: Tatyana Nikolova Signed-off-by: Shiraz Saleem --- drivers/infiniband/hw/irdma/cm.c | 16 +++++----------- drivers/infiniband/hw/irdma/verbs.c | 4 ++-- 2 files changed, 7 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c index dedb3b7..d185f3a0 100644 --- a/drivers/infiniband/hw/irdma/cm.c +++ b/drivers/infiniband/hw/irdma/cm.c @@ -3467,12 +3467,6 @@ static void irdma_cm_disconn_true(struct irdma_qp *iwqp) } cm_id = iwqp->cm_id; - /* make sure we havent already closed this connection */ - if (!cm_id) { - spin_unlock_irqrestore(&iwqp->lock, flags); - return; - } - original_hw_tcp_state = iwqp->hw_tcp_state; original_ibqp_state = iwqp->ibqp_state; last_ae = iwqp->last_aeq; @@ -3494,11 +3488,11 @@ static void irdma_cm_disconn_true(struct irdma_qp *iwqp) disconn_status = -ECONNRESET; } - if ((original_hw_tcp_state == IRDMA_TCP_STATE_CLOSED || - original_hw_tcp_state == IRDMA_TCP_STATE_TIME_WAIT || - last_ae == IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE || - last_ae == IRDMA_AE_BAD_CLOSE || - last_ae == IRDMA_AE_LLP_CONNECTION_RESET || iwdev->rf->reset)) { + if (original_hw_tcp_state == IRDMA_TCP_STATE_CLOSED || + original_hw_tcp_state == IRDMA_TCP_STATE_TIME_WAIT || + last_ae == IRDMA_AE_RDMAP_ROE_BAD_LLP_CLOSE || + last_ae == IRDMA_AE_BAD_CLOSE || + last_ae == IRDMA_AE_LLP_CONNECTION_RESET || iwdev->rf->reset || !cm_id) { issue_close = 1; iwqp->cm_id = NULL; qp->term_flags = 0; diff --git a/drivers/infiniband/hw/irdma/verbs.c b/drivers/infiniband/hw/irdma/verbs.c index 46f4753..52f3e88 100644 --- a/drivers/infiniband/hw/irdma/verbs.c +++ b/drivers/infiniband/hw/irdma/verbs.c @@ -1618,13 +1618,13 @@ int irdma_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr, int attr_mask, if (issue_modify_qp && iwqp->ibqp_state > IB_QPS_RTS) { if (dont_wait) { - if (iwqp->cm_id && iwqp->hw_tcp_state) { + if (iwqp->hw_tcp_state) { spin_lock_irqsave(&iwqp->lock, flags); iwqp->hw_tcp_state = IRDMA_TCP_STATE_CLOSED; iwqp->last_aeq = IRDMA_AE_RESET_SENT; spin_unlock_irqrestore(&iwqp->lock, flags); - irdma_cm_disconn(iwqp); } + irdma_cm_disconn(iwqp); } else { int close_timer_started; From patchwork Mon Apr 25 18:17:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiraz Saleem X-Patchwork-Id: 12826036 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78639C433EF for ; Mon, 25 Apr 2022 18:17:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239876AbiDYSUV (ORCPT ); Mon, 25 Apr 2022 14:20:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244282AbiDYSUU (ORCPT ); Mon, 25 Apr 2022 14:20:20 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB18C3B282 for ; Mon, 25 Apr 2022 11:17:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1650910635; x=1682446635; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pWZv4l5LQ+CoAV5A3VkGyMs2OCn1dMWyctZXhNHzAzA=; b=DzIWQXY2vHnLNL0RmQw3CAbLlnjMK599oWlpogi2sujYpL5A5DOc02fF HSB/jhyhd+qm8YjdRSTR6Di8WuoZ+WER7IjwGI4L6RD7JWW/tgiy45YPG YCneX/QCfInxSoP07fXwI/evd7dNAl6VLRQRDRYFqhP5cqDw3pMrQrhO9 bvd0FczvBDjsb3cxhxx/qFnyQhTq+BGZLYZ7mSZTSnvXkOcBR4YOj/RgT RISDusrpeWWeuaP8La1a4nqh2JT7kfJYEKaZFlETA0/fTpp6rqel8MTLz IHP3OLabQPdDv5XGclX94sjfn6nIaPdEdUNGISjqaOMCBj6osu+qe5vBd Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10328"; a="252697320" X-IronPort-AV: E=Sophos;i="5.90,289,1643702400"; d="scan'208";a="252697320" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 11:17:15 -0700 X-IronPort-AV: E=Sophos;i="5.90,289,1643702400"; d="scan'208";a="537708105" Received: from ssaleem-mobl1.amr.corp.intel.com ([10.212.50.71]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 11:17:14 -0700 From: Shiraz Saleem To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, Shiraz Saleem Subject: [PATCH for-rc 2/3] RDMA/irdma: Reduce iWARP QP destroy time Date: Mon, 25 Apr 2022 13:17:02 -0500 Message-Id: <20220425181703.1634-3-shiraz.saleem@intel.com> X-Mailer: git-send-email 2.35.2 In-Reply-To: <20220425181703.1634-1-shiraz.saleem@intel.com> References: <20220425181703.1634-1-shiraz.saleem@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org QP destroy is synchronous and waits for its refcnt to be decremented in irdma_cm_node_free_cb (for iWARP) which fires after the RCU grace period elapses. Applications running a large number of connections are exposed to high wait times on destroy QP for events like SIGABORT. The long poll for this wait time is the firing of the call_rcu callback during a CM node destroy which can be slow. It holds the QP reference count and blocks the destroy QP from completing. call_rcu only needs to make sure that list walkers have a reference to the cm_node object before freeing it and thus need to wait for grace period elapse. The rest of the connection teardown in irdma_cm_node_free_cb is moved out of the grace period wait in irdma_destroy_connection. Also, replace call_rcu with a simple kfree_rcu as it just needs to do a kfree on the cm_node Fixes: 146b9756f14c ("RDMA/irdma: Add connection manager") Signed-off-by: Shiraz Saleem --- drivers/infiniband/hw/irdma/cm.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/hw/irdma/cm.c b/drivers/infiniband/hw/irdma/cm.c index d185f3a0..656baa3 100644 --- a/drivers/infiniband/hw/irdma/cm.c +++ b/drivers/infiniband/hw/irdma/cm.c @@ -2308,10 +2308,8 @@ static void irdma_cm_free_ah(struct irdma_cm_node *cm_node) return NULL; } -static void irdma_cm_node_free_cb(struct rcu_head *rcu_head) +static void irdma_destroy_connection(struct irdma_cm_node *cm_node) { - struct irdma_cm_node *cm_node = - container_of(rcu_head, struct irdma_cm_node, rcu_head); struct irdma_cm_core *cm_core = cm_node->cm_core; struct irdma_qp *iwqp; struct irdma_cm_info nfo; @@ -2359,7 +2357,6 @@ static void irdma_cm_node_free_cb(struct rcu_head *rcu_head) } cm_core->cm_free_ah(cm_node); - kfree(cm_node); } /** @@ -2387,8 +2384,9 @@ void irdma_rem_ref_cm_node(struct irdma_cm_node *cm_node) spin_unlock_irqrestore(&cm_core->ht_lock, flags); - /* wait for all list walkers to exit their grace period */ - call_rcu(&cm_node->rcu_head, irdma_cm_node_free_cb); + irdma_destroy_connection(cm_node); + + kfree_rcu(cm_node, rcu_head); } /** From patchwork Mon Apr 25 18:17:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shiraz Saleem X-Patchwork-Id: 12826037 X-Patchwork-Delegate: jgg@ziepe.ca Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C05C6C433EF for ; Mon, 25 Apr 2022 18:17:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240925AbiDYSUX (ORCPT ); Mon, 25 Apr 2022 14:20:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48176 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244283AbiDYSUW (ORCPT ); Mon, 25 Apr 2022 14:20:22 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E0F93B297 for ; Mon, 25 Apr 2022 11:17:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1650910637; x=1682446637; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wsCBqUyEsgptXEVyPY0aIeCxZWrZQwX9pGgCuybLr5o=; b=LSkLaJuYi0mT6SjENgYNNjMCFyx3dNGKn0FB2MtFJ6ufDG89hTOkjtw7 XG3reBqeMlJydtBKe/AT/IV8YV0w5nHSAUo+gXZSDuCACiWUNlt+8QO1E nRhUqcFBBLZCYPYKq7cMS4VCNq7m/sVB7zEVvQgvTkGrG3kA0/t0sS2+F sco0b3VmNkCnV0kwfTfQpsDCdUFRUqg0lSFJmcxIynSkgjiDAljhI7AHh RevEdnWW3CmviK6DC+z3uPvHZOsHSNFzuQG2s5fMDdQakA+Vp8FZdo/// Utqisg1y7VCKsifIC/X0CIXJ7JQIYKozD4xj6CyWWGWDXzcQVnaCsA2kI w==; X-IronPort-AV: E=McAfee;i="6400,9594,10328"; a="252697324" X-IronPort-AV: E=Sophos;i="5.90,289,1643702400"; d="scan'208";a="252697324" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 11:17:15 -0700 X-IronPort-AV: E=Sophos;i="5.90,289,1643702400"; d="scan'208";a="537708115" Received: from ssaleem-mobl1.amr.corp.intel.com ([10.212.50.71]) by orsmga002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Apr 2022 11:17:15 -0700 From: Shiraz Saleem To: jgg@nvidia.com, leon@kernel.org Cc: linux-rdma@vger.kernel.org, Mustafa Ismail , Shiraz Saleem Subject: [PATCH for-rc 3/3] RDMA/irdma: Fix possible crash due to NULL netdev in notifier Date: Mon, 25 Apr 2022 13:17:03 -0500 Message-Id: <20220425181703.1634-4-shiraz.saleem@intel.com> X-Mailer: git-send-email 2.35.2 In-Reply-To: <20220425181703.1634-1-shiraz.saleem@intel.com> References: <20220425181703.1634-1-shiraz.saleem@intel.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org From: Mustafa Ismail For some net events in irdma_net_event notifier, the netdev can be NULL which will cause a crash in rdma_vlan_dev_real_dev. Fix this by moving all processing to the NETEVENT_NEIGH_UPDATE case where the netdev is guaranteed to not be NULL. Fixes: 6702bc147448 ("RDMA/irdma: Fix netdev notifications for vlan's") Signed-off-by: Mustafa Ismail Signed-off-by: Shiraz Saleem --- drivers/infiniband/hw/irdma/utils.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/drivers/infiniband/hw/irdma/utils.c b/drivers/infiniband/hw/irdma/utils.c index 346c2c5..8176041 100644 --- a/drivers/infiniband/hw/irdma/utils.c +++ b/drivers/infiniband/hw/irdma/utils.c @@ -258,18 +258,16 @@ int irdma_net_event(struct notifier_block *notifier, unsigned long event, u32 local_ipaddr[4] = {}; bool ipv4 = true; - real_dev = rdma_vlan_dev_real_dev(netdev); - if (!real_dev) - real_dev = netdev; - - ibdev = ib_device_get_by_netdev(real_dev, RDMA_DRIVER_IRDMA); - if (!ibdev) - return NOTIFY_DONE; - - iwdev = to_iwdev(ibdev); - switch (event) { case NETEVENT_NEIGH_UPDATE: + real_dev = rdma_vlan_dev_real_dev(netdev); + if (!real_dev) + real_dev = netdev; + ibdev = ib_device_get_by_netdev(real_dev, RDMA_DRIVER_IRDMA); + if (!ibdev) + return NOTIFY_DONE; + + iwdev = to_iwdev(ibdev); p = (__be32 *)neigh->primary_key; if (neigh->tbl->family == AF_INET6) { ipv4 = false; @@ -290,13 +288,12 @@ int irdma_net_event(struct notifier_block *notifier, unsigned long event, irdma_manage_arp_cache(iwdev->rf, neigh->ha, local_ipaddr, ipv4, IRDMA_ARP_DELETE); + ib_device_put(ibdev); break; default: break; } - ib_device_put(ibdev); - return NOTIFY_DONE; }