From patchwork Thu Feb 16 15:35:31 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Abeni X-Patchwork-Id: 9577463 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 065AF6049F for ; Thu, 16 Feb 2017 15:36:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EC1C928617 for ; Thu, 16 Feb 2017 15:36:37 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id DFDF32861A; Thu, 16 Feb 2017 15:36:37 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7C6D528617 for ; Thu, 16 Feb 2017 15:36:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932153AbdBPPgg (ORCPT ); Thu, 16 Feb 2017 10:36:36 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36294 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932123AbdBPPgf (ORCPT ); Thu, 16 Feb 2017 10:36:35 -0500 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 20B057FB60; Thu, 16 Feb 2017 15:36:36 +0000 (UTC) Received: from dhcp-176-80.mxp.redhat.com (dhcp-176-80.mxp.redhat.com [10.32.176.80]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id v1GFaYbr007755; Thu, 16 Feb 2017 10:36:35 -0500 From: Paolo Abeni To: linux-rdma@vger.kernel.org Cc: Doug Ledford , Sean Hefty , Hal Rosenstock Subject: [PATCH] ipoib: clean ib tx ring periodically Date: Thu, 16 Feb 2017 16:35:31 +0100 Message-Id: <589591340739f0ceeea9ca449b6de3df01caadc4.1487259121.git.pabeni@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 16 Feb 2017 15:36:36 +0000 (UTC) Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The skbs transmitted via ipoib_send() are freed only if there are 16 or more outstanding work requests or if the send queue is full. If there is very little networking activity, the transmitted skbs can be held by the device driver for an unlimited amount of time, starving other subsystems. E.g. assuming the ipv6 is enabled, with the following sequence: systemctl start firewalld modprobe ib_ipoib ip addr add dev ib0 fc00::1/64 systemctl stop firewalld a cpu will hang: rmmod conntrack will keep a core busy spinning for nf_conntrack_untracked going to 0, since some ICMP6 ND packets are generated and transmitted when the ipv6 address is attached to the device, and such packets get a notrack ct entry. This change address the issue introducing a periodic timer performing "garbage collection" on the send ring at low frequency (once every second). This new timer runs independently from the currently used poll_timer, so that no additional delay is introduced to clean the ring after errors or ring full event. Reported-by: Thomas Cameron Fixes: f56bcd801356 ("IPoIB: Use separate CQ for UD send completions") Signed-off-by: Paolo Abeni --- drivers/infiniband/ulp/ipoib/ipoib.h | 2 ++ drivers/infiniband/ulp/ipoib/ipoib_ib.c | 35 +++++++++++++++++++++++++++------ 2 files changed, 31 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index da12717..3b5039b 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -402,6 +402,8 @@ struct ipoib_dev_priv { u64 hca_caps; struct ipoib_ethtool_st ethtool; struct timer_list poll_timer; + struct timer_list tx_gc_timer; + unsigned long drain_tx_cq_stamp; unsigned max_send_sge; bool sm_fullmember_sendonly_support; }; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index 5038f9d..e565848 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -490,13 +490,20 @@ void ipoib_ib_completion(struct ib_cq *cq, void *dev_ptr) napi_schedule(&priv->napi); } +static void __drain_tx_cq(struct ipoib_dev_priv *priv) +{ + while (poll_tx(priv)) + ; /* nothing */ + + priv->drain_tx_cq_stamp = jiffies; +} + static void drain_tx_cq(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); netif_tx_lock(dev); - while (poll_tx(priv)) - ; /* nothing */ + __drain_tx_cq(priv); if (netif_queue_stopped(dev)) mod_timer(&priv->poll_timer, jiffies + 1); @@ -637,8 +644,7 @@ void ipoib_send(struct net_device *dev, struct sk_buff *skb, } if (unlikely(priv->tx_outstanding > MAX_SEND_CQE)) - while (poll_tx(priv)) - ; /* nothing */ + __drain_tx_cq(priv); } static void __ipoib_reap_ah(struct net_device *dev) @@ -697,6 +703,19 @@ static void ipoib_ib_tx_timer_func(unsigned long ctx) drain_tx_cq((struct net_device *)ctx); } +static void ipoib_tx_gc_timer_func(unsigned long ctx) +{ + struct net_device *dev = (struct net_device *)ctx; + struct ipoib_dev_priv *priv = netdev_priv(dev); + + netif_tx_lock(dev); + if (time_after(jiffies, priv->drain_tx_cq_stamp + HZ)) + __drain_tx_cq(priv); + netif_tx_unlock(dev); + + mod_timer(&priv->tx_gc_timer, jiffies + HZ); +} + int ipoib_ib_dev_open(struct net_device *dev) { struct ipoib_dev_priv *priv = netdev_priv(dev); @@ -834,8 +853,7 @@ void ipoib_drain_cq(struct net_device *dev) } } while (n == IPOIB_NUM_WC); - while (poll_tx(priv)) - ; /* nothing */ + __drain_tx_cq(priv); local_bh_enable(); } @@ -906,6 +924,7 @@ int ipoib_ib_dev_stop(struct net_device *dev) timeout: del_timer_sync(&priv->poll_timer); + del_timer_sync(&priv->tx_gc_timer); qp_attr.qp_state = IB_QPS_RESET; if (ib_modify_qp(priv->qp, &qp_attr, IB_QP_STATE)) ipoib_warn(priv, "Failed to modify QP to RESET state\n"); @@ -932,6 +951,10 @@ int ipoib_ib_dev_init(struct net_device *dev, struct ib_device *ca, int port) setup_timer(&priv->poll_timer, ipoib_ib_tx_timer_func, (unsigned long) dev); + setup_timer(&priv->tx_gc_timer, ipoib_tx_gc_timer_func, + (unsigned long) dev); + mod_timer(&priv->tx_gc_timer, jiffies + HZ); + priv->drain_tx_cq_stamp = jiffies; if (dev->flags & IFF_UP) { if (ipoib_ib_dev_open(dev)) {