From patchwork Thu Feb 12 01:43:25 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Ledford X-Patchwork-Id: 5815021 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 0BD07BF440 for ; Thu, 12 Feb 2015 01:44:17 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 3D5A1201F4 for ; Thu, 12 Feb 2015 01:44:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 45D28201FA for ; Thu, 12 Feb 2015 01:44:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753289AbbBLBoK (ORCPT ); Wed, 11 Feb 2015 20:44:10 -0500 Received: from mx1.redhat.com ([209.132.183.28]:44252 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753484AbbBLBoJ (ORCPT ); Wed, 11 Feb 2015 20:44:09 -0500 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t1C1i3wY021717 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 11 Feb 2015 20:44:03 -0500 Received: from linux-ws.xsintricity.com ([10.10.116.22]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t1C1i1Ed029536; Wed, 11 Feb 2015 20:44:03 -0500 From: Doug Ledford To: linux-rdma@vger.kernel.org, roland@kernel.org Cc: Or Gerlitz , Erez Shitrit , Doug Ledford Subject: [PATCH 02/22] IB/ipoib: Make the carrier_on_task race aware Date: Wed, 11 Feb 2015 20:43:25 -0500 Message-Id: <2306bc8da2aed5ea1645445920e84afc46496de5.1423703861.git.dledford@redhat.com> In-Reply-To: References: In-Reply-To: References: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We blindly assume that we can just take the rtnl lock and that will prevent races with downing this interface. Unfortunately, that's not the case. In ipoib_mcast_stop_thread() we will call flush_workqueue() in an attempt to clear out all remaining instances of ipoib_join_task. But, since this task is put on the same workqueue as the join task, the flush_workqueue waits on this thread too. But this thread is deadlocked on the rtnl lock. The better thing here is to use trylock and loop on that until we either get the lock or we see that FLAG_ADMIN_UP has been cleared, in which case we don't need to do anything anyway and we just return. Signed-off-by: Doug Ledford --- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 21 +++++++++++++++------ 1 file changed, 15 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index eee66d13e5b..9862c76a83f 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -353,18 +353,27 @@ void ipoib_mcast_carrier_on_task(struct work_struct *work) carrier_on_task); struct ib_port_attr attr; - /* - * Take rtnl_lock to avoid racing with ipoib_stop() and - * turning the carrier back on while a device is being - * removed. - */ if (ib_query_port(priv->ca, priv->port, &attr) || attr.state != IB_PORT_ACTIVE) { ipoib_dbg(priv, "Keeping carrier off until IB port is active\n"); return; } - rtnl_lock(); + /* + * Take rtnl_lock to avoid racing with ipoib_stop() and + * turning the carrier back on while a device is being + * removed. However, ipoib_stop() will attempt to flush + * the workqueue while holding the rtnl lock, so loop + * on trylock until either we get the lock or we see + * FLAG_ADMIN_UP go away as that signals that we are bailing + * and can safely ignore the carrier on work + */ + while (!rtnl_trylock()) { + if (!test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)) + return; + else + msleep(20); + } if (!ipoib_cm_admin_enabled(priv->dev)) dev_set_mtu(priv->dev, min(priv->mcast_mtu, priv->admin_mtu)); netif_carrier_on(priv->dev);