From patchwork Thu Jan 22 14:31:26 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Ledford X-Patchwork-Id: 5685171 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 280519F358 for ; Thu, 22 Jan 2015 14:32:07 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 4E1D42034B for ; Thu, 22 Jan 2015 14:32:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5D83F20304 for ; Thu, 22 Jan 2015 14:32:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752783AbbAVOcC (ORCPT ); Thu, 22 Jan 2015 09:32:02 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46106 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751584AbbAVOb7 (ORCPT ); Thu, 22 Jan 2015 09:31:59 -0500 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id t0MEVnNT006892 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 22 Jan 2015 09:31:49 -0500 Received: from linux-ws.xsintricity.com ([10.10.116.28]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t0MEVlBu011531; Thu, 22 Jan 2015 09:31:48 -0500 From: Doug Ledford To: linux-rdma@vger.kernel.org, roland@kernel.org Cc: Amir Vadai , Eyal Perry , Or Gerlitz , Erez Shitrit , Doug Ledford Subject: [PATCH FIX For-3.19 v5 01/10] IB/ipoib: fix IPOIB_MCAST_RUN flag usage Date: Thu, 22 Jan 2015 09:31:26 -0500 Message-Id: <68022bd26f94145c5d9371fee18e1f2dd9807545.1421936879.git.dledford@redhat.com> In-Reply-To: References: In-Reply-To: References: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The usage of IPOIB_MCAST_RUN as a flag is inconsistent. In some places it is used to mean "our device is administratively allowed to send multicast joins/leaves/packets" and in other places it means "our multicast join task thread is currently running and will process your request if you put it on the queue". However, this latter meaning is in fact flawed as there is a race condition between the join task testing the mcast list and finding it empty of remaining work, dropping the mcast mutex and also the priv->lock spinlock, and clearing the IPOIB_MCAST_RUN flag. Further, there are numerous locations that use the flag in the former fashion, and when all tasks complete and the task thread clears the RUN flag, all of those other locations will fail to ever again queue any work. This results in the interface coming up fine initially, but having problems adding new multicast groups after the first round of groups have all been added and the RUN flag is cleared by the join task thread when it thinks it is done. To resolve this issue, convert all locations in the code to treat the RUN flag as an indicator that the multicast portion of this interface is in fact administratively up and joins/leaves/sends can be performed. There is no harm (other than a slight performance penalty) to never clearing this flag and using it in this fashion as it simply means that a few places that used to micro-optimize how often this task was queued on a work queue will now queue the task a few extra times. We can address that suboptimal behavior in future patches. Signed-off-by: Doug Ledford --- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index bc50dd0d0e4..91b8fe118ec 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -630,8 +630,6 @@ void ipoib_mcast_join_task(struct work_struct *work) } ipoib_dbg_mcast(priv, "successfully joined all multicast groups\n"); - - clear_bit(IPOIB_MCAST_RUN, &priv->flags); } int ipoib_mcast_start_thread(struct net_device *dev) @@ -641,8 +639,8 @@ int ipoib_mcast_start_thread(struct net_device *dev) ipoib_dbg_mcast(priv, "starting multicast thread\n"); mutex_lock(&mcast_mutex); - if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags)) - queue_delayed_work(priv->wq, &priv->mcast_task, 0); + set_bit(IPOIB_MCAST_RUN, &priv->flags); + queue_delayed_work(priv->wq, &priv->mcast_task, 0); mutex_unlock(&mcast_mutex); return 0; @@ -725,7 +723,7 @@ void ipoib_mcast_send(struct net_device *dev, u8 *daddr, struct sk_buff *skb) memcpy(mcast->mcmember.mgid.raw, mgid, sizeof (union ib_gid)); __ipoib_mcast_add(dev, mcast); list_add_tail(&mcast->list, &priv->multicast_list); - if (!test_and_set_bit(IPOIB_MCAST_RUN, &priv->flags)) + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) queue_delayed_work(priv->wq, &priv->mcast_task, 0); } @@ -951,7 +949,8 @@ void ipoib_mcast_restart_task(struct work_struct *work) /* * Restart our join task if needed */ - ipoib_mcast_start_thread(dev); + if (test_bit(IPOIB_MCAST_RUN, &priv->flags)) + queue_delayed_work(priv->wq, &priv->mcast_task, 0); rtnl_unlock(); }