From patchwork Thu Sep 17 10:16:12 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Moni Shoua X-Patchwork-Id: 48265 Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by demeter.kernel.org (8.14.2/8.14.2) with ESMTP id n8HAD0qZ014324 for ; Thu, 17 Sep 2009 10:16:26 GMT Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751447AbZIQKQV (ORCPT ); Thu, 17 Sep 2009 06:16:21 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751464AbZIQKQV (ORCPT ); Thu, 17 Sep 2009 06:16:21 -0400 Received: from fwil.voltaire.com ([193.47.165.2]:12886 "EHLO exil.voltaire.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751447AbZIQKQV (ORCPT ); Thu, 17 Sep 2009 06:16:21 -0400 Received: from [172.25.1.23] ([172.25.1.23]) by exil.voltaire.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 17 Sep 2009 13:16:12 +0300 Message-ID: <4AB20C6C.9090005@Voltaire.COM> Date: Thu, 17 Sep 2009 13:16:12 +0300 From: Moni Shoua User-Agent: Thunderbird 1.5.0.12 (X11/20070718) MIME-Version: 1.0 To: Roland Dreier CC: linux-rdma , OpenFabrics General , Vlad Subject: [PATCH] IB/ipoib: Do not turn on carrier to a non active port X-OriginalArrivalTime: 17 Sep 2009 10:16:12.0123 (UTC) FILETIME=[E1BEC2B0:01CA377F] Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org This patch fixes https://bugs.openfabrics.org/show_bug.cgi?id=1726 Multicast join can succeed even if IB port is down. This happens when OpenSM runs on the same port as the requesting port. IPoIB on the other hand, calls netif_carrier_on() when join succeeded without caring about the state of the IB port. The result is - an IPoIB interface in RUNNING state but without active IB port to support it. If a bonding interface uses this IPoIB interface as a slave it might not detect that this slave is almost useless and failover functionality will be damaged. The fix here is to check the state of the IB port in the carrier_task before calling netif_carrier_on(). Signed-off-by: Moni Shoua --- drivers/infiniband/ulp/ipoib/ipoib.h | 2 +- drivers/infiniband/ulp/ipoib/ipoib_ib.c | 2 ++ drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 13 +++++++++++-- 4 files changed, 15 insertions(+), 4 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h index 753a983..f29ce14 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib.h +++ b/drivers/infiniband/ulp/ipoib/ipoib.h @@ -292,7 +292,7 @@ struct ipoib_dev_priv { struct delayed_work pkey_poll_task; struct delayed_work mcast_task; - struct work_struct carrier_on_task; + struct delayed_work carrier_on_task; struct work_struct flush_light; struct work_struct flush_normal; struct work_struct flush_heavy; diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c index e35f4a0..c452089 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c @@ -724,6 +724,8 @@ int ipoib_ib_dev_down(struct net_device *dev, int flush) ipoib_dbg(priv, "downing ib_dev\n"); clear_bit(IPOIB_FLAG_OPER_UP, &priv->flags); + cancel_delayed_work(&priv->carrier_on_task); + netif_carrier_off(dev); /* Shutdown the P_Key thread if still active */ diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c index 2bf5116..5242e0d 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c @@ -1079,7 +1079,7 @@ static void ipoib_setup(struct net_device *dev) INIT_DELAYED_WORK(&priv->pkey_poll_task, ipoib_pkey_poll); INIT_DELAYED_WORK(&priv->mcast_task, ipoib_mcast_join_task); - INIT_WORK(&priv->carrier_on_task, ipoib_mcast_carrier_on_task); + INIT_DELAYED_WORK(&priv->carrier_on_task, ipoib_mcast_carrier_on_task); INIT_WORK(&priv->flush_light, ipoib_ib_dev_flush_light); INIT_WORK(&priv->flush_normal, ipoib_ib_dev_flush_normal); INIT_WORK(&priv->flush_heavy, ipoib_ib_dev_flush_heavy); diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 25874fc..b4b4016 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -361,13 +361,22 @@ static int ipoib_mcast_sendonly_join(struct ipoib_mcast *mcast) void ipoib_mcast_carrier_on_task(struct work_struct *work) { struct ipoib_dev_priv *priv = container_of(work, struct ipoib_dev_priv, - carrier_on_task); + carrier_on_task.work); + struct ib_port_attr attr; /* * Take rtnl_lock to avoid racing with ipoib_stop() and * turning the carrier back on while a device is being * removed. */ + + if (ib_query_port(priv->ca, priv->port, &attr) || + attr.state != IB_PORT_ACTIVE) { + ipoib_dbg(priv, "wait with carrier until IB port is active\n"); + if (test_bit(IPOIB_FLAG_OPER_UP, &priv->flags)) + queue_delayed_work(ipoib_workqueue, &priv->carrier_on_task, HZ); + return; + } rtnl_lock(); netif_carrier_on(priv->dev); rtnl_unlock(); @@ -403,7 +412,7 @@ static int ipoib_mcast_join_complete(int status, * deadlock on rtnl_lock here. */ if (mcast == priv->broadcast) - queue_work(ipoib_workqueue, &priv->carrier_on_task); + queue_delayed_work(ipoib_workqueue, &priv->carrier_on_task, 0); return 0; }