From patchwork Fri Sep 11 01:21:05 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Doug Ledford X-Patchwork-Id: 7156751 Return-Path: X-Original-To: patchwork-linux-rdma@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id DD6B0BEEC1 for ; Fri, 11 Sep 2015 01:21:19 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 9276F2085F for ; Fri, 11 Sep 2015 01:21:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B3AAE2085E for ; Fri, 11 Sep 2015 01:21:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751007AbbIKBVO (ORCPT ); Thu, 10 Sep 2015 21:21:14 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43093 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750786AbbIKBVO (ORCPT ); Thu, 10 Sep 2015 21:21:14 -0400 Received: from int-mx14.intmail.prod.int.phx2.redhat.com (int-mx14.intmail.prod.int.phx2.redhat.com [10.5.11.27]) by mx1.redhat.com (Postfix) with ESMTPS id CCE748EA3C for ; Fri, 11 Sep 2015 01:21:13 +0000 (UTC) Received: from linux-ws.xsintricity.com (ovpn-116-33.rdu2.redhat.com [10.10.116.33]) by int-mx14.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t8B1LDW2025903; Thu, 10 Sep 2015 21:21:13 -0400 From: Doug Ledford To: linux-rdma@vger.kernel.org Cc: Doug Ledford Subject: [PATCH for-4.3] IB/ipoib: add module option for auto-creating mcast groups Date: Thu, 10 Sep 2015 21:21:05 -0400 Message-Id: <980e8b0a82042d7e5801e02bf16fe72a0bde6759.1441934465.git.dledford@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.27 Sender: linux-rdma-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-rdma@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP During the recent rework of the mcast handling in ipoib, the join task for regular and send-only joins were merged. In the old code, the comments indicated that the ipoib driver didn't send enough information to auto-create IB multicast groups when the join was a send-only join. The reality is that the comments said we didn't, but we actually did. Since we merged the two join tasks, we now follow the comments and don't auto-create IB multicast groups for an ipoib send-only multicast join. This has been reported to cause problems in certain environments that rely on this behavior. Specifically, if you have an IB <-> Ethernet gateway then there is a fundamental mismatch between the methodologies used on the two fabrics. On Ethernet, an app need not subscribe to a multicast group, merely listen. As such, the Ethernet side of the gateway has no way of knowing if there are listeners. If we don't create groups for sends in this case, and the listeners are only on the Ethernet side of the gateway, the listeners will not get any of the packets sent on the IB side of the gateway. There are instances of installations with 100's (maybe 1000's) of multicast groups where static creation of all the groups is not practical that rely upon the send-only joins creating the IB multicast group in order to function, so to preserve these existing installations, add a module option to the ipoib module to restore the previous behavior. Signed-off-by: Doug Ledford --- drivers/infiniband/ulp/ipoib/ipoib_multicast.c | 32 +++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c index 09a1748f9d13..2d95b8ae379b 100644 --- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c +++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c @@ -47,6 +47,11 @@ #include "ipoib.h" +static bool __read_mostly mcast_auto_create; + +module_param(mcast_auto_create, bool, 0644); +MODULE_PARM_DESC(mcast_auto_create, "Should multicast sends auto-create the IB multicast group? (Default: false)"); + #ifdef CONFIG_INFINIBAND_IPOIB_DEBUG static int mcast_debug_level; @@ -514,9 +519,34 @@ static void ipoib_mcast_join(struct net_device *dev, struct ipoib_mcast *mcast) * detect if there are full members or not. A major problem * with supporting SEND ONLY is detecting when the group is * auto-destroyed as IPoIB will cache the MLID.. + * + * An additional problem is that if we auto-create the IB + * mcast group in response to a send-only action, then we + * will be the creating entity, but we will not have any + * mechanism by which we will track when we should leave + * the group ourselves. We will occasionally leave and + * re-join the group when these events occur: + * + * 1) ifdown/ifup + * 2) a regular mcast join/leave happens and we run + * ipoib_mcast_restart_task + * 3) a REREGISTER event comes in from the SM + * 4) any other event that might cause a mcast flush + * + * However, these events are not deterministic and we can + * leave unused groups subscribed for long periods of time. + * In addition, since the core IB layer does not yet support + * send-only IB joins, we have to do a regular join and then + * simply never attach a QP to listen to the incoming data. + * This means that phantom, wasted data will end up coming + * across our inbound physical link only to be thrown away + * by the multicast dispatch mechanism on the card or in + * the kernel driver. For these reasons, we default to not + * auto creating groups for send-only multicast operations. */ #if 1 - if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags)) + if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags) && + !mcast_auto_create) comp_mask &= ~IB_SA_MCMEMBER_REC_TRAFFIC_CLASS; #else if (test_bit(IPOIB_MCAST_FLAG_SENDONLY, &mcast->flags))