From patchwork Thu Jan 11 12:00:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517310 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 02E8E15AC9; Thu, 11 Jan 2024 12:01:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R551e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045168;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-Pgn2k_1704974458; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-Pgn2k_1704974458) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:09 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 01/15] net/smc: improve SMC-D device dump for virtual ISM Date: Thu, 11 Jan 2024 20:00:22 +0800 Message-Id: <20240111120036.109903-2-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org The introduction of virtual ISM requires improvement of SMC-D device dump. Software implemented non-PCI device (loopback-ism) should be handled correctly and the CHID reserved for virtual ISM should be got from smcd_ops interface instead of PCI information. Signed-off-by: Wen Gu --- net/smc/smc_ism.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/net/smc/smc_ism.c b/net/smc/smc_ism.c index ac88de2a06a0..66bcfddd3fcf 100644 --- a/net/smc/smc_ism.c +++ b/net/smc/smc_ism.c @@ -252,12 +252,11 @@ static int smc_nl_handle_smcd_dev(struct smcd_dev *smcd, char smc_pnet[SMC_MAX_PNETID_LEN + 1]; struct smc_pci_dev smc_pci_dev; struct nlattr *port_attrs; + struct device *device; struct nlattr *attrs; - struct ism_dev *ism; int use_cnt = 0; void *nlh; - ism = smcd->priv; nlh = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq, &smc_gen_nl_family, NLM_F_MULTI, SMC_NETLINK_GET_DEV_SMCD); @@ -272,7 +271,15 @@ static int smc_nl_handle_smcd_dev(struct smcd_dev *smcd, if (nla_put_u8(skb, SMC_NLA_DEV_IS_CRIT, use_cnt > 0)) goto errattr; memset(&smc_pci_dev, 0, sizeof(smc_pci_dev)); - smc_set_pci_values(to_pci_dev(ism->dev.parent), &smc_pci_dev); + device = smcd->ops->get_dev(smcd); + if (device->parent) + smc_set_pci_values(to_pci_dev(device->parent), &smc_pci_dev); + if (smc_ism_is_virtual(smcd)) { + smc_pci_dev.pci_pchid = smc_ism_get_chid(smcd); + if (!device->parent) + snprintf(smc_pci_dev.pci_id, sizeof(smc_pci_dev.pci_id), + "%s", dev_name(device)); + } if (nla_put_u32(skb, SMC_NLA_DEV_PCI_FID, smc_pci_dev.pci_fid)) goto errattr; if (nla_put_u16(skb, SMC_NLA_DEV_PCI_CHID, smc_pci_dev.pci_pchid)) From patchwork Thu Jan 11 12:00:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517311 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF06D15ACE; Thu, 11 Jan 2024 12:01:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-PdqDE_1704974470; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-PdqDE_1704974470) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:11 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 02/15] net/smc: decouple specialized struct from SMC-D DMB registration Date: Thu, 11 Jan 2024 20:00:23 +0800 Message-Id: <20240111120036.109903-3-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org The struct 'ism_client' is specialized for s390 platform firmware ISM. So replace it with 'void' to make SMCD DMB registration helper generic for both virtual ISM and existing ISM. Signed-off-by: Wen Gu --- drivers/s390/net/ism_drv.c | 2 +- include/net/smc.h | 4 ++-- net/smc/smc_ism.c | 7 ++----- 3 files changed, 5 insertions(+), 8 deletions(-) diff --git a/drivers/s390/net/ism_drv.c b/drivers/s390/net/ism_drv.c index 2c8e964425dc..9b2a52913e76 100644 --- a/drivers/s390/net/ism_drv.c +++ b/drivers/s390/net/ism_drv.c @@ -726,7 +726,7 @@ static int smcd_query_rgid(struct smcd_dev *smcd, struct smcd_gid *rgid, } static int smcd_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb, - struct ism_client *client) + void *client) { return ism_register_dmb(smcd->priv, (struct ism_dmb *)dmb, client); } diff --git a/include/net/smc.h b/include/net/smc.h index c9dcb30e3fd9..6273c3a8b24a 100644 --- a/include/net/smc.h +++ b/include/net/smc.h @@ -50,7 +50,6 @@ struct smcd_dmb { #define ISM_ERROR 0xFFFF struct smcd_dev; -struct ism_client; struct smcd_gid { u64 gid; @@ -61,7 +60,7 @@ struct smcd_ops { int (*query_remote_gid)(struct smcd_dev *dev, struct smcd_gid *rgid, u32 vid_valid, u32 vid); int (*register_dmb)(struct smcd_dev *dev, struct smcd_dmb *dmb, - struct ism_client *client); + void *client); int (*unregister_dmb)(struct smcd_dev *dev, struct smcd_dmb *dmb); int (*add_vlan_id)(struct smcd_dev *dev, u64 vlan_id); int (*del_vlan_id)(struct smcd_dev *dev, u64 vlan_id); @@ -81,6 +80,7 @@ struct smcd_ops { struct smcd_dev { const struct smcd_ops *ops; void *priv; + void *client; struct list_head list; spinlock_t lock; struct smc_connection **conn; diff --git a/net/smc/smc_ism.c b/net/smc/smc_ism.c index 66bcfddd3fcf..fb1837d0a861 100644 --- a/net/smc/smc_ism.c +++ b/net/smc/smc_ism.c @@ -222,7 +222,6 @@ int smc_ism_unregister_dmb(struct smcd_dev *smcd, struct smc_buf_desc *dmb_desc) int smc_ism_register_dmb(struct smc_link_group *lgr, int dmb_len, struct smc_buf_desc *dmb_desc) { -#if IS_ENABLED(CONFIG_ISM) struct smcd_dmb dmb; int rc; @@ -231,7 +230,7 @@ int smc_ism_register_dmb(struct smc_link_group *lgr, int dmb_len, dmb.sba_idx = dmb_desc->sba_idx; dmb.vlan_id = lgr->vlan_id; dmb.rgid = lgr->peer_gid.gid; - rc = lgr->smcd->ops->register_dmb(lgr->smcd, &dmb, &smc_ism_client); + rc = lgr->smcd->ops->register_dmb(lgr->smcd, &dmb, lgr->smcd->client); if (!rc) { dmb_desc->sba_idx = dmb.sba_idx; dmb_desc->token = dmb.dmb_tok; @@ -240,9 +239,6 @@ int smc_ism_register_dmb(struct smc_link_group *lgr, int dmb_len, dmb_desc->len = dmb.dmb_len; } return rc; -#else - return 0; -#endif } static int smc_nl_handle_smcd_dev(struct smcd_dev *smcd, @@ -453,6 +449,7 @@ static void smcd_register_dev(struct ism_dev *ism) if (!smcd) return; smcd->priv = ism; + smcd->client = &smc_ism_client; ism_set_priv(ism, &smc_ism_client, smcd); if (smc_pnetid_by_dev_port(&ism->pdev->dev, 0, smcd->pnetid)) smc_pnetid_by_table_smcd(smcd); From patchwork Thu Jan 11 12:00:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517313 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 444CA15AF8; Thu, 11 Jan 2024 12:01:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045176;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-PgOqA_1704974472; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-PgOqA_1704974472) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:13 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 03/15] net/smc: introduce virtual ISM device loopback-ism Date: Thu, 11 Jan 2024 20:00:24 +0800 Message-Id: <20240111120036.109903-4-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org This introduces a kind of virtual ISM device loopback-ism for SMCDv2.1. loopback-ism is implemented by software and serves inter-process or inter-container SMC communication in the same OS instance. It is created during SMC module loading and destroyed upon unloading. The support for loopback-ism can be configured via CONFIG_SMC_LO. Signed-off-by: Wen Gu --- net/smc/Kconfig | 13 +++ net/smc/Makefile | 2 +- net/smc/af_smc.c | 12 ++- net/smc/smc_loopback.c | 181 +++++++++++++++++++++++++++++++++++++++++ net/smc/smc_loopback.h | 33 ++++++++ 5 files changed, 239 insertions(+), 2 deletions(-) create mode 100644 net/smc/smc_loopback.c create mode 100644 net/smc/smc_loopback.h diff --git a/net/smc/Kconfig b/net/smc/Kconfig index 746be3996768..e191f78551f4 100644 --- a/net/smc/Kconfig +++ b/net/smc/Kconfig @@ -20,3 +20,16 @@ config SMC_DIAG smcss. if unsure, say Y. + +config SMC_LO + bool "SMC_LO: virtual ISM loopback-ism for SMC" + depends on SMC + default n + help + SMC_LO provides a kind of virtual ISM device called loopback-ism + for SMCD to upgrade AF_INET TCP connections whose ends share the + same kernel. + loopback-ism is a software implemented device that does not depend + on a specific architecture or hardware. + + if unsure, say N. diff --git a/net/smc/Makefile b/net/smc/Makefile index 875efcd126a2..a8c37111abe1 100644 --- a/net/smc/Makefile +++ b/net/smc/Makefile @@ -4,5 +4,5 @@ obj-$(CONFIG_SMC) += smc.o obj-$(CONFIG_SMC_DIAG) += smc_diag.o smc-y := af_smc.o smc_pnet.o smc_ib.o smc_clc.o smc_core.o smc_wr.o smc_llc.o smc-y += smc_cdc.o smc_tx.o smc_rx.o smc_close.o smc_ism.o smc_netlink.o smc_stats.o -smc-y += smc_tracepoint.o +smc-y += smc_tracepoint.o smc_loopback.o smc-$(CONFIG_SYSCTL) += smc_sysctl.o diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index a2cb30af46cb..189aea09b66e 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -53,6 +53,7 @@ #include "smc_stats.h" #include "smc_tracepoint.h" #include "smc_sysctl.h" +#include "smc_loopback.h" static DEFINE_MUTEX(smc_server_lgr_pending); /* serialize link group * creation on server @@ -3556,15 +3557,23 @@ static int __init smc_init(void) goto out_sock; } + rc = smc_loopback_init(); + if (rc) { + pr_err("%s: smc_loopback_init fails with %d\n", __func__, rc); + goto out_ib; + } + rc = tcp_register_ulp(&smc_ulp_ops); if (rc) { pr_err("%s: tcp_ulp_register fails with %d\n", __func__, rc); - goto out_ib; + goto out_lo; } static_branch_enable(&tcp_have_smc); return 0; +out_lo: + smc_loopback_exit(); out_ib: smc_ib_unregister_client(); out_sock: @@ -3602,6 +3611,7 @@ static void __exit smc_exit(void) tcp_unregister_ulp(&smc_ulp_ops); sock_unregister(PF_SMC); smc_core_exit(); + smc_loopback_exit(); smc_ib_unregister_client(); smc_ism_exit(); destroy_workqueue(smc_close_wq); diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c new file mode 100644 index 000000000000..cbb6625ccd0d --- /dev/null +++ b/net/smc/smc_loopback.c @@ -0,0 +1,181 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Shared Memory Communications Direct over loopback-ism device. + * + * Provide a SMC-D loopback-ism device. + * + * Copyright (c) 2024, Alibaba Inc. + * + * Author: Wen Gu + * Tony Lu + * + */ + +#include +#include +#include + +#include "smc_ism.h" +#include "smc_loopback.h" + +#if IS_ENABLED(CONFIG_SMC_LO) +static const char smc_lo_dev_name[] = "loopback-ism"; +static struct smc_lo_dev *lo_dev; +static struct class *smc_class; + +static const struct smcd_ops lo_ops = { + .query_remote_gid = NULL, + .register_dmb = NULL, + .unregister_dmb = NULL, + .add_vlan_id = NULL, + .del_vlan_id = NULL, + .set_vlan_required = NULL, + .reset_vlan_required = NULL, + .signal_event = NULL, + .move_data = NULL, + .supports_v2 = NULL, + .get_local_gid = NULL, + .get_chid = NULL, + .get_dev = NULL, +}; + +static struct smcd_dev *smcd_lo_alloc_dev(const struct smcd_ops *ops, + int max_dmbs) +{ + struct smcd_dev *smcd; + + smcd = kzalloc(sizeof(*smcd), GFP_KERNEL); + if (!smcd) + return NULL; + + smcd->conn = kcalloc(max_dmbs, sizeof(struct smc_connection *), + GFP_KERNEL); + if (!smcd->conn) + goto out_smcd; + + smcd->ops = ops; + + spin_lock_init(&smcd->lock); + spin_lock_init(&smcd->lgr_lock); + INIT_LIST_HEAD(&smcd->vlan); + INIT_LIST_HEAD(&smcd->lgr_list); + init_waitqueue_head(&smcd->lgrs_deleted); + return smcd; + +out_smcd: + kfree(smcd); + return NULL; +} + +static int smcd_lo_register_dev(struct smc_lo_dev *ldev) +{ + struct smcd_dev *smcd; + + smcd = smcd_lo_alloc_dev(&lo_ops, SMC_LO_MAX_DMBS); + if (!smcd) + return -ENOMEM; + ldev->smcd = smcd; + smcd->priv = ldev; + + /* TODO: + * register loopback-ism to smcd_dev list. + */ + return 0; +} + +static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev) +{ + struct smcd_dev *smcd = ldev->smcd; + + /* TODO: + * unregister loopback-ism from smcd_dev list. + */ + kfree(smcd->conn); + kfree(smcd); +} + +static int smc_lo_dev_init(struct smc_lo_dev *ldev) +{ + return smcd_lo_register_dev(ldev); +} + +static void smc_lo_dev_exit(struct smc_lo_dev *ldev) +{ + smcd_lo_unregister_dev(ldev); +} + +static void smc_lo_dev_release(struct device *dev) +{ + struct smc_lo_dev *ldev = + container_of(dev, struct smc_lo_dev, dev); + + kfree(ldev); +} + +static int smc_lo_dev_probe(void) +{ + struct smc_lo_dev *ldev; + int ret; + + smc_class = class_create("smc"); + if (IS_ERR(smc_class)) + return PTR_ERR(smc_class); + + ldev = kzalloc(sizeof(*ldev), GFP_KERNEL); + if (!ldev) { + ret = -ENOMEM; + goto destroy_class; + } + + ldev->dev.parent = NULL; + ldev->dev.class = smc_class; + ldev->dev.release = smc_lo_dev_release; + device_initialize(&ldev->dev); + dev_set_name(&ldev->dev, smc_lo_dev_name); + ret = device_add(&ldev->dev); + if (ret) + goto free_dev; + + ret = smc_lo_dev_init(ldev); + if (ret) + goto del_dev; + + lo_dev = ldev; /* global loopback device */ + return 0; + +del_dev: + device_del(&ldev->dev); +free_dev: + put_device(&ldev->dev); +destroy_class: + class_destroy(smc_class); + return ret; +} + +static void smc_lo_dev_remove(void) +{ + if (!lo_dev) + return; + + smc_lo_dev_exit(lo_dev); + device_del(&lo_dev->dev); /* device_add in smc_lo_dev_probe */ + put_device(&lo_dev->dev); /* device_initialize in smc_lo_dev_probe */ + class_destroy(smc_class); +} +#endif + +int smc_loopback_init(void) +{ +#if IS_ENABLED(CONFIG_SMC_LO) + return smc_lo_dev_probe(); +#else + return 0; +#endif +} + +void smc_loopback_exit(void) +{ +#if IS_ENABLED(CONFIG_SMC_LO) + smc_lo_dev_remove(); +#endif +} diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h new file mode 100644 index 000000000000..9dd44d4c0ca3 --- /dev/null +++ b/net/smc/smc_loopback.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Shared Memory Communications Direct over loopback-ism device. + * + * Provide a SMC-D loopback-ism device. + * + * Copyright (c) 2024, Alibaba Inc. + * + * Author: Wen Gu + * Tony Lu + * + */ + +#ifndef _SMC_LOOPBACK_H +#define _SMC_LOOPBACK_H + +#include +#include +#include + +#if IS_ENABLED(CONFIG_SMC_LO) +#define SMC_LO_MAX_DMBS 5000 + +struct smc_lo_dev { + struct smcd_dev *smcd; + struct device dev; +}; +#endif + +int smc_loopback_init(void); +void smc_loopback_exit(void); + +#endif /* _SMC_LOOPBACK_H */ From patchwork Thu Jan 11 12:00:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517314 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E097F16414; Thu, 11 Jan 2024 12:01:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-PewXf_1704974474; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-PewXf_1704974474) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:15 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 04/15] net/smc: implement ID-related operations of loopback-ism Date: Thu, 11 Jan 2024 20:00:25 +0800 Message-Id: <20240111120036.109903-5-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org This implements GID and CHID related operations of loopback-ism device. loopback-ism acts as an ISMv2. It's GID is generated randomly by UUIDv4 algorithm and CHID is reserved 0xFFFF. Signed-off-by: Wen Gu --- net/smc/smc_loopback.c | 62 ++++++++++++++++++++++++++++++++++++++---- net/smc/smc_loopback.h | 3 ++ 2 files changed, 60 insertions(+), 5 deletions(-) diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index cbb6625ccd0d..40dff28d837d 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -19,12 +19,63 @@ #include "smc_loopback.h" #if IS_ENABLED(CONFIG_SMC_LO) +#define SMC_LO_V2_CAPABLE 0x1 /* loopback-ism acts as ISMv2 */ + static const char smc_lo_dev_name[] = "loopback-ism"; static struct smc_lo_dev *lo_dev; static struct class *smc_class; +static void smc_lo_generate_id(struct smc_lo_dev *ldev) +{ + struct smcd_gid *lgid = &ldev->local_gid; + uuid_t uuid; + + uuid_gen(&uuid); + memcpy(&lgid->gid, &uuid, sizeof(lgid->gid)); + memcpy(&lgid->gid_ext, (u8 *)&uuid + sizeof(lgid->gid), + sizeof(lgid->gid_ext)); + + ldev->chid = SMC_LO_CHID; +} + +static int smc_lo_query_rgid(struct smcd_dev *smcd, struct smcd_gid *rgid, + u32 vid_valid, u32 vid) +{ + struct smc_lo_dev *ldev = smcd->priv; + + /* rgid should equal to lgid in loopback situation */ + if (!ldev || rgid->gid != ldev->local_gid.gid || + rgid->gid_ext != ldev->local_gid.gid_ext) + return -ENETUNREACH; + return 0; +} + +static int smc_lo_supports_v2(void) +{ + return SMC_LO_V2_CAPABLE; +} + +static void smc_lo_get_local_gid(struct smcd_dev *smcd, + struct smcd_gid *smcd_gid) +{ + struct smc_lo_dev *ldev = smcd->priv; + + smcd_gid->gid = ldev->local_gid.gid; + smcd_gid->gid_ext = ldev->local_gid.gid_ext; +} + +static u16 smc_lo_get_chid(struct smcd_dev *smcd) +{ + return ((struct smc_lo_dev *)smcd->priv)->chid; +} + +static struct device *smc_lo_get_dev(struct smcd_dev *smcd) +{ + return &((struct smc_lo_dev *)smcd->priv)->dev; +} + static const struct smcd_ops lo_ops = { - .query_remote_gid = NULL, + .query_remote_gid = smc_lo_query_rgid, .register_dmb = NULL, .unregister_dmb = NULL, .add_vlan_id = NULL, @@ -33,10 +84,10 @@ static const struct smcd_ops lo_ops = { .reset_vlan_required = NULL, .signal_event = NULL, .move_data = NULL, - .supports_v2 = NULL, - .get_local_gid = NULL, - .get_chid = NULL, - .get_dev = NULL, + .supports_v2 = smc_lo_supports_v2, + .get_local_gid = smc_lo_get_local_gid, + .get_chid = smc_lo_get_chid, + .get_dev = smc_lo_get_dev, }; static struct smcd_dev *smcd_lo_alloc_dev(const struct smcd_ops *ops, @@ -96,6 +147,7 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev) static int smc_lo_dev_init(struct smc_lo_dev *ldev) { + smc_lo_generate_id(ldev); return smcd_lo_register_dev(ldev); } diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h index 9dd44d4c0ca3..55b41133a97f 100644 --- a/net/smc/smc_loopback.h +++ b/net/smc/smc_loopback.h @@ -20,10 +20,13 @@ #if IS_ENABLED(CONFIG_SMC_LO) #define SMC_LO_MAX_DMBS 5000 +#define SMC_LO_CHID 0xFFFF struct smc_lo_dev { struct smcd_dev *smcd; struct device dev; + u16 chid; + struct smcd_gid local_gid; }; #endif From patchwork Thu Jan 11 12:00:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517312 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6851415488; Thu, 11 Jan 2024 12:01:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R321e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046049;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-Pgn7c_1704974476; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-Pgn7c_1704974476) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:17 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 05/15] net/smc: implement some unsupported operations of loopback-ism Date: Thu, 11 Jan 2024 20:00:26 +0800 Message-Id: <20240111120036.109903-6-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org vlan operations are not supported currently since the need for vlan in loopback situation does not seem to be strong. signal_event operation is not supported since no event now needs to be processed by loopback-ism device. Signed-off-by: Wen Gu --- net/smc/smc_loopback.c | 36 +++++++++++++++++++++++++++++++----- 1 file changed, 31 insertions(+), 5 deletions(-) diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index 40dff28d837d..353d4a2d69a1 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -50,6 +50,32 @@ static int smc_lo_query_rgid(struct smcd_dev *smcd, struct smcd_gid *rgid, return 0; } +static int smc_lo_add_vlan_id(struct smcd_dev *smcd, u64 vlan_id) +{ + return -EOPNOTSUPP; +} + +static int smc_lo_del_vlan_id(struct smcd_dev *smcd, u64 vlan_id) +{ + return -EOPNOTSUPP; +} + +static int smc_lo_set_vlan_required(struct smcd_dev *smcd) +{ + return -EOPNOTSUPP; +} + +static int smc_lo_reset_vlan_required(struct smcd_dev *smcd) +{ + return -EOPNOTSUPP; +} + +static int smc_lo_signal_event(struct smcd_dev *dev, struct smcd_gid *rgid, + u32 trigger_irq, u32 event_code, u64 info) +{ + return 0; +} + static int smc_lo_supports_v2(void) { return SMC_LO_V2_CAPABLE; @@ -78,11 +104,11 @@ static const struct smcd_ops lo_ops = { .query_remote_gid = smc_lo_query_rgid, .register_dmb = NULL, .unregister_dmb = NULL, - .add_vlan_id = NULL, - .del_vlan_id = NULL, - .set_vlan_required = NULL, - .reset_vlan_required = NULL, - .signal_event = NULL, + .add_vlan_id = smc_lo_add_vlan_id, + .del_vlan_id = smc_lo_del_vlan_id, + .set_vlan_required = smc_lo_set_vlan_required, + .reset_vlan_required = smc_lo_reset_vlan_required, + .signal_event = smc_lo_signal_event, .move_data = NULL, .supports_v2 = smc_lo_supports_v2, .get_local_gid = smc_lo_get_local_gid, From patchwork Thu Jan 11 12:00:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517316 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF7AC171A0; Thu, 11 Jan 2024 12:01:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045192;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-Pgn8A_1704974477; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-Pgn8A_1704974477) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:19 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 06/15] net/smc: implement DMB-related operations of loopback-ism Date: Thu, 11 Jan 2024 20:00:27 +0800 Message-Id: <20240111120036.109903-7-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org This implements DMB (un)registration and data move operations of loopback-ism device. Signed-off-by: Wen Gu --- net/smc/smc_cdc.c | 6 ++ net/smc/smc_cdc.h | 1 + net/smc/smc_loopback.c | 133 ++++++++++++++++++++++++++++++++++++++++- net/smc/smc_loopback.h | 13 ++++ 4 files changed, 150 insertions(+), 3 deletions(-) diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c index 3c06625ceb20..c820ef197610 100644 --- a/net/smc/smc_cdc.c +++ b/net/smc/smc_cdc.c @@ -410,6 +410,12 @@ static void smc_cdc_msg_recv(struct smc_sock *smc, struct smc_cdc_msg *cdc) static void smcd_cdc_rx_tsklet(struct tasklet_struct *t) { struct smc_connection *conn = from_tasklet(conn, t, rx_tsklet); + + smcd_cdc_rx_handler(conn); +} + +void smcd_cdc_rx_handler(struct smc_connection *conn) +{ struct smcd_cdc_msg *data_cdc; struct smcd_cdc_msg cdc; struct smc_sock *smc; diff --git a/net/smc/smc_cdc.h b/net/smc/smc_cdc.h index 696cc11f2303..11559d4ebf2b 100644 --- a/net/smc/smc_cdc.h +++ b/net/smc/smc_cdc.h @@ -301,5 +301,6 @@ int smcr_cdc_msg_send_validation(struct smc_connection *conn, struct smc_wr_buf *wr_buf); int smc_cdc_init(void) __init; void smcd_cdc_rx_init(struct smc_connection *conn); +void smcd_cdc_rx_handler(struct smc_connection *conn); #endif /* SMC_CDC_H */ diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index 353d4a2d69a1..f72e7b24fc1a 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -15,11 +15,13 @@ #include #include +#include "smc_cdc.h" #include "smc_ism.h" #include "smc_loopback.h" #if IS_ENABLED(CONFIG_SMC_LO) #define SMC_LO_V2_CAPABLE 0x1 /* loopback-ism acts as ISMv2 */ +#define SMC_DMA_ADDR_INVALID (~(dma_addr_t)0) static const char smc_lo_dev_name[] = "loopback-ism"; static struct smc_lo_dev *lo_dev; @@ -50,6 +52,97 @@ static int smc_lo_query_rgid(struct smcd_dev *smcd, struct smcd_gid *rgid, return 0; } +static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb, + void *client_priv) +{ + struct smc_lo_dmb_node *dmb_node, *tmp_node; + struct smc_lo_dev *ldev = smcd->priv; + int sba_idx, order, rc; + struct page *pages; + + /* check space for new dmb */ + for_each_clear_bit(sba_idx, ldev->sba_idx_mask, SMC_LO_MAX_DMBS) { + if (!test_and_set_bit(sba_idx, ldev->sba_idx_mask)) + break; + } + if (sba_idx == SMC_LO_MAX_DMBS) + return -ENOSPC; + + dmb_node = kzalloc(sizeof(*dmb_node), GFP_KERNEL); + if (!dmb_node) { + rc = -ENOMEM; + goto err_bit; + } + + dmb_node->sba_idx = sba_idx; + order = get_order(dmb->dmb_len); + pages = alloc_pages(GFP_KERNEL | __GFP_NOWARN | + __GFP_NOMEMALLOC | __GFP_COMP | + __GFP_NORETRY | __GFP_ZERO, + order); + if (!pages) { + rc = -ENOMEM; + goto err_node; + } + dmb_node->cpu_addr = (void *)page_address(pages); + dmb_node->len = dmb->dmb_len; + dmb_node->dma_addr = SMC_DMA_ADDR_INVALID; + +again: + /* add new dmb into hash table */ + get_random_bytes(&dmb_node->token, sizeof(dmb_node->token)); + write_lock(&ldev->dmb_ht_lock); + hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb_node->token) { + if (tmp_node->token == dmb_node->token) { + write_unlock(&ldev->dmb_ht_lock); + goto again; + } + } + hash_add(ldev->dmb_ht, &dmb_node->list, dmb_node->token); + write_unlock(&ldev->dmb_ht_lock); + + dmb->sba_idx = dmb_node->sba_idx; + dmb->dmb_tok = dmb_node->token; + dmb->cpu_addr = dmb_node->cpu_addr; + dmb->dma_addr = dmb_node->dma_addr; + dmb->dmb_len = dmb_node->len; + + return 0; + +err_node: + kfree(dmb_node); +err_bit: + clear_bit(sba_idx, ldev->sba_idx_mask); + return rc; +} + +static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb) +{ + struct smc_lo_dmb_node *dmb_node = NULL, *tmp_node; + struct smc_lo_dev *ldev = smcd->priv; + + /* remove dmb from hash table */ + write_lock(&ldev->dmb_ht_lock); + hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb->dmb_tok) { + if (tmp_node->token == dmb->dmb_tok) { + dmb_node = tmp_node; + break; + } + } + if (!dmb_node) { + write_unlock(&ldev->dmb_ht_lock); + return -EINVAL; + } + hash_del(&dmb_node->list); + write_unlock(&ldev->dmb_ht_lock); + + clear_bit(dmb_node->sba_idx, ldev->sba_idx_mask); + kfree(dmb_node->cpu_addr); + kfree(dmb_node); + + return 0; +} + static int smc_lo_add_vlan_id(struct smcd_dev *smcd, u64 vlan_id) { return -EOPNOTSUPP; @@ -76,6 +169,38 @@ static int smc_lo_signal_event(struct smcd_dev *dev, struct smcd_gid *rgid, return 0; } +static int smc_lo_move_data(struct smcd_dev *smcd, u64 dmb_tok, + unsigned int idx, bool sf, unsigned int offset, + void *data, unsigned int size) +{ + struct smc_lo_dmb_node *rmb_node = NULL, *tmp_node; + struct smc_lo_dev *ldev = smcd->priv; + + read_lock(&ldev->dmb_ht_lock); + hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb_tok) { + if (tmp_node->token == dmb_tok) { + rmb_node = tmp_node; + break; + } + } + if (!rmb_node) { + read_unlock(&ldev->dmb_ht_lock); + return -EINVAL; + } + read_unlock(&ldev->dmb_ht_lock); + + memcpy((char *)rmb_node->cpu_addr + offset, data, size); + + if (sf) { + struct smc_connection *conn = + smcd->conn[rmb_node->sba_idx]; + + if (conn && !conn->killed) + smcd_cdc_rx_handler(conn); + } + return 0; +} + static int smc_lo_supports_v2(void) { return SMC_LO_V2_CAPABLE; @@ -102,14 +227,14 @@ static struct device *smc_lo_get_dev(struct smcd_dev *smcd) static const struct smcd_ops lo_ops = { .query_remote_gid = smc_lo_query_rgid, - .register_dmb = NULL, - .unregister_dmb = NULL, + .register_dmb = smc_lo_register_dmb, + .unregister_dmb = smc_lo_unregister_dmb, .add_vlan_id = smc_lo_add_vlan_id, .del_vlan_id = smc_lo_del_vlan_id, .set_vlan_required = smc_lo_set_vlan_required, .reset_vlan_required = smc_lo_reset_vlan_required, .signal_event = smc_lo_signal_event, - .move_data = NULL, + .move_data = smc_lo_move_data, .supports_v2 = smc_lo_supports_v2, .get_local_gid = smc_lo_get_local_gid, .get_chid = smc_lo_get_chid, @@ -174,6 +299,8 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev) static int smc_lo_dev_init(struct smc_lo_dev *ldev) { smc_lo_generate_id(ldev); + rwlock_init(&ldev->dmb_ht_lock); + hash_init(ldev->dmb_ht); return smcd_lo_register_dev(ldev); } diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h index 55b41133a97f..24ab9d747613 100644 --- a/net/smc/smc_loopback.h +++ b/net/smc/smc_loopback.h @@ -20,13 +20,26 @@ #if IS_ENABLED(CONFIG_SMC_LO) #define SMC_LO_MAX_DMBS 5000 +#define SMC_LO_DMBS_HASH_BITS 12 #define SMC_LO_CHID 0xFFFF +struct smc_lo_dmb_node { + struct hlist_node list; + u64 token; + u32 len; + u32 sba_idx; + void *cpu_addr; + dma_addr_t dma_addr; +}; + struct smc_lo_dev { struct smcd_dev *smcd; struct device dev; u16 chid; struct smcd_gid local_gid; + rwlock_t dmb_ht_lock; + DECLARE_BITMAP(sba_idx_mask, SMC_LO_MAX_DMBS); + DECLARE_HASHTABLE(dmb_ht, SMC_LO_DMBS_HASH_BITS); }; #endif From patchwork Thu Jan 11 12:00:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517336 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6708F15ACF; Thu, 11 Jan 2024 12:06:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-PgOta_1704974479; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-PgOta_1704974479) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:21 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 07/15] net/smc: register loopback-ism into SMC-D device list Date: Thu, 11 Jan 2024 20:00:28 +0800 Message-Id: <20240111120036.109903-8-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org After loopback-ism device gets ready, add it to the SMC-D device list as an ISMv2 device. Signed-off-by: Wen Gu --- net/smc/smc_ism.c | 11 +++++++---- net/smc/smc_ism.h | 1 + net/smc/smc_loopback.c | 20 +++++++++++++------- 3 files changed, 21 insertions(+), 11 deletions(-) diff --git a/net/smc/smc_ism.c b/net/smc/smc_ism.c index fb1837d0a861..4065ebd2e43d 100644 --- a/net/smc/smc_ism.c +++ b/net/smc/smc_ism.c @@ -91,6 +91,11 @@ bool smc_ism_is_v2_capable(void) return smc_ism_v2_capable; } +void smc_ism_set_v2_capable(void) +{ + smc_ism_v2_capable = true; +} + /* Set a connection using this DMBE. */ void smc_ism_set_conn(struct smc_connection *conn) { @@ -454,11 +459,9 @@ static void smcd_register_dev(struct ism_dev *ism) if (smc_pnetid_by_dev_port(&ism->pdev->dev, 0, smcd->pnetid)) smc_pnetid_by_table_smcd(smcd); + if (smcd->ops->supports_v2()) + smc_ism_set_v2_capable(); mutex_lock(&smcd_dev_list.mutex); - if (list_empty(&smcd_dev_list.list)) { - if (smcd->ops->supports_v2()) - smc_ism_v2_capable = true; - } /* sort list: devices without pnetid before devices with pnetid */ if (smcd->pnetid[0]) list_add_tail(&smcd->list, &smcd_dev_list.list); diff --git a/net/smc/smc_ism.h b/net/smc/smc_ism.h index ffff40c30a06..6903cd5d4d4d 100644 --- a/net/smc/smc_ism.h +++ b/net/smc/smc_ism.h @@ -52,6 +52,7 @@ int smc_ism_signal_shutdown(struct smc_link_group *lgr); void smc_ism_get_system_eid(u8 **eid); u16 smc_ism_get_chid(struct smcd_dev *dev); bool smc_ism_is_v2_capable(void); +void smc_ism_set_v2_capable(void); int smc_ism_init(void); void smc_ism_exit(void); int smcd_nl_get_device(struct sk_buff *skb, struct netlink_callback *cb); diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index f72e7b24fc1a..db0b45f8560c 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -278,10 +278,12 @@ static int smcd_lo_register_dev(struct smc_lo_dev *ldev) return -ENOMEM; ldev->smcd = smcd; smcd->priv = ldev; - - /* TODO: - * register loopback-ism to smcd_dev list. - */ + smc_ism_set_v2_capable(); + mutex_lock(&smcd_dev_list.mutex); + list_add(&smcd->list, &smcd_dev_list.list); + mutex_unlock(&smcd_dev_list.mutex); + pr_warn_ratelimited("smc: adding smcd device %s\n", + smc_lo_dev_name); return 0; } @@ -289,9 +291,13 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev) { struct smcd_dev *smcd = ldev->smcd; - /* TODO: - * unregister loopback-ism from smcd_dev list. - */ + pr_warn_ratelimited("smc: removing smcd device %s\n", + smc_lo_dev_name); + smcd->going_away = 1; + smc_smcd_terminate_all(smcd); + mutex_lock(&smcd_dev_list.mutex); + list_del_init(&smcd->list); + mutex_unlock(&smcd_dev_list.mutex); kfree(smcd->conn); kfree(smcd); } From patchwork Thu Jan 11 12:00:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517315 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29A93168A6; Thu, 11 Jan 2024 12:01:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R291e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046049;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-PdqI7_1704974481; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-PdqI7_1704974481) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:23 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 08/15] net/smc: introduce loopback-ism runtime switch Date: Thu, 11 Jan 2024 20:00:29 +0800 Message-Id: <20240111120036.109903-9-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org This provides a runtime switch to activate or deactivate loopback-ism device by echo {1|0} > /sys/devices/virtual/smc/loopback-ism/active. It will trigger the registration or removal of loopback-ism from the SMC-D device list. Signed-off-by: Wen Gu --- net/smc/smc_loopback.c | 55 ++++++++++++++++++++++++++++++++++++++++++ net/smc/smc_loopback.h | 1 + 2 files changed, 56 insertions(+) diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index db0b45f8560c..3bf7bf5e8c96 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -27,6 +27,58 @@ static const char smc_lo_dev_name[] = "loopback-ism"; static struct smc_lo_dev *lo_dev; static struct class *smc_class; +static int smcd_lo_register_dev(struct smc_lo_dev *ldev); +static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev); + +static ssize_t active_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct smc_lo_dev *ldev = + container_of(dev, struct smc_lo_dev, dev); + + return sysfs_emit(buf, "%d\n", ldev->active); +} + +static ssize_t active_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + struct smc_lo_dev *ldev = + container_of(dev, struct smc_lo_dev, dev); + bool active; + int ret; + + ret = kstrtobool(buf, &active); + if (ret) + return ret; + + if (active && !ldev->active) { + /* activate loopback-ism */ + ret = smcd_lo_register_dev(ldev); + if (ret) + return ret; + } else if (!active && ldev->active) { + /* deactivate loopback-ism */ + smcd_lo_unregister_dev(ldev); + } + + return count; +} +static DEVICE_ATTR_RW(active); +static struct attribute *smc_lo_attrs[] = { + &dev_attr_active.attr, + NULL, +}; + +static struct attribute_group smc_lo_attr_group = { + .attrs = smc_lo_attrs, +}; + +static const struct attribute_group *smc_lo_attr_groups[] = { + &smc_lo_attr_group, + NULL, +}; + static void smc_lo_generate_id(struct smc_lo_dev *ldev) { struct smcd_gid *lgid = &ldev->local_gid; @@ -282,6 +334,7 @@ static int smcd_lo_register_dev(struct smc_lo_dev *ldev) mutex_lock(&smcd_dev_list.mutex); list_add(&smcd->list, &smcd_dev_list.list); mutex_unlock(&smcd_dev_list.mutex); + ldev->active = 1; pr_warn_ratelimited("smc: adding smcd device %s\n", smc_lo_dev_name); return 0; @@ -293,6 +346,7 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev) pr_warn_ratelimited("smc: removing smcd device %s\n", smc_lo_dev_name); + ldev->active = 0; smcd->going_away = 1; smc_smcd_terminate_all(smcd); mutex_lock(&smcd_dev_list.mutex); @@ -340,6 +394,7 @@ static int smc_lo_dev_probe(void) ldev->dev.parent = NULL; ldev->dev.class = smc_class; + ldev->dev.groups = smc_lo_attr_groups; ldev->dev.release = smc_lo_dev_release; device_initialize(&ldev->dev); dev_set_name(&ldev->dev, smc_lo_dev_name); diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h index 24ab9d747613..02a522e322b4 100644 --- a/net/smc/smc_loopback.h +++ b/net/smc/smc_loopback.h @@ -35,6 +35,7 @@ struct smc_lo_dmb_node { struct smc_lo_dev { struct smcd_dev *smcd; struct device dev; + u8 active; u16 chid; struct smcd_gid local_gid; rwlock_t dmb_ht_lock; From patchwork Thu Jan 11 12:00:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517317 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 343A0171A3; Thu, 11 Jan 2024 12:01:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-PfIj6_1704974483; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-PfIj6_1704974483) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:25 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 09/15] net/smc: introduce loopback-ism statistics attributes Date: Thu, 11 Jan 2024 20:00:30 +0800 Message-Id: <20240111120036.109903-10-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org This introduces some statistics attributes of loopback-ism. They can be read from /sys/devices/virtual/smc/loopback-ism/{xfer_tytes|dmbs_cnt}. Signed-off-by: Wen Gu --- net/smc/smc_loopback.c | 74 ++++++++++++++++++++++++++++++++++++++++++ net/smc/smc_loopback.h | 22 +++++++++++++ 2 files changed, 96 insertions(+) diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index 3bf7bf5e8c96..a89dbf84aea5 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -30,6 +30,65 @@ static struct class *smc_class; static int smcd_lo_register_dev(struct smc_lo_dev *ldev); static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev); +static void smc_lo_clear_stats(struct smc_lo_dev *ldev) +{ + struct smc_lo_dev_stats64 *tmp; + int cpu; + + for_each_possible_cpu(cpu) { + tmp = per_cpu_ptr(ldev->stats, cpu); + tmp->xfer_bytes = 0; + } +} + +static void smc_lo_get_stats(struct smc_lo_dev *ldev, + struct smc_lo_dev_stats64 *stats) +{ + int size, cpu, i; + u64 *src, *sum; + + memset(stats, 0, sizeof(*stats)); + size = sizeof(*stats) / sizeof(u64); + for_each_possible_cpu(cpu) { + src = (u64 *)per_cpu_ptr(ldev->stats, cpu); + sum = (u64 *)stats; + for (i = 0; i < size; i++) + *(sum++) += *(src++); + } +} + +static ssize_t smc_lo_show_stats(struct device *dev, + struct device_attribute *attr, + char *buf, unsigned long offset) +{ + struct smc_lo_dev *ldev = + container_of(dev, struct smc_lo_dev, dev); + struct smc_lo_dev_stats64 stats; + ssize_t ret = -EINVAL; + + if (WARN_ON(offset > sizeof(struct smc_lo_dev_stats64) || + offset % sizeof(u64) != 0)) + goto out; + + smc_lo_get_stats(ldev, &stats); + ret = sysfs_emit(buf, "%llu\n", *(u64 *)(((u8 *)&stats) + offset)); +out: + return ret; +} + +/* generate a read-only statistics attribute */ +#define SMC_LO_DEVICE_ATTR_RO(name) \ +static ssize_t name##_show(struct device *dev, \ + struct device_attribute *attr, char *buf) \ +{ \ + return smc_lo_show_stats(dev, attr, buf, \ + offsetof(struct smc_lo_dev_stats64, name)); \ +} \ +static DEVICE_ATTR_RO(name) + +SMC_LO_DEVICE_ATTR_RO(xfer_bytes); +SMC_LO_DEVICE_ATTR_RO(dmbs_cnt); + static ssize_t active_show(struct device *dev, struct device_attribute *attr, char *buf) { @@ -67,6 +126,8 @@ static ssize_t active_store(struct device *dev, static DEVICE_ATTR_RW(active); static struct attribute *smc_lo_attrs[] = { &dev_attr_active.attr, + &dev_attr_xfer_bytes.attr, + &dev_attr_dmbs_cnt.attr, NULL, }; @@ -152,6 +213,7 @@ static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb, } hash_add(ldev->dmb_ht, &dmb_node->list, dmb_node->token); write_unlock(&ldev->dmb_ht_lock); + SMC_LO_STAT_DMBS_INC(ldev); dmb->sba_idx = dmb_node->sba_idx; dmb->dmb_tok = dmb_node->token; @@ -191,6 +253,7 @@ static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb) clear_bit(dmb_node->sba_idx, ldev->sba_idx_mask); kfree(dmb_node->cpu_addr); kfree(dmb_node); + SMC_LO_STAT_DMBS_DEC(ldev); return 0; } @@ -249,6 +312,8 @@ static int smc_lo_move_data(struct smcd_dev *smcd, u64 dmb_tok, if (conn && !conn->killed) smcd_cdc_rx_handler(conn); + } else { + SMC_LO_STAT_XFER_BYTES(ldev, size); } return 0; } @@ -354,6 +419,7 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev) mutex_unlock(&smcd_dev_list.mutex); kfree(smcd->conn); kfree(smcd); + smc_lo_clear_stats(ldev); } static int smc_lo_dev_init(struct smc_lo_dev *ldev) @@ -374,6 +440,7 @@ static void smc_lo_dev_release(struct device *dev) struct smc_lo_dev *ldev = container_of(dev, struct smc_lo_dev, dev); + free_percpu(ldev->stats); kfree(ldev); } @@ -392,6 +459,13 @@ static int smc_lo_dev_probe(void) goto destroy_class; } + ldev->stats = alloc_percpu(struct smc_lo_dev_stats64); + if (!ldev->stats) { + ret = -ENOMEM; + kfree(ldev); + goto destroy_class; + } + ldev->dev.parent = NULL; ldev->dev.class = smc_class; ldev->dev.groups = smc_lo_attr_groups; diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h index 02a522e322b4..d4572ca42f08 100644 --- a/net/smc/smc_loopback.h +++ b/net/smc/smc_loopback.h @@ -32,16 +32,38 @@ struct smc_lo_dmb_node { dma_addr_t dma_addr; }; +struct smc_lo_dev_stats64 { + __u64 xfer_bytes; + __u64 dmbs_cnt; +}; + struct smc_lo_dev { struct smcd_dev *smcd; struct device dev; u8 active; u16 chid; struct smcd_gid local_gid; + struct smc_lo_dev_stats64 __percpu *stats; rwlock_t dmb_ht_lock; DECLARE_BITMAP(sba_idx_mask, SMC_LO_MAX_DMBS); DECLARE_HASHTABLE(dmb_ht, SMC_LO_DMBS_HASH_BITS); }; + +#define SMC_LO_STAT_SUB(ldev, key, val) \ +do { \ + struct smc_lo_dev_stats64 *_stats = (ldev)->stats; \ + this_cpu_add((*(_stats)).key, val); \ +} \ +while (0) + +#define SMC_LO_STAT_XFER_BYTES(ldev, val) \ + SMC_LO_STAT_SUB(ldev, xfer_bytes, val) + +#define SMC_LO_STAT_DMBS_INC(ldev) \ + SMC_LO_STAT_SUB(ldev, dmbs_cnt, 1) + +#define SMC_LO_STAT_DMBS_DEC(ldev) \ + SMC_LO_STAT_SUB(ldev, dmbs_cnt, -1) #endif int smc_loopback_init(void); From patchwork Thu Jan 11 12:00:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517323 X-Patchwork-Delegate: kuba@kernel.org Received: from out199-12.us.a.mail.aliyun.com (out199-12.us.a.mail.aliyun.com [47.90.199.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 739B83D55F; Thu, 11 Jan 2024 12:01:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-PgOvd_1704974485; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-PgOvd_1704974485) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:27 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 10/15] net/smc: add operations to merge sndbuf with peer DMB Date: Thu, 11 Jan 2024 20:00:31 +0800 Message-Id: <20240111120036.109903-11-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org In some scenarios using virtual ISM device, sndbuf can share the same physical memory region with peer DMB to avoid data copy from one side to the other. In such case the sndbuf is only a descriptor that describes the shared memory and does not actually occupy memory, it's more like a ghost buffer. +----------+ +----------+ | socket A | | socket B | +----------+ +----------+ | | +--------+ +--------+ | sndbuf | | DMB | | desc | | desc | +--------+ +--------+ | | | +----v-----+ +--------------------------> memory | +----------+ So here introduces three new SMC-D device operations to check if this feature is supported by device, and to {attach|detach} ghost sndbuf to peer DMB. For now only loopback-ism supports this. Signed-off-by: Wen Gu --- include/net/smc.h | 3 +++ net/smc/smc_ism.c | 40 ++++++++++++++++++++++++++++++++++++++++ net/smc/smc_ism.h | 4 ++++ 3 files changed, 47 insertions(+) diff --git a/include/net/smc.h b/include/net/smc.h index 6273c3a8b24a..01387631d8a6 100644 --- a/include/net/smc.h +++ b/include/net/smc.h @@ -62,6 +62,9 @@ struct smcd_ops { int (*register_dmb)(struct smcd_dev *dev, struct smcd_dmb *dmb, void *client); int (*unregister_dmb)(struct smcd_dev *dev, struct smcd_dmb *dmb); + int (*support_dmb_nocopy)(struct smcd_dev *dev); + int (*attach_dmb)(struct smcd_dev *dev, struct smcd_dmb *dmb); + int (*detach_dmb)(struct smcd_dev *dev, u64 token); int (*add_vlan_id)(struct smcd_dev *dev, u64 vlan_id); int (*del_vlan_id)(struct smcd_dev *dev, u64 vlan_id); int (*set_vlan_required)(struct smcd_dev *dev); diff --git a/net/smc/smc_ism.c b/net/smc/smc_ism.c index 4065ebd2e43d..2d2781724932 100644 --- a/net/smc/smc_ism.c +++ b/net/smc/smc_ism.c @@ -246,6 +246,46 @@ int smc_ism_register_dmb(struct smc_link_group *lgr, int dmb_len, return rc; } +bool smc_ism_support_dmb_nocopy(struct smcd_dev *smcd) +{ + /* for now only loopback-ism supports + * merging sndbuf with peer DMB to avoid + * data copies between them. + */ + return (smcd->ops->support_dmb_nocopy && + smcd->ops->support_dmb_nocopy(smcd)); +} + +int smc_ism_attach_dmb(struct smcd_dev *dev, u64 token, + struct smc_buf_desc *dmb_desc) +{ + struct smcd_dmb dmb; + int rc = 0; + + if (!dev->ops->attach_dmb) + return -EINVAL; + + memset(&dmb, 0, sizeof(dmb)); + dmb.dmb_tok = token; + rc = dev->ops->attach_dmb(dev, &dmb); + if (!rc) { + dmb_desc->sba_idx = dmb.sba_idx; + dmb_desc->token = dmb.dmb_tok; + dmb_desc->cpu_addr = dmb.cpu_addr; + dmb_desc->dma_addr = dmb.dma_addr; + dmb_desc->len = dmb.dmb_len; + } + return rc; +} + +int smc_ism_detach_dmb(struct smcd_dev *dev, u64 token) +{ + if (!dev->ops->detach_dmb) + return -EINVAL; + + return dev->ops->detach_dmb(dev, token); +} + static int smc_nl_handle_smcd_dev(struct smcd_dev *smcd, struct sk_buff *skb, struct netlink_callback *cb) diff --git a/net/smc/smc_ism.h b/net/smc/smc_ism.h index 6903cd5d4d4d..8ea5ab737c6f 100644 --- a/net/smc/smc_ism.h +++ b/net/smc/smc_ism.h @@ -48,6 +48,10 @@ int smc_ism_put_vlan(struct smcd_dev *dev, unsigned short vlan_id); int smc_ism_register_dmb(struct smc_link_group *lgr, int buf_size, struct smc_buf_desc *dmb_desc); int smc_ism_unregister_dmb(struct smcd_dev *dev, struct smc_buf_desc *dmb_desc); +bool smc_ism_support_dmb_nocopy(struct smcd_dev *smcd); +int smc_ism_attach_dmb(struct smcd_dev *dev, u64 token, + struct smc_buf_desc *dmb_desc); +int smc_ism_detach_dmb(struct smcd_dev *dev, u64 token); int smc_ism_signal_shutdown(struct smc_link_group *lgr); void smc_ism_get_system_eid(u8 **eid); u16 smc_ism_get_chid(struct smcd_dev *dev); From patchwork Thu Jan 11 12:00:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517320 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A33F18ED6; Thu, 11 Jan 2024 12:01:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R911e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045170;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-PgnAv_1704974487; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-PgnAv_1704974487) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:29 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 11/15] net/smc: attach or detach ghost sndbuf to peer DMB Date: Thu, 11 Jan 2024 20:00:32 +0800 Message-Id: <20240111120036.109903-12-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org The ghost sndbuf descriptor will be created and attached to peer DMB once peer token is obtained and it will be detach and freed when the connection is freed. Signed-off-by: Wen Gu --- net/smc/af_smc.c | 16 ++++++++++++ net/smc/smc_core.c | 61 +++++++++++++++++++++++++++++++++++++++++++++- net/smc/smc_core.h | 1 + 3 files changed, 77 insertions(+), 1 deletion(-) diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 189aea09b66e..96a6e5f13351 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -1437,6 +1437,14 @@ static int smc_connect_ism(struct smc_sock *smc, } smc_conn_save_peer_info(smc, aclc); + + if (smc_ism_support_dmb_nocopy(smc->conn.lgr->smcd)) { + rc = smcd_buf_attach(smc); + if (rc) { + rc = SMC_CLC_DECL_MEM; /* try to fallback */ + goto connect_abort; + } + } smc_close_init(smc); smc_rx_init(smc); smc_tx_init(smc); @@ -2541,6 +2549,14 @@ static void smc_listen_work(struct work_struct *work) mutex_unlock(&smc_server_lgr_pending); } smc_conn_save_peer_info(new_smc, cclc); + + if (ini->is_smcd && + smc_ism_support_dmb_nocopy(new_smc->conn.lgr->smcd)) { + rc = smcd_buf_attach(new_smc); + if (rc) + goto out_decl; + } + smc_listen_out_connected(new_smc); SMC_STAT_SERV_SUCC_INC(sock_net(newclcsock->sk), ini); goto out_free; diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index 95cc95458e2d..da6a8d9c81ea 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -1149,6 +1149,20 @@ static void smcr_buf_unuse(struct smc_buf_desc *buf_desc, bool is_rmb, } } +static void smcd_buf_detach(struct smc_connection *conn) +{ + struct smcd_dev *smcd = conn->lgr->smcd; + u64 peer_token = conn->peer_token; + + if (!conn->sndbuf_desc) + return; + + smc_ism_detach_dmb(smcd, peer_token); + + kfree(conn->sndbuf_desc); + conn->sndbuf_desc = NULL; +} + static void smc_buf_unuse(struct smc_connection *conn, struct smc_link_group *lgr) { @@ -1192,6 +1206,8 @@ void smc_conn_free(struct smc_connection *conn) if (lgr->is_smcd) { if (!list_empty(&lgr->list)) smc_ism_unset_conn(conn); + if (smc_ism_support_dmb_nocopy(lgr->smcd)) + smcd_buf_detach(conn); tasklet_kill(&conn->rx_tsklet); } else { smc_cdc_wait_pend_tx_wr(conn); @@ -1445,6 +1461,8 @@ static void smc_conn_kill(struct smc_connection *conn, bool soft) smc_sk_wake_ups(smc); if (conn->lgr->is_smcd) { smc_ism_unset_conn(conn); + if (smc_ism_support_dmb_nocopy(conn->lgr->smcd)) + smcd_buf_detach(conn); if (soft) tasklet_kill(&conn->rx_tsklet); else @@ -2458,12 +2476,18 @@ int smc_buf_create(struct smc_sock *smc, bool is_smcd) int rc; /* create send buffer */ + if (is_smcd && + smc_ism_support_dmb_nocopy(smc->conn.lgr->smcd)) + goto create_rmb; + rc = __smc_buf_create(smc, is_smcd, false); if (rc) return rc; + +create_rmb: /* create rmb */ rc = __smc_buf_create(smc, is_smcd, true); - if (rc) { + if (rc && smc->conn.sndbuf_desc) { down_write(&smc->conn.lgr->sndbufs_lock); list_del(&smc->conn.sndbuf_desc->list); up_write(&smc->conn.lgr->sndbufs_lock); @@ -2473,6 +2497,41 @@ int smc_buf_create(struct smc_sock *smc, bool is_smcd) return rc; } +int smcd_buf_attach(struct smc_sock *smc) +{ + struct smc_connection *conn = &smc->conn; + struct smcd_dev *smcd = conn->lgr->smcd; + u64 peer_token = conn->peer_token; + struct smc_buf_desc *buf_desc; + int rc; + + buf_desc = kzalloc(sizeof(*buf_desc), GFP_KERNEL); + if (!buf_desc) + return -ENOMEM; + + /* The ghost sndbuf_desc describes the same memory region as + * peer RMB. Its lifecycle is consistent with the connection's + * and it will be freed with the connections instead of the + * link group. + */ + rc = smc_ism_attach_dmb(smcd, peer_token, buf_desc); + if (rc) + goto free; + + smc->sk.sk_sndbuf = buf_desc->len; + buf_desc->cpu_addr = + (u8 *)buf_desc->cpu_addr + sizeof(struct smcd_cdc_msg); + buf_desc->len -= sizeof(struct smcd_cdc_msg); + conn->sndbuf_desc = buf_desc; + conn->sndbuf_desc->used = 1; + atomic_set(&conn->sndbuf_space, conn->sndbuf_desc->len); + return 0; + +free: + kfree(buf_desc); + return rc; +} + static inline int smc_rmb_reserve_rtoken_idx(struct smc_link_group *lgr) { int i; diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h index 1f175376037b..d93cf51dbd7c 100644 --- a/net/smc/smc_core.h +++ b/net/smc/smc_core.h @@ -557,6 +557,7 @@ void smc_smcd_terminate(struct smcd_dev *dev, struct smcd_gid *peer_gid, void smc_smcd_terminate_all(struct smcd_dev *dev); void smc_smcr_terminate_all(struct smc_ib_device *smcibdev); int smc_buf_create(struct smc_sock *smc, bool is_smcd); +int smcd_buf_attach(struct smc_sock *smc); int smc_uncompress_bufsize(u8 compressed); int smc_rmb_rtoken_handling(struct smc_connection *conn, struct smc_link *link, struct smc_clc_msg_accept_confirm *clc); From patchwork Thu Jan 11 12:00:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517322 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-112.freemail.mail.aliyun.com (out30-112.freemail.mail.aliyun.com [115.124.30.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F33A92836B; Thu, 11 Jan 2024 12:01:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-PgOwg_1704974489; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-PgOwg_1704974489) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:31 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 12/15] net/smc: adapt cursor update when sndbuf and peer DMB are merged Date: Thu, 11 Jan 2024 20:00:33 +0800 Message-Id: <20240111120036.109903-13-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org Since ghost sndbuf shares the same physical memory with peer DMB, the cursor update processing needs to be adapted to ensure that the data to be consumed won't be overwritten. So in this case, the fin_curs and sndbuf_space that were originally updated after sending the CDC message should be modified to not be update until the peer updates cons_curs. Signed-off-by: Wen Gu --- net/smc/smc_cdc.c | 52 +++++++++++++++++++++++++++++++++++++---------- 1 file changed, 41 insertions(+), 11 deletions(-) diff --git a/net/smc/smc_cdc.c b/net/smc/smc_cdc.c index c820ef197610..e938fe3bcc7c 100644 --- a/net/smc/smc_cdc.c +++ b/net/smc/smc_cdc.c @@ -18,6 +18,7 @@ #include "smc_tx.h" #include "smc_rx.h" #include "smc_close.h" +#include "smc_ism.h" /********************************** send *************************************/ @@ -255,17 +256,25 @@ int smcd_cdc_msg_send(struct smc_connection *conn) return rc; smc_curs_copy(&conn->rx_curs_confirmed, &curs, conn); conn->local_rx_ctrl.prod_flags.cons_curs_upd_req = 0; - /* Calculate transmitted data and increment free send buffer space */ - diff = smc_curs_diff(conn->sndbuf_desc->len, &conn->tx_curs_fin, - &conn->tx_curs_sent); - /* increased by confirmed number of bytes */ - smp_mb__before_atomic(); - atomic_add(diff, &conn->sndbuf_space); - /* guarantee 0 <= sndbuf_space <= sndbuf_desc->len */ - smp_mb__after_atomic(); - smc_curs_copy(&conn->tx_curs_fin, &conn->tx_curs_sent, conn); + if (!smc_ism_support_dmb_nocopy(conn->lgr->smcd)) { + /* Ghost sndbuf shares the same memory region with + * peer DMB, so don't update the tx_curs_fin and + * sndbuf_space until peer has consumed the data. + */ + /* Calculate transmitted data and increment free + * send buffer space + */ + diff = smc_curs_diff(conn->sndbuf_desc->len, &conn->tx_curs_fin, + &conn->tx_curs_sent); + /* increased by confirmed number of bytes */ + smp_mb__before_atomic(); + atomic_add(diff, &conn->sndbuf_space); + /* guarantee 0 <= sndbuf_space <= sndbuf_desc->len */ + smp_mb__after_atomic(); + smc_curs_copy(&conn->tx_curs_fin, &conn->tx_curs_sent, conn); - smc_tx_sndbuf_nonfull(smc); + smc_tx_sndbuf_nonfull(smc); + } return rc; } @@ -323,7 +332,7 @@ static void smc_cdc_msg_recv_action(struct smc_sock *smc, { union smc_host_cursor cons_old, prod_old; struct smc_connection *conn = &smc->conn; - int diff_cons, diff_prod; + int diff_cons, diff_prod, diff_tx; smc_curs_copy(&prod_old, &conn->local_rx_ctrl.prod, conn); smc_curs_copy(&cons_old, &conn->local_rx_ctrl.cons, conn); @@ -339,6 +348,27 @@ static void smc_cdc_msg_recv_action(struct smc_sock *smc, atomic_add(diff_cons, &conn->peer_rmbe_space); /* guarantee 0 <= peer_rmbe_space <= peer_rmbe_size */ smp_mb__after_atomic(); + + if (conn->lgr->is_smcd && + smc_ism_support_dmb_nocopy(conn->lgr->smcd)) { + /* Ghost sndbuf shares the same memory region with + * peer RMB, so update tx_curs_fin and sndbuf_space + * when peer has consumed the data. + */ + /* calculate peer rmb consumed data */ + diff_tx = smc_curs_diff(conn->sndbuf_desc->len, + &conn->tx_curs_fin, + &conn->local_rx_ctrl.cons); + /* increase local sndbuf space and fin_curs */ + smp_mb__before_atomic(); + atomic_add(diff_tx, &conn->sndbuf_space); + /* guarantee 0 <= sndbuf_space <= sndbuf_desc->len */ + smp_mb__after_atomic(); + smc_curs_copy(&conn->tx_curs_fin, + &conn->local_rx_ctrl.cons, conn); + + smc_tx_sndbuf_nonfull(smc); + } } diff_prod = smc_curs_diff(conn->rmb_desc->len, &prod_old, From patchwork Thu Jan 11 12:00:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517318 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 526B71802D; Thu, 11 Jan 2024 12:01:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R801e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046049;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-Pewe3_1704974491; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-Pewe3_1704974491) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:33 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 13/15] net/smc: introduce loopback-ism DMB type control Date: Thu, 11 Jan 2024 20:00:34 +0800 Message-Id: <20240111120036.109903-14-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org This provides a way to {get|set} type of DMB offered by loopback-ism, whether it is physically or virtually contiguous memory. echo 0 > /sys/devices/virtual/smc/loopback-ism/dmb_type # physically echo 1 > /sys/devices/virtual/smc/loopback-ism/dmb_type # virtually The settings take effect after re-activating loopback-ism by: echo 0 > /sys/devices/virtual/smc/loopback-ism/active echo 1 > /sys/devices/virtual/smc/loopback-ism/active After this, the link group and DMBs related to loopback-ism will be flushed and subsequent DMBs created will be of the desired type. The motivation of this control is that physically contiguous DMB has best performance but is usually expensive, while the virtually contiguous DMB is cheap and perform well in most scenarios, but if sndbuf and DMB are merged, virtual DMB will be accessed concurrently in Tx and Rx and there will be a bottleneck caused by lock contention of find_vmap_area when there are many CPUs and CONFIG_HARDENED_USERCOPY is set (see link below). So an option is provided. Link: https://lore.kernel.org/all/238e63cd-e0e8-4fbf-852f-bc4d5bc35d5a@linux.alibaba.com/ Signed-off-by: Wen Gu --- net/smc/smc_loopback.c | 80 +++++++++++++++++++++++++++++++++++------- net/smc/smc_loopback.h | 6 ++++ 2 files changed, 74 insertions(+), 12 deletions(-) diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index a89dbf84aea5..2e734f8e08f5 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -13,6 +13,7 @@ #include #include +#include #include #include "smc_cdc.h" @@ -24,6 +25,7 @@ #define SMC_DMA_ADDR_INVALID (~(dma_addr_t)0) static const char smc_lo_dev_name[] = "loopback-ism"; +static unsigned int smc_lo_dmb_type = SMC_LO_DMB_PHYS; static struct smc_lo_dev *lo_dev; static struct class *smc_class; @@ -124,8 +126,50 @@ static ssize_t active_store(struct device *dev, return count; } static DEVICE_ATTR_RW(active); + +static ssize_t dmb_type_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct smc_lo_dev *ldev = + container_of(dev, struct smc_lo_dev, dev); + const char *type; + + switch (ldev->dmb_type) { + case SMC_LO_DMB_PHYS: + type = "Physically contiguous buffer"; + break; + case SMC_LO_DMB_VIRT: + type = "Virtually contiguous buffer"; + break; + default: + type = "Unknown type"; + } + + return sysfs_emit(buf, "%d: %s\n", ldev->dmb_type, type); +} + +static ssize_t dmb_type_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + unsigned int dmb_type; + int ret; + + ret = kstrtouint(buf, 0, &dmb_type); + if (ret) + return ret; + + if (dmb_type != SMC_LO_DMB_PHYS && + dmb_type != SMC_LO_DMB_VIRT) + return -EINVAL; + + smc_lo_dmb_type = dmb_type; /* re-activate to take effect */ + return count; +} +static DEVICE_ATTR_RW(dmb_type); static struct attribute *smc_lo_attrs[] = { &dev_attr_active.attr, + &dev_attr_dmb_type.attr, &dev_attr_xfer_bytes.attr, &dev_attr_dmbs_cnt.attr, NULL, @@ -170,8 +214,7 @@ static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb, { struct smc_lo_dmb_node *dmb_node, *tmp_node; struct smc_lo_dev *ldev = smcd->priv; - int sba_idx, order, rc; - struct page *pages; + int sba_idx, rc; /* check space for new dmb */ for_each_clear_bit(sba_idx, ldev->sba_idx_mask, SMC_LO_MAX_DMBS) { @@ -188,16 +231,27 @@ static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb, } dmb_node->sba_idx = sba_idx; - order = get_order(dmb->dmb_len); - pages = alloc_pages(GFP_KERNEL | __GFP_NOWARN | - __GFP_NOMEMALLOC | __GFP_COMP | - __GFP_NORETRY | __GFP_ZERO, - order); - if (!pages) { - rc = -ENOMEM; - goto err_node; + if (ldev->dmb_type == SMC_LO_DMB_PHYS) { + struct page *pages; + int order; + + order = get_order(dmb->dmb_len); + pages = alloc_pages(GFP_KERNEL | __GFP_NOWARN | + __GFP_NOMEMALLOC | __GFP_COMP | + __GFP_NORETRY | __GFP_ZERO, + order); + if (!pages) { + rc = -ENOMEM; + goto err_node; + } + dmb_node->cpu_addr = (void *)page_address(pages); + } else { + dmb_node->cpu_addr = vzalloc(dmb->dmb_len); + if (!dmb_node->cpu_addr) { + rc = -ENOMEM; + goto err_node; + } } - dmb_node->cpu_addr = (void *)page_address(pages); dmb_node->len = dmb->dmb_len; dmb_node->dma_addr = SMC_DMA_ADDR_INVALID; @@ -251,7 +305,7 @@ static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb) write_unlock(&ldev->dmb_ht_lock); clear_bit(dmb_node->sba_idx, ldev->sba_idx_mask); - kfree(dmb_node->cpu_addr); + kvfree(dmb_node->cpu_addr); kfree(dmb_node); SMC_LO_STAT_DMBS_DEC(ldev); @@ -396,6 +450,7 @@ static int smcd_lo_register_dev(struct smc_lo_dev *ldev) ldev->smcd = smcd; smcd->priv = ldev; smc_ism_set_v2_capable(); + ldev->dmb_type = smc_lo_dmb_type; mutex_lock(&smcd_dev_list.mutex); list_add(&smcd->list, &smcd_dev_list.list); mutex_unlock(&smcd_dev_list.mutex); @@ -419,6 +474,7 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev) mutex_unlock(&smcd_dev_list.mutex); kfree(smcd->conn); kfree(smcd); + ldev->dmb_type = smc_lo_dmb_type; smc_lo_clear_stats(ldev); } diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h index d4572ca42f08..8ee5c6805fc4 100644 --- a/net/smc/smc_loopback.h +++ b/net/smc/smc_loopback.h @@ -23,6 +23,11 @@ #define SMC_LO_DMBS_HASH_BITS 12 #define SMC_LO_CHID 0xFFFF +enum { + SMC_LO_DMB_PHYS, + SMC_LO_DMB_VIRT, +}; + struct smc_lo_dmb_node { struct hlist_node list; u64 token; @@ -41,6 +46,7 @@ struct smc_lo_dev { struct smcd_dev *smcd; struct device dev; u8 active; + u8 dmb_type; u16 chid; struct smcd_gid local_gid; struct smc_lo_dev_stats64 __percpu *stats; From patchwork Thu Jan 11 12:00:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517319 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A8F5015AC3; Thu, 11 Jan 2024 12:01:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R741e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045192;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-Pewec_1704974493; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-Pewec_1704974493) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:35 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 14/15] net/smc: introduce loopback-ism DMB data copy control Date: Thu, 11 Jan 2024 20:00:35 +0800 Message-Id: <20240111120036.109903-15-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org This provides a way to {get|set} whether loopback-ism device supports merging sndbuf with peer DMB to eliminate data copies between them. echo 0 > /sys/devices/virtual/smc/loopback-ism/dmb_copy # support echo 1 > /sys/devices/virtual/smc/loopback-ism/dmb_copy # not support The settings take effect after re-activating loopback-ism by: echo 0 > /sys/devices/virtual/smc/loopback-ism/active echo 1 > /sys/devices/virtual/smc/loopback-ism/active After this, the link group related to loopback-ism will be flushed and the sndbufs of subsequent connections will be merged or not merged with peer DMB. The motivation of this control is that the bandwidth will be highly improved when sndbuf and DMB are merged, but when virtually contiguous DMB is provided and merged with sndbuf, it will be concurrently accessed on Tx and Rx, then there will be a bottleneck caused by lock contention of find_vmap_area when there are many CPUs and CONFIG_HARDENED_USERCOPY is set (see link below). So an option is provided. Link: https://lore.kernel.org/all/238e63cd-e0e8-4fbf-852f-bc4d5bc35d5a@linux.alibaba.com/ Signed-off-by: Wen Gu --- net/smc/smc_loopback.c | 46 ++++++++++++++++++++++++++++++++++++++++++ net/smc/smc_loopback.h | 8 +++++++- 2 files changed, 53 insertions(+), 1 deletion(-) diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index 2e734f8e08f5..bfbb346ef01a 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -26,6 +26,7 @@ static const char smc_lo_dev_name[] = "loopback-ism"; static unsigned int smc_lo_dmb_type = SMC_LO_DMB_PHYS; +static unsigned int smc_lo_dmb_copy = SMC_LO_DMB_NOCOPY; static struct smc_lo_dev *lo_dev; static struct class *smc_class; @@ -167,9 +168,52 @@ static ssize_t dmb_type_store(struct device *dev, return count; } static DEVICE_ATTR_RW(dmb_type); + +static ssize_t dmb_copy_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct smc_lo_dev *ldev = + container_of(dev, struct smc_lo_dev, dev); + const char *copy; + + switch (ldev->dmb_copy) { + case SMC_LO_DMB_NOCOPY: + copy = "sndbuf and DMB merged and no data copied"; + break; + case SMC_LO_DMB_COPY: + copy = "sndbuf and DMB separated and data copied"; + break; + default: + copy = "Unknown setting"; + } + + return sysfs_emit(buf, "%d: %s\n", ldev->dmb_copy, copy); +} + +static ssize_t dmb_copy_store(struct device *dev, + struct device_attribute *attr, + const char *buf, size_t count) +{ + unsigned int dmb_copy; + int ret; + + ret = kstrtouint(buf, 0, &dmb_copy); + if (ret) + return ret; + + if (dmb_copy != SMC_LO_DMB_NOCOPY && + dmb_copy != SMC_LO_DMB_COPY) + return -EINVAL; + + smc_lo_dmb_copy = dmb_copy; /* re-activate to take effect */ + return count; +} +static DEVICE_ATTR_RW(dmb_copy); + static struct attribute *smc_lo_attrs[] = { &dev_attr_active.attr, &dev_attr_dmb_type.attr, + &dev_attr_dmb_copy.attr, &dev_attr_xfer_bytes.attr, &dev_attr_dmbs_cnt.attr, NULL, @@ -451,6 +495,7 @@ static int smcd_lo_register_dev(struct smc_lo_dev *ldev) smcd->priv = ldev; smc_ism_set_v2_capable(); ldev->dmb_type = smc_lo_dmb_type; + ldev->dmb_copy = smc_lo_dmb_copy; mutex_lock(&smcd_dev_list.mutex); list_add(&smcd->list, &smcd_dev_list.list); mutex_unlock(&smcd_dev_list.mutex); @@ -475,6 +520,7 @@ static void smcd_lo_unregister_dev(struct smc_lo_dev *ldev) kfree(smcd->conn); kfree(smcd); ldev->dmb_type = smc_lo_dmb_type; + ldev->dmb_copy = smc_lo_dmb_copy; smc_lo_clear_stats(ldev); } diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h index 8ee5c6805fc4..7ecb4a35eb36 100644 --- a/net/smc/smc_loopback.h +++ b/net/smc/smc_loopback.h @@ -28,6 +28,11 @@ enum { SMC_LO_DMB_VIRT, }; +enum { + SMC_LO_DMB_NOCOPY, + SMC_LO_DMB_COPY, +}; + struct smc_lo_dmb_node { struct hlist_node list; u64 token; @@ -45,7 +50,8 @@ struct smc_lo_dev_stats64 { struct smc_lo_dev { struct smcd_dev *smcd; struct device dev; - u8 active; + u8 active : 1; + u8 dmb_copy : 1; u8 dmb_type; u16 chid; struct smcd_gid local_gid; From patchwork Thu Jan 11 12:00:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Gu X-Patchwork-Id: 13517321 X-Patchwork-Delegate: kuba@kernel.org Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B4FCC22093; Thu, 11 Jan 2024 12:01:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R421e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=guwen@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0W-PfInN_1704974495; Received: from localhost(mailfrom:guwen@linux.alibaba.com fp:SMTPD_---0W-PfInN_1704974495) by smtp.aliyun-inc.com; Thu, 11 Jan 2024 20:01:36 +0800 From: Wen Gu To: wintera@linux.ibm.com, wenjia@linux.ibm.com, hca@linux.ibm.com, gor@linux.ibm.com, agordeev@linux.ibm.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, jaka@linux.ibm.com Cc: borntraeger@linux.ibm.com, svens@linux.ibm.com, alibuda@linux.alibaba.com, tonylu@linux.alibaba.com, guwen@linux.alibaba.com, linux-s390@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH net-next 15/15] net/smc: implement DMB-merged operations of loopback-ism Date: Thu, 11 Jan 2024 20:00:36 +0800 Message-Id: <20240111120036.109903-16-guwen@linux.alibaba.com> X-Mailer: git-send-email 2.32.0.3.g01195cf9f In-Reply-To: <20240111120036.109903-1-guwen@linux.alibaba.com> References: <20240111120036.109903-1-guwen@linux.alibaba.com> Precedence: bulk X-Mailing-List: netdev@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Patchwork-Delegate: kuba@kernel.org This implements operations related to merging sndbuf with peer DMB in loopback-ism. The DMB won't be unregistered until no sndbuf is attached to it. Signed-off-by: Wen Gu --- net/smc/smc_loopback.c | 101 +++++++++++++++++++++++++++++++++++++++-- net/smc/smc_loopback.h | 4 ++ 2 files changed, 102 insertions(+), 3 deletions(-) diff --git a/net/smc/smc_loopback.c b/net/smc/smc_loopback.c index bfbb346ef01a..296a4d1f1a33 100644 --- a/net/smc/smc_loopback.c +++ b/net/smc/smc_loopback.c @@ -298,6 +298,7 @@ static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb, } dmb_node->len = dmb->dmb_len; dmb_node->dma_addr = SMC_DMA_ADDR_INVALID; + refcount_set(&dmb_node->refcnt, 1); again: /* add new dmb into hash table */ @@ -311,6 +312,7 @@ static int smc_lo_register_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb, } hash_add(ldev->dmb_ht, &dmb_node->list, dmb_node->token); write_unlock(&ldev->dmb_ht_lock); + atomic_inc(&ldev->dmb_cnt); SMC_LO_STAT_DMBS_INC(ldev); dmb->sba_idx = dmb_node->sba_idx; @@ -333,8 +335,8 @@ static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb) struct smc_lo_dmb_node *dmb_node = NULL, *tmp_node; struct smc_lo_dev *ldev = smcd->priv; - /* remove dmb from hash table */ - write_lock(&ldev->dmb_ht_lock); + /* find dmb from hash table */ + read_lock(&ldev->dmb_ht_lock); hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb->dmb_tok) { if (tmp_node->token == dmb->dmb_tok) { dmb_node = tmp_node; @@ -342,9 +344,18 @@ static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb) } } if (!dmb_node) { - write_unlock(&ldev->dmb_ht_lock); + read_unlock(&ldev->dmb_ht_lock); return -EINVAL; } + read_unlock(&ldev->dmb_ht_lock); + + /* wait for peer sndbuf to detach from this dmb */ + if (!refcount_dec_and_test(&dmb_node->refcnt)) + wait_event(ldev->dmbs_release, + !refcount_read(&dmb_node->refcnt)); + + /* remove dmb from hash table */ + write_lock(&ldev->dmb_ht_lock); hash_del(&dmb_node->list); write_unlock(&ldev->dmb_ht_lock); @@ -353,6 +364,73 @@ static int smc_lo_unregister_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb) kfree(dmb_node); SMC_LO_STAT_DMBS_DEC(ldev); + if (atomic_dec_and_test(&ldev->dmb_cnt)) + wake_up(&ldev->ldev_release); + return 0; +} + +static int smc_lo_support_dmb_nocopy(struct smcd_dev *smcd) +{ + struct smc_lo_dev *ldev = smcd->priv; + + return (ldev->dmb_copy == SMC_LO_DMB_NOCOPY); +} + +static int smc_lo_attach_dmb(struct smcd_dev *smcd, struct smcd_dmb *dmb) +{ + struct smc_lo_dmb_node *dmb_node = NULL, *tmp_node; + struct smc_lo_dev *ldev = smcd->priv; + + /* find dmb_node according to dmb->dmb_tok */ + read_lock(&ldev->dmb_ht_lock); + hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb->dmb_tok) { + if (tmp_node->token == dmb->dmb_tok) { + dmb_node = tmp_node; + break; + } + } + if (!dmb_node) { + read_unlock(&ldev->dmb_ht_lock); + return -EINVAL; + } + read_unlock(&ldev->dmb_ht_lock); + + if (!refcount_inc_not_zero(&dmb_node->refcnt)) + /* the dmb is being unregistered, but has + * not been removed from the hash table. + */ + return -EINVAL; + + /* provide dmb information */ + dmb->sba_idx = dmb_node->sba_idx; + dmb->dmb_tok = dmb_node->token; + dmb->cpu_addr = dmb_node->cpu_addr; + dmb->dma_addr = dmb_node->dma_addr; + dmb->dmb_len = dmb_node->len; + return 0; +} + +static int smc_lo_detach_dmb(struct smcd_dev *smcd, u64 token) +{ + struct smc_lo_dmb_node *dmb_node = NULL, *tmp_node; + struct smc_lo_dev *ldev = smcd->priv; + + /* find dmb_node according to dmb->dmb_tok */ + read_lock(&ldev->dmb_ht_lock); + hash_for_each_possible(ldev->dmb_ht, tmp_node, list, token) { + if (tmp_node->token == token) { + dmb_node = tmp_node; + break; + } + } + if (!dmb_node) { + read_unlock(&ldev->dmb_ht_lock); + return -EINVAL; + } + read_unlock(&ldev->dmb_ht_lock); + + if (refcount_dec_and_test(&dmb_node->refcnt)) + wake_up_all(&ldev->dmbs_release); return 0; } @@ -389,6 +467,14 @@ static int smc_lo_move_data(struct smcd_dev *smcd, u64 dmb_tok, struct smc_lo_dmb_node *rmb_node = NULL, *tmp_node; struct smc_lo_dev *ldev = smcd->priv; + /* if sndbuf is merged with peer DMB, there is + * no need to copy data from sndbuf to peer DMB. + */ + if (!sf && smc_lo_support_dmb_nocopy(smcd)) { + SMC_LO_STAT_XFER_BYTES(ldev, size); + return 0; + } + read_lock(&ldev->dmb_ht_lock); hash_for_each_possible(ldev->dmb_ht, tmp_node, list, dmb_tok) { if (tmp_node->token == dmb_tok) { @@ -444,6 +530,9 @@ static const struct smcd_ops lo_ops = { .query_remote_gid = smc_lo_query_rgid, .register_dmb = smc_lo_register_dmb, .unregister_dmb = smc_lo_unregister_dmb, + .support_dmb_nocopy = smc_lo_support_dmb_nocopy, + .attach_dmb = smc_lo_attach_dmb, + .detach_dmb = smc_lo_detach_dmb, .add_vlan_id = smc_lo_add_vlan_id, .del_vlan_id = smc_lo_del_vlan_id, .set_vlan_required = smc_lo_set_vlan_required, @@ -529,12 +618,18 @@ static int smc_lo_dev_init(struct smc_lo_dev *ldev) smc_lo_generate_id(ldev); rwlock_init(&ldev->dmb_ht_lock); hash_init(ldev->dmb_ht); + atomic_set(&ldev->dmb_cnt, 0); + init_waitqueue_head(&ldev->dmbs_release); + init_waitqueue_head(&ldev->ldev_release); + return smcd_lo_register_dev(ldev); } static void smc_lo_dev_exit(struct smc_lo_dev *ldev) { smcd_lo_unregister_dev(ldev); + if (atomic_read(&ldev->dmb_cnt)) + wait_event(ldev->ldev_release, !atomic_read(&ldev->dmb_cnt)); } static void smc_lo_dev_release(struct device *dev) diff --git a/net/smc/smc_loopback.h b/net/smc/smc_loopback.h index 7ecb4a35eb36..19a1eace2255 100644 --- a/net/smc/smc_loopback.h +++ b/net/smc/smc_loopback.h @@ -40,6 +40,7 @@ struct smc_lo_dmb_node { u32 sba_idx; void *cpu_addr; dma_addr_t dma_addr; + refcount_t refcnt; }; struct smc_lo_dev_stats64 { @@ -56,9 +57,12 @@ struct smc_lo_dev { u16 chid; struct smcd_gid local_gid; struct smc_lo_dev_stats64 __percpu *stats; + atomic_t dmb_cnt; rwlock_t dmb_ht_lock; DECLARE_BITMAP(sba_idx_mask, SMC_LO_MAX_DMBS); DECLARE_HASHTABLE(dmb_ht, SMC_LO_DMBS_HASH_BITS); + wait_queue_head_t dmbs_release; + wait_queue_head_t ldev_release; }; #define SMC_LO_STAT_SUB(ldev, key, val) \