From patchwork Wed Feb 26 14:17:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13992514 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4BCA22171D for ; Wed, 26 Feb 2025 14:17:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579473; cv=none; b=BgnTBcqlEWs2s7+ehcm+9K7jtakq3YwRiOyzwBfH1vrV5SLxjwT+SpbQS1kF2et8+AO1NmDG20tJ53f8wRF01B1O762pm1Uk839jhUZVA8VsX1elpMaO21Sj3LTwPWkI0D855hhaNmlqDfBJ5P9ap+ShwLQdsJPTBGiL0BIe1GM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579473; c=relaxed/simple; bh=aHe8IYr/kBEGyQ7LM0iqt6dbzdnEOsEwl8WQVXE1cYA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WRMz80kotiRcMfHuqQaxPUmwebvsAWCCnZGroQkAikJmNTKKhatEiIqWRTF0256KeDtFtmfMA4vV57vW9SWWOLsiYRd2nkW5VuhdLW2IxZIIQbKGnrGxEm5eyRa28/ghVO2DlBvRO8Mnyf61W+8ckA03lzxYGApNtGmddn8SUPw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ATCkGde0; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ATCkGde0" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9CBD6C4CED6; Wed, 26 Feb 2025 14:17:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740579473; bh=aHe8IYr/kBEGyQ7LM0iqt6dbzdnEOsEwl8WQVXE1cYA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ATCkGde0z+pKl1/Fl6JzH3P0ZmnpvhznVm7IyAEZPBeISt3SUCyDmwUuOF67VILeh 73MkjHxSKu7UQNg/AUm8cK1Pero88Zei8eTyPr6sqr4ZhpPjlafaVRC9zosNH7NIka OuHwRBRAOH1Q0vpdc647zTDDPH4A9Ak+Kq59Kxuw9Fe8MozLvo/wRusZUdMXz/e9Jp OhAg+ohMqTbXFmfFE4b1k2hi9PNk+68oBA8BIXm/3E+l28pkEo4JwPjokn6Fj9eGsj uFmSvRe3oJxfqcN9FYLekN2Ug4vpxm7X14xUe7xGQA8BYJQZU1bayEzD2wtTJ8qZMW 0SY2R/129YmVA== From: Leon Romanovsky To: Jason Gunthorpe Cc: Chiara Meiohas , linux-rdma@vger.kernel.org, Yishai Hadas Subject: [PATCH rdma-next 1/6] RDMA/uverbs: Introduce UCAP (User CAPabilities) API Date: Wed, 26 Feb 2025 16:17:27 +0200 Message-ID: <558b28bc07d2067478ec638da87e01a551caa367.1740574943.git.leon@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chiara Meiohas Implement a new User CAPabilities (UCAP) API to provide fine-grained control over specific firmware features. This approach offers more granular capabilities than the existing Linux capabilities, which may be too generic for certain FW features. This mechanism represents each capability as a character device with root read-write access. Root processes can grant users special privileges by allowing access to these character devices (e.g., using chown). UCAP character devices are located in /dev/infiniband and the class path is /sys/class/infiniband_ucaps. Signed-off-by: Chiara Meiohas Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/Makefile | 3 +- drivers/infiniband/core/ucaps.c | 255 ++++++++++++++++++++++++++ drivers/infiniband/core/uverbs_main.c | 2 + include/rdma/ib_ucaps.h | 25 +++ 4 files changed, 284 insertions(+), 1 deletion(-) create mode 100644 drivers/infiniband/core/ucaps.c create mode 100644 include/rdma/ib_ucaps.h diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile index 8ab4eea5a0a5..d49ded7e95f0 100644 --- a/drivers/infiniband/core/Makefile +++ b/drivers/infiniband/core/Makefile @@ -39,6 +39,7 @@ ib_uverbs-y := uverbs_main.o uverbs_cmd.o uverbs_marshall.o \ uverbs_std_types_async_fd.o \ uverbs_std_types_srq.o \ uverbs_std_types_wq.o \ - uverbs_std_types_qp.o + uverbs_std_types_qp.o \ + ucaps.o ib_uverbs-$(CONFIG_INFINIBAND_USER_MEM) += umem.o umem_dmabuf.o ib_uverbs-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o diff --git a/drivers/infiniband/core/ucaps.c b/drivers/infiniband/core/ucaps.c new file mode 100644 index 000000000000..82b1184891ed --- /dev/null +++ b/drivers/infiniband/core/ucaps.c @@ -0,0 +1,255 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include +#include +#include +#include +#include + +#define RDMA_UCAP_FIRST RDMA_UCAP_MLX5_CTRL_LOCAL + +static DEFINE_MUTEX(ucaps_mutex); +static struct ib_ucap *ucaps_list[RDMA_UCAP_MAX]; +static bool ucaps_class_is_registered; +static dev_t ucaps_base_dev; + +struct ib_ucap { + struct cdev cdev; + struct device dev; + int refcount; +}; + +static const char *ucap_names[RDMA_UCAP_MAX] = { + [RDMA_UCAP_MLX5_CTRL_LOCAL] = "mlx5_perm_ctrl_local", + [RDMA_UCAP_MLX5_CTRL_OTHER_VHCA] = "mlx5_perm_ctrl_other_vhca" +}; + +static char *ucaps_devnode(const struct device *dev, umode_t *mode) +{ + if (mode) + *mode = 0600; + + return kasprintf(GFP_KERNEL, "infiniband/%s", dev_name(dev)); +} + +static const struct class ucaps_class = { + .name = "infiniband_ucaps", + .devnode = ucaps_devnode, +}; + +static const struct file_operations ucaps_cdev_fops = { + .owner = THIS_MODULE, + .open = simple_open, +}; + +/** + * ib_cleanup_ucaps - cleanup all API resources and class. + * + * This is called once, when removing the ib_uverbs module. + */ +void ib_cleanup_ucaps(void) +{ + mutex_lock(&ucaps_mutex); + if (!ucaps_class_is_registered) { + mutex_unlock(&ucaps_mutex); + return; + } + + for (int i = RDMA_UCAP_FIRST; i < RDMA_UCAP_MAX; i++) + WARN_ON(ucaps_list[i]); + + class_unregister(&ucaps_class); + ucaps_class_is_registered = false; + unregister_chrdev_region(ucaps_base_dev, RDMA_UCAP_MAX); + mutex_unlock(&ucaps_mutex); +} + +static int get_ucap_from_devt(dev_t devt, u64 *idx_mask) +{ + for (int type = RDMA_UCAP_FIRST; type < RDMA_UCAP_MAX; type++) { + if (ucaps_list[type] && ucaps_list[type]->dev.devt == devt) { + *idx_mask |= 1 << type; + return 0; + } + } + + return -EINVAL; +} + +static int get_devt_from_fd(unsigned int fd, dev_t *ret_dev) +{ + struct file *file; + + file = fget(fd); + if (!file) + return -EBADF; + + *ret_dev = file_inode(file)->i_rdev; + fput(file); + return 0; +} + +/** + * ib_ucaps_init - Initialization required before ucap creation. + * + * Return: 0 on success, or a negative errno value on failure + */ +static int ib_ucaps_init(void) +{ + int ret = 0; + + if (ucaps_class_is_registered) + return ret; + + ret = class_register(&ucaps_class); + if (ret) + return ret; + + ret = alloc_chrdev_region(&ucaps_base_dev, 0, RDMA_UCAP_MAX, + ucaps_class.name); + if (ret < 0) { + class_unregister(&ucaps_class); + return ret; + } + + ucaps_class_is_registered = true; + + return 0; +} + +static void ucap_dev_release(struct device *device) +{ + struct ib_ucap *ucap = container_of(device, struct ib_ucap, dev); + + kfree(ucap); +} + +/** + * ib_create_ucap - Add a ucap character device + * @type: UCAP type + * + * Creates a ucap character device in the /dev/infiniband directory. By default, + * the device has root-only read-write access. + * + * A driver may call this multiple times with the same UCAP type. A reference + * count tracks creations and deletions. + * + * Return: 0 on success, or a negative errno value on failure + */ +int ib_create_ucap(enum rdma_user_cap type) +{ + struct ib_ucap *ucap; + int ret; + + if (type >= RDMA_UCAP_MAX) + return -EINVAL; + + mutex_lock(&ucaps_mutex); + ret = ib_ucaps_init(); + if (ret) + goto unlock; + + ucap = ucaps_list[type]; + if (ucap) { + ucap->refcount++; + mutex_unlock(&ucaps_mutex); + return 0; + } + + ucap = kzalloc(sizeof(*ucap), GFP_KERNEL); + if (!ucap) { + ret = -ENOMEM; + goto unlock; + } + + device_initialize(&ucap->dev); + ucap->dev.class = &ucaps_class; + ucap->dev.devt = MKDEV(MAJOR(ucaps_base_dev), type); + ucap->dev.release = ucap_dev_release; + dev_set_name(&ucap->dev, ucap_names[type]); + + cdev_init(&ucap->cdev, &ucaps_cdev_fops); + ucap->cdev.owner = THIS_MODULE; + + ret = cdev_device_add(&ucap->cdev, &ucap->dev); + if (ret) + goto err_device; + + ucap->refcount = 1; + ucaps_list[type] = ucap; + mutex_unlock(&ucaps_mutex); + + return 0; + +err_device: + put_device(&ucap->dev); +unlock: + mutex_unlock(&ucaps_mutex); + return ret; +} +EXPORT_SYMBOL(ib_create_ucap); + +/** + * ib_remove_ucap - Remove a ucap character device + * @type: User cap type + * + * Removes the ucap character device according to type. The device is completely + * removed from the filesystem when its reference count reaches 0. + */ +void ib_remove_ucap(enum rdma_user_cap type) +{ + struct ib_ucap *ucap; + + mutex_lock(&ucaps_mutex); + ucap = ucaps_list[type]; + if (WARN_ON(!ucap)) + goto end; + + ucap->refcount--; + if (ucap->refcount) + goto end; + + ucaps_list[type] = NULL; + cdev_device_del(&ucap->cdev, &ucap->dev); + put_device(&ucap->dev); + +end: + mutex_unlock(&ucaps_mutex); +} +EXPORT_SYMBOL(ib_remove_ucap); + +/** + * ib_get_ucaps - Get bitmask of ucap types from file descriptors + * @fds: Array of file descriptors + * @fd_count: Number of file descriptors in the array + * @idx_mask: Bitmask to be updated based on the ucaps in the fd list + * + * Given an array of file descriptors, this function returns a bitmask of + * the ucaps where a bit is set if an FD for that ucap type was in the array. + * + * Return: 0 on success, or a negative errno value on failure + */ +int ib_get_ucaps(int *fds, int fd_count, uint64_t *idx_mask) +{ + int ret = 0; + dev_t dev; + + *idx_mask = 0; + mutex_lock(&ucaps_mutex); + for (int i = 0; i < fd_count; i++) { + ret = get_devt_from_fd(fds[i], &dev); + if (ret) + goto end; + + ret = get_ucap_from_devt(dev, idx_mask); + if (ret) + goto end; + } + +end: + mutex_unlock(&ucaps_mutex); + return ret; +} diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c index 85cfc790a7bb..973fe2c7ef53 100644 --- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -52,6 +52,7 @@ #include #include #include +#include #include "uverbs.h" #include "core_priv.h" @@ -1345,6 +1346,7 @@ static void __exit ib_uverbs_cleanup(void) IB_UVERBS_NUM_FIXED_MINOR); unregister_chrdev_region(dynamic_uverbs_dev, IB_UVERBS_NUM_DYNAMIC_MINOR); + ib_cleanup_ucaps(); mmu_notifier_synchronize(); } diff --git a/include/rdma/ib_ucaps.h b/include/rdma/ib_ucaps.h new file mode 100644 index 000000000000..d044d02f087f --- /dev/null +++ b/include/rdma/ib_ucaps.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#ifndef _IB_UCAPS_H_ +#define _IB_UCAPS_H_ + +#define UCAP_ENABLED(ucaps, type) (!!((ucaps) & (1U << type))) + +enum rdma_user_cap { + RDMA_UCAP_MLX5_CTRL_LOCAL, + RDMA_UCAP_MLX5_CTRL_OTHER_VHCA, + RDMA_UCAP_MAX +}; + +void ib_cleanup_ucaps(void); + +int ib_create_ucap(enum rdma_user_cap type); + +void ib_remove_ucap(enum rdma_user_cap type); + +int ib_get_ucaps(int *fds, int fd_count, uint64_t *idx_mask); + +#endif /* _IB_UCAPS_H_ */ From patchwork Wed Feb 26 14:17:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13992516 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 948FC22171D for ; Wed, 26 Feb 2025 14:18:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579481; cv=none; b=KQCnyGHI3Tqyg1OS/avcmaSJc8gjgA4nkpSo24sHgYDW96HeOp0euaP/xBqKDfMJ9BOJdUHA7NOnpXwPswZ9v22QK7sUYdAf3Q+3+lA2oHn155iH65wTxbZx2eFoEsHGZA+FvGJu1jqm/GkFIjbgH5CNcbD9aEfl0NCWrQp6V8w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579481; c=relaxed/simple; bh=Aoz+QavRXw18FJ5GW865BMZ0YWdhhTInaHJmTTVVK78=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PwNSG4+FenbMLet4ENUaFRdSaJw7lSybrmDgxGkJJHqH+m4G+TthZ3f960oKsiehrBIaXFQgDPL+RTXD4Pu0Z6SdAwHb0C72RapadO687opDd3JiwD+N+km57pTQEATlH4qh51TYSILM6+oBReFOTGUVXL1chhNf5qSZUgl3EAY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=QOlNda1y; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="QOlNda1y" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9AE23C4CED6; Wed, 26 Feb 2025 14:18:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740579481; bh=Aoz+QavRXw18FJ5GW865BMZ0YWdhhTInaHJmTTVVK78=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=QOlNda1y59UsA0gETiqm2yrvq/Rszqcc6zl9+FLPpII4kLyVBtohxbMWNET5IDzI/ uMR1Rg/Hk1K0QZCmNfvu/q83nTRv0brmQY9HpLHl+DrN031ObCibrQy7Nz6+ma8kzC O8DrkOzD4/AzK7xKPZrRRz1JUETNAypz6VSQB6sVLNMln5vd2gMPELzbOJPj5WjZmI ZJRevaSrMQFKuc5pJ+n86AQp66YaNjns/dluGgStleRUGYX1oQO8redYU2ORbvlpMX IZOT2EvJdmLOC7wLawHqywFlJv+1p3IiX5LQblYez5T7xq4+ikcQYSybLN+0kYYpK6 vaaxhzFJgcYfQ== From: Leon Romanovsky To: Jason Gunthorpe Cc: Chiara Meiohas , linux-rdma@vger.kernel.org, Yishai Hadas Subject: [PATCH rdma-next 2/6] RDMA/mlx5: Create UCAP char devices for supported device capabilities Date: Wed, 26 Feb 2025 16:17:28 +0200 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chiara Meiohas Create UCAP character devices when probing an IB device with supported firmware capabilities. If the RDMA_CTRL general object type is supported, check for specific UCTX capabilities: Create /dev/infiniband/mlx5_perm_ctrl_local for RDMA_UCAP_MLX5_CTRL_LOCAL Create /dev/infiniband/mlx5_perm_ctrl_other_vhca for RDMA_UCAP_MLX5_CTRL_OTHER_VHCA Signed-off-by: Chiara Meiohas Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/main.c | 47 +++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 81849eb671a1..04b489a6a449 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -47,6 +47,7 @@ #include #include #include +#include #include "macsec.h" #include "data_direct.h" @@ -4201,8 +4202,47 @@ static int mlx5_ib_init_var_table(struct mlx5_ib_dev *dev) return (var_table->bitmap) ? 0 : -ENOMEM; } +static void mlx5_ib_cleanup_ucaps(struct mlx5_ib_dev *dev) +{ + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & MLX5_UCTX_CAP_RDMA_CTRL) + ib_remove_ucap(RDMA_UCAP_MLX5_CTRL_LOCAL); + + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & + MLX5_UCTX_CAP_RDMA_CTRL_OTHER_VHCA) + ib_remove_ucap(RDMA_UCAP_MLX5_CTRL_OTHER_VHCA); +} + +static int mlx5_ib_init_ucaps(struct mlx5_ib_dev *dev) +{ + int ret; + + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & MLX5_UCTX_CAP_RDMA_CTRL) { + ret = ib_create_ucap(RDMA_UCAP_MLX5_CTRL_LOCAL); + if (ret) + return ret; + } + + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & + MLX5_UCTX_CAP_RDMA_CTRL_OTHER_VHCA) { + ret = ib_create_ucap(RDMA_UCAP_MLX5_CTRL_OTHER_VHCA); + if (ret) + goto remove_local; + } + + return 0; + +remove_local: + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & MLX5_UCTX_CAP_RDMA_CTRL) + ib_remove_ucap(RDMA_UCAP_MLX5_CTRL_LOCAL); + return ret; +} + static void mlx5_ib_stage_caps_cleanup(struct mlx5_ib_dev *dev) { + if (MLX5_CAP_GEN_2_64(dev->mdev, general_obj_types_127_64) & + MLX5_HCA_CAP_2_GENERAL_OBJECT_TYPES_RDMA_CTRL) + mlx5_ib_cleanup_ucaps(dev); + bitmap_free(dev->var_table.bitmap); } @@ -4253,6 +4293,13 @@ static int mlx5_ib_stage_caps_init(struct mlx5_ib_dev *dev) return err; } + if (MLX5_CAP_GEN_2_64(dev->mdev, general_obj_types_127_64) & + MLX5_HCA_CAP_2_GENERAL_OBJECT_TYPES_RDMA_CTRL) { + err = mlx5_ib_init_ucaps(dev); + if (err) + return err; + } + dev->ib_dev.use_cq_dim = true; return 0; From patchwork Wed Feb 26 14:17:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13992515 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A8BB22171E for ; Wed, 26 Feb 2025 14:17:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579477; cv=none; b=fSdfVWzS/clJU6BrLcGAHaOWvhpH3Ke/CsXyQNBFf2zuCmWi4YPHsaNjF+BT/wRjTpclRIEyZOgh2/GX9PI0aLODOrpvDshmpLt9pVw+kw5azby3h1jvKRDz3IDrUumT2rK8vTp7/UqsPgmMs30ZyUxVQTzYZsMJsQzIZdAyZh4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579477; c=relaxed/simple; bh=K0nftxyvHz7gX5c75uVRWlp6980Ou0pQ+UbIuElFXZU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FQolf4bVa8vmXWufg9t9El/4PJ00nx22Q9UebfvC849jRBbJP9tFMk5zqmuEUcTUwJIZhSXTMWm08GfWo1BrAtDXjIJZ68b+QndEP3Q5EbLRsZ2zOtJ6vIRPdlPcFYLpS5Z75TIxfXXim9EBcJ94twMP+itpzpkWeynUiltSPJk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=j8wwQpIt; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="j8wwQpIt" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3F2A2C4CED6; Wed, 26 Feb 2025 14:17:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740579476; bh=K0nftxyvHz7gX5c75uVRWlp6980Ou0pQ+UbIuElFXZU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=j8wwQpIt7zYOIrI4BGGd21DnhUD07E0tkIcsV95h7g17Ag4WMlS6Ln5v9KJIJ1/NA JUp5gDFzo76PxvCpDR+s2PZc050eM1qWq3q7lJr5Ih1emoIAXh+JlQeW/RgjRZfPC4 JV8dsFnUdqBsHCj8rMqpeYkgpJfuY/ZYU/lJsu0DoJdOcAtWi4tEOTKLHU/kC/c9Sn zSuVGPzE9P/+W/DXMUV/a0gQL28h9mFjFK6ELzWARZtdAxAiKzh5nIjBLAgw9ic4EQ s4dZmwnlrSdu8gHGVtW4TVK8F5WJeEBLaGVF7+LTHwugYDfTCIdxENMc52E6zABarY LM0tGsMhd+2jQ== From: Leon Romanovsky To: Jason Gunthorpe Cc: Chiara Meiohas , linux-rdma@vger.kernel.org, Yishai Hadas Subject: [PATCH rdma-next 3/6] RDMA/uverbs: Add support for UCAPs in context creation Date: Wed, 26 Feb 2025 16:17:29 +0200 Message-ID: <761767651e7dd0592f2977560321b88905ac3a65.1740574943.git.leon@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chiara Meiohas Add support for file descriptor array attribute for GET_CONTEXT commands. Check that the file descriptor (fd) array represents fds for valid UCAPs. Store the enabled UCAPs from the fd array as a bitmask in ib_ucontext. Signed-off-by: Chiara Meiohas Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/uverbs_cmd.c | 19 +++++++++++++++++++ .../infiniband/core/uverbs_std_types_device.c | 4 ++++ include/rdma/ib_verbs.h | 1 + include/uapi/rdma/ib_user_ioctl_cmds.h | 1 + 4 files changed, 25 insertions(+) diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index 5ad14c39d48c..96d639e1ffa0 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -42,6 +42,7 @@ #include #include +#include #include "rdma_core.h" #include "uverbs.h" @@ -232,6 +233,8 @@ int ib_init_ucontext(struct uverbs_attr_bundle *attrs) { struct ib_ucontext *ucontext = attrs->context; struct ib_uverbs_file *file = attrs->ufile; + int *fd_array; + int fd_count; int ret; if (!down_read_trylock(&file->hw_destroy_rwsem)) @@ -247,6 +250,22 @@ int ib_init_ucontext(struct uverbs_attr_bundle *attrs) if (ret) goto err; + if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_GET_CONTEXT_FD_ARR)) { + fd_count = uverbs_attr_ptr_get_array_size(attrs, + UVERBS_ATTR_GET_CONTEXT_FD_ARR, + sizeof(int)); + if (fd_count < 0) { + ret = fd_count; + goto err_uncharge; + } + + fd_array = uverbs_attr_get_alloced_ptr(attrs, + UVERBS_ATTR_GET_CONTEXT_FD_ARR); + ret = ib_get_ucaps(fd_array, fd_count, &ucontext->enabled_caps); + if (ret) + goto err_uncharge; + } + ret = ucontext->device->ops.alloc_ucontext(ucontext, &attrs->driver_udata); if (ret) diff --git a/drivers/infiniband/core/uverbs_std_types_device.c b/drivers/infiniband/core/uverbs_std_types_device.c index fb0555647336..c0fd283d9d6c 100644 --- a/drivers/infiniband/core/uverbs_std_types_device.c +++ b/drivers/infiniband/core/uverbs_std_types_device.c @@ -437,6 +437,10 @@ DECLARE_UVERBS_NAMED_METHOD( UVERBS_ATTR_TYPE(u32), UA_OPTIONAL), UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_GET_CONTEXT_CORE_SUPPORT, UVERBS_ATTR_TYPE(u64), UA_OPTIONAL), + UVERBS_ATTR_PTR_IN(UVERBS_ATTR_GET_CONTEXT_FD_ARR, + UVERBS_ATTR_MIN_SIZE(sizeof(int)), + UA_OPTIONAL, + UA_ALLOC_AND_COPY), UVERBS_ATTR_UHW()); DECLARE_UVERBS_NAMED_METHOD( diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index b59bf30de430..84b6d64c9e3a 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1530,6 +1530,7 @@ struct ib_ucontext { struct ib_uverbs_file *ufile; struct ib_rdmacg_object cg_obj; + u64 enabled_caps; /* * Implementation details of the RDMA core, don't use in drivers: */ diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h index ec719053aab9..ac7b162611ed 100644 --- a/include/uapi/rdma/ib_user_ioctl_cmds.h +++ b/include/uapi/rdma/ib_user_ioctl_cmds.h @@ -88,6 +88,7 @@ enum uverbs_attrs_query_port_cmd_attr_ids { enum uverbs_attrs_get_context_attr_ids { UVERBS_ATTR_GET_CONTEXT_NUM_COMP_VECTORS, UVERBS_ATTR_GET_CONTEXT_CORE_SUPPORT, + UVERBS_ATTR_GET_CONTEXT_FD_ARR, }; enum uverbs_attrs_query_context_attr_ids { From patchwork Wed Feb 26 14:17:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13992519 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB3CE215189 for ; Wed, 26 Feb 2025 14:18:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579495; cv=none; b=LkDlErOAQO3kWj8Y5pXc07tukUJHNwhCaZGzUn5wx7HqL4EtHZZEajqAWIMeYNzNCtZSBi+c5sfJy9rhUUF0O6tu5q6EWPbVhg3hFfma1vqWiHIFh1U7sFdxmVpUUzmSwzvoSOu44qd21oMvT3Nzo8MY84a3O4iJJ6dXffpy2ZI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579495; c=relaxed/simple; bh=/vHRZjDHxptSNmITw5F+tZoY6JhQa4E8jY/Eb7YwE0A=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FsQqGzbICTbZLknJUJvHSC0FWuCe3jsgG4OeKvWgPwCS8cx6ipvXqEw0g/N+ItL0yfpyTQasCJEJgX9FYnpynG7ovMytG/e6CJ7feJCRpGmhGgOXJM8Tl3PFhF4Ni1OQyP+gbugi5OiYPirpFzq2A/Y+8sOQ1+T+4J4GhSUALxs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=SLynyyle; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="SLynyyle" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 98228C4CED6; Wed, 26 Feb 2025 14:18:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740579495; bh=/vHRZjDHxptSNmITw5F+tZoY6JhQa4E8jY/Eb7YwE0A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SLynyylecOHpqUP8VfSRvFPLWsc09MRIaRDdYNTS6VBOGtNFRBWEZB99i6y1JaS7/ c0QuxhieAgLPJRD3R7M2e8GQYCV6SZApz3v/t3bI4oIBk3MJcO1gfCvICMPA0xjz/m erGvm1gUySnsSns9VXM5mhjeGuG2jE43xHxbfMxkhtCYEFKN+AlniXNajMMHEFpY7y bWZXxN6MUJeeRIjIoeqHPgTi1kCTOrDdD6vWiYL7oW1RJcU5ga/Dt/cVfCSN9HTXbp TSt9wZmG7sLLIKev2WyX/vzAOP2Y0z/LoOTyh5sj0tGjnYTTdZXSCyo0vgAVh5iYdl /wd7cd7SHiI/w== From: Leon Romanovsky To: Jason Gunthorpe Cc: Chiara Meiohas , linux-rdma@vger.kernel.org, Yishai Hadas Subject: [PATCH rdma-next 4/6] RDMA/mlx5: Check enabled UCAPs when creating ucontext Date: Wed, 26 Feb 2025 16:17:30 +0200 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chiara Meiohas Verify that the enabled UCAPs are supported by the device before creating the ucontext. If supported, create the ucontext with the associated capabilities. Store the privileged ucontext UID on creation and remove it when destroying the privileged ucontext. This allows the command interface to recognize privileged commands through its UID. Signed-off-by: Chiara Meiohas Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/devx.c | 31 +++++++++++++++++++++++++++++-- drivers/infiniband/hw/mlx5/devx.h | 5 +++-- drivers/infiniband/hw/mlx5/main.c | 30 ++++++++++++++++++++++++++---- 3 files changed, 58 insertions(+), 8 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c index 4186884c66e1..5d42087c7777 100644 --- a/drivers/infiniband/hw/mlx5/devx.c +++ b/drivers/infiniband/hw/mlx5/devx.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "mlx5_ib.h" #include "devx.h" #include "qp.h" @@ -122,7 +123,27 @@ devx_ufile2uctx(const struct uverbs_attr_bundle *attrs) return to_mucontext(ib_uverbs_get_ucontext(attrs)); } -int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user) +static int set_uctx_ucaps(struct mlx5_ib_dev *dev, u64 req_ucaps, u32 *cap) +{ + if (UCAP_ENABLED(req_ucaps, RDMA_UCAP_MLX5_CTRL_LOCAL)) { + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & MLX5_UCTX_CAP_RDMA_CTRL) + *cap |= MLX5_UCTX_CAP_RDMA_CTRL; + else + return -EOPNOTSUPP; + } + + if (UCAP_ENABLED(req_ucaps, RDMA_UCAP_MLX5_CTRL_OTHER_VHCA)) { + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & + MLX5_UCTX_CAP_RDMA_CTRL_OTHER_VHCA) + *cap |= MLX5_UCTX_CAP_RDMA_CTRL_OTHER_VHCA; + else + return -EOPNOTSUPP; + } + + return 0; +} + +int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user, u64 req_ucaps) { u32 in[MLX5_ST_SZ_DW(create_uctx_in)] = {}; u32 out[MLX5_ST_SZ_DW(create_uctx_out)] = {}; @@ -144,6 +165,12 @@ int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user) MLX5_UCTX_CAP_INTERNAL_DEV_RES)) cap |= MLX5_UCTX_CAP_INTERNAL_DEV_RES; + if (req_ucaps) { + err = set_uctx_ucaps(dev, req_ucaps, &cap); + if (err) + return err; + } + MLX5_SET(create_uctx_in, in, opcode, MLX5_CMD_OP_CREATE_UCTX); MLX5_SET(uctx, uctx, cap, cap); @@ -2573,7 +2600,7 @@ int mlx5_ib_devx_init(struct mlx5_ib_dev *dev) struct mlx5_devx_event_table *table = &dev->devx_event_table; int uid; - uid = mlx5_ib_devx_create(dev, false); + uid = mlx5_ib_devx_create(dev, false, 0); if (uid > 0) { dev->devx_whitelist_uid = uid; xa_init(&table->event_xa); diff --git a/drivers/infiniband/hw/mlx5/devx.h b/drivers/infiniband/hw/mlx5/devx.h index 1344bf4c9d21..ee9e7d3af93f 100644 --- a/drivers/infiniband/hw/mlx5/devx.h +++ b/drivers/infiniband/hw/mlx5/devx.h @@ -24,13 +24,14 @@ struct devx_obj { struct list_head event_sub; /* holds devx_event_subscription entries */ }; #if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS) -int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user); +int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user, u64 req_ucaps); void mlx5_ib_devx_destroy(struct mlx5_ib_dev *dev, u16 uid); int mlx5_ib_devx_init(struct mlx5_ib_dev *dev); void mlx5_ib_devx_cleanup(struct mlx5_ib_dev *dev); void mlx5_ib_ufile_hw_cleanup(struct ib_uverbs_file *ufile); #else -static inline int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user) +static inline int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user, + u64 req_ucaps) { return -EOPNOTSUPP; } diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 04b489a6a449..d07cacaa0abd 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -1935,6 +1935,12 @@ static int set_ucontext_resp(struct ib_ucontext *uctx, return 0; } +static bool uctx_rdma_ctrl_is_enabled(u64 enabled_caps) +{ + return UCAP_ENABLED(enabled_caps, RDMA_UCAP_MLX5_CTRL_LOCAL) || + UCAP_ENABLED(enabled_caps, RDMA_UCAP_MLX5_CTRL_OTHER_VHCA); +} + static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata) { @@ -1977,10 +1983,17 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, return -EINVAL; if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX) { - err = mlx5_ib_devx_create(dev, true); + err = mlx5_ib_devx_create(dev, true, uctx->enabled_caps); if (err < 0) goto out_ctx; context->devx_uid = err; + + if (uctx_rdma_ctrl_is_enabled(uctx->enabled_caps)) { + err = mlx5_cmd_add_privileged_uid(dev->mdev, + context->devx_uid); + if (err) + goto out_devx; + } } lib_uar_4k = req.lib_caps & MLX5_LIB_CAP_4K_UAR; @@ -1995,7 +2008,7 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, /* updates req->total_num_bfregs */ err = calc_total_bfregs(dev, lib_uar_4k, &req, bfregi); if (err) - goto out_devx; + goto out_ucap; mutex_init(&bfregi->lock); bfregi->lib_uar_4k = lib_uar_4k; @@ -2003,7 +2016,7 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, GFP_KERNEL); if (!bfregi->count) { err = -ENOMEM; - goto out_devx; + goto out_ucap; } bfregi->sys_pages = kcalloc(bfregi->num_sys_pages, @@ -2067,6 +2080,11 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, out_count: kfree(bfregi->count); +out_ucap: + if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX && + uctx_rdma_ctrl_is_enabled(uctx->enabled_caps)) + mlx5_cmd_remove_privileged_uid(dev->mdev, context->devx_uid); + out_devx: if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX) mlx5_ib_devx_destroy(dev, context->devx_uid); @@ -2111,8 +2129,12 @@ static void mlx5_ib_dealloc_ucontext(struct ib_ucontext *ibcontext) kfree(bfregi->sys_pages); kfree(bfregi->count); - if (context->devx_uid) + if (context->devx_uid) { + if (uctx_rdma_ctrl_is_enabled(ibcontext->enabled_caps)) + mlx5_cmd_remove_privileged_uid(dev->mdev, + context->devx_uid); mlx5_ib_devx_destroy(dev, context->devx_uid); + } } static phys_addr_t uar_index2pfn(struct mlx5_ib_dev *dev, From patchwork Wed Feb 26 14:17:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13992517 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CC4A722171D for ; Wed, 26 Feb 2025 14:18:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579486; cv=none; b=Qe+6UAO2p4oQIWyj+lkWBqQzPU/jGBhIdZG9iemi40AkACkYsE1o9pdPIPpdvpUJ3HkluuMVsafDXCvYElfKDeWI6/IElH2M+lXd83JGCZ3c9AqUxXf/H7ivDot1DuIp/H8U2ZRPrFNSyZMTkJ+gFT41zYLY66sokPlNMn9I/+Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579486; c=relaxed/simple; bh=XS5CM3/uyQ/50M01PNMvY6GGSVw/MYZm3cFCQoed3xs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QCmC1so+ASCCf3ItagNHWjfPvisxdkqUXXM5T8f5DyUWW2O8OfdPrDZRtx1wVUnGJKWtbb/+tAtHxOVlO+Ytc6KFGF6ACvE76MvJ4ugQFUb1Q/sfVaWW59idRmaDP8Cur0/obPOU7IwsOu9NzoN8W1VGn3+z/xmveymXQ+ZqgG0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=uVt56MRK; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="uVt56MRK" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4DB17C4CED6; Wed, 26 Feb 2025 14:18:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740579486; bh=XS5CM3/uyQ/50M01PNMvY6GGSVw/MYZm3cFCQoed3xs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=uVt56MRKwazbKtFqcwTqM2ILnKrJj/BZzsMZNGhYyWgtOIlGYYmwJ0fqPIhPQ7svu L6yhidchGZQqNzgXVFG00icRCj0oq6TCYgczZWV9SSUN0J1FosLtAGOdzswFOHPJAX fBGo1JkdMVfI24fWnS3awQq6JUX4fBaebmQ/n/qcpOn0B0DLoi66Dryy8J45CCuz50 la9r/TRJSjZXdiQQSfogDdiNJHFzs0jU7QP4TEfUHxcOFq4lpAG8icX5Ab01IHxep3 39okt0JJ6mSwJa3xVoAr5hI5jAoFlQ1+/aClpV6K/m9smHWbvtU6Q0S77Nv+iqF8V9 kLJQpf5gt7Ktg== From: Leon Romanovsky To: Jason Gunthorpe Cc: Patrisious Haddad , linux-rdma@vger.kernel.org, Mark Bloch Subject: [PATCH rdma-next 5/6] RDMA/mlx5: Expose RDMA TRANSPORT flow table types to userspace Date: Wed, 26 Feb 2025 16:17:31 +0200 Message-ID: <947aa307213dd9997bf74872ef548995cf67f566.1740574943.git.leon@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Patrisious Haddad This patch adds RDMA_TRANSPORT_RX and RDMA_TRANSPORT_TX as a new flow table type for matcher creation. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Bloch Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/fs.c | 154 ++++++++++++++++++++-- drivers/infiniband/hw/mlx5/fs.h | 2 + drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 + include/uapi/rdma/mlx5_user_ioctl_cmds.h | 1 + include/uapi/rdma/mlx5_user_ioctl_verbs.h | 2 + 5 files changed, 149 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/fs.c b/drivers/infiniband/hw/mlx5/fs.c index 162814ae8cb4..6ae2801fa13f 100644 --- a/drivers/infiniband/hw/mlx5/fs.c +++ b/drivers/infiniband/hw/mlx5/fs.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -690,7 +691,7 @@ static struct mlx5_ib_flow_prio *_get_prio(struct mlx5_ib_dev *dev, struct mlx5_ib_flow_prio *prio, int priority, int num_entries, int num_groups, - u32 flags) + u32 flags, u16 vport) { struct mlx5_flow_table_attr ft_attr = {}; struct mlx5_flow_table *ft; @@ -698,6 +699,7 @@ static struct mlx5_ib_flow_prio *_get_prio(struct mlx5_ib_dev *dev, ft_attr.prio = priority; ft_attr.max_fte = num_entries; ft_attr.flags = flags; + ft_attr.vport = vport; ft_attr.autogroup.max_num_groups = num_groups; ft = mlx5_create_auto_grouped_flow_table(ns, &ft_attr); if (IS_ERR(ft)) @@ -792,7 +794,7 @@ static struct mlx5_ib_flow_prio *get_flow_table(struct mlx5_ib_dev *dev, ft = prio->flow_table; if (!ft) return _get_prio(dev, ns, prio, priority, max_table_size, - num_groups, flags); + num_groups, flags, 0); return prio; } @@ -935,7 +937,7 @@ int mlx5_ib_fs_add_op_fc(struct mlx5_ib_dev *dev, u32 port_num, prio = &dev->flow_db->opfcs[type]; if (!prio->flow_table) { prio = _get_prio(dev, ns, prio, priority, - dev->num_ports * MAX_OPFC_RULES, 1, 0); + dev->num_ports * MAX_OPFC_RULES, 1, 0, 0); if (IS_ERR(prio)) { err = PTR_ERR(prio); goto free; @@ -1413,17 +1415,51 @@ static struct ib_flow *mlx5_ib_create_flow(struct ib_qp *qp, return ERR_PTR(err); } +static int mlx5_ib_fill_transport_ns_info(struct mlx5_ib_dev *dev, + enum mlx5_flow_namespace_type type, + u32 *flags, u16 *vport_idx, + u16 *vport, + struct mlx5_core_dev **ft_mdev, + u32 ib_port) +{ + struct mlx5_core_dev *esw_mdev; + + if (!is_mdev_switchdev_mode(dev->mdev)) + return 0; + + if (!MLX5_CAP_ADV_RDMA(dev->mdev, rdma_transport_manager)) + return -EOPNOTSUPP; + + if (!dev->port[ib_port - 1].rep) + return -EINVAL; + + esw_mdev = mlx5_eswitch_get_core_dev(dev->port[ib_port - 1].rep->esw); + if (esw_mdev != dev->mdev) + return -EOPNOTSUPP; + + *flags |= MLX5_FLOW_TABLE_OTHER_VPORT; + *ft_mdev = esw_mdev; + *vport = dev->port[ib_port - 1].rep->vport; + *vport_idx = dev->port[ib_port - 1].rep->vport_index; + + return 0; +} + static struct mlx5_ib_flow_prio * _get_flow_table(struct mlx5_ib_dev *dev, u16 user_priority, enum mlx5_flow_namespace_type ns_type, - bool mcast) + bool mcast, u32 ib_port) { + struct mlx5_core_dev *ft_mdev = dev->mdev; struct mlx5_flow_namespace *ns = NULL; struct mlx5_ib_flow_prio *prio = NULL; int max_table_size = 0; + u16 vport_idx = 0; bool esw_encap; u32 flags = 0; + u16 vport = 0; int priority; + int ret; if (mcast) priority = MLX5_IB_FLOW_MCAST_PRIO; @@ -1471,13 +1507,38 @@ _get_flow_table(struct mlx5_ib_dev *dev, u16 user_priority, MLX5_CAP_FLOWTABLE_RDMA_TX(dev->mdev, log_max_ft_size)); priority = user_priority; break; + case MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX: + case MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX: + if (ib_port == 0 || user_priority > MLX5_RDMA_TRANSPORT_BYPASS_PRIO) + return ERR_PTR(-EINVAL); + ret = mlx5_ib_fill_transport_ns_info(dev, ns_type, &flags, + &vport_idx, &vport, + &ft_mdev, ib_port); + if (ret) + return ERR_PTR(ret); + + if (ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX) + max_table_size = + BIT(MLX5_CAP_FLOWTABLE_RDMA_TRANSPORT_RX( + ft_mdev, log_max_ft_size)); + else + max_table_size = + BIT(MLX5_CAP_FLOWTABLE_RDMA_TRANSPORT_TX( + ft_mdev, log_max_ft_size)); + priority = user_priority; + break; default: break; } max_table_size = min_t(int, max_table_size, MLX5_FS_MAX_ENTRIES); - ns = mlx5_get_flow_namespace(dev->mdev, ns_type); + if (ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX || + ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX) + ns = mlx5_get_flow_vport_namespace(ft_mdev, ns_type, vport_idx); + else + ns = mlx5_get_flow_namespace(ft_mdev, ns_type); + if (!ns) return ERR_PTR(-EOPNOTSUPP); @@ -1497,6 +1558,12 @@ _get_flow_table(struct mlx5_ib_dev *dev, u16 user_priority, case MLX5_FLOW_NAMESPACE_RDMA_TX: prio = &dev->flow_db->rdma_tx[priority]; break; + case MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX: + prio = &dev->flow_db->rdma_transport_rx[ib_port - 1]; + break; + case MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX: + prio = &dev->flow_db->rdma_transport_tx[ib_port - 1]; + break; default: return ERR_PTR(-EINVAL); } @@ -1507,7 +1574,7 @@ _get_flow_table(struct mlx5_ib_dev *dev, u16 user_priority, return prio; return _get_prio(dev, ns, prio, priority, max_table_size, - MLX5_FS_MAX_TYPES, flags); + MLX5_FS_MAX_TYPES, flags, vport); } static struct mlx5_ib_flow_handler * @@ -1626,7 +1693,8 @@ static struct mlx5_ib_flow_handler *raw_fs_rule_add( mutex_lock(&dev->flow_db->lock); ft_prio = _get_flow_table(dev, fs_matcher->priority, - fs_matcher->ns_type, mcast); + fs_matcher->ns_type, mcast, + fs_matcher->ib_port); if (IS_ERR(ft_prio)) { err = PTR_ERR(ft_prio); goto unlock; @@ -1742,6 +1810,12 @@ mlx5_ib_ft_type_to_namespace(enum mlx5_ib_uapi_flow_table_type table_type, case MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TX: *namespace = MLX5_FLOW_NAMESPACE_RDMA_TX; break; + case MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TRANSPORT_RX: + *namespace = MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX; + break; + case MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TRANSPORT_TX: + *namespace = MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX; + break; default: return -EINVAL; } @@ -1831,7 +1905,8 @@ static int get_dests(struct uverbs_attr_bundle *attrs, return -EINVAL; /* Allow only DEVX object or QP as dest when inserting to RDMA_RX */ - if ((fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_RX) && + if ((fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_RX || + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX) && ((!dest_devx && !dest_qp) || (dest_devx && dest_qp))) return -EINVAL; @@ -1848,7 +1923,8 @@ static int get_dests(struct uverbs_attr_bundle *attrs, return -EINVAL; /* Allow only flow table as dest when inserting to FDB or RDMA_RX */ if ((fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_FDB_BYPASS || - fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_RX) && + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_RX || + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX) && *dest_type != MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE) return -EINVAL; } else if (dest_qp) { @@ -1869,14 +1945,16 @@ static int get_dests(struct uverbs_attr_bundle *attrs, *dest_id = mqp->raw_packet_qp.rq.tirn; *dest_type = MLX5_FLOW_DESTINATION_TYPE_TIR; } else if ((fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_EGRESS || - fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TX) && + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TX || + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX) && !(*flags & MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DROP)) { *dest_type = MLX5_FLOW_DESTINATION_TYPE_PORT; } if (*dest_type == MLX5_FLOW_DESTINATION_TYPE_TIR && (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_EGRESS || - fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TX)) + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TX || + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX)) return -EINVAL; return 0; @@ -2353,6 +2431,15 @@ static int mlx5_ib_matcher_ns(struct uverbs_attr_bundle *attrs, return 0; } +static bool verify_context_caps(struct mlx5_ib_dev *dev, u64 enabled_caps) +{ + if (is_mdev_switchdev_mode(dev->mdev)) + return UCAP_ENABLED(enabled_caps, + RDMA_UCAP_MLX5_CTRL_OTHER_VHCA); + + return UCAP_ENABLED(enabled_caps, RDMA_UCAP_MLX5_CTRL_LOCAL); +} + static int UVERBS_HANDLER(MLX5_IB_METHOD_FLOW_MATCHER_CREATE)( struct uverbs_attr_bundle *attrs) { @@ -2401,6 +2488,26 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_FLOW_MATCHER_CREATE)( goto end; } + if (uverbs_attr_is_valid(attrs, MLX5_IB_ATTR_FLOW_MATCHER_IB_PORT)) { + err = uverbs_copy_from(&obj->ib_port, attrs, + MLX5_IB_ATTR_FLOW_MATCHER_IB_PORT); + if (err) + goto end; + if (!rdma_is_port_valid(&dev->ib_dev, obj->ib_port)) { + err = -EINVAL; + goto end; + } + if (obj->ns_type != MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX && + obj->ns_type != MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX) { + err = -EINVAL; + goto end; + } + if (!verify_context_caps(dev, uobj->context->enabled_caps)) { + err = -EOPNOTSUPP; + goto end; + } + } + uobj->object = obj; obj->mdev = dev->mdev; atomic_set(&obj->usecnt, 0); @@ -2448,7 +2555,7 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_STEERING_ANCHOR_CREATE)( mutex_lock(&dev->flow_db->lock); - ft_prio = _get_flow_table(dev, priority, ns_type, 0); + ft_prio = _get_flow_table(dev, priority, ns_type, 0, 0); if (IS_ERR(ft_prio)) { err = PTR_ERR(ft_prio); goto free_obj; @@ -2834,7 +2941,10 @@ DECLARE_UVERBS_NAMED_METHOD( UA_OPTIONAL), UVERBS_ATTR_CONST_IN(MLX5_IB_ATTR_FLOW_MATCHER_FT_TYPE, enum mlx5_ib_uapi_flow_table_type, - UA_OPTIONAL)); + UA_OPTIONAL), + UVERBS_ATTR_PTR_IN(MLX5_IB_ATTR_FLOW_MATCHER_IB_PORT, + UVERBS_ATTR_TYPE(u32), + UA_OPTIONAL)); DECLARE_UVERBS_NAMED_METHOD_DESTROY( MLX5_IB_METHOD_FLOW_MATCHER_DESTROY, @@ -2904,8 +3014,26 @@ int mlx5_ib_fs_init(struct mlx5_ib_dev *dev) if (!dev->flow_db) return -ENOMEM; + dev->flow_db->rdma_transport_rx = kcalloc(dev->num_ports, + sizeof(struct mlx5_ib_flow_prio), + GFP_KERNEL); + if (!dev->flow_db->rdma_transport_rx) + goto free_flow_db; + + dev->flow_db->rdma_transport_tx = kcalloc(dev->num_ports, + sizeof(struct mlx5_ib_flow_prio), + GFP_KERNEL); + if (!dev->flow_db->rdma_transport_tx) + goto free_rdma_transport_rx; + mutex_init(&dev->flow_db->lock); ib_set_device_ops(&dev->ib_dev, &flow_ops); return 0; + +free_rdma_transport_rx: + kfree(dev->flow_db->rdma_transport_rx); +free_flow_db: + kfree(dev->flow_db); + return -ENOMEM; } diff --git a/drivers/infiniband/hw/mlx5/fs.h b/drivers/infiniband/hw/mlx5/fs.h index b9734904f5f0..0516555eb1c1 100644 --- a/drivers/infiniband/hw/mlx5/fs.h +++ b/drivers/infiniband/hw/mlx5/fs.h @@ -40,6 +40,8 @@ static inline void mlx5_ib_fs_cleanup(struct mlx5_ib_dev *dev) * is a safe assumption that all references are gone. */ mlx5_ib_fs_cleanup_anchor(dev); + kfree(dev->flow_db->rdma_transport_tx); + kfree(dev->flow_db->rdma_transport_rx); kfree(dev->flow_db); } #endif /* _MLX5_IB_FS_H */ diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 974a45c92fbb..ccaaef20f50d 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -276,6 +276,7 @@ struct mlx5_ib_flow_matcher { struct mlx5_core_dev *mdev; atomic_t usecnt; u8 match_criteria_enable; + u32 ib_port; }; struct mlx5_ib_steering_anchor { @@ -307,6 +308,8 @@ struct mlx5_ib_flow_db { struct mlx5_ib_flow_prio rdma_tx[MLX5_IB_NUM_FLOW_FT]; struct mlx5_ib_flow_prio opfcs[MLX5_IB_OPCOUNTER_MAX]; struct mlx5_flow_table *lag_demux_ft; + struct mlx5_ib_flow_prio *rdma_transport_rx; + struct mlx5_ib_flow_prio *rdma_transport_tx; /* Protect flow steering bypass flow tables * when add/del flow rules. * only single add/removal of flow steering rule could be done diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h index fd2e4a3a56b3..18f9fe070213 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h +++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h @@ -239,6 +239,7 @@ enum mlx5_ib_flow_matcher_create_attrs { MLX5_IB_ATTR_FLOW_MATCHER_MATCH_CRITERIA, MLX5_IB_ATTR_FLOW_MATCHER_FLOW_FLAGS, MLX5_IB_ATTR_FLOW_MATCHER_FT_TYPE, + MLX5_IB_ATTR_FLOW_MATCHER_IB_PORT, }; enum mlx5_ib_flow_matcher_destroy_attrs { diff --git a/include/uapi/rdma/mlx5_user_ioctl_verbs.h b/include/uapi/rdma/mlx5_user_ioctl_verbs.h index 7c233df475e7..8f86e79d78a5 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_verbs.h +++ b/include/uapi/rdma/mlx5_user_ioctl_verbs.h @@ -45,6 +45,8 @@ enum mlx5_ib_uapi_flow_table_type { MLX5_IB_UAPI_FLOW_TABLE_TYPE_FDB = 0x2, MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_RX = 0x3, MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TX = 0x4, + MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TRANSPORT_RX = 0x5, + MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TRANSPORT_TX = 0x6, }; enum mlx5_ib_uapi_flow_action_packet_reformat_type { From patchwork Wed Feb 26 14:17:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 13992518 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8CC8522171D; Wed, 26 Feb 2025 14:18:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579491; cv=none; b=ogYCEw7gnN0CFIrZ/V7t1eIk5Xp+Ox89RSCvqZOmckCKroSahVqcWT6UHBo1Cr0eebZWmMwfdclSHLwQS8qpnZc5MCOiIWCLtimavGsS+G7SJd2kSu7wrDK1aLqWyGtgbLTM/bEi7/Rz3NZFYvTIZaJBwNpxgV6+7xzfbwEGAnY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740579491; c=relaxed/simple; bh=4TNlmV4WiESdLqnmWqz+l+1OpeFDoosOidqN+VhVqhw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dyR5yC3K6YypO6LskoeiRz4GCY9/eqkXD0JrjGo0BA/xZRphc0LPu5/+4hwByYH2vFyR4np6sKHGFVW+/ZxC+jXXQmifL7sIkTNFEzTkDXvrEq1LeQf0hiEElQ8O4KHM4RM5/3BzweHfsX+Wdu2ikFGpi4N03MIZFYBHDxtJzTU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=iGv3PfbE; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="iGv3PfbE" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 91DCCC4CED6; Wed, 26 Feb 2025 14:18:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1740579491; bh=4TNlmV4WiESdLqnmWqz+l+1OpeFDoosOidqN+VhVqhw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iGv3PfbE06PlNPnvVcKYUA2B5hdr8goRDXqswRbzCWj0alLbCjs3MxnteTpMOGztR RxCbRt93zI8HyzhHN444G+rlVX4CzmWPj9wUtIuCu5MT4NLFMPpQNA5xjB1aUKVLpY afcxY9ubhCTWmmoJhsHSprWZB+XbK4+pYvkgZklktwKezxTZijjebKPLMvLDLT9vJM uzJuG6qEUatssGjUxuwyRhCnNUGjiBySdXgbH1bdui1QghkIaLzqbjhsNXpI5Un4Hb zbK96+RX44TSayj0f1fISEuP1ufkO3/ojMEx13NypaGdRG05asGUKXmED969Uks2SP RJm1So55MksQw== From: Leon Romanovsky To: Jason Gunthorpe Cc: Chiara Meiohas , Jonathan Corbet , linux-doc@vger.kernel.org, linux-rdma@vger.kernel.org, Yishai Hadas Subject: [PATCH rdma-next 6/6] docs: infiniband: document the UCAP API Date: Wed, 26 Feb 2025 16:17:32 +0200 Message-ID: <56c1bbf06f10d6b352b59ebbb9bf31dc6be3a03e.1740574943.git.leon@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chiara Meiohas Add an explanation on the newly added UCAP API. Signed-off-by: Chiara Meiohas Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky --- Documentation/infiniband/index.rst | 1 + Documentation/infiniband/ucaps.rst | 71 ++++++++++++++++++++++++++++++ 2 files changed, 72 insertions(+) create mode 100644 Documentation/infiniband/ucaps.rst diff --git a/Documentation/infiniband/index.rst b/Documentation/infiniband/index.rst index 9cd7615438b9..5b4c24125f66 100644 --- a/Documentation/infiniband/index.rst +++ b/Documentation/infiniband/index.rst @@ -12,6 +12,7 @@ InfiniBand opa_vnic sysfs tag_matching + ucaps user_mad user_verbs diff --git a/Documentation/infiniband/ucaps.rst b/Documentation/infiniband/ucaps.rst new file mode 100644 index 000000000000..b8b6927742f4 --- /dev/null +++ b/Documentation/infiniband/ucaps.rst @@ -0,0 +1,71 @@ +================================= +Infiniband Userspace Capabilities +================================= + + User CAPabilities (UCAPs) provide fine-grained control over specific + firmware features in Infiniband (IB) devices. This approach offers + more granular capabilities than the existing Linux capabilities, + which may be too generic for certain FW features. + + Each user capability is represented as a character device with root + read-write access. Root processes can grant users special privileges + by allowing access to these character devices (e.g., using chown). + +Usage +===== + + UCAPs allow control over specific features of an IB device using file + descriptors of UCAP character devices. Here is how a user enables + specific features of an IB device: + + * A root process grants the user access to the UCAP files that + represents the capabilities (e.g., using chown). + * The user opens the UCAP files, obtaining file descriptors. + * When opening an IB device, include an array of the UCAP file + descriptors as an attribute. + * The ib_uverbs driver recognizes the UCAP file descriptors and enables + the corresponding capabilities for the IB device. + +Creating UCAPs +============== + + To create a new UCAP, drivers must first define a type in the + rdma_user_cap enum in rdma/ib_ucaps.h. The name of the UCAP character + device should be added to the ucap_names array in + drivers/infiniband/core/ucaps.c. Then, the driver can create the UCAP + character device by calling the ib_create_ucap API with the UCAP + type. + + A reference count is stored for each UCAP to track creations and + removals of the UCAP device. If multiple creation calls are made with + the same type (e.g., for two IB devices), the UCAP character device + is created during the first call and subsequent calls increment the + reference count. + + The UCAP character device is created under /dev/infiniband, and its + permissions are set to allow root read and write access only. + +Removing UCAPs +============== + + Each removal decrements the reference count of the UCAP. The UCAP + character device is removed from the filesystem only when the + reference count is decreased to 0. + +/dev and /sys/class files +========================= + + The class:: + + /sys/class/infiniband_ucaps + + is created when the first UCAP character device is created. + + The UCAP character device is created under /dev/infiniband. + + For example, if mlx5_ib adds the rdma_user_cap + RDMA_UCAP_MLX5_CTRL_LOCAL with name "mlx5_perm_ctrl_local", this will + create the device node:: + + /dev/infiniband/mlx5_perm_ctrl_local +