From patchwork Thu Mar 6 11:51:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 14004376 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62371156225 for ; Thu, 6 Mar 2025 11:52:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261943; cv=none; b=rj8JiT6GfAmPQ1/Q1ZhInh1aRYachCq88gTxY6XZpaHn3AhkSNG+3eDipcXhWd4IvYS1or9qXV9QLIFkEQJ61K15S2KfVQc2jcWUW17IPwPTfaevjAkgiTETnIdstMjqxQwChR9jWQEM1DYi3ULhHu1ajybpypRN1dZGP5YWo9Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261943; c=relaxed/simple; bh=bBEFpyggw32dBnkxrfC6sVvI81ku4g6fOvDpHBGdxZc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mj8MpAztKJzFa2tBpb7ixHtT02RXab2Y2u0VU4yNbe425VmCh87HpZZTsJny/n8wH44KoR3lf2qf4xXCbFRkHfEGFZTAG7hPifoiKGU2mH/C+YmmftOAp91MN/4djOpHgWYz3jQw3kz38HKZtY70fX9iNZFgDgrPgbYppobVGuc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=jOo/QIdQ; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="jOo/QIdQ" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2CAAEC4CEE0; Thu, 6 Mar 2025 11:52:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1741261942; bh=bBEFpyggw32dBnkxrfC6sVvI81ku4g6fOvDpHBGdxZc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jOo/QIdQq/9KS3yAIMeaYjjxRGVxmAYjk1Nd4kfY5w9TdxFP/2odcp6LkdOIcPL/9 QU0AmZVQFmrTHE8odIbKEowSgt1p0+0pTJX8Tsbd4WU47ZsVuz0OCKUnxfle55mzOL 7DjvzABQ+67eCQeuA0WdJbhijsFuMRxT4ELjCcMrGfoKCWJ0dwCncbpRgxp5Bhxu2Z ujFr6cZ019Pcs+bFXz9LhOCkU1LsuIwJ2SaTTMqLwNCJt854TSui2Dn0sJapof6z4T sIt98kGWdkTiicQYan49MApysxvEpV3McX2ElEhT2qfZ80Sj5eRXh/Y/eIDyaMwlgZ 4SiflHGqA1OTA== From: Leon Romanovsky To: Jason Gunthorpe Cc: Chiara Meiohas , linux-rdma@vger.kernel.org, Yishai Hadas Subject: [PATCH rdma-next v1 1/6] RDMA/uverbs: Introduce UCAP (User CAPabilities) API Date: Thu, 6 Mar 2025 13:51:26 +0200 Message-ID: <5a1379187cd21178e8554afc81a3c941f21af22f.1741261611.git.leon@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chiara Meiohas Implement a new User CAPabilities (UCAP) API to provide fine-grained control over specific firmware features. This approach offers more granular capabilities than the existing Linux capabilities, which may be too generic for certain FW features. This mechanism represents each capability as a character device with root read-write access. Root processes can grant users special privileges by allowing access to these character devices (e.g., using chown). UCAP character devices are located in /dev/infiniband and the class path is /sys/class/infiniband_ucaps. Signed-off-by: Chiara Meiohas Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky Reviewed-by: Zhu Yanjun --- drivers/infiniband/core/Makefile | 3 +- drivers/infiniband/core/ucaps.c | 267 ++++++++++++++++++++++++++ drivers/infiniband/core/uverbs_main.c | 2 + include/rdma/ib_ucaps.h | 25 +++ 4 files changed, 296 insertions(+), 1 deletion(-) create mode 100644 drivers/infiniband/core/ucaps.c create mode 100644 include/rdma/ib_ucaps.h diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile index 8ab4eea5a0a5..d49ded7e95f0 100644 --- a/drivers/infiniband/core/Makefile +++ b/drivers/infiniband/core/Makefile @@ -39,6 +39,7 @@ ib_uverbs-y := uverbs_main.o uverbs_cmd.o uverbs_marshall.o \ uverbs_std_types_async_fd.o \ uverbs_std_types_srq.o \ uverbs_std_types_wq.o \ - uverbs_std_types_qp.o + uverbs_std_types_qp.o \ + ucaps.o ib_uverbs-$(CONFIG_INFINIBAND_USER_MEM) += umem.o umem_dmabuf.o ib_uverbs-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o diff --git a/drivers/infiniband/core/ucaps.c b/drivers/infiniband/core/ucaps.c new file mode 100644 index 000000000000..6853c6d078f9 --- /dev/null +++ b/drivers/infiniband/core/ucaps.c @@ -0,0 +1,267 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#include +#include +#include +#include +#include +#include + +#define RDMA_UCAP_FIRST RDMA_UCAP_MLX5_CTRL_LOCAL + +static DEFINE_MUTEX(ucaps_mutex); +static struct ib_ucap *ucaps_list[RDMA_UCAP_MAX]; +static bool ucaps_class_is_registered; +static dev_t ucaps_base_dev; + +struct ib_ucap { + struct cdev cdev; + struct device dev; + struct kref ref; +}; + +static const char *ucap_names[RDMA_UCAP_MAX] = { + [RDMA_UCAP_MLX5_CTRL_LOCAL] = "mlx5_perm_ctrl_local", + [RDMA_UCAP_MLX5_CTRL_OTHER_VHCA] = "mlx5_perm_ctrl_other_vhca" +}; + +static char *ucaps_devnode(const struct device *dev, umode_t *mode) +{ + if (mode) + *mode = 0600; + + return kasprintf(GFP_KERNEL, "infiniband/%s", dev_name(dev)); +} + +static const struct class ucaps_class = { + .name = "infiniband_ucaps", + .devnode = ucaps_devnode, +}; + +static const struct file_operations ucaps_cdev_fops = { + .owner = THIS_MODULE, + .open = simple_open, +}; + +/** + * ib_cleanup_ucaps - cleanup all API resources and class. + * + * This is called once, when removing the ib_uverbs module. + */ +void ib_cleanup_ucaps(void) +{ + mutex_lock(&ucaps_mutex); + if (!ucaps_class_is_registered) { + mutex_unlock(&ucaps_mutex); + return; + } + + for (int i = RDMA_UCAP_FIRST; i < RDMA_UCAP_MAX; i++) + WARN_ON(ucaps_list[i]); + + class_unregister(&ucaps_class); + ucaps_class_is_registered = false; + unregister_chrdev_region(ucaps_base_dev, RDMA_UCAP_MAX); + mutex_unlock(&ucaps_mutex); +} + +static int get_ucap_from_devt(dev_t devt, u64 *idx_mask) +{ + for (int type = RDMA_UCAP_FIRST; type < RDMA_UCAP_MAX; type++) { + if (ucaps_list[type] && ucaps_list[type]->dev.devt == devt) { + *idx_mask |= 1 << type; + return 0; + } + } + + return -EINVAL; +} + +static int get_devt_from_fd(unsigned int fd, dev_t *ret_dev) +{ + struct file *file; + + file = fget(fd); + if (!file) + return -EBADF; + + *ret_dev = file_inode(file)->i_rdev; + fput(file); + return 0; +} + +/** + * ib_ucaps_init - Initialization required before ucap creation. + * + * Return: 0 on success, or a negative errno value on failure + */ +static int ib_ucaps_init(void) +{ + int ret = 0; + + if (ucaps_class_is_registered) + return ret; + + ret = class_register(&ucaps_class); + if (ret) + return ret; + + ret = alloc_chrdev_region(&ucaps_base_dev, 0, RDMA_UCAP_MAX, + ucaps_class.name); + if (ret < 0) { + class_unregister(&ucaps_class); + return ret; + } + + ucaps_class_is_registered = true; + + return 0; +} + +static void ucap_dev_release(struct device *device) +{ + struct ib_ucap *ucap = container_of(device, struct ib_ucap, dev); + + kfree(ucap); +} + +/** + * ib_create_ucap - Add a ucap character device + * @type: UCAP type + * + * Creates a ucap character device in the /dev/infiniband directory. By default, + * the device has root-only read-write access. + * + * A driver may call this multiple times with the same UCAP type. A reference + * count tracks creations and deletions. + * + * Return: 0 on success, or a negative errno value on failure + */ +int ib_create_ucap(enum rdma_user_cap type) +{ + struct ib_ucap *ucap; + int ret; + + if (type >= RDMA_UCAP_MAX) + return -EINVAL; + + mutex_lock(&ucaps_mutex); + ret = ib_ucaps_init(); + if (ret) + goto unlock; + + ucap = ucaps_list[type]; + if (ucap) { + kref_get(&ucap->ref); + mutex_unlock(&ucaps_mutex); + return 0; + } + + ucap = kzalloc(sizeof(*ucap), GFP_KERNEL); + if (!ucap) { + ret = -ENOMEM; + goto unlock; + } + + device_initialize(&ucap->dev); + ucap->dev.class = &ucaps_class; + ucap->dev.devt = MKDEV(MAJOR(ucaps_base_dev), type); + ucap->dev.release = ucap_dev_release; + ret = dev_set_name(&ucap->dev, ucap_names[type]); + if (ret) + goto err_device; + + cdev_init(&ucap->cdev, &ucaps_cdev_fops); + ucap->cdev.owner = THIS_MODULE; + + ret = cdev_device_add(&ucap->cdev, &ucap->dev); + if (ret) + goto err_device; + + kref_init(&ucap->ref); + ucaps_list[type] = ucap; + mutex_unlock(&ucaps_mutex); + + return 0; + +err_device: + put_device(&ucap->dev); +unlock: + mutex_unlock(&ucaps_mutex); + return ret; +} +EXPORT_SYMBOL(ib_create_ucap); + +static void ib_release_ucap(struct kref *ref) +{ + struct ib_ucap *ucap = container_of(ref, struct ib_ucap, ref); + enum rdma_user_cap type; + + for (type = RDMA_UCAP_FIRST; type < RDMA_UCAP_MAX; type++) { + if (ucaps_list[type] == ucap) + break; + } + WARN_ON(type == RDMA_UCAP_MAX); + + ucaps_list[type] = NULL; + cdev_device_del(&ucap->cdev, &ucap->dev); + put_device(&ucap->dev); +} + +/** + * ib_remove_ucap - Remove a ucap character device + * @type: User cap type + * + * Removes the ucap character device according to type. The device is completely + * removed from the filesystem when its reference count reaches 0. + */ +void ib_remove_ucap(enum rdma_user_cap type) +{ + struct ib_ucap *ucap; + + mutex_lock(&ucaps_mutex); + ucap = ucaps_list[type]; + if (WARN_ON(!ucap)) + goto end; + + kref_put(&ucap->ref, ib_release_ucap); +end: + mutex_unlock(&ucaps_mutex); +} +EXPORT_SYMBOL(ib_remove_ucap); + +/** + * ib_get_ucaps - Get bitmask of ucap types from file descriptors + * @fds: Array of file descriptors + * @fd_count: Number of file descriptors in the array + * @idx_mask: Bitmask to be updated based on the ucaps in the fd list + * + * Given an array of file descriptors, this function returns a bitmask of + * the ucaps where a bit is set if an FD for that ucap type was in the array. + * + * Return: 0 on success, or a negative errno value on failure + */ +int ib_get_ucaps(int *fds, int fd_count, uint64_t *idx_mask) +{ + int ret = 0; + dev_t dev; + + *idx_mask = 0; + mutex_lock(&ucaps_mutex); + for (int i = 0; i < fd_count; i++) { + ret = get_devt_from_fd(fds[i], &dev); + if (ret) + goto end; + + ret = get_ucap_from_devt(dev, idx_mask); + if (ret) + goto end; + } + +end: + mutex_unlock(&ucaps_mutex); + return ret; +} diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c index 85cfc790a7bb..973fe2c7ef53 100644 --- a/drivers/infiniband/core/uverbs_main.c +++ b/drivers/infiniband/core/uverbs_main.c @@ -52,6 +52,7 @@ #include #include #include +#include #include "uverbs.h" #include "core_priv.h" @@ -1345,6 +1346,7 @@ static void __exit ib_uverbs_cleanup(void) IB_UVERBS_NUM_FIXED_MINOR); unregister_chrdev_region(dynamic_uverbs_dev, IB_UVERBS_NUM_DYNAMIC_MINOR); + ib_cleanup_ucaps(); mmu_notifier_synchronize(); } diff --git a/include/rdma/ib_ucaps.h b/include/rdma/ib_ucaps.h new file mode 100644 index 000000000000..8f0552a2b2b0 --- /dev/null +++ b/include/rdma/ib_ucaps.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* + * Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. All rights reserved + */ + +#ifndef _IB_UCAPS_H_ +#define _IB_UCAPS_H_ + +#define UCAP_ENABLED(ucaps, type) (!!((ucaps) & (1U << (type)))) + +enum rdma_user_cap { + RDMA_UCAP_MLX5_CTRL_LOCAL, + RDMA_UCAP_MLX5_CTRL_OTHER_VHCA, + RDMA_UCAP_MAX +}; + +void ib_cleanup_ucaps(void); + +int ib_create_ucap(enum rdma_user_cap type); + +void ib_remove_ucap(enum rdma_user_cap type); + +int ib_get_ucaps(int *fds, int fd_count, uint64_t *idx_mask); + +#endif /* _IB_UCAPS_H_ */ From patchwork Thu Mar 6 11:51:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 14004375 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 06C74156225 for ; Thu, 6 Mar 2025 11:52:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261940; cv=none; b=cZSdVn2IsqnrFQ5tPiLsKf2xJ/UGDmwO5+7uMqcNXgvWEG6LcBfRC0oYu4g64MxetshUGMgXcqak/Lu/Jccoda61d6HYH31yJfZ7foqJ8SLlYK0rAQTZxp9MH2hVQsDIryVaZzr/jZsbqVtQyF1mEChCpiWJlmUsbZweB5G+gMw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261940; c=relaxed/simple; bh=Aoz+QavRXw18FJ5GW865BMZ0YWdhhTInaHJmTTVVK78=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pzIRHFk6FOkdJqB4PbzrsHfgcn+WXSsxshU8mAaKZWtV0Ka8sDV+bjwlyVeb8RrMankmz82jpVDgE+XVbCAEzIltc311Vat2JagYpoeiBTZTWwoHXxrQRt0jxqCxlS3Bp9oPQrQCLm7g8XcickpDb8uJojr7iIFFydHhsaqQg/Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lEpEtT4/; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lEpEtT4/" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6A022C4CEE2; Thu, 6 Mar 2025 11:52:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1741261939; bh=Aoz+QavRXw18FJ5GW865BMZ0YWdhhTInaHJmTTVVK78=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lEpEtT4/hxaRI6yx78Gram0ym5jPsDzPHttPZWQhrukwoybAAYENqWpWbCJBmOtFv dXFPHPGTFKIJNqpcZKwoBLcvSohMOS6iKNHCUJxc5KZoZJYz+sa5otkDjuQd8B7DJ5 L0Ueb6UltLV2QEs54dxz7Ly3xtzGUn2SpmQekLhZ2RRHre96zM+8qUfvkvz5/z8VWf rgWYlKNn15PWymePumx3nyJK7OhQeeWIKTZFtK7Cu7bdcnuKBFfkNwSkoQQQpY4vO1 LSMrbckUGHPBYmzhU6TYh39NgE5yXdB1B+hmqOL3JkfMwPW0a34fH3BkhVF62AyMTb 5+ta590flMbbA== From: Leon Romanovsky To: Jason Gunthorpe Cc: Chiara Meiohas , linux-rdma@vger.kernel.org, Yishai Hadas Subject: [PATCH rdma-next v1 2/6] RDMA/mlx5: Create UCAP char devices for supported device capabilities Date: Thu, 6 Mar 2025 13:51:27 +0200 Message-ID: <30ed40e7a12a694cf4ee257459ed61b145b7837d.1741261611.git.leon@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chiara Meiohas Create UCAP character devices when probing an IB device with supported firmware capabilities. If the RDMA_CTRL general object type is supported, check for specific UCTX capabilities: Create /dev/infiniband/mlx5_perm_ctrl_local for RDMA_UCAP_MLX5_CTRL_LOCAL Create /dev/infiniband/mlx5_perm_ctrl_other_vhca for RDMA_UCAP_MLX5_CTRL_OTHER_VHCA Signed-off-by: Chiara Meiohas Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/main.c | 47 +++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 81849eb671a1..04b489a6a449 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -47,6 +47,7 @@ #include #include #include +#include #include "macsec.h" #include "data_direct.h" @@ -4201,8 +4202,47 @@ static int mlx5_ib_init_var_table(struct mlx5_ib_dev *dev) return (var_table->bitmap) ? 0 : -ENOMEM; } +static void mlx5_ib_cleanup_ucaps(struct mlx5_ib_dev *dev) +{ + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & MLX5_UCTX_CAP_RDMA_CTRL) + ib_remove_ucap(RDMA_UCAP_MLX5_CTRL_LOCAL); + + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & + MLX5_UCTX_CAP_RDMA_CTRL_OTHER_VHCA) + ib_remove_ucap(RDMA_UCAP_MLX5_CTRL_OTHER_VHCA); +} + +static int mlx5_ib_init_ucaps(struct mlx5_ib_dev *dev) +{ + int ret; + + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & MLX5_UCTX_CAP_RDMA_CTRL) { + ret = ib_create_ucap(RDMA_UCAP_MLX5_CTRL_LOCAL); + if (ret) + return ret; + } + + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & + MLX5_UCTX_CAP_RDMA_CTRL_OTHER_VHCA) { + ret = ib_create_ucap(RDMA_UCAP_MLX5_CTRL_OTHER_VHCA); + if (ret) + goto remove_local; + } + + return 0; + +remove_local: + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & MLX5_UCTX_CAP_RDMA_CTRL) + ib_remove_ucap(RDMA_UCAP_MLX5_CTRL_LOCAL); + return ret; +} + static void mlx5_ib_stage_caps_cleanup(struct mlx5_ib_dev *dev) { + if (MLX5_CAP_GEN_2_64(dev->mdev, general_obj_types_127_64) & + MLX5_HCA_CAP_2_GENERAL_OBJECT_TYPES_RDMA_CTRL) + mlx5_ib_cleanup_ucaps(dev); + bitmap_free(dev->var_table.bitmap); } @@ -4253,6 +4293,13 @@ static int mlx5_ib_stage_caps_init(struct mlx5_ib_dev *dev) return err; } + if (MLX5_CAP_GEN_2_64(dev->mdev, general_obj_types_127_64) & + MLX5_HCA_CAP_2_GENERAL_OBJECT_TYPES_RDMA_CTRL) { + err = mlx5_ib_init_ucaps(dev); + if (err) + return err; + } + dev->ib_dev.use_cq_dim = true; return 0; From patchwork Thu Mar 6 11:51:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 14004380 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C652720AF7D for ; Thu, 6 Mar 2025 11:52:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261959; cv=none; b=dh6zEk/7wWOrSMTojlH9eGrPEZIsfr7QkXr9ObntUm2UR6y1NHf6GcnBsMaRwQBDuP7Sr9Rdo6H6lOnMjGDIWozxSI1cIsoQhbpUPYa0PFEw9djnrelR3Dhp5pyWR2nHPYPQcMJ0EKnzi2aKX/Yj1bpXNzua2U/xwzx4cnrtRN0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261959; c=relaxed/simple; bh=FVhnjMExwMrpFaHFuREiJ/T+XdJSrxiV9jaFai0DEVU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oVM3eejlasRbDfU+7kFBeBTdiWivFP8da+sA2IxwEKr4VhMdiC6BKM6n42MfytU6H6nJE/FJ3mHaECWt/4PieS69F3oUAmFF6zo+qyJA0k1McsZ752oLWoKLOSCx0KzdIPNApsk/1uPGT4emSiND009PUNj3BMi7eCOTWXXxRO0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gaH3XMFz; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gaH3XMFz" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 18F2AC4CEE4; Thu, 6 Mar 2025 11:52:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1741261959; bh=FVhnjMExwMrpFaHFuREiJ/T+XdJSrxiV9jaFai0DEVU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gaH3XMFzPOu+TsqMK8NWNk9xis0U4pWvo+fWqN/NWEouqlUF0cN6yctx+FSF91C6p LQszzI/e3zWC3QDz0BeQU2PCQSZ8rlaGZmBhdSxhgPOgLSdLTeZ6ICL8SWGo7bmRCh MxuzL4u5aMsZLF8452JwJksH7a/BZmUqyaWwcfe0nD1BnQp6dxQCaMTQA0LzCWjz1s MJ6b8lwjkw/63u96S8bBq6tAt0Mn+VEch6T/1EG1en7hH4iqy34rfeRJ5o7etg47rw RC+TOS9HRq/PS/c3M0qIRI0JAi1LN3bSkOMUiyNoEI5+2WNBymA7AMkN+sipTFLANp bUknpd+IRc+qA== From: Leon Romanovsky To: Jason Gunthorpe Cc: Chiara Meiohas , linux-rdma@vger.kernel.org, Yishai Hadas Subject: [PATCH rdma-next v1 3/6] RDMA/uverbs: Add support for UCAPs in context creation Date: Thu, 6 Mar 2025 13:51:28 +0200 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chiara Meiohas Add support for file descriptor array attribute for GET_CONTEXT commands. Check that the file descriptor (fd) array represents fds for valid UCAPs. Store the enabled UCAPs from the fd array as a bitmask in ib_ucontext. Signed-off-by: Chiara Meiohas Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky --- drivers/infiniband/core/uverbs_cmd.c | 19 +++++++++++++++++++ .../infiniband/core/uverbs_std_types_device.c | 4 ++++ include/rdma/ib_verbs.h | 1 + include/uapi/rdma/ib_user_ioctl_cmds.h | 1 + 4 files changed, 25 insertions(+) diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c index 5ad14c39d48c..96d639e1ffa0 100644 --- a/drivers/infiniband/core/uverbs_cmd.c +++ b/drivers/infiniband/core/uverbs_cmd.c @@ -42,6 +42,7 @@ #include #include +#include #include "rdma_core.h" #include "uverbs.h" @@ -232,6 +233,8 @@ int ib_init_ucontext(struct uverbs_attr_bundle *attrs) { struct ib_ucontext *ucontext = attrs->context; struct ib_uverbs_file *file = attrs->ufile; + int *fd_array; + int fd_count; int ret; if (!down_read_trylock(&file->hw_destroy_rwsem)) @@ -247,6 +250,22 @@ int ib_init_ucontext(struct uverbs_attr_bundle *attrs) if (ret) goto err; + if (uverbs_attr_is_valid(attrs, UVERBS_ATTR_GET_CONTEXT_FD_ARR)) { + fd_count = uverbs_attr_ptr_get_array_size(attrs, + UVERBS_ATTR_GET_CONTEXT_FD_ARR, + sizeof(int)); + if (fd_count < 0) { + ret = fd_count; + goto err_uncharge; + } + + fd_array = uverbs_attr_get_alloced_ptr(attrs, + UVERBS_ATTR_GET_CONTEXT_FD_ARR); + ret = ib_get_ucaps(fd_array, fd_count, &ucontext->enabled_caps); + if (ret) + goto err_uncharge; + } + ret = ucontext->device->ops.alloc_ucontext(ucontext, &attrs->driver_udata); if (ret) diff --git a/drivers/infiniband/core/uverbs_std_types_device.c b/drivers/infiniband/core/uverbs_std_types_device.c index fb0555647336..c0fd283d9d6c 100644 --- a/drivers/infiniband/core/uverbs_std_types_device.c +++ b/drivers/infiniband/core/uverbs_std_types_device.c @@ -437,6 +437,10 @@ DECLARE_UVERBS_NAMED_METHOD( UVERBS_ATTR_TYPE(u32), UA_OPTIONAL), UVERBS_ATTR_PTR_OUT(UVERBS_ATTR_GET_CONTEXT_CORE_SUPPORT, UVERBS_ATTR_TYPE(u64), UA_OPTIONAL), + UVERBS_ATTR_PTR_IN(UVERBS_ATTR_GET_CONTEXT_FD_ARR, + UVERBS_ATTR_MIN_SIZE(sizeof(int)), + UA_OPTIONAL, + UA_ALLOC_AND_COPY), UVERBS_ATTR_UHW()); DECLARE_UVERBS_NAMED_METHOD( diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h index a5761038935d..9941f4185c79 100644 --- a/include/rdma/ib_verbs.h +++ b/include/rdma/ib_verbs.h @@ -1530,6 +1530,7 @@ struct ib_ucontext { struct ib_uverbs_file *ufile; struct ib_rdmacg_object cg_obj; + u64 enabled_caps; /* * Implementation details of the RDMA core, don't use in drivers: */ diff --git a/include/uapi/rdma/ib_user_ioctl_cmds.h b/include/uapi/rdma/ib_user_ioctl_cmds.h index ec719053aab9..ac7b162611ed 100644 --- a/include/uapi/rdma/ib_user_ioctl_cmds.h +++ b/include/uapi/rdma/ib_user_ioctl_cmds.h @@ -88,6 +88,7 @@ enum uverbs_attrs_query_port_cmd_attr_ids { enum uverbs_attrs_get_context_attr_ids { UVERBS_ATTR_GET_CONTEXT_NUM_COMP_VECTORS, UVERBS_ATTR_GET_CONTEXT_CORE_SUPPORT, + UVERBS_ATTR_GET_CONTEXT_FD_ARR, }; enum uverbs_attrs_query_context_attr_ids { From patchwork Thu Mar 6 11:51:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 14004377 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 11360156225 for ; Thu, 6 Mar 2025 11:52:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261948; cv=none; b=ARJBJQhu3NHCt8KP/R0Ng8RYihEcNmlIQxf7eqQY3mRbgwfh5JEae5t/TLZKBmT/8nwWAgHzU0uGQJIMxm4VPoc+0Ad2uVwsLndIS0721Ryp/4t2CIQ+hwJilKAdnWU4pEPO+4PuWfZWtRuPPC1+Gull3tZ9dREPak0xYwwQynE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261948; c=relaxed/simple; bh=bTciPm5xpivDDQ68z1qsRozR+LSvcKNYFP1QeDj14ig=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AiPuD//NudNGdPgsUs59MndSVqakJYas9QBEhzH+/9QpkLs96EPHzmCjrB0Gx6s/Ve6XYZO859at0N/HAMI2RhHODx0gGhIYeQ3EJW0RMsxnjXPd/mhmBV0WKkk7HDrqiTFU+VrLDbhFiSBx8HE86gm6vo6yDFk4tU5MD8mTTq8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=T4GK9LpL; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="T4GK9LpL" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1690FC4CEE0; Thu, 6 Mar 2025 11:52:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1741261947; bh=bTciPm5xpivDDQ68z1qsRozR+LSvcKNYFP1QeDj14ig=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T4GK9LpLYW7vBdObT4U5/N1TBgpfWp/VLBydmQs/ErO4QGRRqiFvNkwfKxnMspaP4 L8OajwH87HB5RjkPhldyB7Ab0WEuCubxVHbRxhizgzM+6jNOr2HqqJK2bUTut9lQlY Y3XpIZDfcAUvW6BRkBnwb9oKP7tzLAj+pKgWz2gFJmC7j6nhHG4yedPMnsRebk70jv Iy1wi2qkQKNCLKqS0e23ADFRu4E9UA0uOOrlp5mRi96swTg1pR1QQNSQdcb4HwdGW1 F2KQhJ1oh4XhjohRW0ztKClIjSJauguHix5wUJYqdDGgaBXEL7E3JkQ5rn0NP9rSip go1x2JyxIe6+Q== From: Leon Romanovsky To: Jason Gunthorpe Cc: Chiara Meiohas , linux-rdma@vger.kernel.org, Yishai Hadas Subject: [PATCH rdma-next v1 4/6] RDMA/mlx5: Check enabled UCAPs when creating ucontext Date: Thu, 6 Mar 2025 13:51:29 +0200 Message-ID: <8b180583a207cb30deb7a2967934079749cdcc44.1741261611.git.leon@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chiara Meiohas Verify that the enabled UCAPs are supported by the device before creating the ucontext. If supported, create the ucontext with the associated capabilities. Store the privileged ucontext UID on creation and remove it when destroying the privileged ucontext. This allows the command interface to recognize privileged commands through its UID. Signed-off-by: Chiara Meiohas Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/devx.c | 31 +++++++++++++++++++++++++++++-- drivers/infiniband/hw/mlx5/devx.h | 5 +++-- drivers/infiniband/hw/mlx5/main.c | 30 ++++++++++++++++++++++++++---- 3 files changed, 58 insertions(+), 8 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c index 39304cae5b10..2479da8620ca 100644 --- a/drivers/infiniband/hw/mlx5/devx.c +++ b/drivers/infiniband/hw/mlx5/devx.c @@ -13,6 +13,7 @@ #include #include #include +#include #include "mlx5_ib.h" #include "devx.h" #include "qp.h" @@ -122,7 +123,27 @@ devx_ufile2uctx(const struct uverbs_attr_bundle *attrs) return to_mucontext(ib_uverbs_get_ucontext(attrs)); } -int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user) +static int set_uctx_ucaps(struct mlx5_ib_dev *dev, u64 req_ucaps, u32 *cap) +{ + if (UCAP_ENABLED(req_ucaps, RDMA_UCAP_MLX5_CTRL_LOCAL)) { + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & MLX5_UCTX_CAP_RDMA_CTRL) + *cap |= MLX5_UCTX_CAP_RDMA_CTRL; + else + return -EOPNOTSUPP; + } + + if (UCAP_ENABLED(req_ucaps, RDMA_UCAP_MLX5_CTRL_OTHER_VHCA)) { + if (MLX5_CAP_GEN(dev->mdev, uctx_cap) & + MLX5_UCTX_CAP_RDMA_CTRL_OTHER_VHCA) + *cap |= MLX5_UCTX_CAP_RDMA_CTRL_OTHER_VHCA; + else + return -EOPNOTSUPP; + } + + return 0; +} + +int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user, u64 req_ucaps) { u32 in[MLX5_ST_SZ_DW(create_uctx_in)] = {}; u32 out[MLX5_ST_SZ_DW(create_uctx_out)] = {}; @@ -146,6 +167,12 @@ int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user) capable(CAP_SYS_RAWIO)) cap |= MLX5_UCTX_CAP_INTERNAL_DEV_RES; + if (req_ucaps) { + err = set_uctx_ucaps(dev, req_ucaps, &cap); + if (err) + return err; + } + MLX5_SET(create_uctx_in, in, opcode, MLX5_CMD_OP_CREATE_UCTX); MLX5_SET(uctx, uctx, cap, cap); @@ -2575,7 +2602,7 @@ int mlx5_ib_devx_init(struct mlx5_ib_dev *dev) struct mlx5_devx_event_table *table = &dev->devx_event_table; int uid; - uid = mlx5_ib_devx_create(dev, false); + uid = mlx5_ib_devx_create(dev, false, 0); if (uid > 0) { dev->devx_whitelist_uid = uid; xa_init(&table->event_xa); diff --git a/drivers/infiniband/hw/mlx5/devx.h b/drivers/infiniband/hw/mlx5/devx.h index 1344bf4c9d21..ee9e7d3af93f 100644 --- a/drivers/infiniband/hw/mlx5/devx.h +++ b/drivers/infiniband/hw/mlx5/devx.h @@ -24,13 +24,14 @@ struct devx_obj { struct list_head event_sub; /* holds devx_event_subscription entries */ }; #if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS) -int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user); +int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user, u64 req_ucaps); void mlx5_ib_devx_destroy(struct mlx5_ib_dev *dev, u16 uid); int mlx5_ib_devx_init(struct mlx5_ib_dev *dev); void mlx5_ib_devx_cleanup(struct mlx5_ib_dev *dev); void mlx5_ib_ufile_hw_cleanup(struct ib_uverbs_file *ufile); #else -static inline int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user) +static inline int mlx5_ib_devx_create(struct mlx5_ib_dev *dev, bool is_user, + u64 req_ucaps) { return -EOPNOTSUPP; } diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c index 04b489a6a449..d07cacaa0abd 100644 --- a/drivers/infiniband/hw/mlx5/main.c +++ b/drivers/infiniband/hw/mlx5/main.c @@ -1935,6 +1935,12 @@ static int set_ucontext_resp(struct ib_ucontext *uctx, return 0; } +static bool uctx_rdma_ctrl_is_enabled(u64 enabled_caps) +{ + return UCAP_ENABLED(enabled_caps, RDMA_UCAP_MLX5_CTRL_LOCAL) || + UCAP_ENABLED(enabled_caps, RDMA_UCAP_MLX5_CTRL_OTHER_VHCA); +} + static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata) { @@ -1977,10 +1983,17 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, return -EINVAL; if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX) { - err = mlx5_ib_devx_create(dev, true); + err = mlx5_ib_devx_create(dev, true, uctx->enabled_caps); if (err < 0) goto out_ctx; context->devx_uid = err; + + if (uctx_rdma_ctrl_is_enabled(uctx->enabled_caps)) { + err = mlx5_cmd_add_privileged_uid(dev->mdev, + context->devx_uid); + if (err) + goto out_devx; + } } lib_uar_4k = req.lib_caps & MLX5_LIB_CAP_4K_UAR; @@ -1995,7 +2008,7 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, /* updates req->total_num_bfregs */ err = calc_total_bfregs(dev, lib_uar_4k, &req, bfregi); if (err) - goto out_devx; + goto out_ucap; mutex_init(&bfregi->lock); bfregi->lib_uar_4k = lib_uar_4k; @@ -2003,7 +2016,7 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, GFP_KERNEL); if (!bfregi->count) { err = -ENOMEM; - goto out_devx; + goto out_ucap; } bfregi->sys_pages = kcalloc(bfregi->num_sys_pages, @@ -2067,6 +2080,11 @@ static int mlx5_ib_alloc_ucontext(struct ib_ucontext *uctx, out_count: kfree(bfregi->count); +out_ucap: + if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX && + uctx_rdma_ctrl_is_enabled(uctx->enabled_caps)) + mlx5_cmd_remove_privileged_uid(dev->mdev, context->devx_uid); + out_devx: if (req.flags & MLX5_IB_ALLOC_UCTX_DEVX) mlx5_ib_devx_destroy(dev, context->devx_uid); @@ -2111,8 +2129,12 @@ static void mlx5_ib_dealloc_ucontext(struct ib_ucontext *ibcontext) kfree(bfregi->sys_pages); kfree(bfregi->count); - if (context->devx_uid) + if (context->devx_uid) { + if (uctx_rdma_ctrl_is_enabled(ibcontext->enabled_caps)) + mlx5_cmd_remove_privileged_uid(dev->mdev, + context->devx_uid); mlx5_ib_devx_destroy(dev, context->devx_uid); + } } static phys_addr_t uar_index2pfn(struct mlx5_ib_dev *dev, From patchwork Thu Mar 6 11:51:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 14004378 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2AF5920A5CA for ; Thu, 6 Mar 2025 11:52:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261952; cv=none; b=Qh/JgtgOt1+BGbykPmVEVIGxtbc5V1PWMpXTvd37JKZr2EwIgGAuV6JvM59jiNnLT40naECTHVejRchdcfp9pUUrrDWrKbRCxKU6/VbU8YiEITjvaPDxoqcE0agJdD6zl3kphQvIuNXoYkNro2syCQ1pyNvsfPL6ehjOl6x3NAM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261952; c=relaxed/simple; bh=XS5CM3/uyQ/50M01PNMvY6GGSVw/MYZm3cFCQoed3xs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=O+q74ZsM8bFFBh9RU5WphU5VtLgOmA1AFPcyPyqdrCJZglHuw47RHucnwJZXNAWvp6IjT1eWCw0cBKruQ2fLNr9wtLNZ+vxqycvkpi4w/ZFz2Xf2MANtvtPUKGCHhE/2D3qJxhNm7uDlhkueeq216biDXBZFpwnj8q+yFxYRNPM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=HneCMTI+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="HneCMTI+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id A7412C4CEE0; Thu, 6 Mar 2025 11:52:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1741261951; bh=XS5CM3/uyQ/50M01PNMvY6GGSVw/MYZm3cFCQoed3xs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=HneCMTI+r3k/h3iuWAQflNwrA5bKvswbtdUPzqkhO6CvJycrh8CcdUCFY+WWl8G3Z TwUPo75y8qZXONWq5sJ1e02IKfI7IkbBE8CjpyvpOqiIIEYTWiJWcjdWX2pxCFqnjz 2eSJZJTWtlbn6Bc6Br/nSQnponhdSRlX1tT/au4ZS8iq3UbSxXoihwfVIjhDUFzZUZ 9ZCpinMpaBTQRGim5Ej/Gu0GYjOT6zy7B6UyissBH2pjO6KSFlWlR5ubDxZdgRHmDD kXKoUjF1K15gdU25uw2PSVGEhUd/A5KjypziWf4c+VJUmIbEGVvDLLLlWy++8hneTt /DMRMrVOnjNYg== From: Leon Romanovsky To: Jason Gunthorpe Cc: Patrisious Haddad , linux-rdma@vger.kernel.org, Mark Bloch Subject: [PATCH rdma-next v1 5/6] RDMA/mlx5: Expose RDMA TRANSPORT flow table types to userspace Date: Thu, 6 Mar 2025 13:51:30 +0200 Message-ID: <2287d8c50483e880450c7e8e08d9de34cdec1b14.1741261611.git.leon@kernel.org> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Patrisious Haddad This patch adds RDMA_TRANSPORT_RX and RDMA_TRANSPORT_TX as a new flow table type for matcher creation. Signed-off-by: Patrisious Haddad Reviewed-by: Mark Bloch Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/fs.c | 154 ++++++++++++++++++++-- drivers/infiniband/hw/mlx5/fs.h | 2 + drivers/infiniband/hw/mlx5/mlx5_ib.h | 3 + include/uapi/rdma/mlx5_user_ioctl_cmds.h | 1 + include/uapi/rdma/mlx5_user_ioctl_verbs.h | 2 + 5 files changed, 149 insertions(+), 13 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/fs.c b/drivers/infiniband/hw/mlx5/fs.c index 162814ae8cb4..6ae2801fa13f 100644 --- a/drivers/infiniband/hw/mlx5/fs.c +++ b/drivers/infiniband/hw/mlx5/fs.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -690,7 +691,7 @@ static struct mlx5_ib_flow_prio *_get_prio(struct mlx5_ib_dev *dev, struct mlx5_ib_flow_prio *prio, int priority, int num_entries, int num_groups, - u32 flags) + u32 flags, u16 vport) { struct mlx5_flow_table_attr ft_attr = {}; struct mlx5_flow_table *ft; @@ -698,6 +699,7 @@ static struct mlx5_ib_flow_prio *_get_prio(struct mlx5_ib_dev *dev, ft_attr.prio = priority; ft_attr.max_fte = num_entries; ft_attr.flags = flags; + ft_attr.vport = vport; ft_attr.autogroup.max_num_groups = num_groups; ft = mlx5_create_auto_grouped_flow_table(ns, &ft_attr); if (IS_ERR(ft)) @@ -792,7 +794,7 @@ static struct mlx5_ib_flow_prio *get_flow_table(struct mlx5_ib_dev *dev, ft = prio->flow_table; if (!ft) return _get_prio(dev, ns, prio, priority, max_table_size, - num_groups, flags); + num_groups, flags, 0); return prio; } @@ -935,7 +937,7 @@ int mlx5_ib_fs_add_op_fc(struct mlx5_ib_dev *dev, u32 port_num, prio = &dev->flow_db->opfcs[type]; if (!prio->flow_table) { prio = _get_prio(dev, ns, prio, priority, - dev->num_ports * MAX_OPFC_RULES, 1, 0); + dev->num_ports * MAX_OPFC_RULES, 1, 0, 0); if (IS_ERR(prio)) { err = PTR_ERR(prio); goto free; @@ -1413,17 +1415,51 @@ static struct ib_flow *mlx5_ib_create_flow(struct ib_qp *qp, return ERR_PTR(err); } +static int mlx5_ib_fill_transport_ns_info(struct mlx5_ib_dev *dev, + enum mlx5_flow_namespace_type type, + u32 *flags, u16 *vport_idx, + u16 *vport, + struct mlx5_core_dev **ft_mdev, + u32 ib_port) +{ + struct mlx5_core_dev *esw_mdev; + + if (!is_mdev_switchdev_mode(dev->mdev)) + return 0; + + if (!MLX5_CAP_ADV_RDMA(dev->mdev, rdma_transport_manager)) + return -EOPNOTSUPP; + + if (!dev->port[ib_port - 1].rep) + return -EINVAL; + + esw_mdev = mlx5_eswitch_get_core_dev(dev->port[ib_port - 1].rep->esw); + if (esw_mdev != dev->mdev) + return -EOPNOTSUPP; + + *flags |= MLX5_FLOW_TABLE_OTHER_VPORT; + *ft_mdev = esw_mdev; + *vport = dev->port[ib_port - 1].rep->vport; + *vport_idx = dev->port[ib_port - 1].rep->vport_index; + + return 0; +} + static struct mlx5_ib_flow_prio * _get_flow_table(struct mlx5_ib_dev *dev, u16 user_priority, enum mlx5_flow_namespace_type ns_type, - bool mcast) + bool mcast, u32 ib_port) { + struct mlx5_core_dev *ft_mdev = dev->mdev; struct mlx5_flow_namespace *ns = NULL; struct mlx5_ib_flow_prio *prio = NULL; int max_table_size = 0; + u16 vport_idx = 0; bool esw_encap; u32 flags = 0; + u16 vport = 0; int priority; + int ret; if (mcast) priority = MLX5_IB_FLOW_MCAST_PRIO; @@ -1471,13 +1507,38 @@ _get_flow_table(struct mlx5_ib_dev *dev, u16 user_priority, MLX5_CAP_FLOWTABLE_RDMA_TX(dev->mdev, log_max_ft_size)); priority = user_priority; break; + case MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX: + case MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX: + if (ib_port == 0 || user_priority > MLX5_RDMA_TRANSPORT_BYPASS_PRIO) + return ERR_PTR(-EINVAL); + ret = mlx5_ib_fill_transport_ns_info(dev, ns_type, &flags, + &vport_idx, &vport, + &ft_mdev, ib_port); + if (ret) + return ERR_PTR(ret); + + if (ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX) + max_table_size = + BIT(MLX5_CAP_FLOWTABLE_RDMA_TRANSPORT_RX( + ft_mdev, log_max_ft_size)); + else + max_table_size = + BIT(MLX5_CAP_FLOWTABLE_RDMA_TRANSPORT_TX( + ft_mdev, log_max_ft_size)); + priority = user_priority; + break; default: break; } max_table_size = min_t(int, max_table_size, MLX5_FS_MAX_ENTRIES); - ns = mlx5_get_flow_namespace(dev->mdev, ns_type); + if (ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX || + ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX) + ns = mlx5_get_flow_vport_namespace(ft_mdev, ns_type, vport_idx); + else + ns = mlx5_get_flow_namespace(ft_mdev, ns_type); + if (!ns) return ERR_PTR(-EOPNOTSUPP); @@ -1497,6 +1558,12 @@ _get_flow_table(struct mlx5_ib_dev *dev, u16 user_priority, case MLX5_FLOW_NAMESPACE_RDMA_TX: prio = &dev->flow_db->rdma_tx[priority]; break; + case MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX: + prio = &dev->flow_db->rdma_transport_rx[ib_port - 1]; + break; + case MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX: + prio = &dev->flow_db->rdma_transport_tx[ib_port - 1]; + break; default: return ERR_PTR(-EINVAL); } @@ -1507,7 +1574,7 @@ _get_flow_table(struct mlx5_ib_dev *dev, u16 user_priority, return prio; return _get_prio(dev, ns, prio, priority, max_table_size, - MLX5_FS_MAX_TYPES, flags); + MLX5_FS_MAX_TYPES, flags, vport); } static struct mlx5_ib_flow_handler * @@ -1626,7 +1693,8 @@ static struct mlx5_ib_flow_handler *raw_fs_rule_add( mutex_lock(&dev->flow_db->lock); ft_prio = _get_flow_table(dev, fs_matcher->priority, - fs_matcher->ns_type, mcast); + fs_matcher->ns_type, mcast, + fs_matcher->ib_port); if (IS_ERR(ft_prio)) { err = PTR_ERR(ft_prio); goto unlock; @@ -1742,6 +1810,12 @@ mlx5_ib_ft_type_to_namespace(enum mlx5_ib_uapi_flow_table_type table_type, case MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TX: *namespace = MLX5_FLOW_NAMESPACE_RDMA_TX; break; + case MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TRANSPORT_RX: + *namespace = MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX; + break; + case MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TRANSPORT_TX: + *namespace = MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX; + break; default: return -EINVAL; } @@ -1831,7 +1905,8 @@ static int get_dests(struct uverbs_attr_bundle *attrs, return -EINVAL; /* Allow only DEVX object or QP as dest when inserting to RDMA_RX */ - if ((fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_RX) && + if ((fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_RX || + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX) && ((!dest_devx && !dest_qp) || (dest_devx && dest_qp))) return -EINVAL; @@ -1848,7 +1923,8 @@ static int get_dests(struct uverbs_attr_bundle *attrs, return -EINVAL; /* Allow only flow table as dest when inserting to FDB or RDMA_RX */ if ((fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_FDB_BYPASS || - fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_RX) && + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_RX || + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX) && *dest_type != MLX5_FLOW_DESTINATION_TYPE_FLOW_TABLE) return -EINVAL; } else if (dest_qp) { @@ -1869,14 +1945,16 @@ static int get_dests(struct uverbs_attr_bundle *attrs, *dest_id = mqp->raw_packet_qp.rq.tirn; *dest_type = MLX5_FLOW_DESTINATION_TYPE_TIR; } else if ((fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_EGRESS || - fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TX) && + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TX || + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX) && !(*flags & MLX5_IB_ATTR_CREATE_FLOW_FLAGS_DROP)) { *dest_type = MLX5_FLOW_DESTINATION_TYPE_PORT; } if (*dest_type == MLX5_FLOW_DESTINATION_TYPE_TIR && (fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_EGRESS || - fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TX)) + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TX || + fs_matcher->ns_type == MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX)) return -EINVAL; return 0; @@ -2353,6 +2431,15 @@ static int mlx5_ib_matcher_ns(struct uverbs_attr_bundle *attrs, return 0; } +static bool verify_context_caps(struct mlx5_ib_dev *dev, u64 enabled_caps) +{ + if (is_mdev_switchdev_mode(dev->mdev)) + return UCAP_ENABLED(enabled_caps, + RDMA_UCAP_MLX5_CTRL_OTHER_VHCA); + + return UCAP_ENABLED(enabled_caps, RDMA_UCAP_MLX5_CTRL_LOCAL); +} + static int UVERBS_HANDLER(MLX5_IB_METHOD_FLOW_MATCHER_CREATE)( struct uverbs_attr_bundle *attrs) { @@ -2401,6 +2488,26 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_FLOW_MATCHER_CREATE)( goto end; } + if (uverbs_attr_is_valid(attrs, MLX5_IB_ATTR_FLOW_MATCHER_IB_PORT)) { + err = uverbs_copy_from(&obj->ib_port, attrs, + MLX5_IB_ATTR_FLOW_MATCHER_IB_PORT); + if (err) + goto end; + if (!rdma_is_port_valid(&dev->ib_dev, obj->ib_port)) { + err = -EINVAL; + goto end; + } + if (obj->ns_type != MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_RX && + obj->ns_type != MLX5_FLOW_NAMESPACE_RDMA_TRANSPORT_TX) { + err = -EINVAL; + goto end; + } + if (!verify_context_caps(dev, uobj->context->enabled_caps)) { + err = -EOPNOTSUPP; + goto end; + } + } + uobj->object = obj; obj->mdev = dev->mdev; atomic_set(&obj->usecnt, 0); @@ -2448,7 +2555,7 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_STEERING_ANCHOR_CREATE)( mutex_lock(&dev->flow_db->lock); - ft_prio = _get_flow_table(dev, priority, ns_type, 0); + ft_prio = _get_flow_table(dev, priority, ns_type, 0, 0); if (IS_ERR(ft_prio)) { err = PTR_ERR(ft_prio); goto free_obj; @@ -2834,7 +2941,10 @@ DECLARE_UVERBS_NAMED_METHOD( UA_OPTIONAL), UVERBS_ATTR_CONST_IN(MLX5_IB_ATTR_FLOW_MATCHER_FT_TYPE, enum mlx5_ib_uapi_flow_table_type, - UA_OPTIONAL)); + UA_OPTIONAL), + UVERBS_ATTR_PTR_IN(MLX5_IB_ATTR_FLOW_MATCHER_IB_PORT, + UVERBS_ATTR_TYPE(u32), + UA_OPTIONAL)); DECLARE_UVERBS_NAMED_METHOD_DESTROY( MLX5_IB_METHOD_FLOW_MATCHER_DESTROY, @@ -2904,8 +3014,26 @@ int mlx5_ib_fs_init(struct mlx5_ib_dev *dev) if (!dev->flow_db) return -ENOMEM; + dev->flow_db->rdma_transport_rx = kcalloc(dev->num_ports, + sizeof(struct mlx5_ib_flow_prio), + GFP_KERNEL); + if (!dev->flow_db->rdma_transport_rx) + goto free_flow_db; + + dev->flow_db->rdma_transport_tx = kcalloc(dev->num_ports, + sizeof(struct mlx5_ib_flow_prio), + GFP_KERNEL); + if (!dev->flow_db->rdma_transport_tx) + goto free_rdma_transport_rx; + mutex_init(&dev->flow_db->lock); ib_set_device_ops(&dev->ib_dev, &flow_ops); return 0; + +free_rdma_transport_rx: + kfree(dev->flow_db->rdma_transport_rx); +free_flow_db: + kfree(dev->flow_db); + return -ENOMEM; } diff --git a/drivers/infiniband/hw/mlx5/fs.h b/drivers/infiniband/hw/mlx5/fs.h index b9734904f5f0..0516555eb1c1 100644 --- a/drivers/infiniband/hw/mlx5/fs.h +++ b/drivers/infiniband/hw/mlx5/fs.h @@ -40,6 +40,8 @@ static inline void mlx5_ib_fs_cleanup(struct mlx5_ib_dev *dev) * is a safe assumption that all references are gone. */ mlx5_ib_fs_cleanup_anchor(dev); + kfree(dev->flow_db->rdma_transport_tx); + kfree(dev->flow_db->rdma_transport_rx); kfree(dev->flow_db); } #endif /* _MLX5_IB_FS_H */ diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h index 974a45c92fbb..ccaaef20f50d 100644 --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h @@ -276,6 +276,7 @@ struct mlx5_ib_flow_matcher { struct mlx5_core_dev *mdev; atomic_t usecnt; u8 match_criteria_enable; + u32 ib_port; }; struct mlx5_ib_steering_anchor { @@ -307,6 +308,8 @@ struct mlx5_ib_flow_db { struct mlx5_ib_flow_prio rdma_tx[MLX5_IB_NUM_FLOW_FT]; struct mlx5_ib_flow_prio opfcs[MLX5_IB_OPCOUNTER_MAX]; struct mlx5_flow_table *lag_demux_ft; + struct mlx5_ib_flow_prio *rdma_transport_rx; + struct mlx5_ib_flow_prio *rdma_transport_tx; /* Protect flow steering bypass flow tables * when add/del flow rules. * only single add/removal of flow steering rule could be done diff --git a/include/uapi/rdma/mlx5_user_ioctl_cmds.h b/include/uapi/rdma/mlx5_user_ioctl_cmds.h index fd2e4a3a56b3..18f9fe070213 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_cmds.h +++ b/include/uapi/rdma/mlx5_user_ioctl_cmds.h @@ -239,6 +239,7 @@ enum mlx5_ib_flow_matcher_create_attrs { MLX5_IB_ATTR_FLOW_MATCHER_MATCH_CRITERIA, MLX5_IB_ATTR_FLOW_MATCHER_FLOW_FLAGS, MLX5_IB_ATTR_FLOW_MATCHER_FT_TYPE, + MLX5_IB_ATTR_FLOW_MATCHER_IB_PORT, }; enum mlx5_ib_flow_matcher_destroy_attrs { diff --git a/include/uapi/rdma/mlx5_user_ioctl_verbs.h b/include/uapi/rdma/mlx5_user_ioctl_verbs.h index 7c233df475e7..8f86e79d78a5 100644 --- a/include/uapi/rdma/mlx5_user_ioctl_verbs.h +++ b/include/uapi/rdma/mlx5_user_ioctl_verbs.h @@ -45,6 +45,8 @@ enum mlx5_ib_uapi_flow_table_type { MLX5_IB_UAPI_FLOW_TABLE_TYPE_FDB = 0x2, MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_RX = 0x3, MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TX = 0x4, + MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TRANSPORT_RX = 0x5, + MLX5_IB_UAPI_FLOW_TABLE_TYPE_RDMA_TRANSPORT_TX = 0x6, }; enum mlx5_ib_uapi_flow_action_packet_reformat_type { From patchwork Thu Mar 6 11:51:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Leon Romanovsky X-Patchwork-Id: 14004379 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9A61620A5D3; Thu, 6 Mar 2025 11:52:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261955; cv=none; b=UjOXYaI9nZo2zIxmbvUCDC554Wt3L+f4EsyOJxzR+S5pPljk5IXnC3Ag9K8UXQrBEUlbyDXnflJQ8N21irgImUEg5Q/eGryiP+wLQyPyuHoGSEEcvZHvwbAhSR5t7U18vp368WYy4IcF2AZxzQU744jMehyU7mp+OSTVhF7R2jA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741261955; c=relaxed/simple; bh=4TNlmV4WiESdLqnmWqz+l+1OpeFDoosOidqN+VhVqhw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=N8lKRtpgQBrJmu+TRmwt0xP0+xNA7WgQjBtUR08oi7m/T1zc16941TDx/sEyTZVniQJ9bZa1MV+Gu/K413H4dzHSzF+wgBsskkECCB+l+v3EJ2oVN5rsc9JDVR8VZ4AtePQB4ykDILi/wbPGtlo1/bxvwF/1+zjwKMmiNAjX8wE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=UhPHyyFO; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="UhPHyyFO" Received: by smtp.kernel.org (Postfix) with ESMTPSA id DD98DC4CEE2; Thu, 6 Mar 2025 11:52:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1741261955; bh=4TNlmV4WiESdLqnmWqz+l+1OpeFDoosOidqN+VhVqhw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UhPHyyFOOUq7L6/K6qma/qyfwVkJEprJznkDVFbFA8GCF7REisdF6IubHSc+bLeQB Rah43IqTRnC+bW/Zc6AF04UsiUk5qEOSFrpZEVJb6wCOSwYj6tjQwphLIkYxHWgsz6 oeZvf+WUFRcw5d0AkuJSM6RjXIKAF511BNSNjUwrU+ikTAhhLN3TwYSG7lvAlJR0ef qfwwFp0Gq7H6YscUNkI42Z9UmOGksA1TVL40oZm1p5WTvcDnwq+SaaQciirl+vpyRS +ELPGyFIi8SJSrJduRTm3aLsg62KlRTvsEl9VI0K7Ic84f7cIUIh3RisUoY08zdjz1 Ca9qCrKzlW2Rg== From: Leon Romanovsky To: Jason Gunthorpe Cc: Chiara Meiohas , Jonathan Corbet , linux-doc@vger.kernel.org, linux-rdma@vger.kernel.org, Yishai Hadas Subject: [PATCH rdma-next v1 6/6] docs: infiniband: document the UCAP API Date: Thu, 6 Mar 2025 13:51:31 +0200 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-rdma@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Chiara Meiohas Add an explanation on the newly added UCAP API. Signed-off-by: Chiara Meiohas Reviewed-by: Yishai Hadas Signed-off-by: Leon Romanovsky --- Documentation/infiniband/index.rst | 1 + Documentation/infiniband/ucaps.rst | 71 ++++++++++++++++++++++++++++++ 2 files changed, 72 insertions(+) create mode 100644 Documentation/infiniband/ucaps.rst diff --git a/Documentation/infiniband/index.rst b/Documentation/infiniband/index.rst index 9cd7615438b9..5b4c24125f66 100644 --- a/Documentation/infiniband/index.rst +++ b/Documentation/infiniband/index.rst @@ -12,6 +12,7 @@ InfiniBand opa_vnic sysfs tag_matching + ucaps user_mad user_verbs diff --git a/Documentation/infiniband/ucaps.rst b/Documentation/infiniband/ucaps.rst new file mode 100644 index 000000000000..b8b6927742f4 --- /dev/null +++ b/Documentation/infiniband/ucaps.rst @@ -0,0 +1,71 @@ +================================= +Infiniband Userspace Capabilities +================================= + + User CAPabilities (UCAPs) provide fine-grained control over specific + firmware features in Infiniband (IB) devices. This approach offers + more granular capabilities than the existing Linux capabilities, + which may be too generic for certain FW features. + + Each user capability is represented as a character device with root + read-write access. Root processes can grant users special privileges + by allowing access to these character devices (e.g., using chown). + +Usage +===== + + UCAPs allow control over specific features of an IB device using file + descriptors of UCAP character devices. Here is how a user enables + specific features of an IB device: + + * A root process grants the user access to the UCAP files that + represents the capabilities (e.g., using chown). + * The user opens the UCAP files, obtaining file descriptors. + * When opening an IB device, include an array of the UCAP file + descriptors as an attribute. + * The ib_uverbs driver recognizes the UCAP file descriptors and enables + the corresponding capabilities for the IB device. + +Creating UCAPs +============== + + To create a new UCAP, drivers must first define a type in the + rdma_user_cap enum in rdma/ib_ucaps.h. The name of the UCAP character + device should be added to the ucap_names array in + drivers/infiniband/core/ucaps.c. Then, the driver can create the UCAP + character device by calling the ib_create_ucap API with the UCAP + type. + + A reference count is stored for each UCAP to track creations and + removals of the UCAP device. If multiple creation calls are made with + the same type (e.g., for two IB devices), the UCAP character device + is created during the first call and subsequent calls increment the + reference count. + + The UCAP character device is created under /dev/infiniband, and its + permissions are set to allow root read and write access only. + +Removing UCAPs +============== + + Each removal decrements the reference count of the UCAP. The UCAP + character device is removed from the filesystem only when the + reference count is decreased to 0. + +/dev and /sys/class files +========================= + + The class:: + + /sys/class/infiniband_ucaps + + is created when the first UCAP character device is created. + + The UCAP character device is created under /dev/infiniband. + + For example, if mlx5_ib adds the rdma_user_cap + RDMA_UCAP_MLX5_CTRL_LOCAL with name "mlx5_perm_ctrl_local", this will + create the device node:: + + /dev/infiniband/mlx5_perm_ctrl_local +