From patchwork Wed Mar 8 13:28:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165798 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2AA75C678D5 for ; Wed, 8 Mar 2023 13:29:10 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8BC4210E5D3; Wed, 8 Mar 2023 13:29:09 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 906D710E5D3; Wed, 8 Mar 2023 13:29:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282148; x=1709818148; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FMM5X18RLC+uCAkDceiTtJ8A5U1bc1jKPTXVv6mvnxw=; b=ZqqO1TzNYtPbP0sA6S9+wT9Ow8XMZO2G621P5ydSB75lUhUSgJjFbmZX X3sqeVx7kfC2+Mrge7w6RoEqNiXZSOEKvBJDwhIRjSQnorBDG6mza9xoV CLB3IyKtZVyGVNXExKoaLDCXVPrMH8V63DBBUApRLMj2EvvlR24SBkJ+3 xOsiN3Rd0kWTHGrEmvFH13UW6s7qjoaKJOEdsXYzC7gn4o3jUJpI9Gx5G rW3/EuMuJh2+U3DH+rWq/8c2WK1Fu69aE+C4rAFkfytqPm6+mZPYL4fZk b2gpctDht3QbJJ3ZD86SllCaaVxXybu9qpRIPRMhbO6/Vc9tTSHDHIGdx g==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165083" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165083" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789224" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789224" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:07 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:40 -0800 Message-Id: <20230308132903.465159-2-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 01/24] vfio: Allocate per device file structure X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This is preparation for adding vfio device cdev support. vfio device cdev requires: 1) A per device file memory to store the kvm pointer set by KVM. It will be propagated to vfio_device:kvm after the device cdev file is bound to an iommufd. 2) A mechanism to block device access through device cdev fd before it is bound to an iommufd. To address above requirements, this adds a per device file structure named vfio_device_file. For now, it's only a wrapper of struct vfio_device pointer. Other fields will be added to this per file structure in future commits. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- drivers/vfio/group.c | 13 +++++++++++-- drivers/vfio/vfio.h | 6 ++++++ drivers/vfio/vfio_main.c | 31 ++++++++++++++++++++++++++----- 3 files changed, 43 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 27d5ba7cf9dc..af2ef8006e1d 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -218,19 +218,26 @@ void vfio_device_group_close(struct vfio_device *device) static struct file *vfio_device_open_file(struct vfio_device *device) { + struct vfio_device_file *df; struct file *filep; int ret; + df = vfio_allocate_device_file(device); + if (IS_ERR(df)) { + ret = PTR_ERR(df); + goto err_out; + } + ret = vfio_device_group_open(device); if (ret) - goto err_out; + goto err_free; /* * We can't use anon_inode_getfd() because we need to modify * the f_mode flags directly to allow more than just ioctls */ filep = anon_inode_getfile("[vfio-device]", &vfio_device_fops, - device, O_RDWR); + df, O_RDWR); if (IS_ERR(filep)) { ret = PTR_ERR(filep); goto err_close_device; @@ -254,6 +261,8 @@ static struct file *vfio_device_open_file(struct vfio_device *device) err_close_device: vfio_device_group_close(device); +err_free: + kfree(df); err_out: return ERR_PTR(ret); } diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 7b19c621e0e6..87d3dd6b9ef9 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -16,11 +16,17 @@ struct iommufd_ctx; struct iommu_group; struct vfio_container; +struct vfio_device_file { + struct vfio_device *device; +}; + void vfio_device_put_registration(struct vfio_device *device); bool vfio_device_try_get_registration(struct vfio_device *device); int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd); void vfio_device_close(struct vfio_device *device, struct iommufd_ctx *iommufd); +struct vfio_device_file * +vfio_allocate_device_file(struct vfio_device *device); extern const struct file_operations vfio_device_fops; diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 89497c933490..f0b9151e3ba7 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -404,6 +404,20 @@ static bool vfio_assert_device_open(struct vfio_device *device) return !WARN_ON_ONCE(!READ_ONCE(device->open_count)); } +struct vfio_device_file * +vfio_allocate_device_file(struct vfio_device *device) +{ + struct vfio_device_file *df; + + df = kzalloc(sizeof(*df), GFP_KERNEL_ACCOUNT); + if (!df) + return ERR_PTR(-ENOMEM); + + df->device = device; + + return df; +} + static int vfio_device_first_open(struct vfio_device *device, struct iommufd_ctx *iommufd) { @@ -517,12 +531,15 @@ static inline void vfio_device_pm_runtime_put(struct vfio_device *device) */ static int vfio_device_fops_release(struct inode *inode, struct file *filep) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; vfio_device_group_close(device); vfio_device_put_registration(device); + kfree(df); + return 0; } @@ -1087,7 +1104,8 @@ static int vfio_ioctl_device_feature(struct vfio_device *device, static long vfio_device_fops_unl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; int ret; ret = vfio_device_pm_runtime_get(device); @@ -1114,7 +1132,8 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf, size_t count, loff_t *ppos) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; if (unlikely(!device->ops->read)) return -EINVAL; @@ -1126,7 +1145,8 @@ static ssize_t vfio_device_fops_write(struct file *filep, const char __user *buf, size_t count, loff_t *ppos) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; if (unlikely(!device->ops->write)) return -EINVAL; @@ -1136,7 +1156,8 @@ static ssize_t vfio_device_fops_write(struct file *filep, static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; if (unlikely(!device->ops->mmap)) return -EINVAL; From patchwork Wed Mar 8 13:28:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165801 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 89152C6FD20 for ; Wed, 8 Mar 2023 13:29:20 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D5E6310E5EA; Wed, 8 Mar 2023 13:29:19 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 21AB210E5DB; Wed, 8 Mar 2023 13:29:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282150; x=1709818150; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=e3LO6kn7d4PFkjrmP53RPj3vcLfhQ+eZNQFxcM7y67w=; b=GYdo5U574o6mjglK8qEAJN6a1qNmAO/nNL7+KDkOzugNYpMKmohIUpx2 TEMkEhUKGSGC4ZrKSxuKBSRTi7WIHENVlzseWXxwBQTJh6jVk40J/qy3T b5mjKDNGJG8rdfNWVxLbvc7DPyNJ18jPtvgwTxxvL4nDuYfvK6vr3pJ2Y WWmeX6Xh1KwUHItwtiSjLfT7Gme59riESZ/lr15lMeKb8ECtItyiiwsgG wfArza8/oqVn4UgL92ZAWiNkeaUcBuQrLHDWrO7uPbRFnns/J7///M22V XQrOJc70GiPJ+7ewqnN/61Tjbwwfir8PnYAgal8RgEopW+pGYkdQNeTF/ Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165095" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165095" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:09 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789249" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789249" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:08 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:41 -0800 Message-Id: <20230308132903.465159-3-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 02/24] vfio: Refine vfio file kAPIs for KVM X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This prepares for making the below kAPIs to accept both group file and device file instead of only vfio group file. bool vfio_file_enforced_coherent(struct file *file); void vfio_file_set_kvm(struct file *file, struct kvm *kvm); Besides the above change, vfio_file_is_valid() is added to check if a given file is a valid vfio file. It would be extended to check both vfio group file and vfio device file later. vfio_file_is_group() is kept to for the VFIO PCI hot reset path. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- drivers/vfio/group.c | 57 +++++++++++++++------------------------- drivers/vfio/vfio.h | 3 +++ drivers/vfio/vfio_main.c | 45 +++++++++++++++++++++++++++++++ include/linux/vfio.h | 1 + virt/kvm/vfio.c | 10 +++---- 5 files changed, 75 insertions(+), 41 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index af2ef8006e1d..a293e1f92e89 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -754,6 +754,15 @@ bool vfio_device_has_container(struct vfio_device *device) return device->group->container; } +struct vfio_group *vfio_group_from_file(struct file *file) +{ + struct vfio_group *group = file->private_data; + + if (file->f_op != &vfio_group_fops) + return NULL; + return group; +} + /** * vfio_file_iommu_group - Return the struct iommu_group for the vfio group file * @file: VFIO group file @@ -764,13 +773,13 @@ bool vfio_device_has_container(struct vfio_device *device) */ struct iommu_group *vfio_file_iommu_group(struct file *file) { - struct vfio_group *group = file->private_data; + struct vfio_group *group = vfio_group_from_file(file); struct iommu_group *iommu_group = NULL; if (!IS_ENABLED(CONFIG_SPAPR_TCE_IOMMU)) return NULL; - if (!vfio_file_is_group(file)) + if (!group) return NULL; mutex_lock(&group->group_lock); @@ -784,33 +793,20 @@ struct iommu_group *vfio_file_iommu_group(struct file *file) EXPORT_SYMBOL_GPL(vfio_file_iommu_group); /** - * vfio_file_is_group - True if the file is usable with VFIO aPIS + * vfio_file_is_group - True if the file is a vfio group file * @file: VFIO group file */ bool vfio_file_is_group(struct file *file) { - return file->f_op == &vfio_group_fops; + return vfio_group_from_file(file); } EXPORT_SYMBOL_GPL(vfio_file_is_group); -/** - * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file - * is always CPU cache coherent - * @file: VFIO group file - * - * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop - * bit in DMA transactions. A return of false indicates that the user has - * rights to access additional instructions such as wbinvd on x86. - */ -bool vfio_file_enforced_coherent(struct file *file) +bool vfio_group_enforced_coherent(struct vfio_group *group) { - struct vfio_group *group = file->private_data; struct vfio_device *device; bool ret = true; - if (!vfio_file_is_group(file)) - return true; - /* * If the device does not have IOMMU_CAP_ENFORCE_CACHE_COHERENCY then * any domain later attached to it will also not support it. If the cap @@ -828,28 +824,17 @@ bool vfio_file_enforced_coherent(struct file *file) mutex_unlock(&group->device_lock); return ret; } -EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); -/** - * vfio_file_set_kvm - Link a kvm with VFIO drivers - * @file: VFIO group file - * @kvm: KVM to link - * - * When a VFIO device is first opened the KVM will be available in - * device->kvm if one was associated with the group. - */ -void vfio_file_set_kvm(struct file *file, struct kvm *kvm) +void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm) { - struct vfio_group *group = file->private_data; - - if (!vfio_file_is_group(file)) - return; - + /* + * When a VFIO device is first opened the KVM will be available in + * device->kvm if one was associated with the group. + */ spin_lock(&group->kvm_ref_lock); group->kvm = kvm; spin_unlock(&group->kvm_ref_lock); } -EXPORT_SYMBOL_GPL(vfio_file_set_kvm); /** * vfio_file_has_dev - True if the VFIO file is a handle for device @@ -860,9 +845,9 @@ EXPORT_SYMBOL_GPL(vfio_file_set_kvm); */ bool vfio_file_has_dev(struct file *file, struct vfio_device *device) { - struct vfio_group *group = file->private_data; + struct vfio_group *group = vfio_group_from_file(file); - if (!vfio_file_is_group(file)) + if (!group) return false; return group == device->group; diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 87d3dd6b9ef9..b1e327a85a32 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -90,6 +90,9 @@ void vfio_device_group_unregister(struct vfio_device *device); int vfio_device_group_use_iommu(struct vfio_device *device); void vfio_device_group_unuse_iommu(struct vfio_device *device); void vfio_device_group_close(struct vfio_device *device); +struct vfio_group *vfio_group_from_file(struct file *file); +bool vfio_group_enforced_coherent(struct vfio_group *group); +void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm); bool vfio_device_has_container(struct vfio_device *device); int __init vfio_group_init(void); void vfio_group_cleanup(void); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index f0b9151e3ba7..edadfac9be49 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1175,6 +1175,51 @@ const struct file_operations vfio_device_fops = { .mmap = vfio_device_fops_mmap, }; +/** + * vfio_file_is_valid - True if the file is valid vfio file + * @file: VFIO group file or VFIO device file + */ +bool vfio_file_is_valid(struct file *file) +{ + return vfio_group_from_file(file); +} +EXPORT_SYMBOL_GPL(vfio_file_is_valid); + +/** + * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file + * is always CPU cache coherent + * @file: VFIO group file or VFIO device file + * + * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop + * bit in DMA transactions. A return of false indicates that the user has + * rights to access additional instructions such as wbinvd on x86. + */ +bool vfio_file_enforced_coherent(struct file *file) +{ + struct vfio_group *group = vfio_group_from_file(file); + + if (group) + return vfio_group_enforced_coherent(group); + + return true; +} +EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); + +/** + * vfio_file_set_kvm - Link a kvm with VFIO drivers + * @file: VFIO group file or VFIO device file + * @kvm: KVM to link + * + */ +void vfio_file_set_kvm(struct file *file, struct kvm *kvm) +{ + struct vfio_group *group = vfio_group_from_file(file); + + if (group) + vfio_group_set_kvm(group, kvm); +} +EXPORT_SYMBOL_GPL(vfio_file_set_kvm); + /* * Sub-module support */ diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 3188d8a374bd..b14dcdd0b71f 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -245,6 +245,7 @@ int vfio_mig_get_next_state(struct vfio_device *device, */ struct iommu_group *vfio_file_iommu_group(struct file *file); bool vfio_file_is_group(struct file *file); +bool vfio_file_is_valid(struct file *file); bool vfio_file_enforced_coherent(struct file *file); void vfio_file_set_kvm(struct file *file, struct kvm *kvm); bool vfio_file_has_dev(struct file *file, struct vfio_device *device); diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 9584eb57e0ed..8bac308ba630 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -64,18 +64,18 @@ static bool kvm_vfio_file_enforced_coherent(struct file *file) return ret; } -static bool kvm_vfio_file_is_group(struct file *file) +static bool kvm_vfio_file_is_valid(struct file *file) { bool (*fn)(struct file *file); bool ret; - fn = symbol_get(vfio_file_is_group); + fn = symbol_get(vfio_file_is_valid); if (!fn) return false; ret = fn(file); - symbol_put(vfio_file_is_group); + symbol_put(vfio_file_is_valid); return ret; } @@ -154,8 +154,8 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) if (!filp) return -EBADF; - /* Ensure the FD is a vfio group FD.*/ - if (!kvm_vfio_file_is_group(filp)) { + /* Ensure the FD is a vfio FD.*/ + if (!kvm_vfio_file_is_valid(filp)) { ret = -EINVAL; goto err_fput; } From patchwork Wed Mar 8 13:28:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165800 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 00880C6FD20 for ; Wed, 8 Mar 2023 13:29:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C663B10E5DC; Wed, 8 Mar 2023 13:29:15 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id D8E2F10E5DF; Wed, 8 Mar 2023 13:29:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282151; x=1709818151; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=d/S6N5Lv6Y2oP1cjAAUp6VHx7dhKOZ1WBGAZjsiqARQ=; b=VPN+by3miQC77kZh+sDBrnzuroELsp8pygjYkPrqHy2NkTs0QecvmJqK VMTZJ897x8eUmlpX7WsHqZwogzGguOXs4ZrNDuDsNPjuF/ymDcFzl0UQC 2HJm62R9bI4u3HHwOjp8298V24VY73MeilJQg5Ly2doR5Z20fNXkC7fhg D9fQkJrE5Gwn938S2WRF5PP+6FjJLtMG2mpOVI4pxIcUuug1j+uXaZM44 cVdrcftNfOiVqlNU3bLkfdvHKF9JKRAwsdStGj1+tzoPynfP58kn/kw8q +k70UhMm+CSMlhF5lD4a3OHGzswESE2I2hA/JIy16GpLwCxuKm5jD+VC3 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165110" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165110" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:11 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789282" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789282" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:10 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:42 -0800 Message-Id: <20230308132903.465159-4-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 03/24] vfio: Accept vfio device file in the KVM facing kAPI X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This makes the vfio file kAPIs to accepte vfio device files, also a preparation for vfio device cdev support. For the kvm set with vfio device file, kvm pointer is stored in struct vfio_device_file, and use kvm_ref_lock to protect kvm set and kvm pointer usage within VFIO. This kvm pointer will be set to vfio_device after device file is bound to iommufd in the cdev path. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- drivers/vfio/vfio.h | 2 ++ drivers/vfio/vfio_main.c | 42 +++++++++++++++++++++++++++++++++++++--- 2 files changed, 41 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index b1e327a85a32..69e1a0692b06 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -18,6 +18,8 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + spinlock_t kvm_ref_lock; /* protect kvm field */ + struct kvm *kvm; }; void vfio_device_put_registration(struct vfio_device *device); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index edadfac9be49..03d5b2979f79 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -414,6 +414,7 @@ vfio_allocate_device_file(struct vfio_device *device) return ERR_PTR(-ENOMEM); df->device = device; + spin_lock_init(&df->kvm_ref_lock); return df; } @@ -1175,13 +1176,23 @@ const struct file_operations vfio_device_fops = { .mmap = vfio_device_fops_mmap, }; +static struct vfio_device *vfio_device_from_file(struct file *file) +{ + struct vfio_device_file *df = file->private_data; + + if (file->f_op != &vfio_device_fops) + return NULL; + return df->device; +} + /** * vfio_file_is_valid - True if the file is valid vfio file * @file: VFIO group file or VFIO device file */ bool vfio_file_is_valid(struct file *file) { - return vfio_group_from_file(file); + return vfio_group_from_file(file) || + vfio_device_from_file(file); } EXPORT_SYMBOL_GPL(vfio_file_is_valid); @@ -1196,15 +1207,36 @@ EXPORT_SYMBOL_GPL(vfio_file_is_valid); */ bool vfio_file_enforced_coherent(struct file *file) { - struct vfio_group *group = vfio_group_from_file(file); + struct vfio_group *group; + struct vfio_device *device; + group = vfio_group_from_file(file); if (group) return vfio_group_enforced_coherent(group); + device = vfio_device_from_file(file); + if (device) + return device_iommu_capable(device->dev, + IOMMU_CAP_ENFORCE_CACHE_COHERENCY); + return true; } EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); +static void vfio_device_file_set_kvm(struct file *file, struct kvm *kvm) +{ + struct vfio_device_file *df = file->private_data; + + /* + * The kvm is first recorded in the vfio_device_file, and will + * be propagated to vfio_device::kvm when the file is bound to + * iommufd successfully in the vfio device cdev path. + */ + spin_lock(&df->kvm_ref_lock); + df->kvm = kvm; + spin_unlock(&df->kvm_ref_lock); +} + /** * vfio_file_set_kvm - Link a kvm with VFIO drivers * @file: VFIO group file or VFIO device file @@ -1213,10 +1245,14 @@ EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); */ void vfio_file_set_kvm(struct file *file, struct kvm *kvm) { - struct vfio_group *group = vfio_group_from_file(file); + struct vfio_group *group; + group = vfio_group_from_file(file); if (group) vfio_group_set_kvm(group, kvm); + + if (vfio_device_from_file(file)) + vfio_device_file_set_kvm(file, kvm); } EXPORT_SYMBOL_GPL(vfio_file_set_kvm); From patchwork Wed Mar 8 13:28:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165802 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 99025C76186 for ; Wed, 8 Mar 2023 13:29:22 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A9FAB10E5F0; Wed, 8 Mar 2023 13:29:21 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1455910E5DC; Wed, 8 Mar 2023 13:29:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282155; x=1709818155; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=z18H5rnG5rdWAosAE0Cknh0TC7WZCPg+Sxra/8Wu7Cg=; b=V71iFcOzXG9gRSCV2at5OBYo/NRDuEQceSk+nmMAvu+JBczQcDOw4RyM 3H0CLfNeuH/YDuDUz0ww+p+zY+/fVPLRO0BBdRCgYBT2RXWXoM7/8sEkI 9A1gLJjHsPHVlkrWg7BwlS5mgg/DUi0+oyCAX+HPFdh0bbxks1NCpfCSj sLCpY5JCA+efuFIkfqb8IPeyhRunB+x82eshANctGutjEk/+9s3u92N5V Q1M0G+eo46vwRqqFU7UQrRlhQtuz2Op6wN/gCzKEBHJCPHh2ZY7cRlYTU Se6UMUqt8KNVa/AOH7Go8Hp16GGJ+H8XygPCmt5GwfR871iS/UqMQeCiS Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165130" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165130" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789298" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789298" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:12 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:43 -0800 Message-Id: <20230308132903.465159-5-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 04/24] kvm/vfio: Rename kvm_vfio_group to prepare for accepting vfio device fd X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Meanwhile, rename related helpers. No functional change is intended. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- virt/kvm/vfio.c | 115 ++++++++++++++++++++++++------------------------ 1 file changed, 58 insertions(+), 57 deletions(-) diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 8bac308ba630..857d6ba349e1 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -21,7 +21,7 @@ #include #endif -struct kvm_vfio_group { +struct kvm_vfio_file { struct list_head node; struct file *file; #ifdef CONFIG_SPAPR_TCE_IOMMU @@ -30,7 +30,7 @@ struct kvm_vfio_group { }; struct kvm_vfio { - struct list_head group_list; + struct list_head file_list; struct mutex lock; bool noncoherent; }; @@ -98,34 +98,35 @@ static struct iommu_group *kvm_vfio_file_iommu_group(struct file *file) } static void kvm_spapr_tce_release_vfio_group(struct kvm *kvm, - struct kvm_vfio_group *kvg) + struct kvm_vfio_file *kvf) { - if (WARN_ON_ONCE(!kvg->iommu_group)) + if (WARN_ON_ONCE(!kvf->iommu_group)) return; - kvm_spapr_tce_release_iommu_group(kvm, kvg->iommu_group); - iommu_group_put(kvg->iommu_group); - kvg->iommu_group = NULL; + kvm_spapr_tce_release_iommu_group(kvm, kvf->iommu_group); + iommu_group_put(kvf->iommu_group); + kvf->iommu_group = NULL; } #endif /* - * Groups can use the same or different IOMMU domains. If the same then - * adding a new group may change the coherency of groups we've previously - * been told about. We don't want to care about any of that so we retest - * each group and bail as soon as we find one that's noncoherent. This - * means we only ever [un]register_noncoherent_dma once for the whole device. + * Groups/devices can use the same or different IOMMU domains. If the same + * then adding a new group/device may change the coherency of groups/devices + * we've previously been told about. We don't want to care about any of + * that so we retest each group/device and bail as soon as we find one that's + * noncoherent. This means we only ever [un]register_noncoherent_dma once + * for the whole device. */ static void kvm_vfio_update_coherency(struct kvm_device *dev) { struct kvm_vfio *kv = dev->private; bool noncoherent = false; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (!kvm_vfio_file_enforced_coherent(kvg->file)) { + list_for_each_entry(kvf, &kv->file_list, node) { + if (!kvm_vfio_file_enforced_coherent(kvf->file)) { noncoherent = true; break; } @@ -143,10 +144,10 @@ static void kvm_vfio_update_coherency(struct kvm_device *dev) mutex_unlock(&kv->lock); } -static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) +static int kvm_vfio_file_add(struct kvm_device *dev, unsigned int fd) { struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; struct file *filp; int ret; @@ -162,27 +163,27 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (kvg->file == filp) { + list_for_each_entry(kvf, &kv->file_list, node) { + if (kvf->file == filp) { ret = -EEXIST; goto err_unlock; } } - kvg = kzalloc(sizeof(*kvg), GFP_KERNEL_ACCOUNT); - if (!kvg) { + kvf = kzalloc(sizeof(*kvf), GFP_KERNEL_ACCOUNT); + if (!kvf) { ret = -ENOMEM; goto err_unlock; } - kvg->file = filp; - list_add_tail(&kvg->node, &kv->group_list); + kvf->file = filp; + list_add_tail(&kvf->node, &kv->file_list); kvm_arch_start_assignment(dev->kvm); mutex_unlock(&kv->lock); - kvm_vfio_file_set_kvm(kvg->file, dev->kvm); + kvm_vfio_file_set_kvm(kvf->file, dev->kvm); kvm_vfio_update_coherency(dev); return 0; @@ -193,10 +194,10 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) return ret; } -static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd) +static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd) { struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; struct fd f; int ret; @@ -208,18 +209,18 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd) mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (kvg->file != f.file) + list_for_each_entry(kvf, &kv->file_list, node) { + if (kvf->file != f.file) continue; - list_del(&kvg->node); + list_del(&kvf->node); kvm_arch_end_assignment(dev->kvm); #ifdef CONFIG_SPAPR_TCE_IOMMU - kvm_spapr_tce_release_vfio_group(dev->kvm, kvg); + kvm_spapr_tce_release_vfio_group(dev->kvm, kvf); #endif - kvm_vfio_file_set_kvm(kvg->file, NULL); - fput(kvg->file); - kfree(kvg); + kvm_vfio_file_set_kvm(kvf->file, NULL); + fput(kvf->file); + kfree(kvf); ret = 0; break; } @@ -234,12 +235,12 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd) } #ifdef CONFIG_SPAPR_TCE_IOMMU -static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev, - void __user *arg) +static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev, + void __user *arg) { struct kvm_vfio_spapr_tce param; struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; struct fd f; int ret; @@ -254,20 +255,20 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev, mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (kvg->file != f.file) + list_for_each_entry(kvf, &kv->file_list, node) { + if (kvf->file != f.file) continue; - if (!kvg->iommu_group) { - kvg->iommu_group = kvm_vfio_file_iommu_group(kvg->file); - if (WARN_ON_ONCE(!kvg->iommu_group)) { + if (!kvf->iommu_group) { + kvf->iommu_group = kvm_vfio_file_iommu_group(kvf->file); + if (WARN_ON_ONCE(!kvf->iommu_group)) { ret = -EIO; goto err_fdput; } } ret = kvm_spapr_tce_attach_iommu_group(dev->kvm, param.tablefd, - kvg->iommu_group); + kvf->iommu_group); break; } @@ -278,8 +279,8 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev, } #endif -static int kvm_vfio_set_group(struct kvm_device *dev, long attr, - void __user *arg) +static int kvm_vfio_set_file(struct kvm_device *dev, long attr, + void __user *arg) { int32_t __user *argp = arg; int32_t fd; @@ -288,16 +289,16 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, case KVM_DEV_VFIO_GROUP_ADD: if (get_user(fd, argp)) return -EFAULT; - return kvm_vfio_group_add(dev, fd); + return kvm_vfio_file_add(dev, fd); case KVM_DEV_VFIO_GROUP_DEL: if (get_user(fd, argp)) return -EFAULT; - return kvm_vfio_group_del(dev, fd); + return kvm_vfio_file_del(dev, fd); #ifdef CONFIG_SPAPR_TCE_IOMMU case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: - return kvm_vfio_group_set_spapr_tce(dev, arg); + return kvm_vfio_file_set_spapr_tce(dev, arg); #endif } @@ -309,8 +310,8 @@ static int kvm_vfio_set_attr(struct kvm_device *dev, { switch (attr->group) { case KVM_DEV_VFIO_GROUP: - return kvm_vfio_set_group(dev, attr->attr, - u64_to_user_ptr(attr->addr)); + return kvm_vfio_set_file(dev, attr->attr, + u64_to_user_ptr(attr->addr)); } return -ENXIO; @@ -339,16 +340,16 @@ static int kvm_vfio_has_attr(struct kvm_device *dev, static void kvm_vfio_release(struct kvm_device *dev) { struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg, *tmp; + struct kvm_vfio_file *kvf, *tmp; - list_for_each_entry_safe(kvg, tmp, &kv->group_list, node) { + list_for_each_entry_safe(kvf, tmp, &kv->file_list, node) { #ifdef CONFIG_SPAPR_TCE_IOMMU - kvm_spapr_tce_release_vfio_group(dev->kvm, kvg); + kvm_spapr_tce_release_vfio_group(dev->kvm, kvf); #endif - kvm_vfio_file_set_kvm(kvg->file, NULL); - fput(kvg->file); - list_del(&kvg->node); - kfree(kvg); + kvm_vfio_file_set_kvm(kvf->file, NULL); + fput(kvf->file); + list_del(&kvf->node); + kfree(kvf); kvm_arch_end_assignment(dev->kvm); } @@ -382,7 +383,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type) if (!kv) return -ENOMEM; - INIT_LIST_HEAD(&kv->group_list); + INIT_LIST_HEAD(&kv->file_list); mutex_init(&kv->lock); dev->private = kv; From patchwork Wed Mar 8 13:28:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C2F5EC7618A for ; Wed, 8 Mar 2023 13:29:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6F07B10E602; Wed, 8 Mar 2023 13:29:26 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1B1FB10E5DF; Wed, 8 Mar 2023 13:29:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282156; x=1709818156; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZNI4S2NqNpSuZcZW9SeBoULjYW96XaLuTWlhEuCU3MU=; b=IXUQ69zeTgKqh+3xZmtZW3gh1T3xlZidfSsrQemM3T10Qr0iokS7PS94 T0w0VImludVhboCAuVJm20lMoBTrgKV6kb4aOf7AJ2CmAkIJRi/nWqsV5 6gY0M1Qlw3U8GFWOUfR66jn+0/gsWLPWskt8SUn+NxKU2mXyVt/z+i14e 2WRn/E0YmNd528YHY5QZVqNxTxns4l0jJkzjbqejAO00q6d3dFKQpFH2f 8UlbI6BxXetHI289cI/lMHnoVmS4eKg8V5B5T3Rln/kzF7zt/kMhLNEIj xKzkMbDbRzHxwljWOBOT/zxlux6ghmR3HKWMnA6F0cmE7hCZdOIsLI/DC A==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165140" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165140" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789308" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789308" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:14 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:44 -0800 Message-Id: <20230308132903.465159-6-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 05/24] kvm/vfio: Accept vfio device file from userspace X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*. Old userspace uses KVM_DEV_VFIO_GROUP* works as well. Signed-off-by: Yi Liu Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- Documentation/virt/kvm/devices/vfio.rst | 52 +++++++++++++++++-------- include/uapi/linux/kvm.h | 16 ++++++-- virt/kvm/vfio.c | 16 ++++---- 3 files changed, 55 insertions(+), 29 deletions(-) diff --git a/Documentation/virt/kvm/devices/vfio.rst b/Documentation/virt/kvm/devices/vfio.rst index 79b6811bb4f3..5b05b48abaab 100644 --- a/Documentation/virt/kvm/devices/vfio.rst +++ b/Documentation/virt/kvm/devices/vfio.rst @@ -9,24 +9,37 @@ Device types supported: - KVM_DEV_TYPE_VFIO Only one VFIO instance may be created per VM. The created device -tracks VFIO groups in use by the VM and features of those groups -important to the correctness and acceleration of the VM. As groups -are enabled and disabled for use by the VM, KVM should be updated -about their presence. When registered with KVM, a reference to the -VFIO-group is held by KVM. +tracks VFIO files (group or device) in use by the VM and features +of those groups/devices important to the correctness and acceleration +of the VM. As groups/devices are enabled and disabled for use by the +VM, KVM should be updated about their presence. When registered with +KVM, a reference to the VFIO file is held by KVM. Groups: - KVM_DEV_VFIO_GROUP - -KVM_DEV_VFIO_GROUP attributes: - KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking - kvm_device_attr.addr points to an int32_t file descriptor - for the VFIO group. - KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking - kvm_device_attr.addr points to an int32_t file descriptor - for the VFIO group. - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table + KVM_DEV_VFIO_FILE + alias: KVM_DEV_VFIO_GROUP + +KVM_DEV_VFIO_FILE attributes: + KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device + tracking + + alias: KVM_DEV_VFIO_GROUP_ADD + + kvm_device_attr.addr points to an int32_t file descriptor for the + VFIO file. + KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM + device tracking + + alias: KVM_DEV_VFIO_GROUP_DEL + + kvm_device_attr.addr points to an int32_t file descriptor for the + VFIO file. + + KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table allocated by sPAPR KVM. + + alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE + kvm_device_attr.addr points to a struct:: struct kvm_vfio_spapr_tce { @@ -40,9 +53,14 @@ KVM_DEV_VFIO_GROUP attributes: - @tablefd is a file descriptor for a TCE table allocated via KVM_CREATE_SPAPR_TCE. + only accepts vfio group file as SPAPR has no iommufd support + :: -The GROUP_ADD operation above should be invoked prior to accessing the +The FILE/GROUP_ADD operation above should be invoked prior to accessing the device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support drivers which require a kvm pointer to be set in their .open_device() -callback. +callback. It is the same for device file descriptor via character device +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such file +descriptors, FILE_ADD should be invoked before VFIO_DEVICE_BIND_IOMMUFD +to support the drivers mentioned in piror sentence as well. diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index d77aef872a0a..a8eeca70a498 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1410,10 +1410,18 @@ struct kvm_device_attr { __u64 addr; /* userspace address of attr data */ }; -#define KVM_DEV_VFIO_GROUP 1 -#define KVM_DEV_VFIO_GROUP_ADD 1 -#define KVM_DEV_VFIO_GROUP_DEL 2 -#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE 3 +#define KVM_DEV_VFIO_FILE 1 + +#define KVM_DEV_VFIO_FILE_ADD 1 +#define KVM_DEV_VFIO_FILE_DEL 2 +#define KVM_DEV_VFIO_FILE_SET_SPAPR_TCE 3 + +/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */ +#define KVM_DEV_VFIO_GROUP KVM_DEV_VFIO_FILE + +#define KVM_DEV_VFIO_GROUP_ADD KVM_DEV_VFIO_FILE_ADD +#define KVM_DEV_VFIO_GROUP_DEL KVM_DEV_VFIO_FILE_DEL +#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE KVM_DEV_VFIO_FILE_SET_SPAPR_TCE enum kvm_device_type { KVM_DEV_TYPE_FSL_MPIC_20 = 1, diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 857d6ba349e1..d869913baafd 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr, int32_t fd; switch (attr) { - case KVM_DEV_VFIO_GROUP_ADD: + case KVM_DEV_VFIO_FILE_ADD: if (get_user(fd, argp)) return -EFAULT; return kvm_vfio_file_add(dev, fd); - case KVM_DEV_VFIO_GROUP_DEL: + case KVM_DEV_VFIO_FILE_DEL: if (get_user(fd, argp)) return -EFAULT; return kvm_vfio_file_del(dev, fd); #ifdef CONFIG_SPAPR_TCE_IOMMU - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: return kvm_vfio_file_set_spapr_tce(dev, arg); #endif } @@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr) { switch (attr->group) { - case KVM_DEV_VFIO_GROUP: + case KVM_DEV_VFIO_FILE: return kvm_vfio_set_file(dev, attr->attr, u64_to_user_ptr(attr->addr)); } @@ -321,12 +321,12 @@ static int kvm_vfio_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr) { switch (attr->group) { - case KVM_DEV_VFIO_GROUP: + case KVM_DEV_VFIO_FILE: switch (attr->attr) { - case KVM_DEV_VFIO_GROUP_ADD: - case KVM_DEV_VFIO_GROUP_DEL: + case KVM_DEV_VFIO_FILE_ADD: + case KVM_DEV_VFIO_FILE_DEL: #ifdef CONFIG_SPAPR_TCE_IOMMU - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: #endif return 0; } From patchwork Wed Mar 8 13:28:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165803 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 70472C6FD20 for ; Wed, 8 Mar 2023 13:29:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B0C8910E5F4; Wed, 8 Mar 2023 13:29:21 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0244F10E5E1; Wed, 8 Mar 2023 13:29:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282158; x=1709818158; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Xzsdz0GZ0x3LH67TtI4aQESYRlZwoGYHkjcQObdvaxo=; b=cD8+lHcY4TSd/DR1LFx1dXL/lcSrX0KqL/PdVEt1MZnioFGnkS8krjwy DRORd84Wm+NxMQ3QwjrRs03IfELrAgEDzkJR4eOhxnO3B+marI48O2prA 8oyKwCmXA0srZPVDv86E6ZN0TcKsVWnuc87rSM2wERJ/bcdElpT0eRUCh 0ODen5h7oyjagBljQEmzA3errR9+R4yDysAhTgyFGzxgETRyP6P6csPKV jo8rvOi1RTvVjJE32zeC7M4FEGxs2jasTha6AOuTmHBF006AjpU0HusX0 ibTXpj8mn5In6I4yFGWN6WrYihXL5Rw5A/xElgEC+1yOhe03V5uvX/i/P g==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165156" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165156" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:17 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789318" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789318" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:16 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:45 -0800 Message-Id: <20230308132903.465159-7-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 06/24] vfio: Pass struct vfio_device_file * to vfio_device_open/close() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This avoids passing too much parameters in multiple functions. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- drivers/vfio/group.c | 19 +++++++++++++------ drivers/vfio/vfio.h | 8 ++++---- drivers/vfio/vfio_main.c | 25 +++++++++++++++---------- 3 files changed, 32 insertions(+), 20 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index a293e1f92e89..160a4c891dda 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -169,8 +169,9 @@ static void vfio_device_group_get_kvm_safe(struct vfio_device *device) spin_unlock(&device->group->kvm_ref_lock); } -static int vfio_device_group_open(struct vfio_device *device) +static int vfio_device_group_open(struct vfio_device_file *df) { + struct vfio_device *device = df->device; int ret; mutex_lock(&device->group->group_lock); @@ -190,7 +191,11 @@ static int vfio_device_group_open(struct vfio_device *device) if (device->open_count == 0) vfio_device_group_get_kvm_safe(device); - ret = vfio_device_open(device, device->group->iommufd); + df->iommufd = device->group->iommufd; + + ret = vfio_device_open(df); + if (ret) + df->iommufd = NULL; if (device->open_count == 0) vfio_device_put_kvm(device); @@ -202,12 +207,14 @@ static int vfio_device_group_open(struct vfio_device *device) return ret; } -void vfio_device_group_close(struct vfio_device *device) +void vfio_device_group_close(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + mutex_lock(&device->group->group_lock); mutex_lock(&device->dev_set->lock); - vfio_device_close(device, device->group->iommufd); + vfio_device_close(df); if (device->open_count == 0) vfio_device_put_kvm(device); @@ -228,7 +235,7 @@ static struct file *vfio_device_open_file(struct vfio_device *device) goto err_out; } - ret = vfio_device_group_open(device); + ret = vfio_device_group_open(df); if (ret) goto err_free; @@ -260,7 +267,7 @@ static struct file *vfio_device_open_file(struct vfio_device *device) return filep; err_close_device: - vfio_device_group_close(device); + vfio_device_group_close(df); err_free: kfree(df); err_out: diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 69e1a0692b06..7ced404526d9 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -20,13 +20,13 @@ struct vfio_device_file { struct vfio_device *device; spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; + struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ }; void vfio_device_put_registration(struct vfio_device *device); bool vfio_device_try_get_registration(struct vfio_device *device); -int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd); -void vfio_device_close(struct vfio_device *device, - struct iommufd_ctx *iommufd); +int vfio_device_open(struct vfio_device_file *df); +void vfio_device_close(struct vfio_device_file *df); struct vfio_device_file * vfio_allocate_device_file(struct vfio_device *device); @@ -91,7 +91,7 @@ void vfio_device_group_register(struct vfio_device *device); void vfio_device_group_unregister(struct vfio_device *device); int vfio_device_group_use_iommu(struct vfio_device *device); void vfio_device_group_unuse_iommu(struct vfio_device *device); -void vfio_device_group_close(struct vfio_device *device); +void vfio_device_group_close(struct vfio_device_file *df); struct vfio_group *vfio_group_from_file(struct file *file); bool vfio_group_enforced_coherent(struct vfio_group *group); void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 03d5b2979f79..8c9b05f540fd 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -419,9 +419,10 @@ vfio_allocate_device_file(struct vfio_device *device) return df; } -static int vfio_device_first_open(struct vfio_device *device, - struct iommufd_ctx *iommufd) +static int vfio_device_first_open(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + struct iommufd_ctx *iommufd = df->iommufd; int ret; lockdep_assert_held(&device->dev_set->lock); @@ -453,9 +454,11 @@ static int vfio_device_first_open(struct vfio_device *device, return ret; } -static void vfio_device_last_close(struct vfio_device *device, - struct iommufd_ctx *iommufd) +static void vfio_device_last_close(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + struct iommufd_ctx *iommufd = df->iommufd; + lockdep_assert_held(&device->dev_set->lock); if (device->ops->close_device) @@ -467,15 +470,16 @@ static void vfio_device_last_close(struct vfio_device *device, module_put(device->dev->driver->owner); } -int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd) +int vfio_device_open(struct vfio_device_file *df) { + struct vfio_device *device = df->device; int ret = 0; lockdep_assert_held(&device->dev_set->lock); device->open_count++; if (device->open_count == 1) { - ret = vfio_device_first_open(device, iommufd); + ret = vfio_device_first_open(df); if (ret) device->open_count--; } @@ -483,14 +487,15 @@ int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd) return ret; } -void vfio_device_close(struct vfio_device *device, - struct iommufd_ctx *iommufd) +void vfio_device_close(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + lockdep_assert_held(&device->dev_set->lock); vfio_assert_device_open(device); if (device->open_count == 1) - vfio_device_last_close(device, iommufd); + vfio_device_last_close(df); device->open_count--; } @@ -535,7 +540,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; - vfio_device_group_close(device); + vfio_device_group_close(df); vfio_device_put_registration(device); From patchwork Wed Mar 8 13:28:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165804 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4273DC74A44 for ; Wed, 8 Mar 2023 13:29:26 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 533EE10E5FC; Wed, 8 Mar 2023 13:29:24 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 9598710E5E4; Wed, 8 Mar 2023 13:29:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282159; x=1709818159; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jI5t7+FTFbWU3k6ZtsKZ2cB5vkHyHBAEw/BOV4eh+Oo=; b=ZRiBAoneBqhCfwPUItxQqJ0aDEYAawI9b20HGPasrRj4pzLN37zhxZ8v KmcuUAr+W7LRZ1Kg3PsScCR25oOq0RbEEzJTZzWztECJDwTcDToE/UcnJ 0ElDCOMJ4Nqg20fjT9xgZSVkTrrWG4dkhadyG3sptFpWeeb/+ffcrcO0c l+LHkmzyL8dy1pGoJ5G3HAr2gufiHdovqVORlCVSlgNG4+7LPsQM1pIns vjG7yZ1b/WzijTG/af6MeqnZyCSvclhfLnI/APPLlUXqM5brzkUzPMwOi 4DEdgVmiuYVtZfbBQQ+Z3lWxMZPpDZyVjTxiGReNubb3k9llCjus/fyQ5 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165170" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165170" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:19 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789325" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789325" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:18 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:46 -0800 Message-Id: <20230308132903.465159-8-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 07/24] vfio: Block device access via device fd until device is opened X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Allow the vfio_device file to be in a state where the device FD is opened but the device cannot be used by userspace (i.e. its .open_device() hasn't been called). This inbetween state is not used when the device FD is spawned from the group FD, however when we create the device FD directly by opening a cdev it will be opened in the blocked state. The reason for the inbetween state is that userspace only gets a FD but doesn't gain access permission until binding the FD to an iommufd. So in the blocked state, only the bind operation is allowed. Completing bind will allow user to further access the device. This is implemented by adding a flag in struct vfio_device_file to mark the blocked state and using a simple smp_load_acquire() to obtain the flag value and serialize all the device setup with the thread accessing this device. Following this lockless scheme, it can safely handle the device FD unbound->bound but it cannot handle bound->unbound. To allow this we'd need to add a lock on all the vfio ioctls which seems costly. So once device FD is bound, it remains bound until the FD is closed. Suggested-by: Jason Gunthorpe Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- drivers/vfio/group.c | 11 ++++++++++- drivers/vfio/vfio.h | 1 + drivers/vfio/vfio_main.c | 16 ++++++++++++++++ 3 files changed, 27 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 160a4c891dda..4a220d5bf79b 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -194,9 +194,18 @@ static int vfio_device_group_open(struct vfio_device_file *df) df->iommufd = device->group->iommufd; ret = vfio_device_open(df); - if (ret) + if (ret) { df->iommufd = NULL; + goto out_put_kvm; + } + + /* + * Paired with smp_load_acquire() in vfio_device_fops::ioctl/ + * read/write/mmap + */ + smp_store_release(&df->access_granted, true); +out_put_kvm: if (device->open_count == 0) vfio_device_put_kvm(device); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 7ced404526d9..e60c409868f8 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -18,6 +18,7 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + bool access_granted; spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 8c9b05f540fd..027410e8d4a8 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1114,6 +1114,10 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, struct vfio_device *device = df->device; int ret; + /* Paired with smp_store_release() in vfio_device_group_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + ret = vfio_device_pm_runtime_get(device); if (ret) return ret; @@ -1141,6 +1145,10 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf, struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() in vfio_device_group_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->read)) return -EINVAL; @@ -1154,6 +1162,10 @@ static ssize_t vfio_device_fops_write(struct file *filep, struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() in vfio_device_group_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->write)) return -EINVAL; @@ -1165,6 +1177,10 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() in vfio_device_group_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->mmap)) return -EINVAL; From patchwork Wed Mar 8 13:28:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165805 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E5E4AC74A4B for ; Wed, 8 Mar 2023 13:29:30 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 2CDE610E5E2; Wed, 8 Mar 2023 13:29:30 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 46C3410E5E4; Wed, 8 Mar 2023 13:29:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282161; x=1709818161; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eylPr5Mkuo6YZVW89IJnoIAgtKbWBWud+DBAB7nog88=; b=ORf3/3j0UZm5MlqKUBDAZVAJt8atGleuKRSMvDEIFzcuSxVJFDO/iMFm 7lNKeBZqzC5cSMUQdzPW5QZuenziV0mlo6QEYjZEJLHz5VQo4BKpGY4V0 ErYZgxGaV3wtwWvLYUiCV/8rNZggkYV/SpdU2Sy/4x4kcXDpZ6Ea3nGFD DD/CichvqU6bWcQ0151Ovb+jdrJ0TSous0nGkI1Pv9Bs++F1MYJ0ytN7z IG1RBD/Z7hcst2Qu9ZurA4C4pTF9BRDldNHJpfM8J/NAoKyiyAT9L6Rhc 3zH5PAQUA5qbUgp9oBOk83FA2oZfaKMKUKu7aQQe7wUK7rDFToDSNHnzU g==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165183" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165183" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:21 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789335" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789335" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:20 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:47 -0800 Message-Id: <20230308132903.465159-9-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 08/24] vfio/pci: Update comment around group_fd get in vfio_pci_ioctl_pci_hot_reset() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" this suits more on what the code does. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe --- drivers/vfio/pci/vfio_pci_core.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index a5ab416cf476..65bbef562268 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1308,9 +1308,8 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, } /* - * For each group_fd, get the group through the vfio external user - * interface and store the group and iommu ID. This ensures the group - * is held across the reset. + * Get the group file for each fd to ensure the group held across + * the reset */ for (file_idx = 0; file_idx < hdr.count; file_idx++) { struct file *file = fget(group_fds[file_idx]); From patchwork Wed Mar 8 13:28:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165806 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 61C54C742A7 for ; Wed, 8 Mar 2023 13:29:31 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6924410E60B; Wed, 8 Mar 2023 13:29:30 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id C4E8F10E5FC; Wed, 8 Mar 2023 13:29:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282162; x=1709818162; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4KV5nYt7nyzNQTkdxfsQ1VW0eB5Eh57ig35tZj0LnWI=; b=hqeGGi38gt6LTgRdW9QWioo3TX9p+vvgsry7YvPKe1FTc3m7rX3089DY 1xQse+cArh4mpmNx2VPcqdUO0xIWvZ5N/eSlEciGOdCobl1G0qhhG5TCR PlqlWSGi0Zi9i96b/g/AnmmTNKkyrKfg1PbhYp5o2K3KGRvip/CKDnbKO SqimYZvN8YmaQnR8XyqFDCRaVgLKWLzr7EYtYV0JvJmUCg7Ybc3trZZgf yt0BW7TK8yNdzprc14abuk/5H52D5ajcRoi0CNQhw+2Gl7S9/qj+v5kl3 BlBsJMDdaGT6zuSj3yNmXEI8sufLr4GXuSLetLq2v8PF2Pr5/VX36UjyS g==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165199" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165199" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:22 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789347" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789347" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:21 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:48 -0800 Message-Id: <20230308132903.465159-10-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 09/24] vfio/pci: Only need to check opened devices in the dev_set for hot reset X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" If the affected device is not opened by any user, it is not necessary to check its ownership as it will not be opened by any user if a user is hot resetting a device within this dev_set. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian --- drivers/vfio/pci/vfio_pci_core.c | 17 +++++++++++++++-- include/uapi/linux/vfio.h | 8 ++++++++ 2 files changed, 23 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 65bbef562268..f13b093557a9 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -2429,10 +2429,23 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, list_for_each_entry(cur_vma, &dev_set->device_list, vdev.dev_set_list) { /* - * Test whether all the affected devices are contained by the + * Test whether all the affected devices can be reset by the + * user. The affected devices may already been opened or not + * yet. + * + * For the devices not opened yet, user can reset them as it + * reason is that the hot reset is done under the protection + * of the dev_set->lock, and device open is also under this + * lock. During the hot reset, such devices can not be opened + * by other users. + * + * For the devices that have been opened, needs to check the + * ownership. If the user provides a set of group fds, test + * whether all the opened affected devices are contained by the * set of groups provided by the user. */ - if (!vfio_dev_in_groups(cur_vma, groups)) { + if (cur_vma->vdev.open_count && + !vfio_dev_in_groups(cur_vma, groups)) { ret = -EINVAL; goto err_undo; } diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 0552e8dcf0cb..f96e5689cffc 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -673,6 +673,14 @@ struct vfio_pci_hot_reset_info { * VFIO_DEVICE_PCI_HOT_RESET - _IOW(VFIO_TYPE, VFIO_BASE + 13, * struct vfio_pci_hot_reset) * + * Userspace requests hot reset for the devices it uses. Due to the + * underlying topology, multiple devices can be affected in the reset + * while some might be opened by another user. To avoid interference + * the calling user must ensure all affected devices, if opened, are + * owned by itself. + * + * The ownership is proved by an array of group fds. + * * Return: 0 on success, -errno on failure. */ struct vfio_pci_hot_reset { From patchwork Wed Mar 8 13:28:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165808 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F4068C678D5 for ; Wed, 8 Mar 2023 13:29:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DBCAA10E60C; Wed, 8 Mar 2023 13:29:30 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8873310E602; Wed, 8 Mar 2023 13:29:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282164; x=1709818164; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=z989pXr8cKO9ICuEG/SHwdMZNnkhbkrQGmWc0s0dx1c=; b=FXUCyU9sgZqpUUqSL6sTqIePQLNSVw4q3QZ/08nmCdOVYXtTLVOr7dKb WbGzYvnsc5MJSCoU3aUZhXWeD6AHEer998CDYb7MG6vx3K6Cqi9fIBmlV MusTNzJS+DkfIye/MzV3qaYcMJC9+BMHP4q/dHMkiFyEuBWhERMyENNZI gRBkeVSU1lL9RLWWVnY3inqRGcGgtW7NQPElinp8+PORHjyzKc7EdBnU7 9CenZgqaOPcPtaM5E4BXPI0It0sDTnxxgz/9673sDpMMK7Nu/7FNv6NyH U9ikBZLWI+JM4chfemM5trB61uMakRgFcX0T4sskFFuaXoUuXQJDss+wR A==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165208" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165208" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:24 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789357" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789357" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:23 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:49 -0800 Message-Id: <20230308132903.465159-11-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 10/24] vfio/pci: Rename the helpers and data in hot reset path to accept device fd X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" No function change is intended, just to make the helpers and structures to be prepared to accept device fds as proof of device ownership. Signed-off-by: Yi Liu --- drivers/vfio/pci/vfio_pci_core.c | 40 ++++++++++++++++---------------- 1 file changed, 20 insertions(+), 20 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index f13b093557a9..265a0058436c 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -177,10 +177,10 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev) } } -struct vfio_pci_group_info; +struct vfio_pci_user_file_info; static void vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set); static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, - struct vfio_pci_group_info *groups); + struct vfio_pci_user_file_info *user_info); /* * INTx masking requires the ability to disable INTx signaling via PCI_COMMAND @@ -799,7 +799,7 @@ static int vfio_pci_fill_devs(struct pci_dev *pdev, void *data) return 0; } -struct vfio_pci_group_info { +struct vfio_pci_user_file_info { int count; struct file **files; }; @@ -1260,9 +1260,9 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, { unsigned long minsz = offsetofend(struct vfio_pci_hot_reset, count); struct vfio_pci_hot_reset hdr; - int32_t *group_fds; + int32_t *user_fds; struct file **files; - struct vfio_pci_group_info info; + struct vfio_pci_user_file_info info; bool slot = false; int file_idx, count = 0, ret = 0; @@ -1292,17 +1292,17 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, if (!hdr.count || hdr.count > count) return -EINVAL; - group_fds = kcalloc(hdr.count, sizeof(*group_fds), GFP_KERNEL); + user_fds = kcalloc(hdr.count, sizeof(*user_fds), GFP_KERNEL); files = kcalloc(hdr.count, sizeof(*files), GFP_KERNEL); - if (!group_fds || !files) { - kfree(group_fds); + if (!user_fds || !files) { + kfree(user_fds); kfree(files); return -ENOMEM; } - if (copy_from_user(group_fds, arg->group_fds, - hdr.count * sizeof(*group_fds))) { - kfree(group_fds); + if (copy_from_user(user_fds, arg->group_fds, + hdr.count * sizeof(*user_fds))) { + kfree(user_fds); kfree(files); return -EFAULT; } @@ -1312,7 +1312,7 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, * the reset */ for (file_idx = 0; file_idx < hdr.count; file_idx++) { - struct file *file = fget(group_fds[file_idx]); + struct file *file = fget(user_fds[file_idx]); if (!file) { ret = -EBADF; @@ -1329,9 +1329,9 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, files[file_idx] = file; } - kfree(group_fds); + kfree(user_fds); - /* release reference to groups on error */ + /* release reference to user_fds on error */ if (ret) goto hot_reset_release; @@ -2312,13 +2312,13 @@ const struct pci_error_handlers vfio_pci_core_err_handlers = { }; EXPORT_SYMBOL_GPL(vfio_pci_core_err_handlers); -static bool vfio_dev_in_groups(struct vfio_pci_core_device *vdev, - struct vfio_pci_group_info *groups) +static bool vfio_dev_in_user_fds(struct vfio_pci_core_device *vdev, + struct vfio_pci_user_file_info *user_info) { unsigned int i; - for (i = 0; i < groups->count; i++) - if (vfio_file_has_dev(groups->files[i], &vdev->vdev)) + for (i = 0; i < user_info->count; i++) + if (vfio_file_has_dev(user_info->files[i], &vdev->vdev)) return true; return false; } @@ -2398,7 +2398,7 @@ static int vfio_pci_dev_set_pm_runtime_get(struct vfio_device_set *dev_set) * get each memory_lock. */ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, - struct vfio_pci_group_info *groups) + struct vfio_pci_user_file_info *user_info) { struct vfio_pci_core_device *cur_mem; struct vfio_pci_core_device *cur_vma; @@ -2445,7 +2445,7 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, * set of groups provided by the user. */ if (cur_vma->vdev.open_count && - !vfio_dev_in_groups(cur_vma, groups)) { + !vfio_dev_in_user_fds(cur_vma, user_info)) { ret = -EINVAL; goto err_undo; } From patchwork Wed Mar 8 13:28:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165819 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 354B5C742A7 for ; Wed, 8 Mar 2023 13:29:52 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id B6FB410E67F; Wed, 8 Mar 2023 13:29:49 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id F2F4D10E5E1; Wed, 8 Mar 2023 13:29:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282166; x=1709818166; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0L0vMMuWTJTwe95x2UEU2BefzXax/kTZVXcty/RFtNc=; b=TYfbn25KFbts+yju3XD/Svzkp8GK/0WA2MBUPkdZZVHl0mjfZD9C9CuV WFXBUiI1uky1rAYGLYKhlbZPJyhnjNfQ3oOKbuzPNGCQlMb9kk1tVCMyJ efDU+PKw2FaRwI+vZDIRlwmVEf7knlQiZDTM+2wWlbSdvTxQ+IgrlxEMl iwNKvnlPIL3g7L0NURcW18qfijOSMm5QIgAJmIJ98y0k7dc+9sSL+tnyn bCIF8AVfwvi7cZXX2A581sV8s8VnystNDGNsCKvHppLTjnAW510OjDCcy 8Npoa1HFByJbl/2cZqDOYm3HIC/mCQrCLk7dbdTDX+tBM/6pLz7WG7t4n Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165217" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165217" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789376" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789376" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:24 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:50 -0800 Message-Id: <20230308132903.465159-12-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 11/24] vfio/pci: Accept device fd in VFIO_DEVICE_PCI_HOT_RESET ioctl X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" VFIO PCI device hot reset requires user to provide a set of FDs to prove ownership on the affected devices in the hot reset. Either group fd or device fd can be used. But when user uses vfio device cdev, there is only device fd, hence VFIO_DEVICE_PCI_HOT_RESET needs to be extended to accept device fds. Signed-off-by: Yi Liu --- drivers/vfio/group.c | 15 +----------- drivers/vfio/pci/vfio_pci_core.c | 22 +++++++++++------ drivers/vfio/vfio.h | 1 + drivers/vfio/vfio_main.c | 42 ++++++++++++++++++++++++++++++++ include/linux/vfio.h | 1 + include/uapi/linux/vfio.h | 6 +++-- 6 files changed, 63 insertions(+), 24 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 4a220d5bf79b..6280368eb0bd 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -852,23 +852,10 @@ void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm) spin_unlock(&group->kvm_ref_lock); } -/** - * vfio_file_has_dev - True if the VFIO file is a handle for device - * @file: VFIO file to check - * @device: Device that must be part of the file - * - * Returns true if given file has permission to manipulate the given device. - */ -bool vfio_file_has_dev(struct file *file, struct vfio_device *device) +bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device) { - struct vfio_group *group = vfio_group_from_file(file); - - if (!group) - return false; - return group == device->group; } -EXPORT_SYMBOL_GPL(vfio_file_has_dev); static char *vfio_devnode(const struct device *dev, umode_t *mode) { diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 265a0058436c..123b468ead73 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -1300,7 +1300,7 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, return -ENOMEM; } - if (copy_from_user(user_fds, arg->group_fds, + if (copy_from_user(user_fds, arg->fds, hdr.count * sizeof(*user_fds))) { kfree(user_fds); kfree(files); @@ -1308,8 +1308,8 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, } /* - * Get the group file for each fd to ensure the group held across - * the reset + * Get the file for each fd to ensure the group/device file + * is held across the reset */ for (file_idx = 0; file_idx < hdr.count; file_idx++) { struct file *file = fget(user_fds[file_idx]); @@ -1319,8 +1319,14 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, break; } - /* Ensure the FD is a vfio group FD.*/ - if (!vfio_file_is_group(file)) { + /* + * For vfio group FD, sanitize the file is enough. + * For vfio device FD, needs to ensure it has got the + * access to device, otherwise it cannot be used as + * proof of device ownership. + */ + if (!vfio_file_is_valid(file) || + (!vfio_file_is_group(file) && !vfio_file_has_device_access(file))) { fput(file); ret = -EINVAL; break; @@ -2440,9 +2446,9 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, * by other users. * * For the devices that have been opened, needs to check the - * ownership. If the user provides a set of group fds, test - * whether all the opened affected devices are contained by the - * set of groups provided by the user. + * ownership. If the user provides a set of group/device + * fds, test whether all the opened devices are contained + * by the set of groups/devices provided by the user. */ if (cur_vma->vdev.open_count && !vfio_dev_in_user_fds(cur_vma, user_info)) { diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index e60c409868f8..464263288d16 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -96,6 +96,7 @@ void vfio_device_group_close(struct vfio_device_file *df); struct vfio_group *vfio_group_from_file(struct file *file); bool vfio_group_enforced_coherent(struct vfio_group *group); void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm); +bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device); bool vfio_device_has_container(struct vfio_device *device); int __init vfio_group_init(void); void vfio_group_cleanup(void); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 027410e8d4a8..cf9994a65df3 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1277,6 +1277,48 @@ void vfio_file_set_kvm(struct file *file, struct kvm *kvm) } EXPORT_SYMBOL_GPL(vfio_file_set_kvm); +/** + * vfio_file_has_device_access - True if the file has opened device + * @file: VFIO device file + */ +bool vfio_file_has_device_access(struct file *file) +{ + struct vfio_device_file *df; + + if (vfio_group_from_file(file) || + !vfio_device_from_file(file)) + return false; + + df = file->private_data; + + return READ_ONCE(df->access_granted); +} +EXPORT_SYMBOL_GPL(vfio_file_has_device_access); + +/** + * vfio_file_has_dev - True if the VFIO file is a handle for device + * @file: VFIO file to check + * @device: Device that must be part of the file + * + * Returns true if given file has permission to manipulate the given device. + */ +bool vfio_file_has_dev(struct file *file, struct vfio_device *device) +{ + struct vfio_group *group; + struct vfio_device *vdev; + + group = vfio_group_from_file(file); + if (group) + return vfio_group_has_dev(group, device); + + vdev = vfio_device_from_file(file); + if (device) + return vdev == device; + + return false; +} +EXPORT_SYMBOL_GPL(vfio_file_has_dev); + /* * Sub-module support */ diff --git a/include/linux/vfio.h b/include/linux/vfio.h index b14dcdd0b71f..1c69be2d687e 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -248,6 +248,7 @@ bool vfio_file_is_group(struct file *file); bool vfio_file_is_valid(struct file *file); bool vfio_file_enforced_coherent(struct file *file); void vfio_file_set_kvm(struct file *file, struct kvm *kvm); +bool vfio_file_has_device_access(struct file *file); bool vfio_file_has_dev(struct file *file, struct vfio_device *device); #define VFIO_PIN_PAGES_MAX_ENTRIES (PAGE_SIZE/sizeof(unsigned long)) diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index f96e5689cffc..d80141969cd1 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -679,7 +679,9 @@ struct vfio_pci_hot_reset_info { * the calling user must ensure all affected devices, if opened, are * owned by itself. * - * The ownership is proved by an array of group fds. + * The ownership can be proved by: + * - An array of group fds + * - An array of device fds * * Return: 0 on success, -errno on failure. */ @@ -687,7 +689,7 @@ struct vfio_pci_hot_reset { __u32 argsz; __u32 flags; __u32 count; - __s32 group_fds[]; + __s32 fds[]; }; #define VFIO_DEVICE_PCI_HOT_RESET _IO(VFIO_TYPE, VFIO_BASE + 13) From patchwork Wed Mar 8 13:28:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 260B2C6FD20 for ; Wed, 8 Mar 2023 13:29:34 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8260410E615; Wed, 8 Mar 2023 13:29:31 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 70C4910E607; Wed, 8 Mar 2023 13:29:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282167; x=1709818167; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+Espd273RqmEh9OcwiFTjRTtZDCKwnO6GhCUnMZWHQA=; b=mKn0plrX1takf8S98UE8P1O+ZAA3fuzDAi73tKIf8ZjMwQHHqPjZBuOE MfsIPBAFP7q6UGFAX4gCfQOHv4eJcRNCBXEkX2Iv2m58YAhg5OXvyhHBk SdSx7ooXFWu9V71UGhufDM8L/3MoxBfIfdHTBS/KMmrEVeuNh6tHsFCE1 HjSoAkzBnFEG8wy4sCc2dOV9zQHeHb/N8rgCwPZn1lMSqCLfcclXaSDr+ 0E/INhzGjGVrsDfmXFEbLZL3fyy6/gSdWjuxGUHIQizOH84Nget81UWjP QFZgrNgFeNNHKbYPKtNFiuWcZ8uc+ys3ji+iej+UZjHtD0Nj5G3IcxBIA Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165227" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165227" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:27 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789384" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789384" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:26 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:51 -0800 Message-Id: <20230308132903.465159-13-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 12/24] vfio/pci: Allow passing zero-length fd array in VFIO_DEVICE_PCI_HOT_RESET X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This is another method to issue PCI hot reset for the users that bounds device to a positive iommufd value. In such case, iommufd is a proof of device ownership. By passing a zero-length fd array, user indicates kernel to do ownership check with the bound iommufd. All the opened devices within the affected dev_set should have been bound to the same iommufd. This is simpler and faster as user does not need to pass a set of fds and kernel no need to search the device within the given fds. Suggested-by: Jason Gunthorpe Signed-off-by: Yi Liu Signed-off-by: Yi Liu --- drivers/iommu/iommufd/device.c | 6 +++ drivers/vfio/iommufd.c | 9 ++++ drivers/vfio/pci/vfio_pci_core.c | 92 ++++++++++++++++++++++---------- include/linux/iommufd.h | 1 + include/linux/vfio.h | 3 ++ include/uapi/linux/vfio.h | 5 ++ 6 files changed, 89 insertions(+), 27 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 9087cd8ed3ea..dbcee0d38a48 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -131,6 +131,12 @@ void iommufd_device_unbind(struct iommufd_device *idev) } EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD); +struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev) +{ + return idev->ictx; +} +EXPORT_SYMBOL_NS_GPL(iommufd_device_to_ictx, IOMMUFD); + static int iommufd_device_setup_msi(struct iommufd_device *idev, struct iommufd_hw_pagetable *hwpt, phys_addr_t sw_msi_start) diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 768d353cb6fa..30c0da2e11f9 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -69,6 +69,15 @@ void vfio_iommufd_unbind(struct vfio_device *vdev) vdev->ops->unbind_iommufd(vdev); } +struct iommufd_ctx *vfio_iommufd_physical_ctx(struct vfio_device *vdev) +{ + /* Only serve for physical device */ + if (!vdev->iommufd_device) + return NULL; + return iommufd_device_to_ictx(vdev->iommufd_device); +} +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_ctx); + /* * The physical standard ops mean that the iommufd_device is bound to the * physical device vdev->dev that was provided to vfio_init_group_dev(). Drivers diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 123b468ead73..b039fbd5c656 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -180,7 +180,8 @@ static void vfio_pci_probe_mmaps(struct vfio_pci_core_device *vdev) struct vfio_pci_user_file_info; static void vfio_pci_dev_set_try_reset(struct vfio_device_set *dev_set); static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, - struct vfio_pci_user_file_info *user_info); + struct vfio_pci_user_file_info *user_info, + struct iommufd_ctx *iommufd_ctx); /* * INTx masking requires the ability to disable INTx signaling via PCI_COMMAND @@ -1255,29 +1256,17 @@ static int vfio_pci_ioctl_get_pci_hot_reset_info( return ret; } -static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, +static int +vfio_pci_ioctl_pci_hot_reset_user_files(struct vfio_pci_core_device *vdev, + struct vfio_pci_hot_reset *hdr, + bool slot, struct vfio_pci_hot_reset __user *arg) { - unsigned long minsz = offsetofend(struct vfio_pci_hot_reset, count); - struct vfio_pci_hot_reset hdr; int32_t *user_fds; struct file **files; struct vfio_pci_user_file_info info; - bool slot = false; int file_idx, count = 0, ret = 0; - if (copy_from_user(&hdr, arg, minsz)) - return -EFAULT; - - if (hdr.argsz < minsz || hdr.flags) - return -EINVAL; - - /* Can we do a slot or bus reset or neither? */ - if (!pci_probe_reset_slot(vdev->pdev->slot)) - slot = true; - else if (pci_probe_reset_bus(vdev->pdev->bus)) - return -ENODEV; - /* * We can't let userspace give us an arbitrarily large buffer to copy, * so verify how many we think there could be. Note groups can have @@ -1289,11 +1278,11 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, return ret; /* Somewhere between 1 and count is OK */ - if (!hdr.count || hdr.count > count) + if (hdr->count > count) return -EINVAL; - user_fds = kcalloc(hdr.count, sizeof(*user_fds), GFP_KERNEL); - files = kcalloc(hdr.count, sizeof(*files), GFP_KERNEL); + user_fds = kcalloc(hdr->count, sizeof(*user_fds), GFP_KERNEL); + files = kcalloc(hdr->count, sizeof(*files), GFP_KERNEL); if (!user_fds || !files) { kfree(user_fds); kfree(files); @@ -1301,7 +1290,7 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, } if (copy_from_user(user_fds, arg->fds, - hdr.count * sizeof(*user_fds))) { + hdr->count * sizeof(*user_fds))) { kfree(user_fds); kfree(files); return -EFAULT; @@ -1311,7 +1300,7 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, * Get the file for each fd to ensure the group/device file * is held across the reset */ - for (file_idx = 0; file_idx < hdr.count; file_idx++) { + for (file_idx = 0; file_idx < hdr->count; file_idx++) { struct file *file = fget(user_fds[file_idx]); if (!file) { @@ -1341,10 +1330,10 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, if (ret) goto hot_reset_release; - info.count = hdr.count; + info.count = hdr->count; info.files = files; - ret = vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, &info); + ret = vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, &info, NULL); hot_reset_release: for (file_idx--; file_idx >= 0; file_idx--) @@ -1354,6 +1343,36 @@ static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, return ret; } +static int vfio_pci_ioctl_pci_hot_reset(struct vfio_pci_core_device *vdev, + struct vfio_pci_hot_reset __user *arg) +{ + unsigned long minsz = offsetofend(struct vfio_pci_hot_reset, count); + struct vfio_pci_hot_reset hdr; + struct iommufd_ctx *iommufd; + bool slot = false; + + if (copy_from_user(&hdr, arg, minsz)) + return -EFAULT; + + if (hdr.argsz < minsz || hdr.flags) + return -EINVAL; + + /* Can we do a slot or bus reset or neither? */ + if (!pci_probe_reset_slot(vdev->pdev->slot)) + slot = true; + else if (pci_probe_reset_bus(vdev->pdev->bus)) + return -ENODEV; + + if (hdr.count) + return vfio_pci_ioctl_pci_hot_reset_user_files(vdev, &hdr, slot, arg); + + iommufd = vfio_iommufd_physical_ctx(&vdev->vdev); + if (!iommufd) + return -EINVAL; + + return vfio_pci_dev_set_hot_reset(vdev->vdev.dev_set, NULL, iommufd); +} + static int vfio_pci_ioctl_ioeventfd(struct vfio_pci_core_device *vdev, struct vfio_device_ioeventfd __user *arg) { @@ -2323,6 +2342,9 @@ static bool vfio_dev_in_user_fds(struct vfio_pci_core_device *vdev, { unsigned int i; + if (!user_info) + return false; + for (i = 0; i < user_info->count; i++) if (vfio_file_has_dev(user_info->files[i], &vdev->vdev)) return true; @@ -2398,13 +2420,25 @@ static int vfio_pci_dev_set_pm_runtime_get(struct vfio_device_set *dev_set) return ret; } +static bool vfio_dev_in_iommufd_ctx(struct vfio_pci_core_device *vdev, + struct iommufd_ctx *iommufd_ctx) +{ + struct iommufd_ctx *iommufd = vfio_iommufd_physical_ctx(&vdev->vdev); + + if (!iommufd) + return false; + + return iommufd == iommufd_ctx; +} + /* * We need to get memory_lock for each device, but devices can share mmap_lock, * therefore we need to zap and hold the vma_lock for each device, and only then * get each memory_lock. */ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, - struct vfio_pci_user_file_info *user_info) + struct vfio_pci_user_file_info *user_info, + struct iommufd_ctx *iommufd_ctx) { struct vfio_pci_core_device *cur_mem; struct vfio_pci_core_device *cur_vma; @@ -2448,10 +2482,14 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, * For the devices that have been opened, needs to check the * ownership. If the user provides a set of group/device * fds, test whether all the opened devices are contained - * by the set of groups/devices provided by the user. + * by the set of groups/devices provided by the user. If + * user provides a zero-length array, the ownerhsip check + * is done by checking if all the opened devices are bound + * to the same iommufd_ctx. */ if (cur_vma->vdev.open_count && - !vfio_dev_in_user_fds(cur_vma, user_info)) { + !vfio_dev_in_user_fds(cur_vma, user_info) && + !vfio_dev_in_iommufd_ctx(cur_vma, iommufd_ctx)) { ret = -EINVAL; goto err_undo; } diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 365f11e8e615..7a0d7f2c4237 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -20,6 +20,7 @@ struct file; struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, struct device *dev, u32 *id); void iommufd_device_unbind(struct iommufd_device *idev); +struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev); int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id); void iommufd_device_detach(struct iommufd_device *idev); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 1c69be2d687e..fc14f8430a10 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -116,6 +116,7 @@ struct vfio_device_ops { int vfio_iommufd_physical_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_physical_unbind(struct vfio_device *vdev); +struct iommufd_ctx *vfio_iommufd_physical_ctx(struct vfio_device *vdev); int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id); int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); @@ -127,6 +128,8 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id); u32 *out_device_id)) NULL) #define vfio_iommufd_physical_unbind \ ((void (*)(struct vfio_device *vdev)) NULL) +#define vfio_iommufd_physical_ctx \ + ((struct iommufd_ctx * (*)(struct vfio_device *vdev)) NULL) #define vfio_iommufd_physical_attach_ioas \ ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) #define vfio_iommufd_emulated_bind \ diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index d80141969cd1..382d95455f89 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -682,6 +682,11 @@ struct vfio_pci_hot_reset_info { * The ownership can be proved by: * - An array of group fds * - An array of device fds + * - A zero-length array + * + * In the last case all affected devices which are opened by this user + * must have been bound to a same iommufd_ctx. This approach is only + * available for devices bound to positive iommufd. * * Return: 0 on success, -errno on failure. */ From patchwork Wed Mar 8 13:28:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165820 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4F514C678D5 for ; Wed, 8 Mar 2023 13:29:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 8EB8A10E681; Wed, 8 Mar 2023 13:29:49 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 6B82D10E5E2; Wed, 8 Mar 2023 13:29:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282169; x=1709818169; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2NxJ2YuHrmW/4nZRdgi+pj1OdZn9UVNaWl2X9mKXSHs=; b=YERnDN+piEZIdsAov5hcJXvjIznXMEXly4AX55EZHDoPc93s49VxrG6G P58c6sbVEolr4IkeMf+7FwlTS62E6oRPX41tqUL8BKDLgYNEWSkkhZZpO E0J6TXOMjMO9518hmExxQN1Vt4PK1yXSlG302SxX1cjKgs+bA1C+S/3j8 tdqAb4EUa7v+NXRwSQA7kMSpo/GPGFwQoBZW3n5eZFc30F6YXfg0SzvRw ZSfRy0mW0wiP6NowR/+oLtE7/onTW8rkod3uhYLXrV8j014lys1PM61GY Jqr2e4ieSDQ2GeNDqjdKpAEkl4mdGnaSUY0MImTM0keNRFpuKPyOosp45 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165237" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165237" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:29 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789391" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789391" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:28 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:52 -0800 Message-Id: <20230308132903.465159-14-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 13/24] vfio/iommufd: Split the compat_ioas attach out from vfio_iommufd_bind() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This makes the group code call .bind_iommufd and .attach_ioas in two steps instead of in a single step. This prepares the bind_iommufd and attach_ioas support in the coming cdev path. Signed-off-by: Yi Liu --- drivers/vfio/group.c | 26 ++++++++++----- drivers/vfio/iommufd.c | 75 ++++++++++++++++++++++++++---------------- drivers/vfio/vfio.h | 8 +++++ 3 files changed, 73 insertions(+), 36 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 6280368eb0bd..555d68aefa71 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -177,7 +177,7 @@ static int vfio_device_group_open(struct vfio_device_file *df) mutex_lock(&device->group->group_lock); if (!vfio_group_has_iommu(device->group)) { ret = -EINVAL; - goto out_unlock; + goto err_unlock; } mutex_lock(&device->dev_set->lock); @@ -194,9 +194,14 @@ static int vfio_device_group_open(struct vfio_device_file *df) df->iommufd = device->group->iommufd; ret = vfio_device_open(df); - if (ret) { - df->iommufd = NULL; - goto out_put_kvm; + if (ret) + goto err_put_kvm; + + if (device->group->iommufd) { + ret = vfio_iommufd_attach_compat_ioas(device, + device->group->iommufd); + if (ret) + goto err_close_device; } /* @@ -205,13 +210,18 @@ static int vfio_device_group_open(struct vfio_device_file *df) */ smp_store_release(&df->access_granted, true); -out_put_kvm: + mutex_unlock(&device->dev_set->lock); + mutex_unlock(&device->group->group_lock); + return 0; + +err_close_device: + vfio_device_close(df); +err_put_kvm: + df->iommufd = NULL; if (device->open_count == 0) vfio_device_put_kvm(device); - mutex_unlock(&device->dev_set->lock); - -out_unlock: +err_unlock: mutex_unlock(&device->group->group_lock); return ret; } diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 30c0da2e11f9..8c518f8bd39a 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -10,52 +10,71 @@ MODULE_IMPORT_NS(IOMMUFD); MODULE_IMPORT_NS(IOMMUFD_VFIO); -int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) +static int vfio_iommufd_device_probe_comapt_noiommu(struct vfio_device *vdev, + struct iommufd_ctx *ictx) { u32 ioas_id; + + if (!capable(CAP_SYS_RAWIO)) + return -EPERM; + + /* + * Require no compat ioas to be assigned to proceed. The basic + * statement is that the user cannot have done something that + * implies they expected translation to exist + */ + if (!iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id)) + return -EPERM; + return 0; +} + +int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) +{ u32 device_id; int ret; lockdep_assert_held(&vdev->dev_set->lock); if (vfio_device_is_noiommu(vdev)) { - if (!capable(CAP_SYS_RAWIO)) - return -EPERM; - - /* - * Require no compat ioas to be assigned to proceed. The basic - * statement is that the user cannot have done something that - * implies they expected translation to exist - */ - if (!iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id)) - return -EPERM; - return 0; + ret = vfio_iommufd_device_probe_comapt_noiommu(vdev, ictx); + if (ret) + return ret; } if (WARN_ON(!vdev->ops->bind_iommufd)) return -ENODEV; - ret = vdev->ops->bind_iommufd(vdev, ictx, &device_id); - if (ret) - return ret; + /* The legacy path has no way to return the device id */ + return vdev->ops->bind_iommufd(vdev, ictx, &device_id); +} - ret = iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id); - if (ret) - goto err_unbind; - ret = vdev->ops->attach_ioas(vdev, &ioas_id); - if (ret) - goto err_unbind; +int vfio_iommufd_attach_compat_ioas(struct vfio_device *vdev, + struct iommufd_ctx *ictx) +{ + u32 ioas_id; + int ret; + + lockdep_assert_held(&vdev->dev_set->lock); /* - * The legacy path has no way to return the device id or the selected - * pt_id + * If the driver doesn't provide this op then it means the device does + * not do DMA at all. So nothing to do. */ - return 0; + if (WARN_ON(!vdev->ops->bind_iommufd)) + return -ENODEV; -err_unbind: - if (vdev->ops->unbind_iommufd) - vdev->ops->unbind_iommufd(vdev); - return ret; + if (vfio_device_is_noiommu(vdev)) { + if (WARN_ON(vfio_iommufd_device_probe_comapt_noiommu(vdev, ictx))) + return -EINVAL; + return 0; + } + + ret = iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id); + if (ret) + return ret; + + /* The legacy path has no way to return the selected pt_id */ + return vdev->ops->attach_ioas(vdev, &ioas_id); } void vfio_iommufd_unbind(struct vfio_device *vdev) diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 464263288d16..3356321805e9 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -232,6 +232,8 @@ static inline void vfio_container_cleanup(void) #if IS_ENABLED(CONFIG_IOMMUFD) int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx); void vfio_iommufd_unbind(struct vfio_device *device); +int vfio_iommufd_attach_compat_ioas(struct vfio_device *vdev, + struct iommufd_ctx *ictx); #else static inline int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx) @@ -242,6 +244,12 @@ static inline int vfio_iommufd_bind(struct vfio_device *device, static inline void vfio_iommufd_unbind(struct vfio_device *device) { } + +static inline int vfio_iommufd_attach_compat_ioas(struct vfio_device *vdev, + struct iommufd_ctx *ictx) +{ + return -EOPNOTSUPP; +} #endif #if IS_ENABLED(CONFIG_VFIO_VIRQFD) From patchwork Wed Mar 8 13:28:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165810 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 21CEDC64EC4 for ; Wed, 8 Mar 2023 13:29:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 57E5810E610; Wed, 8 Mar 2023 13:29:37 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 29C7E10E610; Wed, 8 Mar 2023 13:29:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282171; x=1709818171; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p0+sdiJRsT1CFvbgXCytRESDpwjpnnBYhpkQpIa3C7k=; b=GEnKuiKHV06HxxHZA4xWnCr14sqisaQEPtSRsCJaNDuyrg8WNmnoU6PL W6hbWS6B+wIVliGqBu6n9poGf3BdvmfPp5k9nrIulRIWUDZCkHSBnQ67e AMYN4ubm+F++H6k3hs8S/S2n57na9tXQhelDmfmlFPpyplNoaeHGl2HKB Cy3qn3VbcFEvP2kfu/HEz+GL7G8ie2IL48Y5AYu6lOjNovcr+f57Pjfkn IQuZjGPhwYCCmwShNfdRyTT6OukXrB5ZBZHBLI6+YE1ciKc22zuMdWhh2 TvKvxx3qKoVsvDkQdadivF7Stgbw8psNW+xQTfFTN5s1AMMFr+ND83foj g==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165247" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165247" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789400" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789400" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:29 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:53 -0800 Message-Id: <20230308132903.465159-15-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 14/24] vfio: Add cdev_device_open_cnt to vfio_group X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" for counting the devices that are opened via the cdev path. This count is increased and decreased by the cdev path. The group path checks it to achieve exclusion with the cdev path. With this, only one path (group path or cdev path) will claim DMA ownership. This avoids scenarios in which devices within the same group may be opened via different paths. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- drivers/vfio/group.c | 33 +++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 3 +++ 2 files changed, 36 insertions(+) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 555d68aefa71..63aafd59278d 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -392,6 +392,33 @@ static long vfio_group_fops_unl_ioctl(struct file *filep, } } +int vfio_device_block_group(struct vfio_device *device) +{ + struct vfio_group *group = device->group; + int ret = 0; + + mutex_lock(&group->group_lock); + if (group->opened_file) { + ret = -EBUSY; + goto out_unlock; + } + + group->cdev_device_open_cnt++; + +out_unlock: + mutex_unlock(&group->group_lock); + return ret; +} + +void vfio_device_unblock_group(struct vfio_device *device) +{ + struct vfio_group *group = device->group; + + mutex_lock(&group->group_lock); + group->cdev_device_open_cnt--; + mutex_unlock(&group->group_lock); +} + static int vfio_group_fops_open(struct inode *inode, struct file *filep) { struct vfio_group *group = @@ -414,6 +441,11 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep) goto out_unlock; } + if (group->cdev_device_open_cnt) { + ret = -EBUSY; + goto out_unlock; + } + /* * Do we need multiple instances of the group open? Seems not. */ @@ -488,6 +520,7 @@ static void vfio_group_release(struct device *dev) mutex_destroy(&group->device_lock); mutex_destroy(&group->group_lock); WARN_ON(group->iommu_group); + WARN_ON(group->cdev_device_open_cnt); ida_free(&vfio.group_ida, MINOR(group->dev.devt)); kfree(group); } diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 3356321805e9..e1b8763e443e 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -83,8 +83,11 @@ struct vfio_group { struct blocking_notifier_head notifier; struct iommufd_ctx *iommufd; spinlock_t kvm_ref_lock; + unsigned int cdev_device_open_cnt; }; +int vfio_device_block_group(struct vfio_device *device); +void vfio_device_unblock_group(struct vfio_device *device); int vfio_device_set_group(struct vfio_device *device, enum vfio_group_type type); void vfio_device_remove_group(struct vfio_device *device); From patchwork Wed Mar 8 13:28:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165812 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 238B8C64EC4 for ; Wed, 8 Mar 2023 13:29:41 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 00A2D10E603; Wed, 8 Mar 2023 13:29:40 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3670710E61C; Wed, 8 Mar 2023 13:29:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282173; x=1709818173; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=voMqIPw9OlY9tCQvPK26CPNnMsuw9NBE4JXA1W35rMo=; b=jCTXLuqXtU5JO2KCkyjrXen/3V15/csr+hfXc8riHKPUGbEUAWm2ku53 7zc6vYmBdonE3W2RUnHwsBeKOBv3rGAK8jCrxYTXC3cMZXcfrT78tBrLq +QRlRk4C+/er12QcWCdyvKLhoJb24P1nrjb2R+IJ0okm4rgcO3hnk8YGS jckzrde820ZbBTFKkjcCM7eRfQ5xi/ne5cFyIKrjv4Tyv1Gariu8T0cJ/ IIKYh+pH2y9CwofOhNfIKKeB2fyjSqpD8L8JEVc/IwHhoeot7cTUjhEfl mNv5LDCfUSwcbW/VexELN3v6pgc5VA0UQGoq9BMIBtZuN4GCldbjm6dRN g==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165266" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165266" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789410" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789410" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:31 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:54 -0800 Message-Id: <20230308132903.465159-16-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 15/24] vfio: Make vfio_device_open() single open for device cdev path X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" VFIO group has historically allowed multi-open of the device FD. This was made secure because the "open" was executed via an ioctl to the group FD which is itself only single open. However, no known use of multiple device FDs today. It is kind of a strange thing to do because new device FDs can naturally be created via dup(). When we implement the new device uAPI (only used in cdev path) there is no natural way to allow the device itself from being multi-opened in a secure manner. Without the group FD we cannot prove the security context of the opener. Thus, when moving to the new uAPI we block the ability to multi-open the device. Old group path still allows it. vfio_device_open() needs to sustain both the legacy behavior i.e. multi-open in the group path and the new behavior i.e. single-open in the cdev path. This mixture leads to storing a vfio_group pointer in struct vfio_device_file. Only the group path would set it, cdev path never sets it. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian --- drivers/vfio/group.c | 2 ++ drivers/vfio/vfio.h | 2 ++ drivers/vfio/vfio_main.c | 7 +++++++ 3 files changed, 11 insertions(+) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 63aafd59278d..f067a6a7bbd2 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -254,6 +254,8 @@ static struct file *vfio_device_open_file(struct vfio_device *device) goto err_out; } + df->group = device->group; + ret = vfio_device_group_open(df); if (ret) goto err_free; diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index e1b8763e443e..11397cc95e0b 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -18,6 +18,8 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + struct vfio_group *group; + bool access_granted; spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index cf9994a65df3..aa31aae33407 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -477,6 +477,13 @@ int vfio_device_open(struct vfio_device_file *df) lockdep_assert_held(&device->dev_set->lock); + /* + * Only group path supports multiple device open. cdev path + * doesn't have a secure way for it. + */ + if (device->open_count != 0 && !df->group) + return -EINVAL; + device->open_count++; if (device->open_count == 1) { ret = vfio_device_first_open(df); From patchwork Wed Mar 8 13:28:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165811 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E828CC742A7 for ; Wed, 8 Mar 2023 13:29:38 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E7F1210E628; Wed, 8 Mar 2023 13:29:37 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id CA22810E607; Wed, 8 Mar 2023 13:29:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282174; x=1709818174; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XeiR8haIZIHcl7wUZyWGU+A/6RExf6z0n7ntzoVlizo=; b=hIcBhEX1wI1kCmMZxBg0I0GatyFdx1mlDqCsLQcUE5icSYuTkQRH4Ofo IDZLpMBx2zmqDSkbN2nOzeFq1CL8yAK5b3+qwF2DtJdoI1pM19Vj5W+gU 1U4OmUsdcTdiVhZdltDszxrlL7uG/SSaaZPQ4it3kyDu7yY6JNmE0oP2D 9+Myo3hUFYQQZEH8x0hwtMUCMBCjvB7b4se8Uc1MJgQuuV+fVSFQ0Qi/o g4QUA4m1lrmPqukAcL+f7XQ0GYvi4Xhb2r5cqL8kW7dsisQJ5eQkEAj2R FTQ7x0INZEyUkRg7r8xkZq50xf1nnKtFN7p3dvROjh6gg3XSYADUIqmFb Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165279" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165279" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789419" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789419" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:33 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:55 -0800 Message-Id: <20230308132903.465159-17-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 16/24] vfio: Make vfio_device_first_open() to cover the noiommu mode in cdev path X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" vfio_device_first_open() now covers the below two cases: 1) user uses iommufd (e.g. the group path in iommufd compat mode); 2) user uses container (e.g. the group path in legacy mode); The above two paths have their own noiommu mode support accordingly. The cdev path also uses iommufd, so for the case user provides a valid iommufd, this helper is able to support it. But for noiommu mode, the cdev path just provides a NULL iommufd. So this needs to be able to cover it. As there is no special things to do for the cdev path in noiommu mode, it can be covered by simply differentiate it from the container case. If user is not using iommufd nor container, it is the noiommu mode. Signed-off-by: Yi Liu --- drivers/vfio/group.c | 15 +++++++++++++++ drivers/vfio/vfio.h | 1 + drivers/vfio/vfio_main.c | 21 +++++++++++++++++---- 3 files changed, 33 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index f067a6a7bbd2..51c027134814 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -780,6 +780,21 @@ void vfio_device_group_unregister(struct vfio_device *device) mutex_unlock(&device->group->device_lock); } +/* + * This shall be used without group lock as group and group->container + * should be fixed before group is set to df->group. + */ +bool vfio_device_group_uses_container(struct vfio_device_file *df) +{ + /* + * Use the df->group instead of the df->device->group as no + * lock is acquired here. + */ + if (WARN_ON(!df->group)) + return false; + return READ_ONCE(df->group->container); +} + int vfio_device_group_use_iommu(struct vfio_device *device) { struct vfio_group *group = device->group; diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 11397cc95e0b..615ffd58562b 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -95,6 +95,7 @@ int vfio_device_set_group(struct vfio_device *device, void vfio_device_remove_group(struct vfio_device *device); void vfio_device_group_register(struct vfio_device *device); void vfio_device_group_unregister(struct vfio_device *device); +bool vfio_device_group_uses_container(struct vfio_device_file *df); int vfio_device_group_use_iommu(struct vfio_device *device); void vfio_device_group_unuse_iommu(struct vfio_device *device); void vfio_device_group_close(struct vfio_device_file *df); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index aa31aae33407..8c73df1a400e 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -423,16 +423,29 @@ static int vfio_device_first_open(struct vfio_device_file *df) { struct vfio_device *device = df->device; struct iommufd_ctx *iommufd = df->iommufd; - int ret; + int ret = 0; lockdep_assert_held(&device->dev_set->lock); if (!try_module_get(device->dev->driver->owner)) return -ENODEV; + /* + * The handling here depends on what the user is using. + * + * If user uses iommufd in the group compat mode or the + * cdev path, call vfio_iommufd_bind(). + * + * If user uses container in the group legacy mode, call + * vfio_device_group_use_iommu(). + * + * If user doesn't use iommufd nor container, this is + * the noiommufd mode in the cdev path, nothing needs + * to be done here just go ahead to open device. + */ if (iommufd) ret = vfio_iommufd_bind(device, iommufd); - else + else if (vfio_device_group_uses_container(df)) ret = vfio_device_group_use_iommu(device); if (ret) goto err_module_put; @@ -447,7 +460,7 @@ static int vfio_device_first_open(struct vfio_device_file *df) err_unuse_iommu: if (iommufd) vfio_iommufd_unbind(device); - else + else if (vfio_device_group_uses_container(df)) vfio_device_group_unuse_iommu(device); err_module_put: module_put(device->dev->driver->owner); @@ -465,7 +478,7 @@ static void vfio_device_last_close(struct vfio_device_file *df) device->ops->close_device(device); if (iommufd) vfio_iommufd_unbind(device); - else + else if (vfio_device_group_uses_container(df)) vfio_device_group_unuse_iommu(device); module_put(device->dev->driver->owner); } From patchwork Wed Mar 8 13:28:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CEF7DC64EC4 for ; Wed, 8 Mar 2023 13:29:43 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4006910E630; Wed, 8 Mar 2023 13:29:43 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5A25D10E624; Wed, 8 Mar 2023 13:29:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282176; x=1709818176; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VuqafYByKK/jWkKRq463YXQFQwOvpfEb1z0GI9PVpxY=; b=Dh5NfmHFutfwxWeEwtLnvQ/95VB75apgQtO1vnWaoW7RyzVm7YbORkvp zh/mtOYWkh4ZSTM0fpEoZVEoU/iZx3x/HHwxdwXKfgyqaD+ZecsyVLPYQ BNrQSgwj9cbGZLWMr9lJIrDyzl6Po+5kdJkJ5gCvfjvDy4Xnc1Z01sQIv njpOJwm3aHnTuJa+GesVLpwAb7rAihZsmrVtUoHqRI48Mz/vFM6oSyITK Jok2BUA8/7C5ubh+CH9Bam2D89EuTbbLDew0RlcgqwawclbBXnvqxV2tT WK/3ydefx8P7fFfly1lA5QMFIEEhfWXSpstJoWa1qvmTxHQydIY3UVL7w w==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165295" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165295" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:36 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789423" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789423" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:35 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:56 -0800 Message-Id: <20230308132903.465159-18-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 17/24] vfio-iommufd: Make vfio_iommufd_bind() selectively return devid X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" bind_iommufd() will generate an ID to represent this bond, it is needed by userspace for further usage. devid is stored in vfio_device_file to avoid passing devid pointer in multiple places. Signed-off-by: Yi Liu --- drivers/vfio/iommufd.c | 13 ++++++++++--- drivers/vfio/vfio.h | 6 ++++-- drivers/vfio/vfio_main.c | 8 +++++--- include/linux/iommufd.h | 2 ++ 4 files changed, 21 insertions(+), 8 deletions(-) diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 8c518f8bd39a..b2cdb6b2b37f 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -28,7 +28,8 @@ static int vfio_iommufd_device_probe_comapt_noiommu(struct vfio_device *vdev, return 0; } -int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) +int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, + u32 *devid) { u32 device_id; int ret; @@ -44,8 +45,14 @@ int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) if (WARN_ON(!vdev->ops->bind_iommufd)) return -ENODEV; - /* The legacy path has no way to return the device id */ - return vdev->ops->bind_iommufd(vdev, ictx, &device_id); + ret = vdev->ops->bind_iommufd(vdev, ictx, &device_id); + if (ret) + return ret; + + if (devid) + *devid = device_id; + + return 0; } int vfio_iommufd_attach_compat_ioas(struct vfio_device *vdev, diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 615ffd58562b..98cee2f765e9 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -24,6 +24,7 @@ struct vfio_device_file { spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ + u32 devid; /* only valid when iommufd is valid */ }; void vfio_device_put_registration(struct vfio_device *device); @@ -236,13 +237,14 @@ static inline void vfio_container_cleanup(void) #endif #if IS_ENABLED(CONFIG_IOMMUFD) -int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx); +int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx, + u32 *devid); void vfio_iommufd_unbind(struct vfio_device *device); int vfio_iommufd_attach_compat_ioas(struct vfio_device *vdev, struct iommufd_ctx *ictx); #else static inline int vfio_iommufd_bind(struct vfio_device *device, - struct iommufd_ctx *ictx) + struct iommufd_ctx *ictx, u32 *devid) { return -EOPNOTSUPP; } diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 8c73df1a400e..a66ca138059b 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -444,7 +444,7 @@ static int vfio_device_first_open(struct vfio_device_file *df) * to be done here just go ahead to open device. */ if (iommufd) - ret = vfio_iommufd_bind(device, iommufd); + ret = vfio_iommufd_bind(device, iommufd, &df->devid); else if (vfio_device_group_uses_container(df)) ret = vfio_device_group_use_iommu(device); if (ret) @@ -476,10 +476,12 @@ static void vfio_device_last_close(struct vfio_device_file *df) if (device->ops->close_device) device->ops->close_device(device); - if (iommufd) + if (iommufd) { vfio_iommufd_unbind(device); - else if (vfio_device_group_uses_container(df)) + df->devid = IOMMUFD_INVALID_ID; + } else if (vfio_device_group_uses_container(df)) { vfio_device_group_unuse_iommu(device); + } module_put(device->dev->driver->owner); } diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 7a0d7f2c4237..48b9bfab9891 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -10,6 +10,8 @@ #include #include +#define IOMMUFD_INVALID_ID 0 + struct device; struct iommufd_device; struct page; From patchwork Wed Mar 8 13:28:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165815 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 83EDFC742A7 for ; Wed, 8 Mar 2023 13:29:45 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C941710E677; Wed, 8 Mar 2023 13:29:44 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id EBCFE10E62E; Wed, 8 Mar 2023 13:29:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282178; x=1709818178; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pRPKEgYL1tnVGtbrz2UmjuZ3sJu66ZSJl+fj1oA+j90=; b=iybV04bGAZ4xFzu9hOt9fSoSoXQIKaHBMhEh3EKVpHntLWkPoDmicgtE 3v4sFUK72+7p/acmOAH1mj0o4NnlR1kxGdCEbaAtrr0HZs9WAco3+oIIb 0Cy+jQ00n/Yh69kisj0na2E1+xoHZEsfIlkuYRm4S+Xy61GrS/VhbvXzu hF6RwOoX9vNMDaY0SO3PuzhLjkqswBbuBF7uxPjitsnQrq0pimmE6bkD9 LRuhGkMKK0I/oICs1HOhm1ahkoewYwCCd0qDXbSH4vfeHi6CqGUqJfjYK d7RkVa45OYXYIy+DwzswZd/BTD/OoHAL2f/OqdecvMnZOyZEftPHB7IfU Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165302" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165302" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789431" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789431" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:36 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:57 -0800 Message-Id: <20230308132903.465159-19-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 18/24] vfio-iommufd: Add detach_ioas support for physical VFIO devices X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" this prepares for adding DETACH ioctl for physical VFIO devices. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- Documentation/driver-api/vfio.rst | 8 +++++--- drivers/vfio/fsl-mc/vfio_fsl_mc.c | 1 + drivers/vfio/iommufd.c | 20 +++++++++++++++++++ .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 2 ++ drivers/vfio/pci/mlx5/main.c | 1 + drivers/vfio/pci/vfio_pci.c | 1 + drivers/vfio/platform/vfio_amba.c | 1 + drivers/vfio/platform/vfio_platform.c | 1 + drivers/vfio/vfio_main.c | 3 ++- include/linux/vfio.h | 8 +++++++- 10 files changed, 41 insertions(+), 5 deletions(-) diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst index 50b690f7f663..44527420f20d 100644 --- a/Documentation/driver-api/vfio.rst +++ b/Documentation/driver-api/vfio.rst @@ -279,6 +279,7 @@ similar to a file operations structure:: struct iommufd_ctx *ictx, u32 *out_device_id); void (*unbind_iommufd)(struct vfio_device *vdev); int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id); + void (*detach_ioas)(struct vfio_device *vdev); int (*open_device)(struct vfio_device *vdev); void (*close_device)(struct vfio_device *vdev); ssize_t (*read)(struct vfio_device *vdev, char __user *buf, @@ -315,9 +316,10 @@ container_of(). - The [un]bind_iommufd callbacks are issued when the device is bound to and unbound from iommufd. - - The attach_ioas callback is issued when the device is attached to an - IOAS managed by the bound iommufd. The attached IOAS is automatically - detached when the device is unbound from iommufd. + - The [de]attach_ioas callback is issued when the device is attached to + and detached from an IOAS managed by the bound iommufd. However, the + attached IOAS can also be automatically detached when the device is + unbound from iommufd. - The read/write/mmap callbacks implement the device region access defined by the device's own VFIO_DEVICE_GET_REGION_INFO ioctl. diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c index c89a047a4cd8..d540cf683d93 100644 --- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c +++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c @@ -594,6 +594,7 @@ static const struct vfio_device_ops vfio_fsl_mc_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static struct fsl_mc_driver vfio_fsl_mc_driver = { diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index b2cdb6b2b37f..c06494e322f9 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -139,6 +139,14 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id) { int rc; + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return -EINVAL; + + if (vdev->iommufd_attached) + return -EBUSY; + rc = iommufd_device_attach(vdev->iommufd_device, pt_id); if (rc) return rc; @@ -147,6 +155,18 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id) } EXPORT_SYMBOL_GPL(vfio_iommufd_physical_attach_ioas); +void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev) +{ + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device || !vdev->iommufd_attached)) + return; + + iommufd_device_detach(vdev->iommufd_device); + vdev->iommufd_attached = false; +} +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas); + /* * The emulated standard ops mean that vfio_device is going to use the * "mdev path" and will call vfio_pin_pages()/vfio_dma_rw(). Drivers using this diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c index a117eaf21c14..b2f9778c8366 100644 --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c @@ -1373,6 +1373,7 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static const struct vfio_device_ops hisi_acc_vfio_pci_ops = { @@ -1391,6 +1392,7 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index e897537a9e8a..6fc3410989eb 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -1326,6 +1326,7 @@ static const struct vfio_device_ops mlx5vf_pci_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static int mlx5vf_pci_probe(struct pci_dev *pdev, diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 29091ee2e984..cb5b7f865d58 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -141,6 +141,7 @@ static const struct vfio_device_ops vfio_pci_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c index 83fe54015595..6464b3939ebc 100644 --- a/drivers/vfio/platform/vfio_amba.c +++ b/drivers/vfio/platform/vfio_amba.c @@ -119,6 +119,7 @@ static const struct vfio_device_ops vfio_amba_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static const struct amba_id pl330_ids[] = { diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c index 22a1efca32a8..8cf22fa65baa 100644 --- a/drivers/vfio/platform/vfio_platform.c +++ b/drivers/vfio/platform/vfio_platform.c @@ -108,6 +108,7 @@ static const struct vfio_device_ops vfio_platform_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static struct platform_driver vfio_platform_driver = { diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index a66ca138059b..da5cc4c5c63f 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -258,7 +258,8 @@ static int __vfio_register_dev(struct vfio_device *device, if (WARN_ON(IS_ENABLED(CONFIG_IOMMUFD) && (!device->ops->bind_iommufd || !device->ops->unbind_iommufd || - !device->ops->attach_ioas))) + !device->ops->attach_ioas || + !device->ops->detach_ioas))) return -EINVAL; /* diff --git a/include/linux/vfio.h b/include/linux/vfio.h index fc14f8430a10..8982e48a30e2 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -72,7 +72,9 @@ struct vfio_device { * @bind_iommufd: Called when binding the device to an iommufd * @unbind_iommufd: Opposite of bind_iommufd * @attach_ioas: Called when attaching device to an IOAS/HWPT managed by the - * bound iommufd. Undo in unbind_iommufd. + * bound iommufd. Undo in unbind_iommufd if @detach_ioas is not + * called. + * @detach_ioas: Opposite of attach_ioas * @open_device: Called when the first file descriptor is opened for this device * @close_device: Opposite of open_device * @read: Perform read(2) on device file descriptor @@ -96,6 +98,7 @@ struct vfio_device_ops { struct iommufd_ctx *ictx, u32 *out_device_id); void (*unbind_iommufd)(struct vfio_device *vdev); int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id); + void (*detach_ioas)(struct vfio_device *vdev); int (*open_device)(struct vfio_device *vdev); void (*close_device)(struct vfio_device *vdev); ssize_t (*read)(struct vfio_device *vdev, char __user *buf, @@ -118,6 +121,7 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev, void vfio_iommufd_physical_unbind(struct vfio_device *vdev); struct iommufd_ctx *vfio_iommufd_physical_ctx(struct vfio_device *vdev); int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id); +void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev); int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); @@ -132,6 +136,8 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id); ((struct iommufd_ctx * (*)(struct vfio_device *vdev)) NULL) #define vfio_iommufd_physical_attach_ioas \ ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) +#define vfio_iommufd_physical_detach_ioas \ + ((void (*)(struct vfio_device *vdev)) NULL) #define vfio_iommufd_emulated_bind \ ((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx, \ u32 *out_device_id)) NULL) From patchwork Wed Mar 8 13:28:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165814 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D3016C678D5 for ; Wed, 8 Mar 2023 13:29:44 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1DCE610E679; Wed, 8 Mar 2023 13:29:44 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5EB5610E62E; Wed, 8 Mar 2023 13:29:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282179; x=1709818179; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fGDMbqi2Ir5gzZxjKitGWx0WJmF41WECcS8Va9zoPN4=; b=f8fdtnZYO30wn0ACrMgxDvKJ+Dz6MGfBIgSD+AROxCXXhK0DsGjxQD4G 1I270Y4DjhjPKuBglmW8vDa/+/QGQkBd4Yco3HhulPNrP0My9QA1DJsJZ uA95ELHCgX1f8D7BBBss+mqPMsL6Fv3/hafc1WWRMe8vKCeaWS5D0aoKM PZVX2eYB44EplZILNxvx4EtsZSRQ7RvxklK64jBNo/1R6t5DivEzjqapD x/Glp43kBFpQbzN4WDCU6p3cixnTtEqcTuup6qY4CQSXjhwfbfyC0cdQQ sUkWHfCqFbM9BPaEv0MZcF9W9nDUjVHhVAgA11yLurgmmztZK7FZZCGm7 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165316" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165316" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:39 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789440" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789440" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:38 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:58 -0800 Message-Id: <20230308132903.465159-20-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 19/24] vfio-iommufd: Add detach_ioas support for emulated VFIO devices X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" this prepares for adding DETACH ioctl for emulated VFIO devices. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- drivers/gpu/drm/i915/gvt/kvmgt.c | 1 + drivers/s390/cio/vfio_ccw_ops.c | 1 + drivers/s390/crypto/vfio_ap_ops.c | 1 + drivers/vfio/iommufd.c | 14 +++++++++++++- include/linux/vfio.h | 3 +++ 5 files changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index de675d799c7d..9cd9e9da60dd 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1474,6 +1474,7 @@ static const struct vfio_device_ops intel_vgpu_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static int intel_vgpu_probe(struct mdev_device *mdev) diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c index 5b53b94f13c7..cba4971618ff 100644 --- a/drivers/s390/cio/vfio_ccw_ops.c +++ b/drivers/s390/cio/vfio_ccw_ops.c @@ -632,6 +632,7 @@ static const struct vfio_device_ops vfio_ccw_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; struct mdev_driver vfio_ccw_mdev_driver = { diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 72e10abb103a..9902e62e7a17 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1844,6 +1844,7 @@ static const struct vfio_device_ops vfio_ap_matrix_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static struct mdev_driver vfio_ap_matrix_driver = { diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index c06494e322f9..8a9457d0a33c 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -218,8 +218,20 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id) { lockdep_assert_held(&vdev->dev_set->lock); - if (!vdev->iommufd_access) + if (WARN_ON(!vdev->iommufd_access)) return -ENOENT; return iommufd_access_set_ioas(vdev->iommufd_access, *pt_id); } EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_attach_ioas); + +void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev) +{ + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_access)) + return; + + iommufd_access_destroy(vdev->iommufd_access); + vdev->iommufd_access = NULL; +} +EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_detach_ioas); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 8982e48a30e2..5a1c5b6f1a63 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -126,6 +126,7 @@ int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id); +void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev); #else #define vfio_iommufd_physical_bind \ ((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx, \ @@ -145,6 +146,8 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id); ((void (*)(struct vfio_device *vdev)) NULL) #define vfio_iommufd_emulated_attach_ioas \ ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) +#define vfio_iommufd_emulated_detach_ioas \ + ((void (*)(struct vfio_device *vdev)) NULL) #endif /** From patchwork Wed Mar 8 13:28:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165818 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D8F50C6FD20 for ; Wed, 8 Mar 2023 13:29:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6914D10E678; Wed, 8 Mar 2023 13:29:49 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0C35F10E62E; Wed, 8 Mar 2023 13:29:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282181; x=1709818181; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dzJYS+VtE+jQFaZ2CI9x4Pdab307ftN0Unh+/w+lcnc=; b=ORRKautpLbQA6YiQpBNDDAxMubXrTQaomh8f13lG4OJxct9+JYFnnUsc vRIHTa/5+EAT7alOaVfpahWzOaujWlyUd0aXqo+3yKqQNY5qJlUIwGZ31 IoQMx3+sFkKuIs8zvNxggd2rWNVaH1WhIIPJ2/49ml7QIjOs71FTfzU9t yY60mqKdPLEWQByKD7s/UwEwUpu1Huho66hyB1/LnvJhBD+HDzSUbhp5m qYdIaQ/b1c0YgO1ZK49jSmV9xGUC7CpEc+JD1/6Pl1wsi9K03+oSw5Pp6 pJfVe8fzjCjv/Wziegv9jSp0esPGfZNfJOeUBZoU0rnuYIwI50WxsrL+Y g==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165326" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165326" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789449" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789449" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:39 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:28:59 -0800 Message-Id: <20230308132903.465159-21-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 20/24] vfio: Add cdev for vfio_device X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This allows user to directly open a vfio device w/o using the legacy container/group interface, as a prerequisite for supporting new iommu features like nested translation. The device fd opened in this manner doesn't have the capability to access the device as the fops open() doesn't open the device until the successful BIND_IOMMUFD which be added in next patch. With this patch, devices registered to vfio core have both group and device interface created. - group interface : /dev/vfio/$groupID - device interface: /dev/vfio/devices/vfioX (X is the minor number and is unique across devices) Given a vfio device the user can identify the matching vfioX by checking the sysfs path of the device. Take PCI device (0000:6a:01.0) for example, /sys/bus/pci/devices/0000\:6a\:01.0/vfio-dev/vfio0/dev contains the major:minor of the matching vfioX. Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat that the major:minor matches. The vfio_device cdev logic in this patch: *) __vfio_register_dev() path ends up doing cdev_device_add() for each vfio_device if VFIO_DEVICE_CDEV configured. *) vfio_unregister_group_dev() path does cdev_device_del(); Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- drivers/vfio/Kconfig | 11 +++++++ drivers/vfio/Makefile | 1 + drivers/vfio/device_cdev.c | 62 ++++++++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 46 ++++++++++++++++++++++++++++ drivers/vfio/vfio_main.c | 34 ++++++++++++++++----- include/linux/vfio.h | 4 +++ 6 files changed, 151 insertions(+), 7 deletions(-) create mode 100644 drivers/vfio/device_cdev.c diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index 89e06c981e43..e2105b4dac2d 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -12,6 +12,17 @@ menuconfig VFIO If you don't know what to do here, say N. if VFIO +config VFIO_DEVICE_CDEV + bool "Support for the VFIO cdev /dev/vfio/devices/vfioX" + depends on IOMMUFD + help + The VFIO device cdev is another way for userspace to get device + access. Userspace gets device fd by opening device cdev under + /dev/vfio/devices/vfioX, and then bind the device fd with an iommufd + to set up secure DMA context for device access. + + If you don't know what to do here, say N. + config VFIO_CONTAINER bool "Support for the VFIO container /dev/vfio/vfio" select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64) diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 70e7dcb302ef..245394aeb94b 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_VFIO) += vfio.o vfio-y += vfio_main.o \ group.o \ iova_bitmap.o +vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o vfio-$(CONFIG_IOMMUFD) += iommufd.o vfio-$(CONFIG_VFIO_CONTAINER) += container.o vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c new file mode 100644 index 000000000000..1c640016a824 --- /dev/null +++ b/drivers/vfio/device_cdev.c @@ -0,0 +1,62 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2023 Intel Corporation. + */ +#include + +#include "vfio.h" + +static dev_t device_devt; + +void vfio_init_device_cdev(struct vfio_device *device) +{ + device->device.devt = MKDEV(MAJOR(device_devt), device->index); + cdev_init(&device->cdev, &vfio_device_fops); + device->cdev.owner = THIS_MODULE; +} + +/* + * device access via the fd opened by this function is blocked until + * .open_device() is called successfully during BIND_IOMMUFD. + */ +int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep) +{ + struct vfio_device *device = container_of(inode->i_cdev, + struct vfio_device, cdev); + struct vfio_device_file *df; + int ret; + + if (!vfio_device_try_get_registration(device)) + return -ENODEV; + + df = vfio_allocate_device_file(device); + if (IS_ERR(df)) { + ret = PTR_ERR(df); + goto err_put_registration; + } + + filep->private_data = df; + + return 0; + +err_put_registration: + vfio_device_put_registration(device); + return ret; +} + +static char *vfio_device_devnode(const struct device *dev, umode_t *mode) +{ + return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); +} + +int vfio_cdev_init(struct class *device_class) +{ + device_class->devnode = vfio_device_devnode; + return alloc_chrdev_region(&device_devt, 0, + MINORMASK + 1, "vfio-dev"); +} + +void vfio_cdev_cleanup(void) +{ + unregister_chrdev_region(device_devt, MINORMASK + 1); +} diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 98cee2f765e9..3f359f04b754 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -260,6 +260,52 @@ static inline int vfio_iommufd_attach_compat_ioas(struct vfio_device *vdev, } #endif +#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) +static inline int vfio_device_add(struct vfio_device *device) +{ + return cdev_device_add(&device->cdev, &device->device); +} + +static inline void vfio_device_del(struct vfio_device *device) +{ + cdev_device_del(&device->cdev, &device->device); +} + +void vfio_init_device_cdev(struct vfio_device *device); +int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep); +int vfio_cdev_init(struct class *device_class); +void vfio_cdev_cleanup(void); +#else +static inline int vfio_device_add(struct vfio_device *device) +{ + return device_add(&device->device); +} + +static inline void vfio_device_del(struct vfio_device *device) +{ + device_del(&device->device); +} + +static inline void vfio_init_device_cdev(struct vfio_device *device) +{ +} + +static inline int vfio_device_fops_cdev_open(struct inode *inode, + struct file *filep) +{ + return 0; +} + +static inline int vfio_cdev_init(struct class *device_class) +{ + return 0; +} + +static inline void vfio_cdev_cleanup(void) +{ +} +#endif /* CONFIG_VFIO_DEVICE_CDEV */ + #if IS_ENABLED(CONFIG_VFIO_VIRQFD) int __init vfio_virqfd_init(void); void vfio_virqfd_exit(void); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index da5cc4c5c63f..b0c2a7544524 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -242,6 +242,7 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev, device->device.release = vfio_device_release; device->device.class = vfio.device_class; device->device.parent = device->dev; + vfio_init_device_cdev(device); return 0; out_uninit: @@ -277,7 +278,7 @@ static int __vfio_register_dev(struct vfio_device *device, if (ret) return ret; - ret = device_add(&device->device); + ret = vfio_device_add(device); if (ret) goto err_out; @@ -317,6 +318,20 @@ void vfio_unregister_group_dev(struct vfio_device *device) bool interrupted = false; long rc; + /* + * Placing it before vfio_device_put_registration() to prevent + * new registration refcount increment by VFIO_GROUP_GET_DEVICE_FD + * during the unregister time. + */ + vfio_device_group_unregister(device); + + /* + * Balances vfio_device_add() in the register path. Placing it before + * vfio_device_put_registration() to prevent new registration refcount + * increment by the device cdev open during the unregister time. + */ + vfio_device_del(device); + vfio_device_put_registration(device); rc = try_wait_for_completion(&device->comp); while (rc <= 0) { @@ -340,11 +355,6 @@ void vfio_unregister_group_dev(struct vfio_device *device) } } - vfio_device_group_unregister(device); - - /* Balances device_add in register path */ - device_del(&device->device); - /* Balances vfio_device_set_group in register path */ vfio_device_remove_group(device); } @@ -563,7 +573,8 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; - vfio_device_group_close(df); + if (df->group) + vfio_device_group_close(df); vfio_device_put_registration(device); @@ -1212,6 +1223,7 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) const struct file_operations vfio_device_fops = { .owner = THIS_MODULE, + .open = vfio_device_fops_cdev_open, .release = vfio_device_fops_release, .read = vfio_device_fops_read, .write = vfio_device_fops_write, @@ -1603,9 +1615,16 @@ static int __init vfio_init(void) goto err_dev_class; } + ret = vfio_cdev_init(vfio.device_class); + if (ret) + goto err_alloc_dev_chrdev; + pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); return 0; +err_alloc_dev_chrdev: + class_destroy(vfio.device_class); + vfio.device_class = NULL; err_dev_class: vfio_virqfd_exit(); err_virqfd: @@ -1616,6 +1635,7 @@ static int __init vfio_init(void) static void __exit vfio_cleanup(void) { ida_destroy(&vfio.device_ida); + vfio_cdev_cleanup(); class_destroy(vfio.device_class); vfio.device_class = NULL; vfio_virqfd_exit(); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 5a1c5b6f1a63..34adfcb5b7bd 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -51,6 +52,9 @@ struct vfio_device { /* Members below here are private, not for driver use */ unsigned int index; struct device device; /* device.kref covers object life circle */ +#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) + struct cdev cdev; +#endif refcount_t refcount; /* user count on registered device*/ unsigned int open_count; struct completion comp; From patchwork Wed Mar 8 13:29:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165821 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5496AC64EC4 for ; Wed, 8 Mar 2023 13:29:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4F55A10E689; Wed, 8 Mar 2023 13:29:50 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7D4D210E630; Wed, 8 Mar 2023 13:29:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282182; x=1709818182; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=oI7un8hHPHJZ9QJcGeUUyGy3MTiXa7ydO6MKY5W5FZo=; b=OU2QwZqi9qHmlm28SOzBjdb6r9f2gx+ejaBDWlz2uY+aHVcmU7qiZZgd JwEbVbBj9jybWSXZm5/DeVE1yQ4iUlecmenvqz5AzzCvmB+WIIbSgc5TJ Td7n+8N0DCv46mPXBEZpv4PcfOEbVK9UgFGu4MzSNlj8D5cbeUowzSN8j Wz9xGIkdUEwRdNcakYgp2VSD5tkGrmOHI3b+0OAXL+RLRtYwtLDAq0XIH 0UHTVNofq81MJaRx7906zeAO3jgUYSI8IdZspHojU0Tsjr6BD9qyPL3S6 prihev0MNPvHkwDMNmWFXlFl3TT3WBkNf2HK5F5bxC1tIQSI1R6kWU2KA Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165336" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165336" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789456" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789456" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:41 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:29:00 -0800 Message-Id: <20230308132903.465159-22-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 21/24] vfio: Add VFIO_DEVICE_BIND_IOMMUFD X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This adds ioctl for userspace to bind device cdev fd to iommufd. VFIO_DEVICE_BIND_IOMMUFD: bind device to an iommufd, hence gain DMA control provided by the iommufd. open_device op is called after bind_iommufd op. VFIO no iommu mode is indicated by passing a negative iommufd value. Signed-off-by: Yi Liu --- drivers/vfio/device_cdev.c | 166 +++++++++++++++++++++++++++++++++++++ drivers/vfio/group.c | 15 ++++ drivers/vfio/vfio.h | 15 ++++ drivers/vfio/vfio_main.c | 29 ++++++- include/uapi/linux/vfio.h | 37 +++++++++ 5 files changed, 258 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c index 1c640016a824..568cc9da16c7 100644 --- a/drivers/vfio/device_cdev.c +++ b/drivers/vfio/device_cdev.c @@ -3,6 +3,7 @@ * Copyright (c) 2023 Intel Corporation. */ #include +#include #include "vfio.h" @@ -44,6 +45,171 @@ int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep) return ret; } +static void vfio_device_get_kvm_safe(struct vfio_device_file *df) +{ + spin_lock(&df->kvm_ref_lock); + if (df->kvm) + _vfio_device_get_kvm_safe(df->device, df->kvm); + spin_unlock(&df->kvm_ref_lock); +} + +void vfio_device_cdev_close(struct vfio_device_file *df) +{ + struct vfio_device *device = df->device; + + /* + * As df->access_granted writer is under dev_set->lock as well, + * so this read no need to use smp_load_acquire() to pair with + * smp_store_release() in the caller of vfio_device_open(). + */ + if (!df->access_granted) + return; + + mutex_lock(&device->dev_set->lock); + vfio_device_close(df); + vfio_device_put_kvm(device); + if (df->iommufd) + iommufd_ctx_put(df->iommufd); + mutex_unlock(&device->dev_set->lock); + vfio_device_unblock_group(device); +} + +static int vfio_device_cdev_probe_noiommu(struct vfio_device *device) +{ + struct iommu_group *iommu_group; + int ret = 0; + + if (!IS_ENABLED(CONFIG_VFIO_NOIOMMU) || !vfio_noiommu) + return -EINVAL; + + if (!capable(CAP_SYS_RAWIO)) + return -EPERM; + + iommu_group = iommu_group_get(device->dev); + if (!iommu_group) + return 0; + + /* + * We cannot support noiommu mode for devices that are protected + * by IOMMU. So check the iommu_group, if it is a no-iommu group + * created by VFIO, we support. If not, we refuse. + */ + if (!vfio_group_find_noiommu_group_from_iommu(iommu_group)) + ret = -EINVAL; + iommu_group_put(iommu_group); + return ret; +} + +static struct iommufd_ctx *vfio_get_iommufd_from_fd(int fd) +{ + struct fd f; + struct iommufd_ctx *iommufd; + + f = fdget(fd); + if (!f.file) + return ERR_PTR(-EBADF); + + iommufd = iommufd_ctx_from_file(f.file); + + fdput(f); + return iommufd; +} + +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, + struct vfio_device_bind_iommufd __user *arg) +{ + struct vfio_device *device = df->device; + struct vfio_device_bind_iommufd bind; + struct iommufd_ctx *iommufd = NULL; + unsigned long minsz; + int ret; + + static_assert(__same_type(arg->out_devid, bind.out_devid)); + + minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid); + + if (copy_from_user(&bind, arg, minsz)) + return -EFAULT; + + if (bind.argsz < minsz || bind.flags) + return -EINVAL; + + if (!device->ops->bind_iommufd) + return -ENODEV; + + ret = vfio_device_block_group(device); + if (ret) + return ret; + + mutex_lock(&device->dev_set->lock); + /* If already got access, should fail it. */ + if (df->access_granted) { + ret = -EINVAL; + goto out_unlock; + } + + /* iommufd < 0 means noiommu mode */ + if (bind.iommufd < 0) { + ret = vfio_device_cdev_probe_noiommu(device); + if (ret) + goto out_unlock; + } else { + iommufd = vfio_get_iommufd_from_fd(bind.iommufd); + if (IS_ERR(iommufd)) { + ret = PTR_ERR(iommufd); + goto out_unlock; + } + } + + /* + * Before the device open, get the KVM pointer currently + * associated with the device file (if there is) and obtain + * a reference. This reference is held until device closed. + * Save the pointer in the device for use by drivers. + */ + vfio_device_get_kvm_safe(df); + + df->iommufd = iommufd; + ret = vfio_device_open(df); + if (ret) + goto out_put_kvm; + + if (df->iommufd) + bind.out_devid = df->devid; + else + bind.out_devid = IOMMUFD_INVALID_ID; + + ret = copy_to_user(&arg->out_devid, &bind.out_devid, + sizeof(bind.out_devid)) ? -EFAULT : 0; + if (ret) + goto out_close_device; + + if (bind.iommufd < 0) + dev_warn(device->dev, "device is bound to vfio-noiommu by user " + "(%s:%d)\n", current->comm, task_pid_nr(current)); + + /* + * Paired with smp_load_acquire() in vfio_device_fops::ioctl/ + * read/write/mmap + */ + smp_store_release(&df->access_granted, true); + mutex_unlock(&device->dev_set->lock); + + return 0; + +out_close_device: + vfio_device_close(df); +out_put_kvm: + df->iommufd = NULL; + vfio_device_put_kvm(device); + if (iommufd) + iommufd_ctx_put(iommufd); +out_unlock: + mutex_unlock(&device->dev_set->lock); + vfio_device_unblock_group(device); + return ret; +} + static char *vfio_device_devnode(const struct device *dev, umode_t *mode) { return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 51c027134814..fc49f2459b1a 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -701,6 +701,21 @@ static struct vfio_group *vfio_group_find_or_alloc(struct device *dev) return group; } +struct vfio_group * +vfio_group_find_noiommu_group_from_iommu(struct iommu_group *iommu_group) +{ + struct vfio_group *group; + bool found = false; + + mutex_lock(&vfio.group_lock); + group = vfio_group_find_from_iommu(iommu_group); + if (group && group->type == VFIO_NO_IOMMU) + found = true; + mutex_unlock(&vfio.group_lock); + + return found ? group : NULL; +} + int vfio_device_set_group(struct vfio_device *device, enum vfio_group_type type) { diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 3f359f04b754..5df737b24102 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -91,6 +91,8 @@ struct vfio_group { int vfio_device_block_group(struct vfio_device *device); void vfio_device_unblock_group(struct vfio_device *device); +struct vfio_group * +vfio_group_find_noiommu_group_from_iommu(struct iommu_group *iommu_group); int vfio_device_set_group(struct vfio_device *device, enum vfio_group_type type); void vfio_device_remove_group(struct vfio_device *device); @@ -273,6 +275,9 @@ static inline void vfio_device_del(struct vfio_device *device) void vfio_init_device_cdev(struct vfio_device *device); int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep); +void vfio_device_cdev_close(struct vfio_device_file *df); +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, + struct vfio_device_bind_iommufd __user *arg); int vfio_cdev_init(struct class *device_class); void vfio_cdev_cleanup(void); #else @@ -296,6 +301,16 @@ static inline int vfio_device_fops_cdev_open(struct inode *inode, return 0; } +static inline void vfio_device_cdev_close(struct vfio_device_file *df) +{ +} + +static inline long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, + struct vfio_device_bind_iommufd __user *arg) +{ + return -EOPNOTSUPP; +} + static inline int vfio_cdev_init(struct class *device_class) { return 0; diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index b0c2a7544524..08bb1705d02d 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -575,6 +575,8 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) if (df->group) vfio_device_group_close(df); + else + vfio_device_cdev_close(df); vfio_device_put_registration(device); @@ -1148,7 +1150,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, struct vfio_device *device = df->device; int ret; - /* Paired with smp_store_release() in vfio_device_group_open() */ + if (cmd == VFIO_DEVICE_BIND_IOMMUFD) + return vfio_device_ioctl_bind_iommufd(df, (void __user *)arg); + + /* + * Paired with smp_store_release() in the caller of + * vfio_device_open(). e.g. vfio_device_group_open() + * and vfio_device_ioctl_bind_iommufd() + */ if (!smp_load_acquire(&df->access_granted)) return -EINVAL; @@ -1179,7 +1188,11 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf, struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; - /* Paired with smp_store_release() in vfio_device_group_open() */ + /* + * Paired with smp_store_release() in the caller of + * vfio_device_open(). e.g. vfio_device_group_open() + * and vfio_device_ioctl_bind_iommufd() + */ if (!smp_load_acquire(&df->access_granted)) return -EINVAL; @@ -1196,7 +1209,11 @@ static ssize_t vfio_device_fops_write(struct file *filep, struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; - /* Paired with smp_store_release() in vfio_device_group_open() */ + /* + * Paired with smp_store_release() in the caller of + * vfio_device_open(). e.g. vfio_device_group_open() + * and vfio_device_ioctl_bind_iommufd() + */ if (!smp_load_acquire(&df->access_granted)) return -EINVAL; @@ -1211,7 +1228,11 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; - /* Paired with smp_store_release() in vfio_device_group_open() */ + /* + * Paired with smp_store_release() in the caller of + * vfio_device_open(). e.g. vfio_device_group_open() + * and vfio_device_ioctl_bind_iommufd() + */ if (!smp_load_acquire(&df->access_granted)) return -EINVAL; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 382d95455f89..a53afe349a34 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -194,6 +194,43 @@ struct vfio_group_status { /* --------------- IOCTLs for DEVICE file descriptors --------------- */ +/* + * VFIO_DEVICE_BIND_IOMMUFD - _IOR(VFIO_TYPE, VFIO_BASE + 19, + * struct vfio_device_bind_iommufd) + * + * Bind a vfio_device to the specified iommufd. + * + * The user should provide a device cookie when calling this ioctl. The + * cookie is carried only in event e.g. I/O fault reported to userspace + * via iommufd. The user should use devid returned by this ioctl to mark + * the target device in other ioctls (e.g. iommu hardware infomration query + * via iommufd, and etc.). + * + * User is not allowed to access the device before the binding operation + * is completed. + * + * Unbind is automatically conducted when device fd is closed. + * + * @argsz: user filled size of this data. + * @flags: reserved for future extension. + * @dev_cookie: a per device cookie provided by userspace. + * @iommufd: iommufd to bind. a negative value means noiommu. + * @out_devid: the device id generated by this bind. This field is valid + * as long as the input @iommufd is valid. Otherwise, it is + * meaningless. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_bind_iommufd { + __u32 argsz; + __u32 flags; + __aligned_u64 dev_cookie; + __s32 iommufd; + __u32 out_devid; +}; + +#define VFIO_DEVICE_BIND_IOMMUFD _IO(VFIO_TYPE, VFIO_BASE + 19) + /** * VFIO_DEVICE_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 7, * struct vfio_device_info) From patchwork Wed Mar 8 13:29:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165816 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5CB5AC678D5 for ; Wed, 8 Mar 2023 13:29:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DD9FD10E62E; Wed, 8 Mar 2023 13:29:47 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 08C4C10E678; Wed, 8 Mar 2023 13:29:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282184; x=1709818184; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jQI11V9I2VfPt6XpbiUYEXw6DQ12QXvh5vFb+3F7BPM=; b=HpUak7Jh0C7eWAygDfuV/zISL4BLjtksSAopS8P9Ow9GnATyHUZJ7EFJ h7K6R6Lef9lKyiZliSu4mZ4SLC053UO3eDubWquTl4N1hbKlgzaOtymWl X7fjTX8U9C25KYPBxtNoeJP2lU8qHdc/N++aZyACYmSuJX6x2aIY37bwE Ux2siKt632T9onvtmkS2nOgzLu0LyiWK4zG/X7AVmvHVOxR38Em7tgrj4 rVJcQYx4gnXlMp6/gdN349AKUhWa15X4YoJKOO8iQr3LaVMxxiGxQ3e+q wc58/1y1Tw1A7Lh7QfrO9fJPMkiicvMr5FF3mMIINjA82mKcrthjqdARv Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165349" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165349" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789460" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789460" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:42 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:29:01 -0800 Message-Id: <20230308132903.465159-23-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 22/24] vfio: Add VFIO_DEVICE_AT[DE]TACH_IOMMUFD_PT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This adds ioctl for userspace to attach device cdev fd to and detach from IOAS/hw_pagetable managed by iommufd. VFIO_DEVICE_ATTACH_IOMMUFD_PT: attach vfio device to IOAS, hw_pagetable managed by iommufd. Attach can be undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. VFIO_DEVICE_DETACH_IOMMUFD_PT: detach vfio device from the current attached IOAS or hw_pagetable managed by iommufd. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato --- drivers/vfio/device_cdev.c | 85 ++++++++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 16 +++++++ drivers/vfio/vfio_main.c | 8 ++++ include/uapi/linux/vfio.h | 52 +++++++++++++++++++++++ 4 files changed, 161 insertions(+) diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c index 568cc9da16c7..77e0f34f6bc7 100644 --- a/drivers/vfio/device_cdev.c +++ b/drivers/vfio/device_cdev.c @@ -210,6 +210,91 @@ long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, return ret; } +int vfio_ioctl_device_attach(struct vfio_device_file *df, + struct vfio_device_attach_iommufd_pt __user *arg) +{ + struct vfio_device *device = df->device; + struct vfio_device_attach_iommufd_pt attach; + unsigned long minsz; + int ret; + + static_assert(__same_type(arg->pt_id, attach.pt_id)); + + minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id); + + if (copy_from_user(&attach, arg, minsz)) + return -EFAULT; + + if (attach.argsz < minsz || attach.flags) + return -EINVAL; + + if (!device->ops->bind_iommufd) + return -ENODEV; + + /* ATTACH only allowed for cdev fds */ + if (df->group) + return -EINVAL; + + mutex_lock(&device->dev_set->lock); + /* noiommufd mode doesn't allow attach */ + if (!df->iommufd) { + ret = -EOPNOTSUPP; + goto out_unlock; + } + + ret = device->ops->attach_ioas(device, &attach.pt_id); + if (ret) + goto out_unlock; + + ret = copy_to_user(&arg->pt_id, &attach.pt_id, + sizeof(attach.pt_id)) ? -EFAULT : 0; + if (ret) + goto out_detach; + mutex_unlock(&device->dev_set->lock); + + return 0; + +out_detach: + device->ops->detach_ioas(device); +out_unlock: + mutex_unlock(&device->dev_set->lock); + return ret; +} + +int vfio_ioctl_device_detach(struct vfio_device_file *df, + struct vfio_device_detach_iommufd_pt __user *arg) +{ + struct vfio_device *device = df->device; + struct vfio_device_detach_iommufd_pt detach; + unsigned long minsz; + + minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags); + + if (copy_from_user(&detach, arg, minsz)) + return -EFAULT; + + if (detach.argsz < minsz || detach.flags) + return -EINVAL; + + if (!device->ops->bind_iommufd) + return -ENODEV; + + /* DETACH only allowed for cdev fds */ + if (df->group) + return -EINVAL; + + mutex_lock(&device->dev_set->lock); + /* noiommufd mode doesn't support detach */ + if (!df->iommufd) { + mutex_unlock(&device->dev_set->lock); + return -EOPNOTSUPP; + } + device->ops->detach_ioas(device); + mutex_unlock(&device->dev_set->lock); + + return 0; +} + static char *vfio_device_devnode(const struct device *dev, umode_t *mode) { return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 5df737b24102..8b70e2af4ece 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -278,6 +278,10 @@ int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep); void vfio_device_cdev_close(struct vfio_device_file *df); long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, struct vfio_device_bind_iommufd __user *arg); +int vfio_ioctl_device_attach(struct vfio_device_file *df, + struct vfio_device_attach_iommufd_pt __user *arg); +int vfio_ioctl_device_detach(struct vfio_device_file *df, + struct vfio_device_detach_iommufd_pt __user *arg); int vfio_cdev_init(struct class *device_class); void vfio_cdev_cleanup(void); #else @@ -311,6 +315,18 @@ static inline long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, return -EOPNOTSUPP; } +static inline int vfio_ioctl_device_attach(struct vfio_device_file *df, + struct vfio_device_attach_iommufd_pt __user *arg) +{ + return -EOPNOTSUPP; +} + +static inline int vfio_ioctl_device_detach(struct vfio_device_file *df, + struct vfio_device_detach_iommufd_pt __user *arg) +{ + return -EOPNOTSUPP; +} + static inline int vfio_cdev_init(struct class *device_class) { return 0; diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 08bb1705d02d..e5f4738692ac 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1170,6 +1170,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, ret = vfio_ioctl_device_feature(device, (void __user *)arg); break; + case VFIO_DEVICE_ATTACH_IOMMUFD_PT: + ret = vfio_ioctl_device_attach(df, (void __user *)arg); + break; + + case VFIO_DEVICE_DETACH_IOMMUFD_PT: + ret = vfio_ioctl_device_detach(df, (void __user *)arg); + break; + default: if (unlikely(!device->ops->ioctl)) ret = -EINVAL; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index a53afe349a34..692156a708bb 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -231,6 +231,58 @@ struct vfio_device_bind_iommufd { #define VFIO_DEVICE_BIND_IOMMUFD _IO(VFIO_TYPE, VFIO_BASE + 19) +/* + * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20, + * struct vfio_device_attach_iommufd_pt) + * + * Attach a vfio device to an iommufd address space specified by IOAS + * id or hw_pagetable (hwpt) id. + * + * Available only after a device has been bound to iommufd via + * VFIO_DEVICE_BIND_IOMMUFD + * + * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. + * + * @argsz: user filled size of this data. + * @flags: must be 0. + * @pt_id: Input the target id which can represent an ioas or a hwpt + * allocated via iommufd subsystem. + * Output the attached hwpt id which could be the specified + * hwpt itself or a hwpt automatically created for the + * specified ioas by kernel during the attachment. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_attach_iommufd_pt { + __u32 argsz; + __u32 flags; + __u32 pt_id; +}; + +#define VFIO_DEVICE_ATTACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 20) + +/* + * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 21, + * struct vfio_device_detach_iommufd_pt) + * + * Detach a vfio device from the iommufd address space it has been + * attached to. After it, device should be in a blocking DMA state. + * + * Available only after a device has been bound to iommufd via + * VFIO_DEVICE_BIND_IOMMUFD. + * + * @argsz: user filled size of this data. + * @flags: must be 0. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_detach_iommufd_pt { + __u32 argsz; + __u32 flags; +}; + +#define VFIO_DEVICE_DETACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 21) + /** * VFIO_DEVICE_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 7, * struct vfio_device_info) From patchwork Wed Mar 8 13:29:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165817 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 279A0C64EC4 for ; Wed, 8 Mar 2023 13:29:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 64CD110E62F; Wed, 8 Mar 2023 13:29:49 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id A76AC10E67E; Wed, 8 Mar 2023 13:29:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282185; x=1709818185; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TWaG9EJbT12VI3MGD+8NaVpVpmNynSY3YllYqysnzLk=; b=ei0t0zTX7ebzAAWeUTz1R7szXCSXXj/Ht0Jz65mOB7i+l8Uoj0FZJ89z Rb87Vx73Zkbi9p9N43FrqPnqX4goUOFOAOsUt7vGPS15BGlAVBUm1wBKh Z0j3H5TG80hiTWXv14Y4VB4RH4ceocJhoNftC8fHZ2ZWzaZjQ4F1rDmY/ 5PuU0yZ3xB3OkVq4pbKFiyRWLLX3jrLG3yzBARAWZnqvCixr9lRA954NI cMR63y6Vhg8aTWkjMA+QVpouIMvUr5Q0m/X0k+3/bmCc81CWMVtc31VCg BIRxQZe9F6qKm4ywNDDtw8qrH7qYPvbjLIha4nlNjpKPo+NuEheY9bdr4 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165362" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165362" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789465" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789465" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:44 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:29:02 -0800 Message-Id: <20230308132903.465159-24-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 23/24] vfio: Compile group optionally X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" group code is not needed for vfio device cdev, so with vfio device cdev introduced, the group infrastructures can be compiled out if only cdev is needed. Signed-off-by: Yi Liu Reviewed-by: Kevin Tian --- drivers/vfio/Kconfig | 16 +++++++- drivers/vfio/Makefile | 2 +- drivers/vfio/vfio.h | 94 +++++++++++++++++++++++++++++++++++++++++++ include/linux/vfio.h | 18 ++++++++- 4 files changed, 126 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index e2105b4dac2d..0942a19601a2 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -4,7 +4,9 @@ menuconfig VFIO select IOMMU_API depends on IOMMUFD || !IOMMUFD select INTERVAL_TREE - select VFIO_CONTAINER if IOMMUFD=n + select VFIO_GROUP if SPAPR_TCE_IOMMU || !IOMMUFD + select VFIO_DEVICE_CDEV if !VFIO_GROUP + select VFIO_CONTAINER if IOMMUFD=n && VFIO_GROUP help VFIO provides a framework for secure userspace device drivers. See Documentation/driver-api/vfio.rst for more details. @@ -15,6 +17,7 @@ if VFIO config VFIO_DEVICE_CDEV bool "Support for the VFIO cdev /dev/vfio/devices/vfioX" depends on IOMMUFD + default !VFIO_GROUP help The VFIO device cdev is another way for userspace to get device access. Userspace gets device fd by opening device cdev under @@ -23,9 +26,20 @@ config VFIO_DEVICE_CDEV If you don't know what to do here, say N. +config VFIO_GROUP + bool "Support for the VFIO group /dev/vfio/$group_id" + default y + help + VFIO group support provides the traditional model for accessing + devices through VFIO and is used by the majority of userspace + applications and drivers making use of VFIO. + + If you don't know what to do here, say Y. + config VFIO_CONTAINER bool "Support for the VFIO container /dev/vfio/vfio" select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64) + depends on VFIO_GROUP default y help The VFIO container is the classic interface to VFIO for establishing diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 245394aeb94b..57c3515af606 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -2,9 +2,9 @@ obj-$(CONFIG_VFIO) += vfio.o vfio-y += vfio_main.o \ - group.o \ iova_bitmap.o vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o +vfio-$(CONFIG_VFIO_GROUP) += group.o vfio-$(CONFIG_IOMMUFD) += iommufd.o vfio-$(CONFIG_VFIO_CONTAINER) += container.o vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 8b70e2af4ece..d5dfacf11265 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -60,6 +60,7 @@ enum vfio_group_type { VFIO_NO_IOMMU, }; +#if IS_ENABLED(CONFIG_VFIO_GROUP) struct vfio_group { struct device dev; struct cdev cdev; @@ -115,6 +116,99 @@ static inline bool vfio_device_is_noiommu(struct vfio_device *vdev) return IS_ENABLED(CONFIG_VFIO_NOIOMMU) && vdev->group->type == VFIO_NO_IOMMU; } +#else +struct vfio_group; + +static inline int vfio_device_block_group(struct vfio_device *device) +{ + return 0; +} + +static inline void vfio_device_unblock_group(struct vfio_device *device) +{ +} + +static inline struct vfio_group * +vfio_group_find_noiommu_group_from_iommu(struct iommu_group *iommu_group) +{ + return NULL; +} + +static inline int vfio_device_set_group(struct vfio_device *device, + enum vfio_group_type type) +{ + return 0; +} + +static inline void vfio_device_remove_group(struct vfio_device *device) +{ +} + +static inline void vfio_device_group_register(struct vfio_device *device) +{ +} + +static inline void vfio_device_group_unregister(struct vfio_device *device) +{ +} + +static inline bool vfio_device_group_uses_container(struct vfio_device_file *df) +{ + return false; +} + +static inline int vfio_device_group_use_iommu(struct vfio_device *device) +{ + return -EOPNOTSUPP; +} + +static inline void vfio_device_group_unuse_iommu(struct vfio_device *device) +{ +} + +static inline void vfio_device_group_close(struct vfio_device_file *df) +{ +} + +static inline struct vfio_group *vfio_group_from_file(struct file *file) +{ + return NULL; +} + +static inline bool vfio_group_enforced_coherent(struct vfio_group *group) +{ + return true; +} + +static inline void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm) +{ +} + +static inline bool vfio_group_has_dev(struct vfio_group *group, + struct vfio_device *device) +{ + return false; +} + +static inline bool vfio_device_has_container(struct vfio_device *device) +{ + return false; +} + +static inline int __init vfio_group_init(void) +{ + return 0; +} + +static inline void vfio_group_cleanup(void) +{ +} + +static inline bool vfio_device_is_noiommu(struct vfio_device *vdev) +{ + return false; +} +#endif /* CONFIG_VFIO_GROUP */ #if IS_ENABLED(CONFIG_VFIO_CONTAINER) /** diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 34adfcb5b7bd..e7fc1de35acf 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -43,7 +43,11 @@ struct vfio_device { */ const struct vfio_migration_ops *mig_ops; const struct vfio_log_ops *log_ops; +#if IS_ENABLED(CONFIG_VFIO_GROUP) struct vfio_group *group; + struct list_head group_next; + struct list_head iommu_entry; +#endif struct vfio_device_set *dev_set; struct list_head dev_set_list; unsigned int migration_flags; @@ -58,8 +62,6 @@ struct vfio_device { refcount_t refcount; /* user count on registered device*/ unsigned int open_count; struct completion comp; - struct list_head group_next; - struct list_head iommu_entry; struct iommufd_access *iommufd_access; void (*put_kvm)(struct kvm *kvm); #if IS_ENABLED(CONFIG_IOMMUFD) @@ -259,8 +261,20 @@ int vfio_mig_get_next_state(struct vfio_device *device, /* * External user API */ +#if IS_ENABLED(CONFIG_VFIO_GROUP) struct iommu_group *vfio_file_iommu_group(struct file *file); bool vfio_file_is_group(struct file *file); +#else +static inline struct iommu_group *vfio_file_iommu_group(struct file *file) +{ + return NULL; +} + +static inline bool vfio_file_is_group(struct file *file) +{ + return false; +} +#endif bool vfio_file_is_valid(struct file *file); bool vfio_file_enforced_coherent(struct file *file); void vfio_file_set_kvm(struct file *file, struct kvm *kvm); From patchwork Wed Mar 8 13:29:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13165822 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 18E13C74A4B for ; Wed, 8 Mar 2023 13:29:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 71FE310E68A; Wed, 8 Mar 2023 13:29:50 +0000 (UTC) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3D59B10E62F; Wed, 8 Mar 2023 13:29:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678282188; x=1709818188; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=wbdrYtzM9mN4gwBAKIFy0X9kmMhAGDOr+1bYvr2M9k8=; b=kwOvAstFdXiXm3oFu4sEV3pzW1jsAp71TI0AkeQTxcOc1KbiMC2QuzXo sy/vsUnB5QqpfoVfkDLsYCp2ccZ7a1/1KsZZY+CXb8TMo9aYfi7hycAyf DEZXjblpClhxW+lX7hc76zKgWTApgexVe3K0FktLIPJFAShzVp8u9eQbs JOl9cRsi3XdO106Hq1y8Fnb2IXX8aBSURHFXUJhPg1dGve0PD9A90y7gE WEmmX6eOflouAvG3SiAprSn5oSoTI7fvXeWFgqzaLi403dVPs0UJlW4eI 4/BnzwJaRbBlpe/8XHFpMLOnNr/BDWZjLceQG6vC3VoKd9rbKy0ucpzGi g==; X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="336165375" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="336165375" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Mar 2023 05:29:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10642"; a="922789474" X-IronPort-AV: E=Sophos;i="5.98,244,1673942400"; d="scan'208";a="922789474" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 08 Mar 2023 05:29:46 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Wed, 8 Mar 2023 05:29:03 -0800 Message-Id: <20230308132903.465159-25-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230308132903.465159-1-yi.l.liu@intel.com> References: <20230308132903.465159-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v6 24/24] docs: vfio: Add vfio device cdev description X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-s390@vger.kernel.org, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, mjrosato@linux.ibm.com, kvm@vger.kernel.org, intel-gvt-dev@lists.freedesktop.org, joro@8bytes.org, cohuck@redhat.com, xudong.hao@intel.com, peterx@redhat.com, yan.y.zhao@intel.com, eric.auger@redhat.com, terrence.xu@intel.com, nicolinc@nvidia.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, intel-gfx@lists.freedesktop.org, chao.p.peng@linux.intel.com, lulu@redhat.com, robin.murphy@arm.com, jasowang@redhat.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This gives notes for userspace applications on device cdev usage. Signed-off-by: Yi Liu --- Documentation/driver-api/vfio.rst | 125 ++++++++++++++++++++++++++++++ 1 file changed, 125 insertions(+) diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst index 44527420f20d..227940e5224f 100644 --- a/Documentation/driver-api/vfio.rst +++ b/Documentation/driver-api/vfio.rst @@ -239,6 +239,123 @@ group and can access them as follows:: /* Gratuitous device reset and go... */ ioctl(device, VFIO_DEVICE_RESET); +IOMMUFD and vfio_iommu_type1 +---------------------------- + +IOMMUFD is the new user API to manage I/O page tables from userspace. +It intends to be the portal of delivering advanced userspace DMA +features (nested translation [5], PASID [6], etc.) and backward +compatible with the vfio_iommu_type1 driver. Eventually vfio_iommu_type1 +will be deprecated. + +With the backward compatibility, no change is required for legacy VFIO +drivers or applications to connect a VFIO device to IOMMUFD. + + When CONFIG_IOMMUFD_VFIO_CONTAINER=n, VFIO container still provides + /dev/vfio/vfio which connects to vfio_iommu_type1. To disable VFIO + container and vfio_iommu_type1, the administrator could symbol link + /dev/vfio/vfio to /dev/iommu to enable VFIO container emulation + in IOMMUFD. + + When CONFIG_IOMMUFD_VFIO_CONTAINER=y, IOMMUFD directly provides + /dev/vfio/vfio while the VFIO container and vfio_iommu_type1 are + explicitly disabled. + +VFIO Device cdev +---------------- + +Traditionally user acquires a device fd via VFIO_GROUP_GET_DEVICE_FD +in a VFIO group. + +With CONFIG_VFIO_DEVICE_CDEV=y the user can now acquire a device fd +by directly opening a character device /dev/vfio/devices/vfioX where +"X" is the number allocated uniquely by VFIO for registered devices. + +The cdev only works with IOMMUFD. Both VFIO drivers and applications +must adapt to the new cdev security model which requires using +VFIO_DEVICE_BIND_IOMMUFD to claim DMA ownership before starting to +actually use the device. Once bind succeeds then a VFIO device can +be fully accessed by the user. + +VFIO device cdev doesn't rely on VFIO group/container/iommu drivers. +Hence those modules can be fully compiled out in an environment +where no legacy VFIO application exists. + +So far SPAPR does not support IOMMUFD yet. So it cannot support device +cdev either. + +Device cdev Example +------------------- + +Assume user wants to access PCI device 0000:6a:01.0:: + + $ ls /sys/bus/pci/devices/0000:6a:01.0/vfio-dev/ + vfio0 + +This device is therefore represented as vfio0. The user can verify +its existence:: + + $ ls -l /dev/vfio/devices/vfio0 + crw------- 1 root root 511, 0 Feb 16 01:22 /dev/vfio/devices/vfio0 + $ cat /sys/bus/pci/devices/0000:6a:01.0/vfio-dev/vfio0/dev + 511:0 + $ ls -l /dev/char/511\:0 + lrwxrwxrwx 1 root root 21 Feb 16 01:22 /dev/char/511:0 -> ../vfio/devices/vfio0 + +Then provide the user with access to the device if unprivileged +operation is desired:: + + $ chown user:user /dev/vfio/devices/vfio0 + +Finally the user could get cdev fd by:: + + cdev_fd = open("/dev/vfio/devices/vfio0", O_RDWR); + +An opened cdev_fd doesn't give the user any permission of accessing +the device except binding the cdev_fd to an iommufd. After that point +then the device is fully accessible including attaching it to an +IOMMUFD IOAS/HWPT to enable userspace DMA:: + + struct vfio_device_bind_iommufd bind = { + .argsz = sizeof(bind), + .flags = 0, + }; + struct iommu_ioas_alloc alloc_data = { + .size = sizeof(alloc_data), + .flags = 0, + }; + struct vfio_device_attach_iommufd_pt attach_data = { + .argsz = sizeof(attach_data), + .flags = 0, + }; + struct iommu_ioas_map map = { + .size = sizeof(map), + .flags = IOMMU_IOAS_MAP_READABLE | + IOMMU_IOAS_MAP_WRITEABLE | + IOMMU_IOAS_MAP_FIXED_IOVA, + .__reserved = 0, + }; + + iommufd = open("/dev/iommu", O_RDWR); + + bind.iommufd = iommufd; // negative iommufd means vfio-noiommu mode + ioctl(cdev_fd, VFIO_DEVICE_BIND_IOMMUFD, &bind); + + ioctl(iommufd, IOMMU_IOAS_ALLOC, &alloc_data); + attach_data.pt_id = alloc_data.out_ioas_id; + ioctl(cdev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data); + + /* Allocate some space and setup a DMA mapping */ + map.user_va = (int64_t)mmap(0, 1024 * 1024, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); + map.iova = 0; /* 1MB starting at 0x0 from device view */ + map.length = 1024 * 1024; + map.ioas_id = alloc_data.out_ioas_id;; + + ioctl(iommufd, IOMMU_IOAS_MAP, &map); + + /* Other device operations as stated in "VFIO Usage Example" */ + VFIO User API ------------------------------------------------------------------------------- @@ -566,3 +683,11 @@ This implementation has some specifics: \-0d.1 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) + +.. [5] Nested translation is an IOMMU feature which supports two stage + address translations. This improves the address translation efficiency + in IOMMU virtualization. + +.. [6] PASID stands for Process Address Space ID, introduced by PCI + Express. It is a prerequisite for Shared Virtual Addressing (SVA) + and Scalable I/O Virtualization (Scalable IOV).