From patchwork Sat Apr 1 15:18:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197122 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A3FE7C77B6E for ; Sat, 1 Apr 2023 15:19:17 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DFD2810E2AD; Sat, 1 Apr 2023 15:19:16 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id DDEF310E105; Sat, 1 Apr 2023 15:18:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362316; x=1711898316; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hUwYSsLdqR/fklvxwBo5aRySGIYs2Vl7KYRkZwHwtPA=; b=fE8r5ER9BTNrgUEdDGb0tG6tu/nRPJc95O1cGVgElrEONyzRu2ddwCy1 wG5eWYyNC4pfGVUwIfgd5yFj1/MBsKROBMkaoOoHYchEC9j33XOKrnUj2 S7c3QL1xIuMhc0OTFhOFqD2ytJQYwadqO8BRrSEvO17cW4bkzE3qaDwmN jSnHmp4gpA9/qKxLoZtBzcUTbdSQGQGZZKaLn28aHdKWywF9ja/cD83uG LIvQSYynW4y1OfI+nbDaRJ4m9cn/VxNHxEadZP7Q8rAVWdNg+/n+g4YEr rxA+Na34+4M7Cz1Rjb7lHX8HetPKpSvmPCOhpPbqGV4Xz39X+Hwd44W01 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411175" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411175" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937147" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937147" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:35 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:09 -0700 Message-Id: <20230401151833.124749-2-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 01/25] vfio: Allocate per device file structure X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This is preparation for adding vfio device cdev support. vfio device cdev requires: 1) A per device file memory to store the kvm pointer set by KVM. It will be propagated to vfio_device:kvm after the device cdev file is bound to an iommufd. 2) A mechanism to block device access through device cdev fd before it is bound to an iommufd. To address above requirements, this adds a per device file structure named vfio_device_file. For now, it's only a wrapper of struct vfio_device pointer. Other fields will be added to this per file structure in future commits. Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/group.c | 13 +++++++++++-- drivers/vfio/vfio.h | 6 ++++++ drivers/vfio/vfio_main.c | 35 ++++++++++++++++++++++++++++------- 3 files changed, 45 insertions(+), 9 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index d0c95d033605..8a13cea43f49 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -218,19 +218,26 @@ void vfio_device_group_close(struct vfio_device *device) static struct file *vfio_device_open_file(struct vfio_device *device) { + struct vfio_device_file *df; struct file *filep; int ret; + df = vfio_allocate_device_file(device); + if (IS_ERR(df)) { + ret = PTR_ERR(df); + goto err_out; + } + ret = vfio_device_group_open(device); if (ret) - goto err_out; + goto err_free; /* * We can't use anon_inode_getfd() because we need to modify * the f_mode flags directly to allow more than just ioctls */ filep = anon_inode_getfile("[vfio-device]", &vfio_device_fops, - device, O_RDWR); + df, O_RDWR); if (IS_ERR(filep)) { ret = PTR_ERR(filep); goto err_close_device; @@ -254,6 +261,8 @@ static struct file *vfio_device_open_file(struct vfio_device *device) err_close_device: vfio_device_group_close(device); +err_free: + kfree(df); err_out: return ERR_PTR(ret); } diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index c0aeea24fbd6..250fbd3786c5 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -16,11 +16,17 @@ struct iommufd_ctx; struct iommu_group; struct vfio_container; +struct vfio_device_file { + struct vfio_device *device; +}; + void vfio_device_put_registration(struct vfio_device *device); bool vfio_device_try_get_registration(struct vfio_device *device); int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd); void vfio_device_close(struct vfio_device *device, struct iommufd_ctx *iommufd); +struct vfio_device_file * +vfio_allocate_device_file(struct vfio_device *device); extern const struct file_operations vfio_device_fops; diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index ebbb6b91a498..89722bf87edc 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -404,6 +404,20 @@ static bool vfio_assert_device_open(struct vfio_device *device) return !WARN_ON_ONCE(!READ_ONCE(device->open_count)); } +struct vfio_device_file * +vfio_allocate_device_file(struct vfio_device *device) +{ + struct vfio_device_file *df; + + df = kzalloc(sizeof(*df), GFP_KERNEL_ACCOUNT); + if (!df) + return ERR_PTR(-ENOMEM); + + df->device = device; + + return df; +} + static int vfio_device_first_open(struct vfio_device *device, struct iommufd_ctx *iommufd) { @@ -517,12 +531,15 @@ static inline void vfio_device_pm_runtime_put(struct vfio_device *device) */ static int vfio_device_fops_release(struct inode *inode, struct file *filep) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; vfio_device_group_close(device); vfio_device_put_registration(device); + kfree(df); + return 0; } @@ -1087,7 +1104,8 @@ static int vfio_ioctl_device_feature(struct vfio_device *device, static long vfio_device_fops_unl_ioctl(struct file *filep, unsigned int cmd, unsigned long arg) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; int ret; ret = vfio_device_pm_runtime_get(device); @@ -1114,7 +1132,8 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf, size_t count, loff_t *ppos) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; if (unlikely(!device->ops->read)) return -EINVAL; @@ -1126,7 +1145,8 @@ static ssize_t vfio_device_fops_write(struct file *filep, const char __user *buf, size_t count, loff_t *ppos) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; if (unlikely(!device->ops->write)) return -EINVAL; @@ -1136,7 +1156,8 @@ static ssize_t vfio_device_fops_write(struct file *filep, static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) { - struct vfio_device *device = filep->private_data; + struct vfio_device_file *df = filep->private_data; + struct vfio_device *device = df->device; if (unlikely(!device->ops->mmap)) return -EINVAL; @@ -1156,11 +1177,11 @@ const struct file_operations vfio_device_fops = { static struct vfio_device *vfio_device_from_file(struct file *file) { - struct vfio_device *device = file->private_data; + struct vfio_device_file *df = file->private_data; if (file->f_op != &vfio_device_fops) return NULL; - return device; + return df->device; } /** From patchwork Sat Apr 1 15:18:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197116 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 549C6C77B62 for ; Sat, 1 Apr 2023 15:19:00 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 51F6810E24F; Sat, 1 Apr 2023 15:18:59 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 81F2610E0F6; Sat, 1 Apr 2023 15:18:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362317; x=1711898317; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1DgT9S3o7/JtJhQIxwcH47vx3PZvAm56cQLXBR4VdLE=; b=Ury9VIY1If+t49reg8aXy5Bdfbco2zXqHdonGpSRq2eq6/AAv5VgjtMq cd2rzIwp3P97znF8VarLGUbTlkgZS+QmViHs9K49FOI0ybkRNM78qV2MV TD/e4V5PF6mtP1//DZDQAfTrv2PREZbRTVuc5UpWHRy1AC7nzQetj55hO v/Dbfj2CdRL7V37cbOJd7L0FR8m+Up/51SeD1cM/Wnbar8GcZik9RCwGK KalRSXGPAB9OcUpQE4L+YJ7kPqJfpN1XGAvcshFxpOwKNffo+JFohQbNL b5DFnvfkhYFayJJ+aZWfVF1A/3PCWyN2rt9zsDAu055AZW5SKxRIO0C4L g==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411187" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411187" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937152" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937152" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:36 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:10 -0700 Message-Id: <20230401151833.124749-3-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 02/25] vfio: Refine vfio file kAPIs for KVM X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This prepares for making the below kAPIs to accept both group file and device file instead of only vfio group file. bool vfio_file_enforced_coherent(struct file *file); void vfio_file_set_kvm(struct file *file, struct kvm *kvm); Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/group.c | 36 ++++++--------------------------- drivers/vfio/vfio.h | 2 ++ drivers/vfio/vfio_main.c | 43 ++++++++++++++++++++++++++++++++++++++++ virt/kvm/vfio.c | 10 +++++----- 4 files changed, 56 insertions(+), 35 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 8a13cea43f49..ede4723c5f72 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -802,24 +802,11 @@ bool vfio_file_is_group(struct file *file) } EXPORT_SYMBOL_GPL(vfio_file_is_group); -/** - * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file - * is always CPU cache coherent - * @file: VFIO group file - * - * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop - * bit in DMA transactions. A return of false indicates that the user has - * rights to access additional instructions such as wbinvd on x86. - */ -bool vfio_file_enforced_coherent(struct file *file) +bool vfio_group_enforced_coherent(struct vfio_group *group) { - struct vfio_group *group = file->private_data; struct vfio_device *device; bool ret = true; - if (!vfio_file_is_group(file)) - return true; - /* * If the device does not have IOMMU_CAP_ENFORCE_CACHE_COHERENCY then * any domain later attached to it will also not support it. If the cap @@ -837,28 +824,17 @@ bool vfio_file_enforced_coherent(struct file *file) mutex_unlock(&group->device_lock); return ret; } -EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); -/** - * vfio_file_set_kvm - Link a kvm with VFIO drivers - * @file: VFIO group file - * @kvm: KVM to link - * - * When a VFIO device is first opened the KVM will be available in - * device->kvm if one was associated with the group. - */ -void vfio_file_set_kvm(struct file *file, struct kvm *kvm) +void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm) { - struct vfio_group *group = file->private_data; - - if (!vfio_file_is_group(file)) - return; - + /* + * When a VFIO device is first opened the KVM will be available in + * device->kvm if one was associated with the group. + */ spin_lock(&group->kvm_ref_lock); group->kvm = kvm; spin_unlock(&group->kvm_ref_lock); } -EXPORT_SYMBOL_GPL(vfio_file_set_kvm); bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device) { diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 250fbd3786c5..56ad127ac618 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -92,6 +92,8 @@ void vfio_device_group_unuse_iommu(struct vfio_device *device); void vfio_device_group_close(struct vfio_device *device); struct vfio_group *vfio_group_from_file(struct file *file); bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device); +bool vfio_group_enforced_coherent(struct vfio_group *group); +void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm); bool vfio_device_has_container(struct vfio_device *device); int __init vfio_group_init(void); void vfio_group_cleanup(void); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 89722bf87edc..748bde4d74d9 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1219,6 +1219,49 @@ bool vfio_file_has_dev(struct file *file, struct vfio_device *device) } EXPORT_SYMBOL_GPL(vfio_file_has_dev); +/** + * vfio_file_enforced_coherent - True if the DMA associated with the VFIO file + * is always CPU cache coherent + * @file: VFIO group file or VFIO device file + * + * Enforced coherency means that the IOMMU ignores things like the PCIe no-snoop + * bit in DMA transactions. A return of false indicates that the user has + * rights to access additional instructions such as wbinvd on x86. + */ +bool vfio_file_enforced_coherent(struct file *file) +{ + struct vfio_group *group; + struct vfio_device *device; + + group = vfio_group_from_file(file); + if (group) + return vfio_group_enforced_coherent(group); + + device = vfio_device_from_file(file); + if (device) + return device_iommu_capable(device->dev, + IOMMU_CAP_ENFORCE_CACHE_COHERENCY); + + return true; +} +EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); + +/** + * vfio_file_set_kvm - Link a kvm with VFIO drivers + * @file: VFIO group file or VFIO device file + * @kvm: KVM to link + * + */ +void vfio_file_set_kvm(struct file *file, struct kvm *kvm) +{ + struct vfio_group *group; + + group = vfio_group_from_file(file); + if (group) + vfio_group_set_kvm(group, kvm); +} +EXPORT_SYMBOL_GPL(vfio_file_set_kvm); + /* * Sub-module support */ diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 9584eb57e0ed..8bac308ba630 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -64,18 +64,18 @@ static bool kvm_vfio_file_enforced_coherent(struct file *file) return ret; } -static bool kvm_vfio_file_is_group(struct file *file) +static bool kvm_vfio_file_is_valid(struct file *file) { bool (*fn)(struct file *file); bool ret; - fn = symbol_get(vfio_file_is_group); + fn = symbol_get(vfio_file_is_valid); if (!fn) return false; ret = fn(file); - symbol_put(vfio_file_is_group); + symbol_put(vfio_file_is_valid); return ret; } @@ -154,8 +154,8 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) if (!filp) return -EBADF; - /* Ensure the FD is a vfio group FD.*/ - if (!kvm_vfio_file_is_group(filp)) { + /* Ensure the FD is a vfio FD.*/ + if (!kvm_vfio_file_is_valid(filp)) { ret = -EINVAL; goto err_fput; } From patchwork Sat Apr 1 15:18:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197111 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 81C41C77B6D for ; Sat, 1 Apr 2023 15:18:53 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 1AF1D10E239; Sat, 1 Apr 2023 15:18:49 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7B67610E10B; Sat, 1 Apr 2023 15:18:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362318; x=1711898318; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GLz5NXI+1KYeMmg5ytUfNGqA8vEqwpVcpTZviemQX7A=; b=JgoyUmFQKcln+u79l2MRh9GGhDxkRwxu+aVfPYaWNUH0sQGtjyaof87c pRthtC+wiE/QpAbhET9DUKflxaUDADov2g8A558MPQwO8jR+Odo9LJxlC nQp9ONU1GPTQAluu2tYJVxEpruYStVhllzuP44WX9OESRqaxlGiawbKc2 uEyYnakim37V+a7eVfbHazOO45ABs5zU3YZQuEq2QDScHnrGnI+R0NleE XpwkR8/XldqYcrd189qFscZqdX0kkSR1hKIkiRLZxw1u2kzwjsD2Lovoi vBjnsudNpD0D7iv0kfnCD0B7EotpTviN1KDJCwvaS62byIB6Ho4Ot3oMz g==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411196" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411196" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937156" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937156" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:37 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:11 -0700 Message-Id: <20230401151833.124749-4-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 03/25] vfio: Remove vfio_file_is_group() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" since no user of vfio_file_is_group() now. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Yanting Jiang Signed-off-by: Yi Liu Reviewed-by: Eric Auger --- drivers/vfio/group.c | 10 ---------- include/linux/vfio.h | 1 - 2 files changed, 11 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index ede4723c5f72..4f937ebaf6f7 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -792,16 +792,6 @@ struct iommu_group *vfio_file_iommu_group(struct file *file) } EXPORT_SYMBOL_GPL(vfio_file_iommu_group); -/** - * vfio_file_is_group - True if the file is a vfio group file - * @file: VFIO group file - */ -bool vfio_file_is_group(struct file *file) -{ - return vfio_group_from_file(file); -} -EXPORT_SYMBOL_GPL(vfio_file_is_group); - bool vfio_group_enforced_coherent(struct vfio_group *group) { struct vfio_device *device; diff --git a/include/linux/vfio.h b/include/linux/vfio.h index d9a0770e5fc1..7519ae89fcd6 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -264,7 +264,6 @@ int vfio_mig_get_next_state(struct vfio_device *device, * External user API */ struct iommu_group *vfio_file_iommu_group(struct file *file); -bool vfio_file_is_group(struct file *file); bool vfio_file_is_valid(struct file *file); bool vfio_file_enforced_coherent(struct file *file); void vfio_file_set_kvm(struct file *file, struct kvm *kvm); From patchwork Sat Apr 1 15:18:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 77FC3C6FD1D for ; Sat, 1 Apr 2023 15:18:48 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id E2D2210E0F6; Sat, 1 Apr 2023 15:18:42 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id BE0C410E10C; Sat, 1 Apr 2023 15:18:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362318; x=1711898318; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hjzCvpHXOHHDiAL4SB0c4+VEVNfU428UuCpNOuzRJ9U=; b=cJfL3jyuR0qu6PHOPIE146kJ94IbTw7PW5A9V6qntRFLfGLphKGQ5xD+ pyQE2/L5VNpGpPNZ5KkEUV6MJsOOT+bMEhQqJysE4vgi6hIrX8MX/2jPX t6oLry4/JuOmv4j14kNeroR22SaRnAncEmSfhICIcCVLeKrc0iFVvQ5mn YNwlGReriu40MExVTKM8zWduxKyiVEwrrYArywGyVMsKxKQT7QmNjNqCE NznsjWjZ844HcMVWVD2l8CJ7UcwHNr2jt3Itv2Qyc1+EId+ShVNjCoiJS CJli5MnMPX9qnIbF0FHfcAjfWa21d2Tt8y6iAuu6KB/7uDdbA42+AwrWu Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411206" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411206" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937161" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937161" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:38 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:12 -0700 Message-Id: <20230401151833.124749-5-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 04/25] vfio: Accept vfio device file in the KVM facing kAPI X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This makes the vfio file kAPIs to accept vfio device files, also a preparation for vfio device cdev support. For the kvm set with vfio device file, kvm pointer is stored in struct vfio_device_file, and use kvm_ref_lock to protect kvm set and kvm pointer usage within VFIO. This kvm pointer will be set to vfio_device after device file is bound to iommufd in the cdev path. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu Reviewed-by: Eric Auger --- drivers/vfio/vfio.h | 2 ++ drivers/vfio/vfio_main.c | 18 ++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 56ad127ac618..e4672d91a6f7 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -18,6 +18,8 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + spinlock_t kvm_ref_lock; /* protect kvm field */ + struct kvm *kvm; }; void vfio_device_put_registration(struct vfio_device *device); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 748bde4d74d9..cb543791b28b 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -414,6 +414,7 @@ vfio_allocate_device_file(struct vfio_device *device) return ERR_PTR(-ENOMEM); df->device = device; + spin_lock_init(&df->kvm_ref_lock); return df; } @@ -1246,6 +1247,20 @@ bool vfio_file_enforced_coherent(struct file *file) } EXPORT_SYMBOL_GPL(vfio_file_enforced_coherent); +static void vfio_device_file_set_kvm(struct file *file, struct kvm *kvm) +{ + struct vfio_device_file *df = file->private_data; + + /* + * The kvm is first recorded in the vfio_device_file, and will + * be propagated to vfio_device::kvm when the file is bound to + * iommufd successfully in the vfio device cdev path. + */ + spin_lock(&df->kvm_ref_lock); + df->kvm = kvm; + spin_unlock(&df->kvm_ref_lock); +} + /** * vfio_file_set_kvm - Link a kvm with VFIO drivers * @file: VFIO group file or VFIO device file @@ -1259,6 +1274,9 @@ void vfio_file_set_kvm(struct file *file, struct kvm *kvm) group = vfio_group_from_file(file); if (group) vfio_group_set_kvm(group, kvm); + + if (vfio_device_from_file(file)) + vfio_device_file_set_kvm(file, kvm); } EXPORT_SYMBOL_GPL(vfio_file_set_kvm); From patchwork Sat Apr 1 15:18:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197112 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EEC0FC77B6E for ; Sat, 1 Apr 2023 15:18:54 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3EC8610E375; Sat, 1 Apr 2023 15:18:53 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id AC0D610E112; Sat, 1 Apr 2023 15:18:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362319; x=1711898319; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=mXnPFJrCBMATGLXlH86lTWzSOuOhc1Us+a7uh/qUhFA=; b=nBOHTj1t/PJ//bomSGbjN5IUbAyP/vmvRtJD2hCHZcao8swJIuESmdKR Ub+yFFXlh9p3v4tUvCSNxO05KEQADgriZXj0a+/ZwbT3lXxpVHchJHf7c a6VnKffMYaycRwB3WHOW0c1G23pDw1ZMc59XVW0Q+mXJUnB0cNZVk8vLh rQy9C2Pd3sgIR76Mq3UfXztW3xXac4gK/jBthVlahV+bhA3nKZ55cfC2w Vmd3SufN8C0T3Fz52MTra9ZAcNG6Lajrc3nXQpCsa7MexUbYdqnOU9mzd lvT5gwrs8tj2+JFzf/4UOGlpCAxN33cFnahSirLLCgtPwc8vRfJMU0qiY g==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411216" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411216" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937164" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937164" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:38 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:13 -0700 Message-Id: <20230401151833.124749-6-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 05/25] kvm/vfio: Rename kvm_vfio_group to prepare for accepting vfio device fd X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Meanwhile, rename related helpers. No functional change is intended. Reviewed-by: Kevin Tian Reviewed-by: Eric Auger Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- virt/kvm/vfio.c | 115 ++++++++++++++++++++++++------------------------ 1 file changed, 58 insertions(+), 57 deletions(-) diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 8bac308ba630..857d6ba349e1 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -21,7 +21,7 @@ #include #endif -struct kvm_vfio_group { +struct kvm_vfio_file { struct list_head node; struct file *file; #ifdef CONFIG_SPAPR_TCE_IOMMU @@ -30,7 +30,7 @@ struct kvm_vfio_group { }; struct kvm_vfio { - struct list_head group_list; + struct list_head file_list; struct mutex lock; bool noncoherent; }; @@ -98,34 +98,35 @@ static struct iommu_group *kvm_vfio_file_iommu_group(struct file *file) } static void kvm_spapr_tce_release_vfio_group(struct kvm *kvm, - struct kvm_vfio_group *kvg) + struct kvm_vfio_file *kvf) { - if (WARN_ON_ONCE(!kvg->iommu_group)) + if (WARN_ON_ONCE(!kvf->iommu_group)) return; - kvm_spapr_tce_release_iommu_group(kvm, kvg->iommu_group); - iommu_group_put(kvg->iommu_group); - kvg->iommu_group = NULL; + kvm_spapr_tce_release_iommu_group(kvm, kvf->iommu_group); + iommu_group_put(kvf->iommu_group); + kvf->iommu_group = NULL; } #endif /* - * Groups can use the same or different IOMMU domains. If the same then - * adding a new group may change the coherency of groups we've previously - * been told about. We don't want to care about any of that so we retest - * each group and bail as soon as we find one that's noncoherent. This - * means we only ever [un]register_noncoherent_dma once for the whole device. + * Groups/devices can use the same or different IOMMU domains. If the same + * then adding a new group/device may change the coherency of groups/devices + * we've previously been told about. We don't want to care about any of + * that so we retest each group/device and bail as soon as we find one that's + * noncoherent. This means we only ever [un]register_noncoherent_dma once + * for the whole device. */ static void kvm_vfio_update_coherency(struct kvm_device *dev) { struct kvm_vfio *kv = dev->private; bool noncoherent = false; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (!kvm_vfio_file_enforced_coherent(kvg->file)) { + list_for_each_entry(kvf, &kv->file_list, node) { + if (!kvm_vfio_file_enforced_coherent(kvf->file)) { noncoherent = true; break; } @@ -143,10 +144,10 @@ static void kvm_vfio_update_coherency(struct kvm_device *dev) mutex_unlock(&kv->lock); } -static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) +static int kvm_vfio_file_add(struct kvm_device *dev, unsigned int fd) { struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; struct file *filp; int ret; @@ -162,27 +163,27 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (kvg->file == filp) { + list_for_each_entry(kvf, &kv->file_list, node) { + if (kvf->file == filp) { ret = -EEXIST; goto err_unlock; } } - kvg = kzalloc(sizeof(*kvg), GFP_KERNEL_ACCOUNT); - if (!kvg) { + kvf = kzalloc(sizeof(*kvf), GFP_KERNEL_ACCOUNT); + if (!kvf) { ret = -ENOMEM; goto err_unlock; } - kvg->file = filp; - list_add_tail(&kvg->node, &kv->group_list); + kvf->file = filp; + list_add_tail(&kvf->node, &kv->file_list); kvm_arch_start_assignment(dev->kvm); mutex_unlock(&kv->lock); - kvm_vfio_file_set_kvm(kvg->file, dev->kvm); + kvm_vfio_file_set_kvm(kvf->file, dev->kvm); kvm_vfio_update_coherency(dev); return 0; @@ -193,10 +194,10 @@ static int kvm_vfio_group_add(struct kvm_device *dev, unsigned int fd) return ret; } -static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd) +static int kvm_vfio_file_del(struct kvm_device *dev, unsigned int fd) { struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; struct fd f; int ret; @@ -208,18 +209,18 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd) mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (kvg->file != f.file) + list_for_each_entry(kvf, &kv->file_list, node) { + if (kvf->file != f.file) continue; - list_del(&kvg->node); + list_del(&kvf->node); kvm_arch_end_assignment(dev->kvm); #ifdef CONFIG_SPAPR_TCE_IOMMU - kvm_spapr_tce_release_vfio_group(dev->kvm, kvg); + kvm_spapr_tce_release_vfio_group(dev->kvm, kvf); #endif - kvm_vfio_file_set_kvm(kvg->file, NULL); - fput(kvg->file); - kfree(kvg); + kvm_vfio_file_set_kvm(kvf->file, NULL); + fput(kvf->file); + kfree(kvf); ret = 0; break; } @@ -234,12 +235,12 @@ static int kvm_vfio_group_del(struct kvm_device *dev, unsigned int fd) } #ifdef CONFIG_SPAPR_TCE_IOMMU -static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev, - void __user *arg) +static int kvm_vfio_file_set_spapr_tce(struct kvm_device *dev, + void __user *arg) { struct kvm_vfio_spapr_tce param; struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg; + struct kvm_vfio_file *kvf; struct fd f; int ret; @@ -254,20 +255,20 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev, mutex_lock(&kv->lock); - list_for_each_entry(kvg, &kv->group_list, node) { - if (kvg->file != f.file) + list_for_each_entry(kvf, &kv->file_list, node) { + if (kvf->file != f.file) continue; - if (!kvg->iommu_group) { - kvg->iommu_group = kvm_vfio_file_iommu_group(kvg->file); - if (WARN_ON_ONCE(!kvg->iommu_group)) { + if (!kvf->iommu_group) { + kvf->iommu_group = kvm_vfio_file_iommu_group(kvf->file); + if (WARN_ON_ONCE(!kvf->iommu_group)) { ret = -EIO; goto err_fdput; } } ret = kvm_spapr_tce_attach_iommu_group(dev->kvm, param.tablefd, - kvg->iommu_group); + kvf->iommu_group); break; } @@ -278,8 +279,8 @@ static int kvm_vfio_group_set_spapr_tce(struct kvm_device *dev, } #endif -static int kvm_vfio_set_group(struct kvm_device *dev, long attr, - void __user *arg) +static int kvm_vfio_set_file(struct kvm_device *dev, long attr, + void __user *arg) { int32_t __user *argp = arg; int32_t fd; @@ -288,16 +289,16 @@ static int kvm_vfio_set_group(struct kvm_device *dev, long attr, case KVM_DEV_VFIO_GROUP_ADD: if (get_user(fd, argp)) return -EFAULT; - return kvm_vfio_group_add(dev, fd); + return kvm_vfio_file_add(dev, fd); case KVM_DEV_VFIO_GROUP_DEL: if (get_user(fd, argp)) return -EFAULT; - return kvm_vfio_group_del(dev, fd); + return kvm_vfio_file_del(dev, fd); #ifdef CONFIG_SPAPR_TCE_IOMMU case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: - return kvm_vfio_group_set_spapr_tce(dev, arg); + return kvm_vfio_file_set_spapr_tce(dev, arg); #endif } @@ -309,8 +310,8 @@ static int kvm_vfio_set_attr(struct kvm_device *dev, { switch (attr->group) { case KVM_DEV_VFIO_GROUP: - return kvm_vfio_set_group(dev, attr->attr, - u64_to_user_ptr(attr->addr)); + return kvm_vfio_set_file(dev, attr->attr, + u64_to_user_ptr(attr->addr)); } return -ENXIO; @@ -339,16 +340,16 @@ static int kvm_vfio_has_attr(struct kvm_device *dev, static void kvm_vfio_release(struct kvm_device *dev) { struct kvm_vfio *kv = dev->private; - struct kvm_vfio_group *kvg, *tmp; + struct kvm_vfio_file *kvf, *tmp; - list_for_each_entry_safe(kvg, tmp, &kv->group_list, node) { + list_for_each_entry_safe(kvf, tmp, &kv->file_list, node) { #ifdef CONFIG_SPAPR_TCE_IOMMU - kvm_spapr_tce_release_vfio_group(dev->kvm, kvg); + kvm_spapr_tce_release_vfio_group(dev->kvm, kvf); #endif - kvm_vfio_file_set_kvm(kvg->file, NULL); - fput(kvg->file); - list_del(&kvg->node); - kfree(kvg); + kvm_vfio_file_set_kvm(kvf->file, NULL); + fput(kvf->file); + list_del(&kvf->node); + kfree(kvf); kvm_arch_end_assignment(dev->kvm); } @@ -382,7 +383,7 @@ static int kvm_vfio_create(struct kvm_device *dev, u32 type) if (!kv) return -ENOMEM; - INIT_LIST_HEAD(&kv->group_list); + INIT_LIST_HEAD(&kv->file_list); mutex_init(&kv->lock); dev->private = kv; From patchwork Sat Apr 1 15:18:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197114 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 333F0C77B71 for ; Sat, 1 Apr 2023 15:18:57 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 198DC10E112; Sat, 1 Apr 2023 15:18:55 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7BE0010E113; Sat, 1 Apr 2023 15:18:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362320; x=1711898320; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VkJtqC84dVSmUDf6T6T0tY6zgWPAkWQawl0k0UKZ2ZE=; b=ldr3iSPcBlexnf88etXxWtvvuJB59hGp3cTzmd2dekXwBtMCTXSKxsof uTiMHHFs6RxhZ82ul2ZJpSr5elHVFl52tpuJO4BlYt3w20gyccuc7Ye3t IbG+AMKvOfELPK0usOfCfFk2b3ovBKnHoLvk5zll0vTfjAwF3MOX+38ls IWtdXWLCgsa7lj7pVdKgKqnGeLgwzdDgWayq1pfmlibGF5ix7PXPOw3Kr QT6Pcd/k5w1KHRI1cctsvcwFyPbBqIR/0BqOL589lATKHQDX0+pX0/Q6l uB97Bpvc3/D+IG0oXsFhebaBTxXgeSsyVTFAN7+W+39J1FPiPEV3G7iM4 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411226" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411226" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937168" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937168" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:39 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:14 -0700 Message-Id: <20230401151833.124749-7-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 06/25] kvm/vfio: Accept vfio device file from userspace X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This defines KVM_DEV_VFIO_FILE* and make alias with KVM_DEV_VFIO_GROUP*. Old userspace uses KVM_DEV_VFIO_GROUP* works as well. Reviewed-by: Jason Gunthorpe Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- Documentation/virt/kvm/devices/vfio.rst | 53 +++++++++++++++++-------- include/uapi/linux/kvm.h | 16 ++++++-- virt/kvm/vfio.c | 16 ++++---- 3 files changed, 56 insertions(+), 29 deletions(-) diff --git a/Documentation/virt/kvm/devices/vfio.rst b/Documentation/virt/kvm/devices/vfio.rst index 79b6811bb4f3..277d727ec1a2 100644 --- a/Documentation/virt/kvm/devices/vfio.rst +++ b/Documentation/virt/kvm/devices/vfio.rst @@ -9,24 +9,38 @@ Device types supported: - KVM_DEV_TYPE_VFIO Only one VFIO instance may be created per VM. The created device -tracks VFIO groups in use by the VM and features of those groups -important to the correctness and acceleration of the VM. As groups -are enabled and disabled for use by the VM, KVM should be updated -about their presence. When registered with KVM, a reference to the -VFIO-group is held by KVM. +tracks VFIO files (group or device) in use by the VM and features +of those groups/devices important to the correctness and acceleration +of the VM. As groups/devices are enabled and disabled for use by the +VM, KVM should be updated about their presence. When registered with +KVM, a reference to the VFIO file is held by KVM. Groups: - KVM_DEV_VFIO_GROUP - -KVM_DEV_VFIO_GROUP attributes: - KVM_DEV_VFIO_GROUP_ADD: Add a VFIO group to VFIO-KVM device tracking - kvm_device_attr.addr points to an int32_t file descriptor - for the VFIO group. - KVM_DEV_VFIO_GROUP_DEL: Remove a VFIO group from VFIO-KVM device tracking - kvm_device_attr.addr points to an int32_t file descriptor - for the VFIO group. - KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: attaches a guest visible TCE table + KVM_DEV_VFIO_FILE + alias: KVM_DEV_VFIO_GROUP + +KVM_DEV_VFIO_FILE attributes: + KVM_DEV_VFIO_FILE_ADD: Add a VFIO file (group/device) to VFIO-KVM device + tracking + + alias: KVM_DEV_VFIO_GROUP_ADD + + kvm_device_attr.addr points to an int32_t file descriptor for the + VFIO file. + + KVM_DEV_VFIO_FILE_DEL: Remove a VFIO file (group/device) from VFIO-KVM + device tracking + + alias: KVM_DEV_VFIO_GROUP_DEL + + kvm_device_attr.addr points to an int32_t file descriptor for the + VFIO file. + + KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: attaches a guest visible TCE table allocated by sPAPR KVM. + + alias: KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE + kvm_device_attr.addr points to a struct:: struct kvm_vfio_spapr_tce { @@ -40,9 +54,14 @@ KVM_DEV_VFIO_GROUP attributes: - @tablefd is a file descriptor for a TCE table allocated via KVM_CREATE_SPAPR_TCE. + only accepts vfio group file as SPAPR has no iommufd support + :: -The GROUP_ADD operation above should be invoked prior to accessing the +The FILE/GROUP_ADD operation above should be invoked prior to accessing the device file descriptor via VFIO_GROUP_GET_DEVICE_FD in order to support drivers which require a kvm pointer to be set in their .open_device() -callback. +callback. It is the same for device file descriptor via character device +open which gets device access via VFIO_DEVICE_BIND_IOMMUFD. For such file +descriptors, FILE_ADD should be invoked before VFIO_DEVICE_BIND_IOMMUFD +to support the drivers mentioned in prior sentence as well. diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index d77aef872a0a..a8eeca70a498 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1410,10 +1410,18 @@ struct kvm_device_attr { __u64 addr; /* userspace address of attr data */ }; -#define KVM_DEV_VFIO_GROUP 1 -#define KVM_DEV_VFIO_GROUP_ADD 1 -#define KVM_DEV_VFIO_GROUP_DEL 2 -#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE 3 +#define KVM_DEV_VFIO_FILE 1 + +#define KVM_DEV_VFIO_FILE_ADD 1 +#define KVM_DEV_VFIO_FILE_DEL 2 +#define KVM_DEV_VFIO_FILE_SET_SPAPR_TCE 3 + +/* KVM_DEV_VFIO_GROUP aliases are for compile time uapi compatibility */ +#define KVM_DEV_VFIO_GROUP KVM_DEV_VFIO_FILE + +#define KVM_DEV_VFIO_GROUP_ADD KVM_DEV_VFIO_FILE_ADD +#define KVM_DEV_VFIO_GROUP_DEL KVM_DEV_VFIO_FILE_DEL +#define KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE KVM_DEV_VFIO_FILE_SET_SPAPR_TCE enum kvm_device_type { KVM_DEV_TYPE_FSL_MPIC_20 = 1, diff --git a/virt/kvm/vfio.c b/virt/kvm/vfio.c index 857d6ba349e1..d869913baafd 100644 --- a/virt/kvm/vfio.c +++ b/virt/kvm/vfio.c @@ -286,18 +286,18 @@ static int kvm_vfio_set_file(struct kvm_device *dev, long attr, int32_t fd; switch (attr) { - case KVM_DEV_VFIO_GROUP_ADD: + case KVM_DEV_VFIO_FILE_ADD: if (get_user(fd, argp)) return -EFAULT; return kvm_vfio_file_add(dev, fd); - case KVM_DEV_VFIO_GROUP_DEL: + case KVM_DEV_VFIO_FILE_DEL: if (get_user(fd, argp)) return -EFAULT; return kvm_vfio_file_del(dev, fd); #ifdef CONFIG_SPAPR_TCE_IOMMU - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: return kvm_vfio_file_set_spapr_tce(dev, arg); #endif } @@ -309,7 +309,7 @@ static int kvm_vfio_set_attr(struct kvm_device *dev, struct kvm_device_attr *attr) { switch (attr->group) { - case KVM_DEV_VFIO_GROUP: + case KVM_DEV_VFIO_FILE: return kvm_vfio_set_file(dev, attr->attr, u64_to_user_ptr(attr->addr)); } @@ -321,12 +321,12 @@ static int kvm_vfio_has_attr(struct kvm_device *dev, struct kvm_device_attr *attr) { switch (attr->group) { - case KVM_DEV_VFIO_GROUP: + case KVM_DEV_VFIO_FILE: switch (attr->attr) { - case KVM_DEV_VFIO_GROUP_ADD: - case KVM_DEV_VFIO_GROUP_DEL: + case KVM_DEV_VFIO_FILE_ADD: + case KVM_DEV_VFIO_FILE_DEL: #ifdef CONFIG_SPAPR_TCE_IOMMU - case KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE: + case KVM_DEV_VFIO_FILE_SET_SPAPR_TCE: #endif return 0; } From patchwork Sat Apr 1 15:18:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197129 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5B383C76196 for ; Sat, 1 Apr 2023 15:19:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id DDBF010E286; Sat, 1 Apr 2023 15:19:30 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id D057110E0F6; Sat, 1 Apr 2023 15:18:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362320; x=1711898320; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TM2VBxl68DMup2WqMbkWDMx3S0SElcLwR7pPs6tgu68=; b=gNAUlQClgaS6+PHBa4N/pQJyK0d+W9W9esX6/x9PRAGbvwrECWBpPce/ umNtLrEUpV6sGRjOWXCjCywzOXcKmV2+uqWzicFxXRoZWgWee7HiUoYeJ N+bOj4mAlHxZVu3HjA8l0KnmNtdWfWSjq9MkwYqraZedFCcYi/arn/n7c QWTQ54r66qcTEUaAbda/CdXjkLx6adEhAwOMcXp5wcSMVp/C7oSncTAfF e52beRxxLEEaDe3bgkO4z/NyrNCZcahWsyHOmFuksHM8kU+1jRcyVnqxS o85d6GHoT44ci7qVzEpKEPmPZVb2iBvYs38ASe6uQ4fiBQkbCwdgZxYj4 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411237" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411237" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937176" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937176" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:39 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:15 -0700 Message-Id: <20230401151833.124749-8-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 07/25] vfio: Pass struct vfio_device_file * to vfio_device_open/close() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This avoids passing too much parameters in multiple functions. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu Reviewed-by: Eric Auger --- drivers/vfio/group.c | 20 ++++++++++++++------ drivers/vfio/vfio.h | 8 ++++---- drivers/vfio/vfio_main.c | 25 +++++++++++++++---------- 3 files changed, 33 insertions(+), 20 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 4f937ebaf6f7..9a7b2765eef6 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -169,8 +169,9 @@ static void vfio_device_group_get_kvm_safe(struct vfio_device *device) spin_unlock(&device->group->kvm_ref_lock); } -static int vfio_device_group_open(struct vfio_device *device) +static int vfio_device_group_open(struct vfio_device_file *df) { + struct vfio_device *device = df->device; int ret; mutex_lock(&device->group->group_lock); @@ -190,7 +191,11 @@ static int vfio_device_group_open(struct vfio_device *device) if (device->open_count == 0) vfio_device_group_get_kvm_safe(device); - ret = vfio_device_open(device, device->group->iommufd); + df->iommufd = device->group->iommufd; + + ret = vfio_device_open(df); + if (ret) + df->iommufd = NULL; if (device->open_count == 0) vfio_device_put_kvm(device); @@ -202,12 +207,15 @@ static int vfio_device_group_open(struct vfio_device *device) return ret; } -void vfio_device_group_close(struct vfio_device *device) +void vfio_device_group_close(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + mutex_lock(&device->group->group_lock); mutex_lock(&device->dev_set->lock); - vfio_device_close(device, device->group->iommufd); + vfio_device_close(df); + df->iommufd = NULL; if (device->open_count == 0) vfio_device_put_kvm(device); @@ -228,7 +236,7 @@ static struct file *vfio_device_open_file(struct vfio_device *device) goto err_out; } - ret = vfio_device_group_open(device); + ret = vfio_device_group_open(df); if (ret) goto err_free; @@ -260,7 +268,7 @@ static struct file *vfio_device_open_file(struct vfio_device *device) return filep; err_close_device: - vfio_device_group_close(device); + vfio_device_group_close(df); err_free: kfree(df); err_out: diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index e4672d91a6f7..cffc08f5a6f1 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -20,13 +20,13 @@ struct vfio_device_file { struct vfio_device *device; spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; + struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ }; void vfio_device_put_registration(struct vfio_device *device); bool vfio_device_try_get_registration(struct vfio_device *device); -int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd); -void vfio_device_close(struct vfio_device *device, - struct iommufd_ctx *iommufd); +int vfio_device_open(struct vfio_device_file *df); +void vfio_device_close(struct vfio_device_file *df); struct vfio_device_file * vfio_allocate_device_file(struct vfio_device *device); @@ -91,7 +91,7 @@ void vfio_device_group_register(struct vfio_device *device); void vfio_device_group_unregister(struct vfio_device *device); int vfio_device_group_use_iommu(struct vfio_device *device); void vfio_device_group_unuse_iommu(struct vfio_device *device); -void vfio_device_group_close(struct vfio_device *device); +void vfio_device_group_close(struct vfio_device_file *df); struct vfio_group *vfio_group_from_file(struct file *file); bool vfio_group_has_dev(struct vfio_group *group, struct vfio_device *device); bool vfio_group_enforced_coherent(struct vfio_group *group); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index cb543791b28b..2ea6cb6d03c7 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -419,9 +419,10 @@ vfio_allocate_device_file(struct vfio_device *device) return df; } -static int vfio_device_first_open(struct vfio_device *device, - struct iommufd_ctx *iommufd) +static int vfio_device_first_open(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + struct iommufd_ctx *iommufd = df->iommufd; int ret; lockdep_assert_held(&device->dev_set->lock); @@ -453,9 +454,11 @@ static int vfio_device_first_open(struct vfio_device *device, return ret; } -static void vfio_device_last_close(struct vfio_device *device, - struct iommufd_ctx *iommufd) +static void vfio_device_last_close(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + struct iommufd_ctx *iommufd = df->iommufd; + lockdep_assert_held(&device->dev_set->lock); if (device->ops->close_device) @@ -467,15 +470,16 @@ static void vfio_device_last_close(struct vfio_device *device, module_put(device->dev->driver->owner); } -int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd) +int vfio_device_open(struct vfio_device_file *df) { + struct vfio_device *device = df->device; int ret = 0; lockdep_assert_held(&device->dev_set->lock); device->open_count++; if (device->open_count == 1) { - ret = vfio_device_first_open(device, iommufd); + ret = vfio_device_first_open(df); if (ret) device->open_count--; } @@ -483,14 +487,15 @@ int vfio_device_open(struct vfio_device *device, struct iommufd_ctx *iommufd) return ret; } -void vfio_device_close(struct vfio_device *device, - struct iommufd_ctx *iommufd) +void vfio_device_close(struct vfio_device_file *df) { + struct vfio_device *device = df->device; + lockdep_assert_held(&device->dev_set->lock); vfio_assert_device_open(device); if (device->open_count == 1) - vfio_device_last_close(device, iommufd); + vfio_device_last_close(df); device->open_count--; } @@ -535,7 +540,7 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; - vfio_device_group_close(device); + vfio_device_group_close(df); vfio_device_put_registration(device); From patchwork Sat Apr 1 15:18:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197109 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 95650C76196 for ; Sat, 1 Apr 2023 15:18:50 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 64A5710E122; Sat, 1 Apr 2023 15:18:44 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id F292810E105; Sat, 1 Apr 2023 15:18:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362321; x=1711898321; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3oyfU6r73oJut6IgAnxvBAymWEcYdGn9fFE+eYdombQ=; b=gtPSAg8DlbDriLQiFUMw9fOCU5G8Ix1xNwP0lkKiapuc5NVjsRe2k2ZB OC7X1uShUbst+CeS972g1Ih+YNKkU6jMMXBA0uT/8bbNipTerzUtSZoJc WJpJW70hN8yAI4MQfv1yjqMwSRwZgEruHFHATaGObXDfeMLtTThq2rpNi 4cpmIzF5gsToV7FmWqfqPu8FCOSQ0XbsbgvOS4QpqQ4f/L4RY/9xjAlk3 xeWtSNqoElacfra9IiG4THNpu5D6MHX/h8Yi4QV07Qdygo8kc6oXAwYVQ r9VpU7Dgl5pV91neWUn17d/dOhv2JMUV5IwsMOrDstuAdjd4VPumXj7OD g==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411248" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411248" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937182" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937182" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:40 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:16 -0700 Message-Id: <20230401151833.124749-9-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 08/25] vfio: Block device access via device fd until device is opened X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" Allow the vfio_device file to be in a state where the device FD is opened but the device cannot be used by userspace (i.e. its .open_device() hasn't been called). This inbetween state is not used when the device FD is spawned from the group FD, however when we create the device FD directly by opening a cdev it will be opened in the blocked state. The reason for the inbetween state is that userspace only gets a FD but doesn't gain access permission until binding the FD to an iommufd. So in the blocked state, only the bind operation is allowed. Completing bind will allow user to further access the device. This is implemented by adding a flag in struct vfio_device_file to mark the blocked state and using a simple smp_load_acquire() to obtain the flag value and serialize all the device setup with the thread accessing this device. Following this lockless scheme, it can safely handle the device FD unbound->bound but it cannot handle bound->unbound. To allow this we'd need to add a lock on all the vfio ioctls which seems costly. So once device FD is bound, it remains bound until the FD is closed. Suggested-by: Jason Gunthorpe Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu Reviewed-by: Eric Auger --- drivers/vfio/group.c | 11 ++++++++++- drivers/vfio/vfio.h | 1 + drivers/vfio/vfio_main.c | 42 ++++++++++++++++++++++++++++++++++------ 3 files changed, 47 insertions(+), 7 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 9a7b2765eef6..71f0a9a4016e 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -194,9 +194,18 @@ static int vfio_device_group_open(struct vfio_device_file *df) df->iommufd = device->group->iommufd; ret = vfio_device_open(df); - if (ret) + if (ret) { df->iommufd = NULL; + goto out_put_kvm; + } + + /* + * Paired with smp_load_acquire() in vfio_device_fops::ioctl/ + * read/write/mmap and vfio_file_has_device_access() + */ + smp_store_release(&df->access_granted, true); +out_put_kvm: if (device->open_count == 0) vfio_device_put_kvm(device); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index cffc08f5a6f1..854f2c97cb9a 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -18,6 +18,7 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + bool access_granted; spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 2ea6cb6d03c7..6d5d3c2180c8 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1114,6 +1114,10 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, struct vfio_device *device = df->device; int ret; + /* Paired with smp_store_release() following vfio_device_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + ret = vfio_device_pm_runtime_get(device); if (ret) return ret; @@ -1141,6 +1145,10 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf, struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() following vfio_device_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->read)) return -EINVAL; @@ -1154,6 +1162,10 @@ static ssize_t vfio_device_fops_write(struct file *filep, struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() following vfio_device_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->write)) return -EINVAL; @@ -1165,6 +1177,10 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + /* Paired with smp_store_release() following vfio_device_open() */ + if (!smp_load_acquire(&df->access_granted)) + return -EINVAL; + if (unlikely(!device->ops->mmap)) return -EINVAL; @@ -1201,6 +1217,25 @@ bool vfio_file_is_valid(struct file *file) } EXPORT_SYMBOL_GPL(vfio_file_is_valid); +/* + * Return true if the input file is a vfio device file and has opened + * the input device. Otherwise, return false. + */ +static bool vfio_file_has_device_access(struct file *file, + struct vfio_device *device) +{ + struct vfio_device *vdev = vfio_device_from_file(file); + struct vfio_device_file *df; + + if (!vdev || vdev != device) + return false; + + df = file->private_data; + + /* Paired with smp_store_release() following vfio_device_open() */ + return smp_load_acquire(&df->access_granted); +} + /** * vfio_file_has_dev - True if the VFIO file is a handle for device * @file: VFIO file to check @@ -1211,17 +1246,12 @@ EXPORT_SYMBOL_GPL(vfio_file_is_valid); bool vfio_file_has_dev(struct file *file, struct vfio_device *device) { struct vfio_group *group; - struct vfio_device *vdev; group = vfio_group_from_file(file); if (group) return vfio_group_has_dev(group, device); - vdev = vfio_device_from_file(file); - if (vdev) - return vdev == device; - - return false; + return vfio_file_has_device_access(file, device); } EXPORT_SYMBOL_GPL(vfio_file_has_dev); From patchwork Sat Apr 1 15:18:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197126 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 05F97C77B6E for ; Sat, 1 Apr 2023 15:19:26 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id EDC5010E2FB; Sat, 1 Apr 2023 15:19:24 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id A9E8810E10B; Sat, 1 Apr 2023 15:18:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362321; x=1711898321; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DUxZkfCi6OQ0x/puexaodV/sQy1kiF7rc8iDtDgi5ZI=; b=aBmZ9m7EvlfO5PNQ2K59qywNsWleQWyxmdniNhkjO91epuDEjl9d3Q0j 3e9RAH5MGi7hpKhX9yMuutPL7XxZtB0RlyeIUwGKVVuJ/BJRjODKedjUQ mD/JfrCUZxLdcO3Gp1btLpnhuEvY5nSssUkrcLrkpejUCS30sF4oIgWK0 rN35nfo3QNhgpz/fV13A8rZBm6tPmgRDSdcI9ApymWeUkyyzHvwM8VT3L sUwSRSV5+1sG8KibatweBTsPQ1FrQEVvoLeG5AtOZdkPqu8XqvRfATja9 cm//UdmJt7xE/eENraPpU+h2Otc5uIZBLh6gyAek2gk3mvRMzX0/rLSVT A==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411257" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411257" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937186" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937186" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:40 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:17 -0700 Message-Id: <20230401151833.124749-10-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 09/25] vfio: Add cdev_device_open_cnt to vfio_group X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" for counting the devices that are opened via the cdev path. This count is increased and decreased by the cdev path. The group path checks it to achieve exclusion with the cdev path. With this, only one path (group path or cdev path) will claim DMA ownership. This avoids scenarios in which devices within the same group may be opened via different paths. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu Reviewed-by: Eric Auger --- drivers/vfio/group.c | 33 +++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 3 +++ 2 files changed, 36 insertions(+) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 71f0a9a4016e..d55ce3ca44b7 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -383,6 +383,33 @@ static long vfio_group_fops_unl_ioctl(struct file *filep, } } +int vfio_device_block_group(struct vfio_device *device) +{ + struct vfio_group *group = device->group; + int ret = 0; + + mutex_lock(&group->group_lock); + if (group->opened_file) { + ret = -EBUSY; + goto out_unlock; + } + + group->cdev_device_open_cnt++; + +out_unlock: + mutex_unlock(&group->group_lock); + return ret; +} + +void vfio_device_unblock_group(struct vfio_device *device) +{ + struct vfio_group *group = device->group; + + mutex_lock(&group->group_lock); + group->cdev_device_open_cnt--; + mutex_unlock(&group->group_lock); +} + static int vfio_group_fops_open(struct inode *inode, struct file *filep) { struct vfio_group *group = @@ -405,6 +432,11 @@ static int vfio_group_fops_open(struct inode *inode, struct file *filep) goto out_unlock; } + if (group->cdev_device_open_cnt) { + ret = -EBUSY; + goto out_unlock; + } + /* * Do we need multiple instances of the group open? Seems not. */ @@ -479,6 +511,7 @@ static void vfio_group_release(struct device *dev) mutex_destroy(&group->device_lock); mutex_destroy(&group->group_lock); WARN_ON(group->iommu_group); + WARN_ON(group->cdev_device_open_cnt); ida_free(&vfio.group_ida, MINOR(group->dev.devt)); kfree(group); } diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 854f2c97cb9a..b2f20b78a707 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -83,8 +83,11 @@ struct vfio_group { struct blocking_notifier_head notifier; struct iommufd_ctx *iommufd; spinlock_t kvm_ref_lock; + unsigned int cdev_device_open_cnt; }; +int vfio_device_block_group(struct vfio_device *device); +void vfio_device_unblock_group(struct vfio_device *device); int vfio_device_set_group(struct vfio_device *device, enum vfio_group_type type); void vfio_device_remove_group(struct vfio_device *device); From patchwork Sat Apr 1 15:18:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197110 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E80FDC761AF for ; Sat, 1 Apr 2023 15:18:51 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 77E5C10E1BE; Sat, 1 Apr 2023 15:18:45 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2361C10E0F6; Sat, 1 Apr 2023 15:18:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362322; x=1711898322; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XUStM8T02ZrjFFH8fi8mvUfbLcVIWY8ixRTVzSOKiDE=; b=IcAtNifk/cCZKFL6V75Hpn44nmYK80EqVcWgnt0ihWho5qmy4g6gRDQl jtB/yBgzHfaJ6Uqz4rF+QDmmpwfQVnD4ERPd6vEQBQz3Uvf3zy5T4gi0B CL6wFMsw7u0Li9i85Z0j3od0oY5Ci3zNfpsenEmYdOddBhsA+uYPbosym hSH8MjvtvoNjJ8KFVqvlm8wqA6iw7tylUGTzsUM4wze0Zl7t+45cAh+AZ 0NyRDiHauE5e+rBn7FRkQFZ3p1WqKWnS7DW9/eCEc3nhcoEfQym9CIbT/ iriLigrsdjAPd7779pjJdxrj/S06BGfBEVrJu10k2L7TL4jGk6j9yUVnZ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411268" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411268" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:41 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937190" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937190" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:41 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:18 -0700 Message-Id: <20230401151833.124749-11-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 10/25] vfio: Make vfio_device_open() single open for device cdev path X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" VFIO group has historically allowed multi-open of the device FD. This was made secure because the "open" was executed via an ioctl to the group FD which is itself only single open. However, no known use of multiple device FDs today. It is kind of a strange thing to do because new device FDs can naturally be created via dup(). When we implement the new device uAPI (only used in cdev path) there is no natural way to allow the device itself from being multi-opened in a secure manner. Without the group FD we cannot prove the security context of the opener. Thus, when moving to the new uAPI we block the ability of opening a device multiple times. Given old group path still allows it we store a vfio_group pointer in struct vfio_device_file to differentiate. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Yanting Jiang Signed-off-by: Yi Liu Reviewed-by: Eric Auger --- drivers/vfio/group.c | 2 ++ drivers/vfio/vfio.h | 2 ++ drivers/vfio/vfio_main.c | 7 +++++++ 3 files changed, 11 insertions(+) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index d55ce3ca44b7..1af4b9e012a7 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -245,6 +245,8 @@ static struct file *vfio_device_open_file(struct vfio_device *device) goto err_out; } + df->group = device->group; + ret = vfio_device_group_open(df); if (ret) goto err_free; diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index b2f20b78a707..f1a448f9d067 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -18,6 +18,8 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + struct vfio_group *group; + bool access_granted; spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 6d5d3c2180c8..c8721d5d05fa 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -477,6 +477,13 @@ int vfio_device_open(struct vfio_device_file *df) lockdep_assert_held(&device->dev_set->lock); + /* + * Only the group path allows the device opened multiple times. + * The device cdev path doesn't have a secure way for it. + */ + if (device->open_count != 0 && !df->group) + return -EINVAL; + device->open_count++; if (device->open_count == 1) { ret = vfio_device_first_open(df); From patchwork Sat Apr 1 15:18:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197121 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 86508C77B6D for ; Sat, 1 Apr 2023 15:19:16 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 955C710E10C; Sat, 1 Apr 2023 15:19:15 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 86D5110E105; Sat, 1 Apr 2023 15:18:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362322; x=1711898322; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UokUYeM4rR64gCGp21F4Z3uW+9LSpKJiXFcdGNm8PF0=; b=iIulnkIFkdD5sH7VF005j8wzsH+j9exewqwN4KQueiCtoke+oZOVqNmu W+1MiaoAjNwwi0BCb4zv+Cw28yW00YX/c+4oYrmUDGoVAxRgWPdTUMtbP QznHblLfcj/7EXgQx8omcTtYFkzAUQcpZxfshctq1MSNfXOzJAqikIF2m 5EzXA9u+VZTJhhomX3v7u/xAk3cKXZ9sbnNXcDzSTMDPvmFyZt9y2Kjhl ch05g95Somb3y1mAYcln4JvmBDOJ3lqVzkiEW0Rz2ANmhOsKmj3oTx3IN qwi6LBmC0rp1tzIenqwJ7M/ZJ+rXRma5unsxtx3Ks5xvhIHcCH3ofi1Fz Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411280" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411280" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937193" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937193" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:41 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:19 -0700 Message-Id: <20230401151833.124749-12-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 11/25] vfio: Make vfio_device_first_open() to accept NULL iommufd for noiommu X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" vfio_device_first_open() requires the caller to provide either a valid iommufd (the group path in iommufd compat mode) or a valid container (the group path in legacy container mode). As preparation for noiommu support in device cdev path it's extended to allow both being NULL. The caller is expected to verify noiommu permission before passing NULL to this function. Reviewed-by: Kevin Tian Reviewed-by: Jason Gunthorpe Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/group.c | 6 ++++++ drivers/vfio/vfio.h | 1 + drivers/vfio/vfio_main.c | 12 ++++++++---- 3 files changed, 15 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 1af4b9e012a7..63a4bf06ab9f 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -771,6 +771,12 @@ void vfio_device_group_unregister(struct vfio_device *device) mutex_unlock(&device->group->device_lock); } +/* No group lock since df->group and df->group->container cannot change */ +bool vfio_device_group_uses_container(struct vfio_device_file *df) +{ + return df->group && READ_ONCE(df->group->container); +} + int vfio_device_group_use_iommu(struct vfio_device *device) { struct vfio_group *group = device->group; diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index f1a448f9d067..7d4108cbc185 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -95,6 +95,7 @@ int vfio_device_set_group(struct vfio_device *device, void vfio_device_remove_group(struct vfio_device *device); void vfio_device_group_register(struct vfio_device *device); void vfio_device_group_unregister(struct vfio_device *device); +bool vfio_device_group_uses_container(struct vfio_device_file *df); int vfio_device_group_use_iommu(struct vfio_device *device); void vfio_device_group_unuse_iommu(struct vfio_device *device); void vfio_device_group_close(struct vfio_device_file *df); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index c8721d5d05fa..f4c9c27c7d74 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -423,16 +423,20 @@ static int vfio_device_first_open(struct vfio_device_file *df) { struct vfio_device *device = df->device; struct iommufd_ctx *iommufd = df->iommufd; - int ret; + int ret = 0; lockdep_assert_held(&device->dev_set->lock); if (!try_module_get(device->dev->driver->owner)) return -ENODEV; + /* + * if neither iommufd nor container is used the device is in + * noiommu mode then just go ahead to open it. + */ if (iommufd) ret = vfio_iommufd_bind(device, iommufd); - else + else if (vfio_device_group_uses_container(df)) ret = vfio_device_group_use_iommu(device); if (ret) goto err_module_put; @@ -447,7 +451,7 @@ static int vfio_device_first_open(struct vfio_device_file *df) err_unuse_iommu: if (iommufd) vfio_iommufd_unbind(device); - else + else if (vfio_device_group_uses_container(df)) vfio_device_group_unuse_iommu(device); err_module_put: module_put(device->dev->driver->owner); @@ -465,7 +469,7 @@ static void vfio_device_last_close(struct vfio_device_file *df) device->ops->close_device(device); if (iommufd) vfio_iommufd_unbind(device); - else + else if (vfio_device_group_uses_container(df)) vfio_device_group_unuse_iommu(device); module_put(device->dev->driver->owner); } From patchwork Sat Apr 1 15:18:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197127 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1A574C76196 for ; Sat, 1 Apr 2023 15:19:27 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 3CEE710E330; Sat, 1 Apr 2023 15:19:26 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 5C82610E122; Sat, 1 Apr 2023 15:18:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362323; x=1711898323; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PirptMYKJTaup2Gvl84lhjR2Vgw3w06sDHqdMR/RpfQ=; b=BAUOgqLDOGF631JAqoEE1qnGUPXOSONP+/OBcYUW5hN53OZiyO5FbXup 0y7vghwjzVJZIaUaYik/j+HnEDdhTY7uVfVKhg6ElNCKES81br/d6onaV XSg1AZ/ioQTwRdUed5WIFpHwT33pZAQtsFRBxxAMM6HwfSQuFgzRORb+7 YGCK4kwP1E3qDlmYJDcWoGDB5ABLvSRl6LbKGtbf6303dTgmafQYa3iIo tInZeNsRtP7xl2YK5UuEzpZUAptg2dozaS9GdHcuxdyjA2XqfRGIaxQZC 4z5PIEocrHwHDJL0JP585A1RvYkXVFPH/rWWI72uh+E8yj9KMDd3rybjF g==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411290" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411290" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937197" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937197" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:42 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:20 -0700 Message-Id: <20230401151833.124749-13-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 12/25] vfio-iommufd: Move noiommu support out of vfio_iommufd_bind() X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" into vfio_device_group_open(). This is also more consistent with what will be done in vfio device cdev path. Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/group.c | 9 +++++++++ drivers/vfio/iommufd.c | 35 ++++++++++++++++++----------------- drivers/vfio/vfio.h | 9 +++++++++ 3 files changed, 36 insertions(+), 17 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 63a4bf06ab9f..6178ac5d0b10 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -192,6 +192,15 @@ static int vfio_device_group_open(struct vfio_device_file *df) vfio_device_group_get_kvm_safe(device); df->iommufd = device->group->iommufd; + if (df->iommufd && vfio_device_is_noiommu(device)) { + if (device->open_count == 0) { + ret = vfio_iommufd_enable_noiommu_compat(device, + df->iommufd); + if (ret) + goto out_put_kvm; + } + df->iommufd = NULL; + } ret = vfio_device_open(df); if (ret) { diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 809f2dd73b9e..11f5148b446e 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -10,6 +10,24 @@ MODULE_IMPORT_NS(IOMMUFD); MODULE_IMPORT_NS(IOMMUFD_VFIO); +int vfio_iommufd_enable_noiommu_compat(struct vfio_device *device, + struct iommufd_ctx *ictx) +{ + u32 ioas_id; + + if (!capable(CAP_SYS_RAWIO)) + return -EPERM; + + /* + * Require no compat ioas to be assigned to proceed. The basic + * statement is that the user cannot have done something that + * implies they expected translation to exist + */ + if (!iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id)) + return -EPERM; + return 0; +} + int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) { u32 ioas_id; @@ -18,20 +36,6 @@ int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) lockdep_assert_held(&vdev->dev_set->lock); - if (vfio_device_is_noiommu(vdev)) { - if (!capable(CAP_SYS_RAWIO)) - return -EPERM; - - /* - * Require no compat ioas to be assigned to proceed. The basic - * statement is that the user cannot have done something that - * implies they expected translation to exist - */ - if (!iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id)) - return -EPERM; - return 0; - } - ret = vdev->ops->bind_iommufd(vdev, ictx, &device_id); if (ret) return ret; @@ -59,9 +63,6 @@ void vfio_iommufd_unbind(struct vfio_device *vdev) { lockdep_assert_held(&vdev->dev_set->lock); - if (vfio_device_is_noiommu(vdev)) - return; - if (vdev->ops->unbind_iommufd) vdev->ops->unbind_iommufd(vdev); } diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 7d4108cbc185..136137b8618d 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -236,9 +236,18 @@ static inline void vfio_container_cleanup(void) #endif #if IS_ENABLED(CONFIG_IOMMUFD) +int vfio_iommufd_enable_noiommu_compat(struct vfio_device *device, + struct iommufd_ctx *ictx); int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx); void vfio_iommufd_unbind(struct vfio_device *device); #else +static inline int +vfio_iommufd_enable_noiommu_compat(struct vfio_device *device, + struct iommufd_ctx *ictx) +{ + return -EOPNOTSUPP; +} + static inline int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx) { From patchwork Sat Apr 1 15:18:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197113 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5A35BC77B62 for ; Sat, 1 Apr 2023 15:18:56 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 561D010E224; Sat, 1 Apr 2023 15:18:55 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 2706D10E105; Sat, 1 Apr 2023 15:18:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362324; x=1711898324; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=l6GAZxH9ilEoIOsYSsTBZ0K+NVpQqN2Ka1sb2xlz99E=; b=cwCWuOYA2t8kP1Kj5ldYOVKqnaGEw0iFrrG0w3RG5c/CukzlkjI6efLd VpujGlA0SOYmB4USh4AUS2YeOMmPAnCs+LOPM9W7r2fObWmy1VaRj3kUT 5VjamnMnR5s900IIP5xYGpBWtYk4KtgNizzxZ/pz5W+5M9Ku4F160EZdz S39O+tLrBCd9/zJjtkJAxE5MohH3O+UMqMy+uctB3W1skI0iGTCKxzLRx /VUo16RScfFI9BvokWn9glqi3LdAkOtQWRrR61MaN5okdSwYTigEb7xew V909bgjrzwIKQFxWmtcRYh1KpaLjBOtd+4AicdfN+VTsLhcG/Hy+nwyW4 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411301" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411301" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937202" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937202" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:42 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:21 -0700 Message-Id: <20230401151833.124749-14-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 13/25] vfio-iommufd: Split bind/attach into two steps X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" to align with the coming vfio device cdev support. Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/group.c | 18 ++++++++++++++---- drivers/vfio/iommufd.c | 33 ++++++++++++++------------------- drivers/vfio/vfio.h | 9 +++++++++ 3 files changed, 37 insertions(+), 23 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 6178ac5d0b10..de9f09c7cf00 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -203,9 +203,14 @@ static int vfio_device_group_open(struct vfio_device_file *df) } ret = vfio_device_open(df); - if (ret) { - df->iommufd = NULL; + if (ret) goto out_put_kvm; + + if (df->iommufd) { + ret = vfio_iommufd_attach_compat_ioas(device, + df->iommufd); + if (ret) + goto out_close_device; } /* @@ -214,12 +219,17 @@ static int vfio_device_group_open(struct vfio_device_file *df) */ smp_store_release(&df->access_granted, true); + mutex_unlock(&device->dev_set->lock); + mutex_unlock(&device->group->group_lock); + return 0; + +out_close_device: + vfio_device_close(df); out_put_kvm: + df->iommufd = NULL; if (device->open_count == 0) vfio_device_put_kvm(device); - mutex_unlock(&device->dev_set->lock); - out_unlock: mutex_unlock(&device->group->group_lock); return ret; diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 11f5148b446e..20fd250c247c 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -30,33 +30,28 @@ int vfio_iommufd_enable_noiommu_compat(struct vfio_device *device, int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) { - u32 ioas_id; u32 device_id; - int ret; lockdep_assert_held(&vdev->dev_set->lock); - ret = vdev->ops->bind_iommufd(vdev, ictx, &device_id); - if (ret) - return ret; + /* The legacy path has no way to return the device id */ + return vdev->ops->bind_iommufd(vdev, ictx, &device_id); +} + +int vfio_iommufd_attach_compat_ioas(struct vfio_device *vdev, + struct iommufd_ctx *ictx) +{ + u32 ioas_id; + int ret; + + lockdep_assert_held(&vdev->dev_set->lock); ret = iommufd_vfio_compat_ioas_get_id(ictx, &ioas_id); if (ret) - goto err_unbind; - ret = vdev->ops->attach_ioas(vdev, &ioas_id); - if (ret) - goto err_unbind; - - /* - * The legacy path has no way to return the device id or the selected - * pt_id - */ - return 0; + return ret; -err_unbind: - if (vdev->ops->unbind_iommufd) - vdev->ops->unbind_iommufd(vdev); - return ret; + /* The legacy path has no way to return the selected pt_id */ + return vdev->ops->attach_ioas(vdev, &ioas_id); } void vfio_iommufd_unbind(struct vfio_device *vdev) diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 136137b8618d..abfaf85cc266 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -238,6 +238,8 @@ static inline void vfio_container_cleanup(void) #if IS_ENABLED(CONFIG_IOMMUFD) int vfio_iommufd_enable_noiommu_compat(struct vfio_device *device, struct iommufd_ctx *ictx); +int vfio_iommufd_attach_compat_ioas(struct vfio_device *device, + struct iommufd_ctx *ictx); int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx); void vfio_iommufd_unbind(struct vfio_device *device); #else @@ -248,6 +250,13 @@ vfio_iommufd_enable_noiommu_compat(struct vfio_device *device, return -EOPNOTSUPP; } +static inline int +vfio_iommufd_attach_compat_ioas(struct vfio_device *device, + struct iommufd_ctx *ictx) +{ + return -EOPNOTSUPP; +} + static inline int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx) { From patchwork Sat Apr 1 15:18:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197115 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0DAD4C77B6D for ; Sat, 1 Apr 2023 15:18:59 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 6040A10E23B; Sat, 1 Apr 2023 15:18:58 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 4A98B10E139; Sat, 1 Apr 2023 15:18:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362325; x=1711898325; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jhycbiJB2gQIjPxNEaOUCxxLoPupQjQJSQK0WKWK3hQ=; b=TyN6UWl3XcVQ+199sHHZRF5Ht2vqBWc3dU8PnAVVjZt8tOCNgQdH1PGl E4IQLoCqVDg8tGgR+MzLOzm2k4sr8C50uMBhfDrSnppRvE0eH4R93e1J3 31weyeKHa4jSxT4jw7bWYdnL/sqfHs+uAXssO8Xj9VDGpbWU90JFP0UQo dvBt6CSldof1L7tH9sjLm3Y7LI49B/Udtr0W0/HWHa5EZJclC4ZZ8j3xM q0uLtOSZ6UnKAlffAVHgSe6EkcLe/g5k2KvXnVYRFNwNb/Q/rF6Y7DIvi b92o3PghO/yXjruEgE5ZZVtOMSqHXN1jULgTl5354yMuuqZDn8vmJ9JDi Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411312" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411312" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937205" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937205" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:43 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:22 -0700 Message-Id: <20230401151833.124749-15-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 14/25] vfio: Record devid in vfio_device_file X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" .bind_iommufd() will generate an ID to represent this bond, which is needed by userspace for further usage. Store devid in vfio_device_file to avoid passing the pointer in multiple places. Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/iommufd.c | 12 +++++++----- drivers/vfio/vfio.h | 10 +++++----- drivers/vfio/vfio_main.c | 6 +++--- 3 files changed, 15 insertions(+), 13 deletions(-) diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 20fd250c247c..6ef9316a16a5 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -28,14 +28,14 @@ int vfio_iommufd_enable_noiommu_compat(struct vfio_device *device, return 0; } -int vfio_iommufd_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx) +int vfio_iommufd_bind(struct vfio_device_file *df) { - u32 device_id; + struct vfio_device *vdev = df->device; + struct iommufd_ctx *ictx = df->iommufd; lockdep_assert_held(&vdev->dev_set->lock); - /* The legacy path has no way to return the device id */ - return vdev->ops->bind_iommufd(vdev, ictx, &device_id); + return vdev->ops->bind_iommufd(vdev, ictx, &df->devid); } int vfio_iommufd_attach_compat_ioas(struct vfio_device *vdev, @@ -54,8 +54,10 @@ int vfio_iommufd_attach_compat_ioas(struct vfio_device *vdev, return vdev->ops->attach_ioas(vdev, &ioas_id); } -void vfio_iommufd_unbind(struct vfio_device *vdev) +void vfio_iommufd_unbind(struct vfio_device_file *df) { + struct vfio_device *vdev = df->device; + lockdep_assert_held(&vdev->dev_set->lock); if (vdev->ops->unbind_iommufd) diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index abfaf85cc266..b47b186573ac 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -24,6 +24,7 @@ struct vfio_device_file { spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ + u32 devid; /* only valid when iommufd is valid */ }; void vfio_device_put_registration(struct vfio_device *device); @@ -240,8 +241,8 @@ int vfio_iommufd_enable_noiommu_compat(struct vfio_device *device, struct iommufd_ctx *ictx); int vfio_iommufd_attach_compat_ioas(struct vfio_device *device, struct iommufd_ctx *ictx); -int vfio_iommufd_bind(struct vfio_device *device, struct iommufd_ctx *ictx); -void vfio_iommufd_unbind(struct vfio_device *device); +int vfio_iommufd_bind(struct vfio_device_file *df); +void vfio_iommufd_unbind(struct vfio_device_file *df); #else static inline int vfio_iommufd_enable_noiommu_compat(struct vfio_device *device, @@ -257,13 +258,12 @@ vfio_iommufd_attach_compat_ioas(struct vfio_device *device, return -EOPNOTSUPP; } -static inline int vfio_iommufd_bind(struct vfio_device *device, - struct iommufd_ctx *ictx) +static inline int vfio_iommufd_bind(struct vfio_device_file *df) { return -EOPNOTSUPP; } -static inline void vfio_iommufd_unbind(struct vfio_device *device) +static inline void vfio_iommufd_unbind(struct vfio_device_file *df) { } #endif diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index f4c9c27c7d74..74da44973594 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -435,7 +435,7 @@ static int vfio_device_first_open(struct vfio_device_file *df) * noiommu mode then just go ahead to open it. */ if (iommufd) - ret = vfio_iommufd_bind(device, iommufd); + ret = vfio_iommufd_bind(df); else if (vfio_device_group_uses_container(df)) ret = vfio_device_group_use_iommu(device); if (ret) @@ -450,7 +450,7 @@ static int vfio_device_first_open(struct vfio_device_file *df) err_unuse_iommu: if (iommufd) - vfio_iommufd_unbind(device); + vfio_iommufd_unbind(df); else if (vfio_device_group_uses_container(df)) vfio_device_group_unuse_iommu(device); err_module_put: @@ -468,7 +468,7 @@ static void vfio_device_last_close(struct vfio_device_file *df) if (device->ops->close_device) device->ops->close_device(device); if (iommufd) - vfio_iommufd_unbind(device); + vfio_iommufd_unbind(df); else if (vfio_device_group_uses_container(df)) vfio_device_group_unuse_iommu(device); module_put(device->dev->driver->owner); From patchwork Sat Apr 1 15:18:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197123 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 26BCCC6FD1D for ; Sat, 1 Apr 2023 15:19:21 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 238A310E26F; Sat, 1 Apr 2023 15:19:20 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id BA2E910E227; Sat, 1 Apr 2023 15:18:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362325; x=1711898325; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=HzjuymGSErhNPthZuWLtoEmoDeXjg21nddMtrQuexjk=; b=hEU2yIPgSY3KxRmfuQRQlWEVZjMP2DHF99CCClG32BTMiNstzUVOG2Oz cYuEeJIsgJvSKetJnFCd8aOUn89D3Hyd7zqJcMroRxaDDjF4busNb4FAA OOQpKIJJF+OGgIWnesuxGqwXxIQ0YtfIMTqfWtAoNWzK4H6uRIm2uOLyj p/5ZGYPkjAJf9UkgP8Cq6p+Stg31AhlwNrEAmAM7O2tqWxdCU6CaNZvhQ 6T0ootQrXMg29KYoqGGA3AgjEHLLK82qHZReU2Du9GWaazDjHrVu6jlRa 0maXlNhHfKhbHS8EsGrHRg+IOktY6/OvrJeAu4cx6P6yvRmEyDk7ilcCC g==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411321" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411321" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937208" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937208" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:44 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:23 -0700 Message-Id: <20230401151833.124749-16-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 15/25] vfio-iommufd: Add detach_ioas support for physical VFIO devices X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" this prepares for adding DETACH ioctl for physical VFIO devices. Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- Documentation/driver-api/vfio.rst | 8 +++++--- drivers/vfio/fsl-mc/vfio_fsl_mc.c | 1 + drivers/vfio/iommufd.c | 20 +++++++++++++++++++ .../vfio/pci/hisilicon/hisi_acc_vfio_pci.c | 2 ++ drivers/vfio/pci/mlx5/main.c | 1 + drivers/vfio/pci/vfio_pci.c | 1 + drivers/vfio/platform/vfio_amba.c | 1 + drivers/vfio/platform/vfio_platform.c | 1 + drivers/vfio/vfio_main.c | 3 ++- include/linux/vfio.h | 8 +++++++- 10 files changed, 41 insertions(+), 5 deletions(-) diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst index 68abc089d6dd..363e12c90b87 100644 --- a/Documentation/driver-api/vfio.rst +++ b/Documentation/driver-api/vfio.rst @@ -279,6 +279,7 @@ similar to a file operations structure:: struct iommufd_ctx *ictx, u32 *out_device_id); void (*unbind_iommufd)(struct vfio_device *vdev); int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id); + void (*detach_ioas)(struct vfio_device *vdev); int (*open_device)(struct vfio_device *vdev); void (*close_device)(struct vfio_device *vdev); ssize_t (*read)(struct vfio_device *vdev, char __user *buf, @@ -315,9 +316,10 @@ container_of(). - The [un]bind_iommufd callbacks are issued when the device is bound to and unbound from iommufd. - - The attach_ioas callback is issued when the device is attached to an - IOAS managed by the bound iommufd. The attached IOAS is automatically - detached when the device is unbound from iommufd. + - The [de]attach_ioas callback is issued when the device is attached to + and detached from an IOAS managed by the bound iommufd. However, the + attached IOAS can also be automatically detached when the device is + unbound from iommufd. - The read/write/mmap callbacks implement the device region access defined by the device's own VFIO_DEVICE_GET_REGION_INFO ioctl. diff --git a/drivers/vfio/fsl-mc/vfio_fsl_mc.c b/drivers/vfio/fsl-mc/vfio_fsl_mc.c index c89a047a4cd8..d540cf683d93 100644 --- a/drivers/vfio/fsl-mc/vfio_fsl_mc.c +++ b/drivers/vfio/fsl-mc/vfio_fsl_mc.c @@ -594,6 +594,7 @@ static const struct vfio_device_ops vfio_fsl_mc_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static struct fsl_mc_driver vfio_fsl_mc_driver = { diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 6ef9316a16a5..1c08aa29397a 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -113,6 +113,14 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id) { int rc; + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return -EINVAL; + + if (vdev->iommufd_attached) + return -EBUSY; + rc = iommufd_device_attach(vdev->iommufd_device, pt_id); if (rc) return rc; @@ -121,6 +129,18 @@ int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id) } EXPORT_SYMBOL_GPL(vfio_iommufd_physical_attach_ioas); +void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev) +{ + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device) || !vdev->iommufd_attached) + return; + + iommufd_device_detach(vdev->iommufd_device); + vdev->iommufd_attached = false; +} +EXPORT_SYMBOL_GPL(vfio_iommufd_physical_detach_ioas); + /* * The emulated standard ops mean that vfio_device is going to use the * "mdev path" and will call vfio_pin_pages()/vfio_dma_rw(). Drivers using this diff --git a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c index a117eaf21c14..b2f9778c8366 100644 --- a/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c +++ b/drivers/vfio/pci/hisilicon/hisi_acc_vfio_pci.c @@ -1373,6 +1373,7 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_migrn_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static const struct vfio_device_ops hisi_acc_vfio_pci_ops = { @@ -1391,6 +1392,7 @@ static const struct vfio_device_ops hisi_acc_vfio_pci_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static int hisi_acc_vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) diff --git a/drivers/vfio/pci/mlx5/main.c b/drivers/vfio/pci/mlx5/main.c index d95fd382814c..42ec574a8622 100644 --- a/drivers/vfio/pci/mlx5/main.c +++ b/drivers/vfio/pci/mlx5/main.c @@ -1320,6 +1320,7 @@ static const struct vfio_device_ops mlx5vf_pci_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static int mlx5vf_pci_probe(struct pci_dev *pdev, diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 29091ee2e984..cb5b7f865d58 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -141,6 +141,7 @@ static const struct vfio_device_ops vfio_pci_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static int vfio_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) diff --git a/drivers/vfio/platform/vfio_amba.c b/drivers/vfio/platform/vfio_amba.c index 83fe54015595..6464b3939ebc 100644 --- a/drivers/vfio/platform/vfio_amba.c +++ b/drivers/vfio/platform/vfio_amba.c @@ -119,6 +119,7 @@ static const struct vfio_device_ops vfio_amba_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static const struct amba_id pl330_ids[] = { diff --git a/drivers/vfio/platform/vfio_platform.c b/drivers/vfio/platform/vfio_platform.c index 22a1efca32a8..8cf22fa65baa 100644 --- a/drivers/vfio/platform/vfio_platform.c +++ b/drivers/vfio/platform/vfio_platform.c @@ -108,6 +108,7 @@ static const struct vfio_device_ops vfio_platform_ops = { .bind_iommufd = vfio_iommufd_physical_bind, .unbind_iommufd = vfio_iommufd_physical_unbind, .attach_ioas = vfio_iommufd_physical_attach_ioas, + .detach_ioas = vfio_iommufd_physical_detach_ioas, }; static struct platform_driver vfio_platform_driver = { diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 74da44973594..bc2ecbb7c22a 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -258,7 +258,8 @@ static int __vfio_register_dev(struct vfio_device *device, if (WARN_ON(IS_ENABLED(CONFIG_IOMMUFD) && (!device->ops->bind_iommufd || !device->ops->unbind_iommufd || - !device->ops->attach_ioas))) + !device->ops->attach_ioas || + !device->ops->detach_ioas))) return -EINVAL; /* diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 7519ae89fcd6..bf41b9471c28 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -73,7 +73,9 @@ struct vfio_device { * @bind_iommufd: Called when binding the device to an iommufd * @unbind_iommufd: Opposite of bind_iommufd * @attach_ioas: Called when attaching device to an IOAS/HWPT managed by the - * bound iommufd. Undo in unbind_iommufd. + * bound iommufd. Undo in unbind_iommufd if @detach_ioas is not + * called. + * @detach_ioas: Opposite of attach_ioas * @open_device: Called when the first file descriptor is opened for this device * @close_device: Opposite of open_device * @read: Perform read(2) on device file descriptor @@ -97,6 +99,7 @@ struct vfio_device_ops { struct iommufd_ctx *ictx, u32 *out_device_id); void (*unbind_iommufd)(struct vfio_device *vdev); int (*attach_ioas)(struct vfio_device *vdev, u32 *pt_id); + void (*detach_ioas)(struct vfio_device *vdev); int (*open_device)(struct vfio_device *vdev); void (*close_device)(struct vfio_device *vdev); ssize_t (*read)(struct vfio_device *vdev, char __user *buf, @@ -120,6 +123,7 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_physical_unbind(struct vfio_device *vdev); int vfio_iommufd_physical_attach_ioas(struct vfio_device *vdev, u32 *pt_id); +void vfio_iommufd_physical_detach_ioas(struct vfio_device *vdev); int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); @@ -143,6 +147,8 @@ vfio_iommufd_physical_devid(struct vfio_device *vdev, u32 *id) ((void (*)(struct vfio_device *vdev)) NULL) #define vfio_iommufd_physical_attach_ioas \ ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) +#define vfio_iommufd_physical_detach_ioas \ + ((void (*)(struct vfio_device *vdev)) NULL) #define vfio_iommufd_emulated_bind \ ((int (*)(struct vfio_device *vdev, struct iommufd_ctx *ictx, \ u32 *out_device_id)) NULL) From patchwork Sat Apr 1 15:18:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197124 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6932EC761AF for ; Sat, 1 Apr 2023 15:19:23 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id A3ABC10E2A2; Sat, 1 Apr 2023 15:19:22 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id C048510E239; Sat, 1 Apr 2023 15:18:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362325; x=1711898325; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=z0B7Z80Gkacpxp7MPZnlPz3GQ79FTfsJtRsfLA1adjc=; b=TRo2VJoN8a0myem1ksJXVRlYGH187iRU0hh2cc6nK1+OAyStve1j1/TC 0EI+PgWCgT/YddAbhFmUEXV3Gbj9J+vJBwq3PRgxDfaXT9MYK+MQzpOR1 bklDGYa5bVMzeQuQFQykkp3sxfWElAp0Eh1KE2iKsQHu8fHHFD0BL+4qF jjM1qrs9YhhwoWHSfFEle54TCx0JSlFWutGDx5yXfKnxsj2WFfjU5rYiS N0I1YcnqNR3i5TaevK13AWkoQHA9wlF/798obBvCajzR7a6QrAyUpH96q YcJBPV+x5MyUbIaJjA2JhaYLNU4zyC1ojhSXW1lGDkwhcPbZMhwfxPwUF Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411330" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411330" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937212" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937212" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:44 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:24 -0700 Message-Id: <20230401151833.124749-17-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 16/25] iommufd/device: Add iommufd_access_detach() API X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" From: Nicolin Chen Previously, the detach routine is only done by the destroy(). And it was called by vfio_iommufd_emulated_unbind() when the device runs close(), so all the mappings in iopt were cleaned in that setup, when the call trace reaches this detach() routine. Now, there's a need of a detach uAPI, meaning that it does not only need a new iommufd_access_detach() API, but also requires access->ops->unmap() call as a cleanup. So add one. However, leaving that unprotected can introduce some potential of a race condition during the pin_/unpin_pages() call, where access->ioas->iopt is getting referenced. So, add an ioas_lock to protect the context of iopt referencings. Also, to allow the iommufd_access_unpin_pages() callback to happen via this unmap() call, add an ioas_unpin pointer, so the unpin routine won't be affected by the "access->ioas = NULL" trick. Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Yanting Jiang Signed-off-by: Nicolin Chen Signed-off-by: Yi Liu --- drivers/iommu/iommufd/device.c | 76 +++++++++++++++++++++++-- drivers/iommu/iommufd/iommufd_private.h | 2 + include/linux/iommufd.h | 1 + 3 files changed, 74 insertions(+), 5 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 04a57aa1ae2c..0eaae60f3537 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -474,6 +474,7 @@ iommufd_access_create(struct iommufd_ctx *ictx, iommufd_ctx_get(ictx); iommufd_object_finalize(ictx, &access->obj); *id = access->obj.id; + mutex_init(&access->ioas_lock); return access; } EXPORT_SYMBOL_NS_GPL(iommufd_access_create, IOMMUFD); @@ -493,26 +494,66 @@ void iommufd_access_destroy(struct iommufd_access *access) } EXPORT_SYMBOL_NS_GPL(iommufd_access_destroy, IOMMUFD); +static void __iommufd_access_detach(struct iommufd_access *access) +{ + struct iommufd_ioas *cur_ioas = access->ioas; + + lockdep_assert_held(&access->ioas_lock); + /* + * Set ioas to NULL to block any further iommufd_access_pin_pages(). + * iommufd_access_unpin_pages() can continue using access->ioas_unpin. + */ + access->ioas = NULL; + + if (access->ops->unmap) { + mutex_unlock(&access->ioas_lock); + access->ops->unmap(access->data, 0, ULONG_MAX); + mutex_lock(&access->ioas_lock); + } + iopt_remove_access(&cur_ioas->iopt, access); + refcount_dec(&cur_ioas->obj.users); +} + +void iommufd_access_detach(struct iommufd_access *access) +{ + mutex_lock(&access->ioas_lock); + if (WARN_ON(!access->ioas)) + goto out; + __iommufd_access_detach(access); +out: + access->ioas_unpin = NULL; + mutex_unlock(&access->ioas_lock); +} +EXPORT_SYMBOL_NS_GPL(iommufd_access_detach, IOMMUFD); + int iommufd_access_attach(struct iommufd_access *access, u32 ioas_id) { struct iommufd_ioas *new_ioas; int rc = 0; - if (access->ioas != NULL && access->ioas->obj.id != ioas_id) + mutex_lock(&access->ioas_lock); + if (access->ioas != NULL && access->ioas->obj.id != ioas_id) { + mutex_unlock(&access->ioas_lock); return -EINVAL; + } new_ioas = iommufd_get_ioas(access->ictx, ioas_id); - if (IS_ERR(new_ioas)) + if (IS_ERR(new_ioas)) { + mutex_unlock(&access->ioas_lock); return PTR_ERR(new_ioas); + } rc = iopt_add_access(&new_ioas->iopt, access); if (rc) { + mutex_unlock(&access->ioas_lock); iommufd_put_object(&new_ioas->obj); return rc; } iommufd_ref_to_users(&new_ioas->obj); access->ioas = new_ioas; + access->ioas_unpin = new_ioas; + mutex_unlock(&access->ioas_lock); return 0; } EXPORT_SYMBOL_NS_GPL(iommufd_access_attach, IOMMUFD); @@ -567,8 +608,8 @@ void iommufd_access_notify_unmap(struct io_pagetable *iopt, unsigned long iova, void iommufd_access_unpin_pages(struct iommufd_access *access, unsigned long iova, unsigned long length) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; unsigned long last_iova; struct iopt_area *area; @@ -576,6 +617,13 @@ void iommufd_access_unpin_pages(struct iommufd_access *access, WARN_ON(check_add_overflow(iova, length - 1, &last_iova))) return; + mutex_lock(&access->ioas_lock); + if (!access->ioas_unpin) { + mutex_unlock(&access->ioas_lock); + return; + } + iopt = &access->ioas_unpin->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) iopt_area_remove_access( @@ -585,6 +633,7 @@ void iommufd_access_unpin_pages(struct iommufd_access *access, min(last_iova, iopt_area_last_iova(area)))); up_read(&iopt->iova_rwsem); WARN_ON(!iopt_area_contig_done(&iter)); + mutex_unlock(&access->ioas_lock); } EXPORT_SYMBOL_NS_GPL(iommufd_access_unpin_pages, IOMMUFD); @@ -630,8 +679,8 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, unsigned long length, struct page **out_pages, unsigned int flags) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; unsigned long last_iova; struct iopt_area *area; int rc; @@ -646,6 +695,13 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, if (check_add_overflow(iova, length - 1, &last_iova)) return -EOVERFLOW; + mutex_lock(&access->ioas_lock); + if (!access->ioas) { + mutex_unlock(&access->ioas_lock); + return -ENOENT; + } + iopt = &access->ioas->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); @@ -676,6 +732,7 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, } up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return 0; err_remove: @@ -690,6 +747,7 @@ int iommufd_access_pin_pages(struct iommufd_access *access, unsigned long iova, iopt_area_last_iova(area)))); } up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD); @@ -709,8 +767,8 @@ EXPORT_SYMBOL_NS_GPL(iommufd_access_pin_pages, IOMMUFD); int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, void *data, size_t length, unsigned int flags) { - struct io_pagetable *iopt = &access->ioas->iopt; struct iopt_area_contig_iter iter; + struct io_pagetable *iopt; struct iopt_area *area; unsigned long last_iova; int rc; @@ -720,6 +778,13 @@ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, if (check_add_overflow(iova, length - 1, &last_iova)) return -EOVERFLOW; + mutex_lock(&access->ioas_lock); + if (!access->ioas) { + mutex_unlock(&access->ioas_lock); + return -ENOENT; + } + iopt = &access->ioas->iopt; + down_read(&iopt->iova_rwsem); iopt_for_each_contig_area(&iter, area, iopt, iova, last_iova) { unsigned long last = min(last_iova, iopt_area_last_iova(area)); @@ -746,6 +811,7 @@ int iommufd_access_rw(struct iommufd_access *access, unsigned long iova, rc = -ENOENT; err_out: up_read(&iopt->iova_rwsem); + mutex_unlock(&access->ioas_lock); return rc; } EXPORT_SYMBOL_NS_GPL(iommufd_access_rw, IOMMUFD); diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 2e6e8e217cce..ec2ce3ef187d 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -263,6 +263,8 @@ struct iommufd_access { struct iommufd_object obj; struct iommufd_ctx *ictx; struct iommufd_ioas *ioas; + struct iommufd_ioas *ioas_unpin; + struct mutex ioas_lock; const struct iommufd_access_ops *ops; void *data; unsigned long iova_alignment; diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index ac96df406833..9e0e8894dacc 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -47,6 +47,7 @@ iommufd_access_create(struct iommufd_ctx *ictx, const struct iommufd_access_ops *ops, void *data, u32 *id); void iommufd_access_destroy(struct iommufd_access *access); int iommufd_access_attach(struct iommufd_access *access, u32 ioas_id); +void iommufd_access_detach(struct iommufd_access *access); void iommufd_ctx_get(struct iommufd_ctx *ictx); From patchwork Sat Apr 1 15:18:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197131 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B4318C761AF for ; Sat, 1 Apr 2023 15:19:35 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id F25C210E227; Sat, 1 Apr 2023 15:19:33 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 3287410E23B; Sat, 1 Apr 2023 15:18:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362326; x=1711898326; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0KCtkdFxhMztcP2i9cCvjn0bpuK7nlIb71bQ3qptwPY=; b=nrz1E7+vyW1SERUEDyOCjPjPasBrk2dsMN4geNbifttS1u1ybPWodNj0 sAcECUwg3kbEOo/T6LQXw/vBhQS20vq/lD7ASanAF1UslMdLupmcnr1E/ HHKiTKMly2vGX7QFwJeisYKjkF4xxWI4fYILIRJjyqGvcQ7WEFe7XNV1s 935zU6p8pOrxiJ/l4Nlrta/eNd6kYYUJ8LbIvOUd/UrN3WSkpjulqJKJo /074S8wnHoJK0fyCel/yMJLKHMuifXizVtRW1TV7dlGTLIgY2XN38pru0 Cc9DmQsP03Xs5jjkbW8h5xgSI5Rg3OEZ6UOFw2DSOD7g3Hlav2hEvXpOs Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411339" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411339" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937215" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937215" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:45 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:25 -0700 Message-Id: <20230401151833.124749-18-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 17/25] vfio-iommufd: Add detach_ioas support for emulated VFIO devices X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" this prepares for adding DETACH ioctl for emulated VFIO devices. Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/gpu/drm/i915/gvt/kvmgt.c | 1 + drivers/s390/cio/vfio_ccw_ops.c | 1 + drivers/s390/crypto/vfio_ap_ops.c | 1 + drivers/vfio/iommufd.c | 12 ++++++++++++ include/linux/vfio.h | 3 +++ samples/vfio-mdev/mbochs.c | 1 + samples/vfio-mdev/mdpy.c | 1 + samples/vfio-mdev/mtty.c | 1 + 8 files changed, 21 insertions(+) diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index de675d799c7d..9cd9e9da60dd 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -1474,6 +1474,7 @@ static const struct vfio_device_ops intel_vgpu_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static int intel_vgpu_probe(struct mdev_device *mdev) diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c index 5b53b94f13c7..cba4971618ff 100644 --- a/drivers/s390/cio/vfio_ccw_ops.c +++ b/drivers/s390/cio/vfio_ccw_ops.c @@ -632,6 +632,7 @@ static const struct vfio_device_ops vfio_ccw_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; struct mdev_driver vfio_ccw_mdev_driver = { diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 72e10abb103a..9902e62e7a17 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -1844,6 +1844,7 @@ static const struct vfio_device_ops vfio_ap_matrix_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static struct mdev_driver vfio_ap_matrix_driver = { diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 1c08aa29397a..fb1c7fc3781e 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -204,3 +204,15 @@ int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id) return 0; } EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_attach_ioas); + +void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev) +{ + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_access) || !vdev->iommufd_attached) + return; + + iommufd_access_detach(vdev->iommufd_access); + vdev->iommufd_attached = false; +} +EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_detach_ioas); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index bf41b9471c28..1445eb185121 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -128,6 +128,7 @@ int vfio_iommufd_emulated_bind(struct vfio_device *vdev, struct iommufd_ctx *ictx, u32 *out_device_id); void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id); +void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev); #else static inline struct iommufd_ctx * vfio_iommufd_physical_ictx(struct vfio_device *vdev) @@ -156,6 +157,8 @@ vfio_iommufd_physical_devid(struct vfio_device *vdev, u32 *id) ((void (*)(struct vfio_device *vdev)) NULL) #define vfio_iommufd_emulated_attach_ioas \ ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) +#define vfio_iommufd_emulated_detach_ioas \ + ((void (*)(struct vfio_device *vdev)) NULL) #endif static inline bool vfio_device_cdev_opened(struct vfio_device *device) diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c index 19391dda5fba..47a2914b63d9 100644 --- a/samples/vfio-mdev/mbochs.c +++ b/samples/vfio-mdev/mbochs.c @@ -1377,6 +1377,7 @@ static const struct vfio_device_ops mbochs_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static struct mdev_driver mbochs_driver = { diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c index 5f48aef36995..ce0e67f37406 100644 --- a/samples/vfio-mdev/mdpy.c +++ b/samples/vfio-mdev/mdpy.c @@ -666,6 +666,7 @@ static const struct vfio_device_ops mdpy_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static struct mdev_driver mdpy_driver = { diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c index 35460901b9f7..5069aef3c2a2 100644 --- a/samples/vfio-mdev/mtty.c +++ b/samples/vfio-mdev/mtty.c @@ -1272,6 +1272,7 @@ static const struct vfio_device_ops mtty_dev_ops = { .bind_iommufd = vfio_iommufd_emulated_bind, .unbind_iommufd = vfio_iommufd_emulated_unbind, .attach_ioas = vfio_iommufd_emulated_attach_ioas, + .detach_ioas = vfio_iommufd_emulated_detach_ioas, }; static struct mdev_driver mtty_driver = { From patchwork Sat Apr 1 15:18:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197118 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 34F4BC77B72 for ; Sat, 1 Apr 2023 15:19:03 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 4F41310E113; Sat, 1 Apr 2023 15:19:02 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 38DED10E24B; Sat, 1 Apr 2023 15:18:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362326; x=1711898326; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=opGN95o7pF1df9Ozf4TXyBS5az+9mr8bUEPUq1TuJqs=; b=QRRSgHjAPNVtiPxODl2L+TkzFKIbZvCarKzcB5cTZXW3LriuBNL0qQW8 PICW+pn5vj9cUlBhphnsFXh+p4NwNYBR6pSpm0qC0Ma5hlQ9PqXPlaI5Y Sody69iL2EoBzfhyPX+7LzZ0YtNascdlWMBwixmNXDKsVMUviDLHOQHTn GqcRPSP024Ju57yoVLn7VmEAPQLlqRunsad9MHT+PtxC6M7PACt6Tlvum EfSAEsy0qXe5E2W5XWlYKimaoFW3cbxk9IwJwNacEymVuJ0Kz37luYxMg FQKb+fj4xtEl06jcN77tmlPB0EIxf3ewj6aTkm4IuYtGKB7QRy3IMpF8Y Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411349" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411349" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937218" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937218" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:45 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:26 -0700 Message-Id: <20230401151833.124749-19-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 18/25] vfio: Determine noiommu in vfio_device registration X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This adds a noiommu flag in vfio_device, hence caller of the vfio_device_is_noiommu() just refers to the flag for noiommu check. Reviewed-by: Kevin Tian Tested-by: Nicolin Chen Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/group.c | 2 +- drivers/vfio/vfio.h | 6 +++--- drivers/vfio/vfio_main.c | 2 ++ include/linux/vfio.h | 1 + 4 files changed, 7 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index de9f09c7cf00..8c88bff0fc59 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -192,7 +192,7 @@ static int vfio_device_group_open(struct vfio_device_file *df) vfio_device_group_get_kvm_safe(device); df->iommufd = device->group->iommufd; - if (df->iommufd && vfio_device_is_noiommu(device)) { + if (df->iommufd && device->noiommu) { if (device->open_count == 0) { ret = vfio_iommufd_enable_noiommu_compat(device, df->iommufd); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index b47b186573ac..41dfc9d5205a 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -108,10 +108,10 @@ bool vfio_device_has_container(struct vfio_device *device); int __init vfio_group_init(void); void vfio_group_cleanup(void); -static inline bool vfio_device_is_noiommu(struct vfio_device *vdev) +static inline void vfio_device_set_noiommu(struct vfio_device *device) { - return IS_ENABLED(CONFIG_VFIO_NOIOMMU) && - vdev->group->type == VFIO_NO_IOMMU; + device->noiommu = IS_ENABLED(CONFIG_VFIO_NOIOMMU) && + device->group->type == VFIO_NO_IOMMU; } #if IS_ENABLED(CONFIG_VFIO_CONTAINER) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index bc2ecbb7c22a..989c40a49171 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -277,6 +277,8 @@ static int __vfio_register_dev(struct vfio_device *device, if (ret) return ret; + vfio_device_set_noiommu(device); + ret = device_add(&device->device); if (ret) goto err_out; diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 1445eb185121..672c0d9ac3fa 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -63,6 +63,7 @@ struct vfio_device { bool iommufd_attached; #endif bool cdev_opened; + bool noiommu; }; /** From patchwork Sat Apr 1 15:18:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197130 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6FDAAC77B62 for ; Sat, 1 Apr 2023 15:19:33 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 84D0510E0F6; Sat, 1 Apr 2023 15:19:32 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 1723310E139; Sat, 1 Apr 2023 15:18:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362327; x=1711898327; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=aQ3o8IsEThomvWbYBo9On/FLVyvK6wQgWjHFyMOy86I=; b=EBN4iovjJ1J+HQNYl1XZq6vCwO1xykWmUC+7Ii3xiOjzl9HLcyHpbzGe GzHItZCNysX/gyrfs5ICycKwDvD3dc4pOIlzwiMrCTaoTge8mpTpPrvJi Vu29aUXX5BdWoazNWUYmeY2Ij59PIiwjVVZRFvLpd/tm5QHHwe4vWpzOM ZFJ8r08SxKVqNyF97im3zfQrGJT0myLUNjn/WkRFGbNRqhgQytL486WTp 3s75vjfs1oiFgi0MbVzmXx1m7aGF40DPoTjRkddsObEXTlgUZUdWRcAPb mdV49bDy7VeOsEPVPuSS/+WCz/AetBezxpGBMNi4ZDIRkiM/6VW6t762/ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411360" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411360" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:46 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937221" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937221" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:46 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:27 -0700 Message-Id: <20230401151833.124749-20-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 19/25] vfio: Name noiommu vfio_device with "noiommu-" prefix X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" For noiommu device, vfio core names the cdev node with prefix "noiommu-". Reviewed-by: Kevin Tian Tested-by: Nicolin Chen Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/vfio_main.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 989c40a49171..0337d1ace716 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -269,16 +269,17 @@ static int __vfio_register_dev(struct vfio_device *device, if (!device->dev_set) vfio_assign_device_set(device, device); - ret = dev_set_name(&device->device, "vfio%d", device->index); - if (ret) - return ret; - ret = vfio_device_set_group(device, type); if (ret) return ret; vfio_device_set_noiommu(device); + ret = dev_set_name(&device->device, "%svfio%d", + device->noiommu ? "noiommu-" : "", device->index); + if (ret) + goto err_out; + ret = device_add(&device->device); if (ret) goto err_out; From patchwork Sat Apr 1 15:18:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197117 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BB70EC77B77 for ; Sat, 1 Apr 2023 15:19:03 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id C78F010E105; Sat, 1 Apr 2023 15:19:02 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 8ECE210E105; Sat, 1 Apr 2023 15:18:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362327; x=1711898327; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UGMAQSzJDGu2jFNVvQm9xR08gZhYc2KjsQ3tE+TSPbo=; b=Pl+pPJabRkYLQPS3A1BCS4/iHzIe/wh8GMZcaX20U0nGAtmw7miiaAlV 3CG+Q0AMd8HwURVsDuJIrhAYD+N3fy/H1XGWi+EArA+WqcCiyuPhZzyly zXjcyHhR1jtMtjgZTyfK2Ent8uv1q2McQvAuoxeN/3fy7tbe+8i8O2kxm 4gTNqPe3cZ5ima0XII3TPPNPYsZ5Glh3WLRYeTMhEwAsS2RcOEviwHUSS LJ7L+enlZu4pc0gwwKHJon4XrY/69b53jT5uQ5SNXWmmADqr7fvrl5DaD cLQ41TSlMZLZ8ZJrvuTACKaVPXM+TBgnzYq6K1o1/nN3Tgu1UObHuNz6n g==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411370" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411370" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937225" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937225" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:46 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:28 -0700 Message-Id: <20230401151833.124749-21-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 20/25] vfio: Move vfio_device_group_unregister() to be the first operation in unregister X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This can avoid endless vfio_device refcount increasement by userspace, which would keep blocking the vfio_unregister_group_dev(). Signed-off-by: Yi Liu --- drivers/vfio/vfio_main.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 0337d1ace716..6c31212740fa 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -320,6 +320,12 @@ void vfio_unregister_group_dev(struct vfio_device *device) bool interrupted = false; long rc; + /* + * Prevent new device opened by userspace via the + * VFIO_GROUP_GET_DEVICE_FD in the group path. + */ + vfio_device_group_unregister(device); + vfio_device_put_registration(device); rc = try_wait_for_completion(&device->comp); while (rc <= 0) { @@ -343,8 +349,6 @@ void vfio_unregister_group_dev(struct vfio_device *device) } } - vfio_device_group_unregister(device); - /* Balances device_add in register path */ device_del(&device->device); From patchwork Sat Apr 1 15:18:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197128 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5B3BAC761AF for ; Sat, 1 Apr 2023 15:19:32 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 0D54310E2DE; Sat, 1 Apr 2023 15:19:31 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 0A37210E112; Sat, 1 Apr 2023 15:18:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362328; x=1711898328; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3vQRbagVKoRT/eiSayKC3gqFvQz28lpHtWybHUvXfAI=; b=Wb1HSnUt7YaJYCSVFMCUPA/84LyfPohv5egttpT9ag2/HT7CFk/+r1MO LegIsTDsDl118OJQ7hcESv3N8Tl6pHY7LyS0xftahidt5j/5T4KPOVaIt cSmqQoaqC0UXewX9LBjdE7W5hwqBKys3subSMeUcEgFS5kwpjwdXM+M71 z3ZGOxJNtNeEEaFgi+dmBaeRPwyRYVz6r1H4jivyT6WZKGY14bhz8GSJ0 eZ1z2nuSerjiTznHLf/BTfyO3hqDLiwI7UllpdBDqW9KKMNtR2Gslig3w 0C+Y/pu+H9lTl9Ge6JCXINifLw/KMBJ6eYXkyGm1wDp8wmVeXnpc9iwNn A==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411383" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411383" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:47 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937228" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937228" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:47 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:29 -0700 Message-Id: <20230401151833.124749-22-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 21/25] vfio: Add cdev for vfio_device X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This allows user to directly open a vfio device w/o using the legacy container/group interface, as a prerequisite for supporting new iommu features like nested translation. The device fd opened in this manner doesn't have the capability to access the device as the fops open() doesn't open the device until the successful BIND_IOMMUFD which be added in next patch. With this patch, devices registered to vfio core have both group and device interface created. - group interface : /dev/vfio/$groupID - device interface: /dev/vfio/devices/vfioX - normal device /dev/vfio/devices/noiommu-vfioX - noiommu device ("X" is the minor number and is unique across devices) Given a vfio device the user can identify the matching vfioX by checking the sysfs path of the device. Take PCI device (0000:6a:01.0) for example, /sys/bus/pci/devices/0000\:6a\:01.0/vfio-dev/vfio0/dev contains the major:minor of the matching vfioX. Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat that the major:minor matches. The vfio_device cdev logic in this patch: *) __vfio_register_dev() path ends up doing cdev_device_add() for each vfio_device if VFIO_DEVICE_CDEV configured. *) vfio_unregister_group_dev() path does cdev_device_del(); Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/Kconfig | 11 +++++++ drivers/vfio/Makefile | 1 + drivers/vfio/device_cdev.c | 62 ++++++++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 46 ++++++++++++++++++++++++++++ drivers/vfio/vfio_main.c | 24 ++++++++++++--- include/linux/vfio.h | 13 +++++++- 6 files changed, 151 insertions(+), 6 deletions(-) create mode 100644 drivers/vfio/device_cdev.c diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index 89e06c981e43..e2105b4dac2d 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -12,6 +12,17 @@ menuconfig VFIO If you don't know what to do here, say N. if VFIO +config VFIO_DEVICE_CDEV + bool "Support for the VFIO cdev /dev/vfio/devices/vfioX" + depends on IOMMUFD + help + The VFIO device cdev is another way for userspace to get device + access. Userspace gets device fd by opening device cdev under + /dev/vfio/devices/vfioX, and then bind the device fd with an iommufd + to set up secure DMA context for device access. + + If you don't know what to do here, say N. + config VFIO_CONTAINER bool "Support for the VFIO container /dev/vfio/vfio" select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64) diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 70e7dcb302ef..245394aeb94b 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_VFIO) += vfio.o vfio-y += vfio_main.o \ group.o \ iova_bitmap.o +vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o vfio-$(CONFIG_IOMMUFD) += iommufd.o vfio-$(CONFIG_VFIO_CONTAINER) += container.o vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c new file mode 100644 index 000000000000..1c640016a824 --- /dev/null +++ b/drivers/vfio/device_cdev.c @@ -0,0 +1,62 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2023 Intel Corporation. + */ +#include + +#include "vfio.h" + +static dev_t device_devt; + +void vfio_init_device_cdev(struct vfio_device *device) +{ + device->device.devt = MKDEV(MAJOR(device_devt), device->index); + cdev_init(&device->cdev, &vfio_device_fops); + device->cdev.owner = THIS_MODULE; +} + +/* + * device access via the fd opened by this function is blocked until + * .open_device() is called successfully during BIND_IOMMUFD. + */ +int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep) +{ + struct vfio_device *device = container_of(inode->i_cdev, + struct vfio_device, cdev); + struct vfio_device_file *df; + int ret; + + if (!vfio_device_try_get_registration(device)) + return -ENODEV; + + df = vfio_allocate_device_file(device); + if (IS_ERR(df)) { + ret = PTR_ERR(df); + goto err_put_registration; + } + + filep->private_data = df; + + return 0; + +err_put_registration: + vfio_device_put_registration(device); + return ret; +} + +static char *vfio_device_devnode(const struct device *dev, umode_t *mode) +{ + return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); +} + +int vfio_cdev_init(struct class *device_class) +{ + device_class->devnode = vfio_device_devnode; + return alloc_chrdev_region(&device_devt, 0, + MINORMASK + 1, "vfio-dev"); +} + +void vfio_cdev_cleanup(void) +{ + unregister_chrdev_region(device_devt, MINORMASK + 1); +} diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 41dfc9d5205a..3a8fd0e32f59 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -268,6 +268,52 @@ static inline void vfio_iommufd_unbind(struct vfio_device_file *df) } #endif +#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) +static inline int vfio_device_add(struct vfio_device *device) +{ + return cdev_device_add(&device->cdev, &device->device); +} + +static inline void vfio_device_del(struct vfio_device *device) +{ + cdev_device_del(&device->cdev, &device->device); +} + +void vfio_init_device_cdev(struct vfio_device *device); +int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep); +int vfio_cdev_init(struct class *device_class); +void vfio_cdev_cleanup(void); +#else +static inline int vfio_device_add(struct vfio_device *device) +{ + return device_add(&device->device); +} + +static inline void vfio_device_del(struct vfio_device *device) +{ + device_del(&device->device); +} + +static inline void vfio_init_device_cdev(struct vfio_device *device) +{ +} + +static inline int vfio_device_fops_cdev_open(struct inode *inode, + struct file *filep) +{ + return 0; +} + +static inline int vfio_cdev_init(struct class *device_class) +{ + return 0; +} + +static inline void vfio_cdev_cleanup(void) +{ +} +#endif /* CONFIG_VFIO_DEVICE_CDEV */ + #if IS_ENABLED(CONFIG_VFIO_VIRQFD) int __init vfio_virqfd_init(void); void vfio_virqfd_exit(void); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 6c31212740fa..61f56d9b5e5f 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -242,6 +242,7 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev, device->device.release = vfio_device_release; device->device.class = vfio.device_class; device->device.parent = device->dev; + vfio_init_device_cdev(device); return 0; out_uninit: @@ -280,7 +281,7 @@ static int __vfio_register_dev(struct vfio_device *device, if (ret) goto err_out; - ret = device_add(&device->device); + ret = vfio_device_add(device); if (ret) goto err_out; @@ -326,6 +327,12 @@ void vfio_unregister_group_dev(struct vfio_device *device) */ vfio_device_group_unregister(device); + /* + * Balances vfio_device_add() in register path, also prevents + * new device opened by userspace in the cdev path. + */ + vfio_device_del(device); + vfio_device_put_registration(device); rc = try_wait_for_completion(&device->comp); while (rc <= 0) { @@ -349,9 +356,6 @@ void vfio_unregister_group_dev(struct vfio_device *device) } } - /* Balances device_add in register path */ - device_del(&device->device); - /* Balances vfio_device_set_group in register path */ vfio_device_remove_group(device); } @@ -559,7 +563,8 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; - vfio_device_group_close(df); + if (df->group) + vfio_device_group_close(df); vfio_device_put_registration(device); @@ -1208,6 +1213,7 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) const struct file_operations vfio_device_fops = { .owner = THIS_MODULE, + .open = vfio_device_fops_cdev_open, .release = vfio_device_fops_release, .read = vfio_device_fops_read, .write = vfio_device_fops_write, @@ -1595,9 +1601,16 @@ static int __init vfio_init(void) goto err_dev_class; } + ret = vfio_cdev_init(vfio.device_class); + if (ret) + goto err_alloc_dev_chrdev; + pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); return 0; +err_alloc_dev_chrdev: + class_destroy(vfio.device_class); + vfio.device_class = NULL; err_dev_class: vfio_virqfd_exit(); err_virqfd: @@ -1608,6 +1621,7 @@ static int __init vfio_init(void) static void __exit vfio_cleanup(void) { ida_destroy(&vfio.device_ida); + vfio_cdev_cleanup(); class_destroy(vfio.device_class); vfio.device_class = NULL; vfio_virqfd_exit(); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 672c0d9ac3fa..db717a8f4440 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -51,6 +52,10 @@ struct vfio_device { /* Members below here are private, not for driver use */ unsigned int index; struct device device; /* device.kref covers object life circle */ +#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) + struct cdev cdev; + bool cdev_opened; +#endif refcount_t refcount; /* user count on registered device*/ unsigned int open_count; struct completion comp; @@ -62,7 +67,6 @@ struct vfio_device { struct iommufd_device *iommufd_device; bool iommufd_attached; #endif - bool cdev_opened; bool noiommu; }; @@ -162,11 +166,18 @@ vfio_iommufd_physical_devid(struct vfio_device *vdev, u32 *id) ((void (*)(struct vfio_device *vdev)) NULL) #endif +#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) static inline bool vfio_device_cdev_opened(struct vfio_device *device) { lockdep_assert_held(&device->dev_set->lock); return device->cdev_opened; } +#else +static inline bool vfio_device_cdev_opened(struct vfio_device *device) +{ + return false; +} +#endif /** * @migration_set_state: Optional callback to change the migration state for From patchwork Sat Apr 1 15:18:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197119 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 085DDC76196 for ; Sat, 1 Apr 2023 15:19:13 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 7096310E27A; Sat, 1 Apr 2023 15:19:12 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 95F0D10E227; Sat, 1 Apr 2023 15:18:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362328; x=1711898328; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Nxxu8mv/ExSN3oQSYj+ulTQ2pvW0BPiusuSRvWs4H04=; b=OUjjXckp/EgBLgFwhSxlUuRKIriUYfsJBtE03gaI1ASfreYiCpz3mlNU NDYYto2JmawctwlhHhvekaKjdK1OaBv7zJY3ZOmhBteDR9luGswuV058V q7XKrsQaGuxFOoVFRsBMxEwC4SNpQjnO4HPBqJOswjrCctbN8qqJttkB6 P4+mxiooH4Ju+xSFDQ6zX1dJl8xleYcQlE9cMrH2JO4C1aW08HvRcBWb5 hTDhgPVLyShkP3Q+w4dNp1MyAHsc5I7LF1y1UfcrGp3uP3x1Y+csX9KFU LiYxnCA03Pn9vlaJf2VoSbdA20PN0JgsFKcE8EJZsPDalPIqZbWKfvSwr g==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411393" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411393" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937232" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937232" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:47 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:30 -0700 Message-Id: <20230401151833.124749-23-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 22/25] vfio: Add VFIO_DEVICE_BIND_IOMMUFD X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This adds ioctl for userspace to bind device cdev fd to iommufd. VFIO_DEVICE_BIND_IOMMUFD: bind device to an iommufd, hence gain DMA control provided by the iommufd. open_device op is called after bind_iommufd op. VFIO no iommu mode is indicated by passing VFIO_NOIOMMU_FD (-1) as iommufd value. Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/device_cdev.c | 152 +++++++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 13 ++++ drivers/vfio/vfio_main.c | 5 ++ include/uapi/linux/vfio.h | 31 ++++++++ 4 files changed, 201 insertions(+) diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c index 1c640016a824..e3c51d2f5732 100644 --- a/drivers/vfio/device_cdev.c +++ b/drivers/vfio/device_cdev.c @@ -3,6 +3,7 @@ * Copyright (c) 2023 Intel Corporation. */ #include +#include #include "vfio.h" @@ -44,6 +45,157 @@ int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep) return ret; } +static void vfio_device_get_kvm_safe(struct vfio_device_file *df) +{ + spin_lock(&df->kvm_ref_lock); + if (df->kvm) + _vfio_device_get_kvm_safe(df->device, df->kvm); + spin_unlock(&df->kvm_ref_lock); +} + +void vfio_device_cdev_close(struct vfio_device_file *df) +{ + struct vfio_device *device = df->device; + + /* + * In the time of close, there is no contention with another one + * changing this flag. So read df->access_granted without lock + * and no smp_load_acquire() is ok. + */ + if (!df->access_granted) + return; + + mutex_lock(&device->dev_set->lock); + vfio_device_close(df); + vfio_device_put_kvm(device); + if (df->iommufd) + iommufd_ctx_put(df->iommufd); + device->cdev_opened = false; + mutex_unlock(&device->dev_set->lock); + vfio_device_unblock_group(device); +} + +static int vfio_device_cdev_probe_noiommu(struct vfio_device *device) +{ + if (!capable(CAP_SYS_RAWIO)) + return -EPERM; + + if (!device->noiommu) + return -EINVAL; + + return 0; +} + +static struct iommufd_ctx *vfio_get_iommufd_from_fd(int fd) +{ + struct iommufd_ctx *iommufd; + struct fd f; + + f = fdget(fd); + if (!f.file) + return ERR_PTR(-EBADF); + + iommufd = iommufd_ctx_from_file(f.file); + + fdput(f); + return iommufd; +} + +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, + struct vfio_device_bind_iommufd __user *arg) +{ + struct vfio_device *device = df->device; + struct vfio_device_bind_iommufd bind; + unsigned long minsz; + bool is_noiommu; + int ret; + + static_assert(__same_type(arg->out_devid, df->devid)); + + minsz = offsetofend(struct vfio_device_bind_iommufd, out_devid); + + if (copy_from_user(&bind, arg, minsz)) + return -EFAULT; + + if (bind.argsz < minsz || bind.flags) + return -EINVAL; + + /* BIND_IOMMUFD only allowed for cdev fds */ + if (df->group) + return -EINVAL; + + ret = vfio_device_block_group(device); + if (ret) + return ret; + + is_noiommu = bind.iommufd == VFIO_NOIOMMU_FD; + + mutex_lock(&device->dev_set->lock); + /* one device cannot be bound twice */ + if (df->access_granted) { + ret = -EINVAL; + goto out_unlock; + } + + if (is_noiommu) { + ret = vfio_device_cdev_probe_noiommu(device); + if (ret) + goto out_unlock; + } else { + df->iommufd = vfio_get_iommufd_from_fd(bind.iommufd); + if (IS_ERR(df->iommufd)) { + ret = PTR_ERR(df->iommufd); + df->iommufd = NULL; + goto out_unlock; + } + } + + /* + * Before the device open, get the KVM pointer currently + * associated with the device file (if there is) and obtain + * a reference. This reference is held until device closed. + * Save the pointer in the device for use by drivers. + */ + vfio_device_get_kvm_safe(df); + + ret = vfio_device_open(df); + if (ret) + goto out_put_kvm; + + if (is_noiommu) { + dev_warn(device->dev, "device is bound to vfio-noiommu by user " + "(%s:%d)\n", current->comm, task_pid_nr(current)); + } else { + ret = copy_to_user(&arg->out_devid, &df->devid, + sizeof(df->devid)) ? -EFAULT : 0; + if (ret) + goto out_close_device; + } + + /* + * Paired with smp_load_acquire() in vfio_device_fops::ioctl/ + * read/write/mmap + */ + smp_store_release(&df->access_granted, true); + device->cdev_opened = true; + mutex_unlock(&device->dev_set->lock); + + return 0; + +out_close_device: + vfio_device_close(df); +out_put_kvm: + vfio_device_put_kvm(device); + if (!is_noiommu) { + iommufd_ctx_put(df->iommufd); + df->iommufd = NULL; + } +out_unlock: + mutex_unlock(&device->dev_set->lock); + vfio_device_unblock_group(device); + return ret; +} + static char *vfio_device_devnode(const struct device *dev, umode_t *mode) { return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 3a8fd0e32f59..ace3d52b0928 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -281,6 +281,9 @@ static inline void vfio_device_del(struct vfio_device *device) void vfio_init_device_cdev(struct vfio_device *device); int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep); +void vfio_device_cdev_close(struct vfio_device_file *df); +long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, + struct vfio_device_bind_iommufd __user *arg); int vfio_cdev_init(struct class *device_class); void vfio_cdev_cleanup(void); #else @@ -304,6 +307,16 @@ static inline int vfio_device_fops_cdev_open(struct inode *inode, return 0; } +static inline void vfio_device_cdev_close(struct vfio_device_file *df) +{ +} + +static inline long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, + struct vfio_device_bind_iommufd __user *arg) +{ + return -EOPNOTSUPP; +} + static inline int vfio_cdev_init(struct class *device_class) { return 0; diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 61f56d9b5e5f..0611e29e41a9 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -565,6 +565,8 @@ static int vfio_device_fops_release(struct inode *inode, struct file *filep) if (df->group) vfio_device_group_close(df); + else + vfio_device_cdev_close(df); vfio_device_put_registration(device); @@ -1138,6 +1140,9 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, struct vfio_device *device = df->device; int ret; + if (cmd == VFIO_DEVICE_BIND_IOMMUFD) + return vfio_device_ioctl_bind_iommufd(df, (void __user *)arg); + /* Paired with smp_store_release() following vfio_device_open() */ if (!smp_load_acquire(&df->access_granted)) return -EINVAL; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 5a34364e3b94..84d76630dd60 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -194,6 +194,37 @@ struct vfio_group_status { /* --------------- IOCTLs for DEVICE file descriptors --------------- */ +/* + * VFIO_DEVICE_BIND_IOMMUFD - _IOR(VFIO_TYPE, VFIO_BASE + 19, + * struct vfio_device_bind_iommufd) + * + * Bind a vfio_device to the specified iommufd. + * + * User is restricted from accessing the device before the binding operation + * is completed. + * + * Unbind is automatically conducted when device fd is closed. + * + * @argsz: user filled size of this data. + * @flags: reserved for future extension. + * @iommufd: iommufd to bind. VFIO_NOIOMMU_FD means noiommu. + * @out_devid: the device id generated by this bind. This field is valid + * as long as the input @iommufd is valid. Otherwise, it is + * meaningless. devid is a handle for this device and can be + * used in IOMMUFD commands. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_bind_iommufd { + __u32 argsz; + __u32 flags; + __s32 iommufd; +#define VFIO_NOIOMMU_FD (-1) + __u32 out_devid; +}; + +#define VFIO_DEVICE_BIND_IOMMUFD _IO(VFIO_TYPE, VFIO_BASE + 19) + /** * VFIO_DEVICE_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 7, * struct vfio_device_info) From patchwork Sat Apr 1 15:18:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197132 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0AC63C6FD1D for ; Sat, 1 Apr 2023 15:19:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 447EE10E331; Sat, 1 Apr 2023 15:19:43 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 7299610E105; Sat, 1 Apr 2023 15:18:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362329; x=1711898329; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=08bjCZhxXGSjuh8rdxCrsBgcg7gZq7iG60aLor7QczQ=; b=Sw6wvESLIgKzh/9lgKSnXar4bHp+I9tVw/zYfQR2aKG6U7sIquQFXCU8 ixpoba5nb4Mm8/j45+1bG9Ip0kqLAZsZxWG7Xyh7SZscVktujJfl3duVW I0aOczjHxNfqtmICeDxmOClT9umDxpb63/ADORffvNmu+PDkhaju2nJQO sHe1H4lHwQ8aL17YvZH59cg3/uSJ2Fy3Oq2FCmQouYWC4l/Vww1f4NWWt d6/zgbSAoSrxFtO5mQzoZtp4k8SBdCRknLJriyqYKoG8Nt7MunDbCIkr1 7GhVvwcWoE+ZehlO+E5KJfPsTnQBoqE3Rgh+X1U44reemC4gc1Dhdvlv3 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411404" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411404" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937236" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937236" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:48 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:31 -0700 Message-Id: <20230401151833.124749-24-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 23/25] vfio: Add VFIO_DEVICE_[AT|DE]TACH_IOMMUFD_PT X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This adds ioctl for userspace to attach device cdev fd to and detach from IOAS/hw_pagetable managed by iommufd. VFIO_DEVICE_ATTACH_IOMMUFD_PT: attach vfio device to IOAS, hw_pagetable managed by iommufd. Attach can be undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. VFIO_DEVICE_DETACH_IOMMUFD_PT: detach vfio device from the current attached IOAS or hw_pagetable managed by iommufd. Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Matthew Rosato Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/vfio/device_cdev.c | 77 ++++++++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 16 ++++++++ drivers/vfio/vfio_main.c | 8 ++++ include/uapi/linux/vfio.h | 52 +++++++++++++++++++++++++ 4 files changed, 153 insertions(+) diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c index e3c51d2f5732..12a8435649e9 100644 --- a/drivers/vfio/device_cdev.c +++ b/drivers/vfio/device_cdev.c @@ -196,6 +196,83 @@ long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, return ret; } +int vfio_ioctl_device_attach(struct vfio_device_file *df, + struct vfio_device_attach_iommufd_pt __user *arg) +{ + struct vfio_device *device = df->device; + struct vfio_device_attach_iommufd_pt attach; + unsigned long minsz; + int ret; + + minsz = offsetofend(struct vfio_device_attach_iommufd_pt, pt_id); + + if (copy_from_user(&attach, arg, minsz)) + return -EFAULT; + + if (attach.argsz < minsz || attach.flags) + return -EINVAL; + + /* ATTACH only allowed for cdev fds */ + if (df->group) + return -EINVAL; + + mutex_lock(&device->dev_set->lock); + /* noiommufd mode doesn't allow attach */ + if (!df->iommufd) { + ret = -EOPNOTSUPP; + goto out_unlock; + } + + ret = device->ops->attach_ioas(device, &attach.pt_id); + if (ret) + goto out_unlock; + + ret = copy_to_user(&arg->pt_id, &attach.pt_id, + sizeof(attach.pt_id)) ? -EFAULT : 0; + if (ret) + goto out_detach; + mutex_unlock(&device->dev_set->lock); + + return 0; + +out_detach: + device->ops->detach_ioas(device); +out_unlock: + mutex_unlock(&device->dev_set->lock); + return ret; +} + +int vfio_ioctl_device_detach(struct vfio_device_file *df, + struct vfio_device_detach_iommufd_pt __user *arg) +{ + struct vfio_device *device = df->device; + struct vfio_device_detach_iommufd_pt detach; + unsigned long minsz; + + minsz = offsetofend(struct vfio_device_detach_iommufd_pt, flags); + + if (copy_from_user(&detach, arg, minsz)) + return -EFAULT; + + if (detach.argsz < minsz || detach.flags) + return -EINVAL; + + /* DETACH only allowed for cdev fds */ + if (df->group) + return -EINVAL; + + mutex_lock(&device->dev_set->lock); + /* noiommufd mode doesn't support detach */ + if (!df->iommufd) { + mutex_unlock(&device->dev_set->lock); + return -EOPNOTSUPP; + } + device->ops->detach_ioas(device); + mutex_unlock(&device->dev_set->lock); + + return 0; +} + static char *vfio_device_devnode(const struct device *dev, umode_t *mode) { return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index ace3d52b0928..c199e410db18 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -284,6 +284,10 @@ int vfio_device_fops_cdev_open(struct inode *inode, struct file *filep); void vfio_device_cdev_close(struct vfio_device_file *df); long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, struct vfio_device_bind_iommufd __user *arg); +int vfio_ioctl_device_attach(struct vfio_device_file *df, + struct vfio_device_attach_iommufd_pt __user *arg); +int vfio_ioctl_device_detach(struct vfio_device_file *df, + struct vfio_device_detach_iommufd_pt __user *arg); int vfio_cdev_init(struct class *device_class); void vfio_cdev_cleanup(void); #else @@ -317,6 +321,18 @@ static inline long vfio_device_ioctl_bind_iommufd(struct vfio_device_file *df, return -EOPNOTSUPP; } +static inline int vfio_ioctl_device_attach(struct vfio_device_file *df, + struct vfio_device_attach_iommufd_pt __user *arg) +{ + return -EOPNOTSUPP; +} + +static inline int vfio_ioctl_device_detach(struct vfio_device_file *df, + struct vfio_device_detach_iommufd_pt __user *arg) +{ + return -EOPNOTSUPP; +} + static inline int vfio_cdev_init(struct class *device_class) { return 0; diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 0611e29e41a9..9e2f2daf056a 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -1156,6 +1156,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, ret = vfio_ioctl_device_feature(device, (void __user *)arg); break; + case VFIO_DEVICE_ATTACH_IOMMUFD_PT: + ret = vfio_ioctl_device_attach(df, (void __user *)arg); + break; + + case VFIO_DEVICE_DETACH_IOMMUFD_PT: + ret = vfio_ioctl_device_detach(df, (void __user *)arg); + break; + default: if (unlikely(!device->ops->ioctl)) ret = -EINVAL; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 84d76630dd60..a356596f882d 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -225,6 +225,58 @@ struct vfio_device_bind_iommufd { #define VFIO_DEVICE_BIND_IOMMUFD _IO(VFIO_TYPE, VFIO_BASE + 19) +/* + * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20, + * struct vfio_device_attach_iommufd_pt) + * + * Attach a vfio device to an iommufd address space specified by IOAS + * id or hw_pagetable (hwpt) id. + * + * Available only after a device has been bound to iommufd via + * VFIO_DEVICE_BIND_IOMMUFD + * + * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. + * + * @argsz: user filled size of this data. + * @flags: must be 0. + * @pt_id: Input the target id which can represent an ioas or a hwpt + * allocated via iommufd subsystem. + * Output the attached hwpt id which could be the specified + * hwpt itself or a hwpt automatically created for the + * specified ioas by kernel during the attachment. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_attach_iommufd_pt { + __u32 argsz; + __u32 flags; + __u32 pt_id; +}; + +#define VFIO_DEVICE_ATTACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 20) + +/* + * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 21, + * struct vfio_device_detach_iommufd_pt) + * + * Detach a vfio device from the iommufd address space it has been + * attached to. After it, device should be in a blocking DMA state. + * + * Available only after a device has been bound to iommufd via + * VFIO_DEVICE_BIND_IOMMUFD. + * + * @argsz: user filled size of this data. + * @flags: must be 0. + * + * Return: 0 on success, -errno on failure. + */ +struct vfio_device_detach_iommufd_pt { + __u32 argsz; + __u32 flags; +}; + +#define VFIO_DEVICE_DETACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 21) + /** * VFIO_DEVICE_GET_INFO - _IOR(VFIO_TYPE, VFIO_BASE + 7, * struct vfio_device_info) From patchwork Sat Apr 1 15:18:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197125 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E5F6AC6FD1D for ; Sat, 1 Apr 2023 15:19:24 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id D607E10E2D1; Sat, 1 Apr 2023 15:19:23 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 054C010E23B; Sat, 1 Apr 2023 15:18:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362330; x=1711898330; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=B99jEbLR0tNkpukHQr/yNPobSRf3EINVeCE8niNa5rI=; b=jzm9oxzXpIU85cxVi18qukTqEkQaO3wacwIB50gLyGlk4Ifh+w8KSw0O oLu3+mLsLiXbXdQiCpK1geskG13WDYPrQoVg4GiyAAcCE40Xir4t708DU 2B83yZnKxT8pElZy9HSO3hyaFubj88iDN7xm9HJw3kKbH/sHlZPyDDmcu 9dFvVMj/qKAeEcCTUUmPLElaXOFVd5el42IfZpKz22+/1Ni5tuWjU1IHq s9mTORKKT0XFUKYEYi16quDKnAibOZt2Z6PfkgPwu9CCcsGZvOMdX2Jpr XvSteJp5HkbDa7DVKipE29rbkmTDL8cVgxYxssu72C6zyCUOTB/uVetnS g==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411413" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411413" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937239" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937239" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:49 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:32 -0700 Message-Id: <20230401151833.124749-25-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 24/25] vfio: Compile group optionally X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" group code is not needed for vfio device cdev, so with vfio device cdev introduced, the group infrastructures can be compiled out if only cdev is needed. Reviewed-by: Kevin Tian Tested-by: Terrence Xu Tested-by: Nicolin Chen Tested-by: Yanting Jiang Signed-off-by: Yi Liu --- drivers/iommu/iommufd/Kconfig | 4 +- drivers/vfio/Kconfig | 14 +++++ drivers/vfio/Makefile | 2 +- drivers/vfio/vfio.h | 111 ++++++++++++++++++++++++++++++++-- include/linux/vfio.h | 13 +++- 5 files changed, 133 insertions(+), 11 deletions(-) diff --git a/drivers/iommu/iommufd/Kconfig b/drivers/iommu/iommufd/Kconfig index ada693ea51a7..99d4b075df49 100644 --- a/drivers/iommu/iommufd/Kconfig +++ b/drivers/iommu/iommufd/Kconfig @@ -14,8 +14,8 @@ config IOMMUFD if IOMMUFD config IOMMUFD_VFIO_CONTAINER bool "IOMMUFD provides the VFIO container /dev/vfio/vfio" - depends on VFIO && !VFIO_CONTAINER - default VFIO && !VFIO_CONTAINER + depends on VFIO_GROUP && !VFIO_CONTAINER + default VFIO_GROUP && !VFIO_CONTAINER help IOMMUFD will provide /dev/vfio/vfio instead of VFIO. This relies on IOMMUFD providing compatibility emulation to give the same ioctls. diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index e2105b4dac2d..62c58830a8d5 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -4,6 +4,8 @@ menuconfig VFIO select IOMMU_API depends on IOMMUFD || !IOMMUFD select INTERVAL_TREE + select VFIO_GROUP if SPAPR_TCE_IOMMU || IOMMUFD=n + select VFIO_DEVICE_CDEV if !VFIO_GROUP select VFIO_CONTAINER if IOMMUFD=n help VFIO provides a framework for secure userspace device drivers. @@ -15,6 +17,7 @@ if VFIO config VFIO_DEVICE_CDEV bool "Support for the VFIO cdev /dev/vfio/devices/vfioX" depends on IOMMUFD + default !VFIO_GROUP help The VFIO device cdev is another way for userspace to get device access. Userspace gets device fd by opening device cdev under @@ -23,9 +26,20 @@ config VFIO_DEVICE_CDEV If you don't know what to do here, say N. +config VFIO_GROUP + bool "Support for the VFIO group /dev/vfio/$group_id" + default y + help + VFIO group support provides the traditional model for accessing + devices through VFIO and is used by the majority of userspace + applications and drivers making use of VFIO. + + If you don't know what to do here, say Y. + config VFIO_CONTAINER bool "Support for the VFIO container /dev/vfio/vfio" select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64) + depends on VFIO_GROUP default y help The VFIO container is the classic interface to VFIO for establishing diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 245394aeb94b..57c3515af606 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -2,9 +2,9 @@ obj-$(CONFIG_VFIO) += vfio.o vfio-y += vfio_main.o \ - group.o \ iova_bitmap.o vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o +vfio-$(CONFIG_VFIO_GROUP) += group.o vfio-$(CONFIG_IOMMUFD) += iommufd.o vfio-$(CONFIG_VFIO_CONTAINER) += container.o vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index c199e410db18..9c7a238ec8dd 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -36,6 +36,12 @@ vfio_allocate_device_file(struct vfio_device *device); extern const struct file_operations vfio_device_fops; +#ifdef CONFIG_VFIO_NOIOMMU +extern bool vfio_noiommu __read_mostly; +#else +enum { vfio_noiommu = false }; +#endif + enum vfio_group_type { /* * Physical device with IOMMU backing. @@ -60,6 +66,7 @@ enum vfio_group_type { VFIO_NO_IOMMU, }; +#if IS_ENABLED(CONFIG_VFIO_GROUP) struct vfio_group { struct device dev; struct cdev cdev; @@ -113,6 +120,104 @@ static inline void vfio_device_set_noiommu(struct vfio_device *device) device->noiommu = IS_ENABLED(CONFIG_VFIO_NOIOMMU) && device->group->type == VFIO_NO_IOMMU; } +#else +struct vfio_group; + +static inline int vfio_device_block_group(struct vfio_device *device) +{ + return 0; +} + +static inline void vfio_device_unblock_group(struct vfio_device *device) +{ +} + +static inline int vfio_device_set_group(struct vfio_device *device, + enum vfio_group_type type) +{ + return 0; +} + +static inline void vfio_device_remove_group(struct vfio_device *device) +{ +} + +static inline void vfio_device_group_register(struct vfio_device *device) +{ +} + +static inline void vfio_device_group_unregister(struct vfio_device *device) +{ +} + +static inline bool vfio_device_group_uses_container(struct vfio_device_file *df) +{ + return false; +} + +static inline int vfio_device_group_use_iommu(struct vfio_device *device) +{ + return -EOPNOTSUPP; +} + +static inline void vfio_device_group_unuse_iommu(struct vfio_device *device) +{ +} + +static inline void vfio_device_group_close(struct vfio_device_file *df) +{ +} + +static inline struct vfio_group *vfio_group_from_file(struct file *file) +{ + return NULL; +} + +static inline bool vfio_group_has_dev(struct vfio_group *group, + struct vfio_device *device) +{ + return false; +} + +static inline bool vfio_group_enforced_coherent(struct vfio_group *group) +{ + return true; +} + +static inline void vfio_group_set_kvm(struct vfio_group *group, struct kvm *kvm) +{ +} + +static inline bool vfio_device_has_container(struct vfio_device *device) +{ + return false; +} + +static inline int __init vfio_group_init(void) +{ + return 0; +} + +static inline void vfio_group_cleanup(void) +{ +} + +static inline void vfio_device_set_noiommu(struct vfio_device *device) +{ + struct iommu_group *iommu_group; + + device->noiommu = false; + + if (!IS_ENABLED(CONFIG_VFIO_NOIOMMU) || !vfio_noiommu) + return; + + iommu_group = iommu_group_get(device->dev); + if (iommu_group) + iommu_group_put(iommu_group); + else + device->noiommu = true; +} +#endif /* CONFIG_VFIO_GROUP */ #if IS_ENABLED(CONFIG_VFIO_CONTAINER) /** @@ -356,12 +461,6 @@ static inline void vfio_virqfd_exit(void) } #endif -#ifdef CONFIG_VFIO_NOIOMMU -extern bool vfio_noiommu __read_mostly; -#else -enum { vfio_noiommu = false }; -#endif - #ifdef CONFIG_HAVE_KVM void _vfio_device_get_kvm_safe(struct vfio_device *device, struct kvm *kvm); void vfio_device_put_kvm(struct vfio_device *device); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index db717a8f4440..46b313f8bfaf 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -43,7 +43,11 @@ struct vfio_device { */ const struct vfio_migration_ops *mig_ops; const struct vfio_log_ops *log_ops; +#if IS_ENABLED(CONFIG_VFIO_GROUP) struct vfio_group *group; + struct list_head group_next; + struct list_head iommu_entry; +#endif struct vfio_device_set *dev_set; struct list_head dev_set_list; unsigned int migration_flags; @@ -59,8 +63,6 @@ struct vfio_device { refcount_t refcount; /* user count on registered device*/ unsigned int open_count; struct completion comp; - struct list_head group_next; - struct list_head iommu_entry; struct iommufd_access *iommufd_access; void (*put_kvm)(struct kvm *kvm); #if IS_ENABLED(CONFIG_IOMMUFD) @@ -284,7 +286,14 @@ int vfio_mig_get_next_state(struct vfio_device *device, /* * External user API */ +#if IS_ENABLED(CONFIG_VFIO_GROUP) struct iommu_group *vfio_file_iommu_group(struct file *file); +#else +static inline struct iommu_group *vfio_file_iommu_group(struct file *file) +{ + return NULL; +} +#endif bool vfio_file_is_valid(struct file *file); bool vfio_file_enforced_coherent(struct file *file); void vfio_file_set_kvm(struct file *file, struct kvm *kvm); From patchwork Sat Apr 1 15:18:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 13197120 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D11D7C761AF for ; Sat, 1 Apr 2023 15:19:15 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 02C9310E26C; Sat, 1 Apr 2023 15:19:15 +0000 (UTC) Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by gabe.freedesktop.org (Postfix) with ESMTPS id 119EC10E24B; Sat, 1 Apr 2023 15:18:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1680362331; x=1711898331; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XTW5OvP/lz6wAuWYHWZPf0Ckv407oLnpgloaD6rISDo=; b=EcaqeAekCBGlxD+c2Y6aFEuZc5Z+h6SrY82PcWLqjTJQhJ0A8eiJIhER WQzhpaTqsPpKHKpkWNWtsbQyK7L1WC8/nyzq6XCIILTBE8WquXoU3nWdu w+d1h+YU9hD5SZ+S81vMK8T75j3hQgCh2e2z4RiM71lZQ2D4K3G//PXBA wvyh1J5nz39RbkpRMICFUxWs0bG11fDMuvun709RsJZpk0IEXdW15pQnH Y/2ZXov+I9HfRfqnORCfz9k7ivHV/noFOfoviryRSinYAg16hrhqTV4mb e/Wf8w7CDA41luhRXqWlbqFSnaAHe0xw3AC6rcymUdtz6gf1DScBpNHlf A==; X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="404411425" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="404411425" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Apr 2023 08:18:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10667"; a="678937242" X-IronPort-AV: E=Sophos;i="5.98,310,1673942400"; d="scan'208";a="678937242" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by orsmga007.jf.intel.com with ESMTP; 01 Apr 2023 08:18:50 -0700 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Date: Sat, 1 Apr 2023 08:18:33 -0700 Message-Id: <20230401151833.124749-26-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230401151833.124749-1-yi.l.liu@intel.com> References: <20230401151833.124749-1-yi.l.liu@intel.com> MIME-Version: 1.0 Subject: [Intel-gfx] [PATCH v9 25/25] docs: vfio: Add vfio device cdev description X-BeenThere: intel-gfx@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel graphics driver community testing & development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mjrosato@linux.ibm.com, jasowang@redhat.com, xudong.hao@intel.com, peterx@redhat.com, terrence.xu@intel.com, chao.p.peng@linux.intel.com, linux-s390@vger.kernel.org, yi.l.liu@intel.com, kvm@vger.kernel.org, lulu@redhat.com, yanting.jiang@intel.com, joro@8bytes.org, nicolinc@nvidia.com, yan.y.zhao@intel.com, intel-gfx@lists.freedesktop.org, eric.auger@redhat.com, intel-gvt-dev@lists.freedesktop.org, yi.y.sun@linux.intel.com, cohuck@redhat.com, shameerali.kolothum.thodi@huawei.com, suravee.suthikulpanit@amd.com, robin.murphy@arm.com Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" This gives notes for userspace applications on device cdev usage. Reviewed-by: Kevin Tian Signed-off-by: Yi Liu --- Documentation/driver-api/vfio.rst | 132 ++++++++++++++++++++++++++++++ 1 file changed, 132 insertions(+) diff --git a/Documentation/driver-api/vfio.rst b/Documentation/driver-api/vfio.rst index 363e12c90b87..4f21be7bda8a 100644 --- a/Documentation/driver-api/vfio.rst +++ b/Documentation/driver-api/vfio.rst @@ -239,6 +239,130 @@ group and can access them as follows:: /* Gratuitous device reset and go... */ ioctl(device, VFIO_DEVICE_RESET); +IOMMUFD and vfio_iommu_type1 +---------------------------- + +IOMMUFD is the new user API to manage I/O page tables from userspace. +It intends to be the portal of delivering advanced userspace DMA +features (nested translation [5], PASID [6], etc.) while also providing +a backwards compatibility interface for existing VFIO_TYPE1v2_IOMMU use +cases. Eventually the vfio_iommu_type1 driver, as well as the legacy +vfio container and group model is intended to be deprecated. + +The IOMMUFD backwards compatibility interface can be enabled two ways. +In the first method, the kernel can be configured with +CONFIG_IOMMUFD_VFIO_CONTAINER, in which case the IOMMUFD subsystem +transparently provides the entire infrastructure for the VFIO +container and IOMMU backend interfaces. The compatibility mode can +also be accessed if the VFIO container interface, ie. /dev/vfio/vfio is +simply symlink'd to /dev/iommu. Note that at the time of writing, the +compatibility mode is not entirely feature complete relative to +VFIO_TYPE1v2_IOMMU (ex. DMA mapping MMIO) and does not attempt to +provide compatibility to the VFIO_SPAPR_TCE_IOMMU interface. Therefore +it is not generally advisable at this time to switch from native VFIO +implementations to the IOMMUFD compatibility interfaces. + +Long term, VFIO users should migrate to device access through the cdev +interface described below, and native access through the IOMMUFD +provided interfaces. + +VFIO Device cdev +---------------- + +Traditionally user acquires a device fd via VFIO_GROUP_GET_DEVICE_FD +in a VFIO group. + +With CONFIG_VFIO_DEVICE_CDEV=y the user can now acquire a device fd +by directly opening a character device /dev/vfio/devices/vfioX where +"X" is the number allocated uniquely by VFIO for registered devices. +For noiommu devices, the character device would be named with "noiommu-" +prefix. e.g. /dev/vfio/devices/noiommu-vfioX. + +The cdev only works with IOMMUFD. Both VFIO drivers and applications +must adapt to the new cdev security model which requires using +VFIO_DEVICE_BIND_IOMMUFD to claim DMA ownership before starting to +actually use the device. Once BIND succeeds then a VFIO device can +be fully accessed by the user. + +VFIO device cdev doesn't rely on VFIO group/container/iommu drivers. +Hence those modules can be fully compiled out in an environment +where no legacy VFIO application exists. + +So far SPAPR does not support IOMMUFD yet. So it cannot support device +cdev neither. + +Device cdev Example +------------------- + +Assume user wants to access PCI device 0000:6a:01.0:: + + $ ls /sys/bus/pci/devices/0000:6a:01.0/vfio-dev/ + vfio0 + +This device is therefore represented as vfio0. The user can verify +its existence:: + + $ ls -l /dev/vfio/devices/vfio0 + crw------- 1 root root 511, 0 Feb 16 01:22 /dev/vfio/devices/vfio0 + $ cat /sys/bus/pci/devices/0000:6a:01.0/vfio-dev/vfio0/dev + 511:0 + $ ls -l /dev/char/511\:0 + lrwxrwxrwx 1 root root 21 Feb 16 01:22 /dev/char/511:0 -> ../vfio/devices/vfio0 + +Then provide the user with access to the device if unprivileged +operation is desired:: + + $ chown user:user /dev/vfio/devices/vfio0 + +Finally the user could get cdev fd by:: + + cdev_fd = open("/dev/vfio/devices/vfio0", O_RDWR); + +An opened cdev_fd doesn't give the user any permission of accessing +the device except binding the cdev_fd to an iommufd. After that point +then the device is fully accessible including attaching it to an +IOMMUFD IOAS/HWPT to enable userspace DMA:: + + struct vfio_device_bind_iommufd bind = { + .argsz = sizeof(bind), + .flags = 0, + }; + struct iommu_ioas_alloc alloc_data = { + .size = sizeof(alloc_data), + .flags = 0, + }; + struct vfio_device_attach_iommufd_pt attach_data = { + .argsz = sizeof(attach_data), + .flags = 0, + }; + struct iommu_ioas_map map = { + .size = sizeof(map), + .flags = IOMMU_IOAS_MAP_READABLE | + IOMMU_IOAS_MAP_WRITEABLE | + IOMMU_IOAS_MAP_FIXED_IOVA, + .__reserved = 0, + }; + + iommufd = open("/dev/iommu", O_RDWR); + + bind.iommufd = iommufd; // negative value means vfio-noiommu mode + ioctl(cdev_fd, VFIO_DEVICE_BIND_IOMMUFD, &bind); + + ioctl(iommufd, IOMMU_IOAS_ALLOC, &alloc_data); + attach_data.pt_id = alloc_data.out_ioas_id; + ioctl(cdev_fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data); + + /* Allocate some space and setup a DMA mapping */ + map.user_va = (int64_t)mmap(0, 1024 * 1024, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, 0, 0); + map.iova = 0; /* 1MB starting at 0x0 from device view */ + map.length = 1024 * 1024; + map.ioas_id = alloc_data.out_ioas_id;; + + ioctl(iommufd, IOMMU_IOAS_MAP, &map); + + /* Other device operations as stated in "VFIO Usage Example" */ + VFIO User API ------------------------------------------------------------------------------- @@ -566,3 +690,11 @@ This implementation has some specifics: \-0d.1 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 90) + +.. [5] Nested translation is an IOMMU feature which supports two stage + address translations. This improves the address translation efficiency + in IOMMU virtualization. + +.. [6] PASID stands for Process Address Space ID, introduced by PCI + Express. It is a prerequisite for Shared Virtual Addressing (SVA) + and Scalable I/O Virtualization (Scalable IOV).