From patchwork Thu Oct 26 10:30:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F1BCFC25B48 for ; Thu, 26 Oct 2023 10:53:08 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxsa-0007Jw-Oj; Thu, 26 Oct 2023 06:46:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsZ-0007Ja-IJ; Thu, 26 Oct 2023 06:46:19 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsX-0001HP-C7; Thu, 26 Oct 2023 06:46:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317177; x=1729853177; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=J++MbWPmiluLQiN3hmBvc0spY8b3xhfCQjaAo9rHYLI=; b=mPNnGPnT6V0WNCUbMGzIqXXF+h+WnVXGrhrODekqnnT53SvxeAUTgIf0 a9sD9TrPhgiVe/BE7r/9MTPPdZDnd80jx3ZJJqTWuJch9jt0vwdgsyupy jH1ifHFlCDTKDbZuzmthDFAGKxDQk3YKbBly8/9VFA5Shas2uwrOpmbDW 9/r6RPuB4xSofrzJJ91Ctam0ysSrOa7K5Mt05xgVL+NaT5kpxpaw5Fvv1 IaSeIbWz0tT04nVL8w8mcU43fnV/Z0MmQVN0rTmHHdU1lLY01LcahrtKp 06YgUGot6duzGLUR11Wm03VN9nwK3HwgpZVyHLHSDPpz37VA9kuxu1qU5 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563177" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563177" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:12 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463561" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:45:56 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , David Gibson , Harsh Prateek Bora , Thomas Huth , Tony Krowiak , Halil Pasic , Jason Herne , Eric Farman , Matthew Rosato , qemu-ppc@nongnu.org (open list:sPAPR (pseries)), qemu-s390x@nongnu.org (open list:S390 general arch...) Subject: [PATCH v3 01/37] vfio/container: Move IBM EEH related functions into spapr_pci_vfio.c Date: Thu, 26 Oct 2023 18:30:28 +0800 Message-Id: <20231026103104.1686921-2-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org With vfio_eeh_as_ok/vfio_eeh_as_op moved and made static, vfio.h becomes empty and is deleted. No functional changes intended. Suggested-by: Cédric Le Goater Signed-off-by: Zhenzhong Duan Acked-by: Eric Farman Reviewed-by: Cédric Le Goater --- include/hw/vfio/vfio.h | 7 --- hw/ppc/spapr_pci_vfio.c | 100 +++++++++++++++++++++++++++++++++++++++- hw/vfio/ap.c | 1 - hw/vfio/ccw.c | 1 - hw/vfio/common.c | 1 - hw/vfio/container.c | 98 --------------------------------------- hw/vfio/helpers.c | 1 - 7 files changed, 99 insertions(+), 110 deletions(-) delete mode 100644 include/hw/vfio/vfio.h diff --git a/include/hw/vfio/vfio.h b/include/hw/vfio/vfio.h deleted file mode 100644 index 86248f5436..0000000000 --- a/include/hw/vfio/vfio.h +++ /dev/null @@ -1,7 +0,0 @@ -#ifndef HW_VFIO_H -#define HW_VFIO_H - -bool vfio_eeh_as_ok(AddressSpace *as); -int vfio_eeh_as_op(AddressSpace *as, uint32_t op); - -#endif diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index 9016720547..f283f7e38d 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -18,14 +18,112 @@ */ #include "qemu/osdep.h" +#include #include #include "hw/ppc/spapr.h" #include "hw/pci-host/spapr.h" #include "hw/pci/msix.h" #include "hw/pci/pci_device.h" -#include "hw/vfio/vfio.h" +#include "hw/vfio/vfio-common.h" #include "qemu/error-report.h" +/* + * Interfaces for IBM EEH (Enhanced Error Handling) + */ +static bool vfio_eeh_container_ok(VFIOContainer *container) +{ + /* + * As of 2016-03-04 (linux-4.5) the host kernel EEH/VFIO + * implementation is broken if there are multiple groups in a + * container. The hardware works in units of Partitionable + * Endpoints (== IOMMU groups) and the EEH operations naively + * iterate across all groups in the container, without any logic + * to make sure the groups have their state synchronized. For + * certain operations (ENABLE) that might be ok, until an error + * occurs, but for others (GET_STATE) it's clearly broken. + */ + + /* + * XXX Once fixed kernels exist, test for them here + */ + + if (QLIST_EMPTY(&container->group_list)) { + return false; + } + + if (QLIST_NEXT(QLIST_FIRST(&container->group_list), container_next)) { + return false; + } + + return true; +} + +static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op) +{ + struct vfio_eeh_pe_op pe_op = { + .argsz = sizeof(pe_op), + .op = op, + }; + int ret; + + if (!vfio_eeh_container_ok(container)) { + error_report("vfio/eeh: EEH_PE_OP 0x%x: " + "kernel requires a container with exactly one group", op); + return -EPERM; + } + + ret = ioctl(container->fd, VFIO_EEH_PE_OP, &pe_op); + if (ret < 0) { + error_report("vfio/eeh: EEH_PE_OP 0x%x failed: %m", op); + return -errno; + } + + return ret; +} + +static VFIOContainer *vfio_eeh_as_container(AddressSpace *as) +{ + VFIOAddressSpace *space = vfio_get_address_space(as); + VFIOContainer *container = NULL; + + if (QLIST_EMPTY(&space->containers)) { + /* No containers to act on */ + goto out; + } + + container = QLIST_FIRST(&space->containers); + + if (QLIST_NEXT(container, next)) { + /* + * We don't yet have logic to synchronize EEH state across + * multiple containers + */ + container = NULL; + goto out; + } + +out: + vfio_put_address_space(space); + return container; +} + +static bool vfio_eeh_as_ok(AddressSpace *as) +{ + VFIOContainer *container = vfio_eeh_as_container(as); + + return (container != NULL) && vfio_eeh_container_ok(container); +} + +static int vfio_eeh_as_op(AddressSpace *as, uint32_t op) +{ + VFIOContainer *container = vfio_eeh_as_container(as); + + if (!container) { + return -ENODEV; + } + return vfio_eeh_container_op(container, op); +} + bool spapr_phb_eeh_available(SpaprPhbState *sphb) { return vfio_eeh_as_ok(&sphb->iommu_as); diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c index 5f257bffb9..bbf69ff55a 100644 --- a/hw/vfio/ap.c +++ b/hw/vfio/ap.c @@ -14,7 +14,6 @@ #include #include #include "qapi/error.h" -#include "hw/vfio/vfio.h" #include "hw/vfio/vfio-common.h" #include "hw/s390x/ap-device.h" #include "qemu/error-report.h" diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index 6623ae237b..d857bb8d0f 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -20,7 +20,6 @@ #include #include "qapi/error.h" -#include "hw/vfio/vfio.h" #include "hw/vfio/vfio-common.h" #include "hw/s390x/s390-ccw.h" #include "hw/s390x/vfio-ccw.h" diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 9c5c6433f2..e72055e752 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -26,7 +26,6 @@ #include #include "hw/vfio/vfio-common.h" -#include "hw/vfio/vfio.h" #include "hw/vfio/pci.h" #include "exec/address-spaces.h" #include "exec/memory.h" diff --git a/hw/vfio/container.c b/hw/vfio/container.c index fc88222377..83c0f05bba 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -26,7 +26,6 @@ #include #include "hw/vfio/vfio-common.h" -#include "hw/vfio/vfio.h" #include "exec/address-spaces.h" #include "exec/memory.h" #include "exec/ram_addr.h" @@ -1011,103 +1010,6 @@ static void vfio_put_base_device(VFIODevice *vbasedev) close(vbasedev->fd); } -/* - * Interfaces for IBM EEH (Enhanced Error Handling) - */ -static bool vfio_eeh_container_ok(VFIOContainer *container) -{ - /* - * As of 2016-03-04 (linux-4.5) the host kernel EEH/VFIO - * implementation is broken if there are multiple groups in a - * container. The hardware works in units of Partitionable - * Endpoints (== IOMMU groups) and the EEH operations naively - * iterate across all groups in the container, without any logic - * to make sure the groups have their state synchronized. For - * certain operations (ENABLE) that might be ok, until an error - * occurs, but for others (GET_STATE) it's clearly broken. - */ - - /* - * XXX Once fixed kernels exist, test for them here - */ - - if (QLIST_EMPTY(&container->group_list)) { - return false; - } - - if (QLIST_NEXT(QLIST_FIRST(&container->group_list), container_next)) { - return false; - } - - return true; -} - -static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op) -{ - struct vfio_eeh_pe_op pe_op = { - .argsz = sizeof(pe_op), - .op = op, - }; - int ret; - - if (!vfio_eeh_container_ok(container)) { - error_report("vfio/eeh: EEH_PE_OP 0x%x: " - "kernel requires a container with exactly one group", op); - return -EPERM; - } - - ret = ioctl(container->fd, VFIO_EEH_PE_OP, &pe_op); - if (ret < 0) { - error_report("vfio/eeh: EEH_PE_OP 0x%x failed: %m", op); - return -errno; - } - - return ret; -} - -static VFIOContainer *vfio_eeh_as_container(AddressSpace *as) -{ - VFIOAddressSpace *space = vfio_get_address_space(as); - VFIOContainer *container = NULL; - - if (QLIST_EMPTY(&space->containers)) { - /* No containers to act on */ - goto out; - } - - container = QLIST_FIRST(&space->containers); - - if (QLIST_NEXT(container, next)) { - /* - * We don't yet have logic to synchronize EEH state across - * multiple containers - */ - container = NULL; - goto out; - } - -out: - vfio_put_address_space(space); - return container; -} - -bool vfio_eeh_as_ok(AddressSpace *as) -{ - VFIOContainer *container = vfio_eeh_as_container(as); - - return (container != NULL) && vfio_eeh_container_ok(container); -} - -int vfio_eeh_as_op(AddressSpace *as, uint32_t op) -{ - VFIOContainer *container = vfio_eeh_as_container(as); - - if (!container) { - return -ENODEV; - } - return vfio_eeh_container_op(container, op); -} - static int vfio_device_groupid(VFIODevice *vbasedev, Error **errp) { char *tmp, group_path[PATH_MAX], *group_name; diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index 7e5da21b31..168847e7c5 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -23,7 +23,6 @@ #include #include "hw/vfio/vfio-common.h" -#include "hw/vfio/vfio.h" #include "hw/hw.h" #include "trace.h" #include "qapi/error.h" From patchwork Thu Oct 26 10:30:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437472 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0488EC25B6B for ; Thu, 26 Oct 2023 10:47:57 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxsc-0007Km-3S; Thu, 26 Oct 2023 06:46:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsa-0007Jy-PH; Thu, 26 Oct 2023 06:46:20 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsY-0001Gp-Ip; Thu, 26 Oct 2023 06:46:20 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317178; x=1729853178; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=FWH8NzIQfnAtMON1iP6JV05ZRsxBefa3j3QkoTLfZ+0=; b=eNZVFh1L8leq/V4FJzpIupYRz5y/vUuyXr9ynL3DKQlf1B7rlq5e1umr 0KaRQPSEaIvVE2pK0s3wb5PHhznQ2wa0+2AwKrrfLhr+lCi2iK7v1csgY lqFKrPP1d9G99JXhzk9fbS5PDfImRdKmIqCFyNqKo/mH1exIDq8yDNQLp 9osnsVafzDCCmUQHWNTXCHeCUK2mnjPfNxuzDbPkus47rQg/6pl9X6jKp nmRssGst8WWE3H7qWezeCRbj+gyxdtS+jBk0aPfm1H1wIweKo9fH0Pn3y vFasx2uvR7KocYo2aF7l8M/ydZUMT2gyXEDtVZ56MGXbmRzdVcWrSkzZa A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563193" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563193" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463567" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:02 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 02/37] vfio/container: Move vfio_container_add/del_section_window into spapr.c Date: Thu, 26 Oct 2023 18:30:29 +0800 Message-Id: <20231026103104.1686921-3-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org vfio_container_add/del_section_window are spapr specific functions, so move them into spapr.c to make container.c cleaner. No functional changes intended. Suggested-by: Cédric Le Goater Signed-off-by: Zhenzhong Duan Reviewed-by: Cédric Le Goater --- hw/vfio/container.c | 90 --------------------------------------------- hw/vfio/spapr.c | 90 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 90 insertions(+), 90 deletions(-) diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 83c0f05bba..7a3f005d1b 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -20,9 +20,6 @@ #include "qemu/osdep.h" #include -#ifdef CONFIG_KVM -#include -#endif #include #include "hw/vfio/vfio-common.h" @@ -32,7 +29,6 @@ #include "hw/hw.h" #include "qemu/error-report.h" #include "qemu/range.h" -#include "sysemu/kvm.h" #include "sysemu/reset.h" #include "trace.h" #include "qapi/error.h" @@ -204,92 +200,6 @@ int vfio_dma_map(VFIOContainer *container, hwaddr iova, return -errno; } -int vfio_container_add_section_window(VFIOContainer *container, - MemoryRegionSection *section, - Error **errp) -{ - VFIOHostDMAWindow *hostwin; - hwaddr pgsize = 0; - int ret; - - if (container->iommu_type != VFIO_SPAPR_TCE_v2_IOMMU) { - return 0; - } - - /* For now intersections are not allowed, we may relax this later */ - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { - if (ranges_overlap(hostwin->min_iova, - hostwin->max_iova - hostwin->min_iova + 1, - section->offset_within_address_space, - int128_get64(section->size))) { - error_setg(errp, - "region [0x%"PRIx64",0x%"PRIx64"] overlaps with existing" - "host DMA window [0x%"PRIx64",0x%"PRIx64"]", - section->offset_within_address_space, - section->offset_within_address_space + - int128_get64(section->size) - 1, - hostwin->min_iova, hostwin->max_iova); - return -EINVAL; - } - } - - ret = vfio_spapr_create_window(container, section, &pgsize); - if (ret) { - error_setg_errno(errp, -ret, "Failed to create SPAPR window"); - return ret; - } - - vfio_host_win_add(container, section->offset_within_address_space, - section->offset_within_address_space + - int128_get64(section->size) - 1, pgsize); -#ifdef CONFIG_KVM - if (kvm_enabled()) { - VFIOGroup *group; - IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr); - struct kvm_vfio_spapr_tce param; - struct kvm_device_attr attr = { - .group = KVM_DEV_VFIO_GROUP, - .attr = KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE, - .addr = (uint64_t)(unsigned long)¶m, - }; - - if (!memory_region_iommu_get_attr(iommu_mr, IOMMU_ATTR_SPAPR_TCE_FD, - ¶m.tablefd)) { - QLIST_FOREACH(group, &container->group_list, container_next) { - param.groupfd = group->fd; - if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) { - error_setg_errno(errp, errno, - "vfio: failed GROUP_SET_SPAPR_TCE for " - "KVM VFIO device %d and group fd %d", - param.tablefd, param.groupfd); - return -errno; - } - trace_vfio_spapr_group_attach(param.groupfd, param.tablefd); - } - } - } -#endif - return 0; -} - -void vfio_container_del_section_window(VFIOContainer *container, - MemoryRegionSection *section) -{ - if (container->iommu_type != VFIO_SPAPR_TCE_v2_IOMMU) { - return; - } - - vfio_spapr_remove_window(container, - section->offset_within_address_space); - if (vfio_host_win_del(container, - section->offset_within_address_space, - section->offset_within_address_space + - int128_get64(section->size) - 1) < 0) { - hw_error("%s: Cannot delete missing window at %"HWADDR_PRIx, - __func__, section->offset_within_address_space); - } -} - int vfio_set_dirty_page_tracking(VFIOContainer *container, bool start) { int ret; diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 9ec1e95f6d..9a7517c042 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -11,6 +11,10 @@ #include "qemu/osdep.h" #include #include +#ifdef CONFIG_KVM +#include +#endif +#include "sysemu/kvm.h" #include "hw/vfio/vfio-common.h" #include "hw/hw.h" @@ -253,3 +257,89 @@ int vfio_spapr_remove_window(VFIOContainer *container, return 0; } + +int vfio_container_add_section_window(VFIOContainer *container, + MemoryRegionSection *section, + Error **errp) +{ + VFIOHostDMAWindow *hostwin; + hwaddr pgsize = 0; + int ret; + + if (container->iommu_type != VFIO_SPAPR_TCE_v2_IOMMU) { + return 0; + } + + /* For now intersections are not allowed, we may relax this later */ + QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { + if (ranges_overlap(hostwin->min_iova, + hostwin->max_iova - hostwin->min_iova + 1, + section->offset_within_address_space, + int128_get64(section->size))) { + error_setg(errp, + "region [0x%"PRIx64",0x%"PRIx64"] overlaps with existing" + "host DMA window [0x%"PRIx64",0x%"PRIx64"]", + section->offset_within_address_space, + section->offset_within_address_space + + int128_get64(section->size) - 1, + hostwin->min_iova, hostwin->max_iova); + return -EINVAL; + } + } + + ret = vfio_spapr_create_window(container, section, &pgsize); + if (ret) { + error_setg_errno(errp, -ret, "Failed to create SPAPR window"); + return ret; + } + + vfio_host_win_add(container, section->offset_within_address_space, + section->offset_within_address_space + + int128_get64(section->size) - 1, pgsize); +#ifdef CONFIG_KVM + if (kvm_enabled()) { + VFIOGroup *group; + IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr); + struct kvm_vfio_spapr_tce param; + struct kvm_device_attr attr = { + .group = KVM_DEV_VFIO_GROUP, + .attr = KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE, + .addr = (uint64_t)(unsigned long)¶m, + }; + + if (!memory_region_iommu_get_attr(iommu_mr, IOMMU_ATTR_SPAPR_TCE_FD, + ¶m.tablefd)) { + QLIST_FOREACH(group, &container->group_list, container_next) { + param.groupfd = group->fd; + if (ioctl(vfio_kvm_device_fd, KVM_SET_DEVICE_ATTR, &attr)) { + error_setg_errno(errp, errno, + "vfio: failed GROUP_SET_SPAPR_TCE for " + "KVM VFIO device %d and group fd %d", + param.tablefd, param.groupfd); + return -errno; + } + trace_vfio_spapr_group_attach(param.groupfd, param.tablefd); + } + } + } +#endif + return 0; +} + +void vfio_container_del_section_window(VFIOContainer *container, + MemoryRegionSection *section) +{ + if (container->iommu_type != VFIO_SPAPR_TCE_v2_IOMMU) { + return; + } + + vfio_spapr_remove_window(container, + section->offset_within_address_space); + if (vfio_host_win_del(container, + section->offset_within_address_space, + section->offset_within_address_space + + int128_get64(section->size) - 1) < 0) { + hw_error("%s: Cannot delete missing window at %"HWADDR_PRIx, + __func__, section->offset_within_address_space); + } +} From patchwork Thu Oct 26 10:30:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437498 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15973C25B6B for ; Thu, 26 Oct 2023 10:51:43 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxsh-0007M3-93; Thu, 26 Oct 2023 06:46:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsf-0007Lc-SE; Thu, 26 Oct 2023 06:46:26 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsd-0001Hy-SM; Thu, 26 Oct 2023 06:46:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317183; x=1729853183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VfIkbI1fdwCJdZ1kaD6+6suI4Zq2iPxTLHayCXaq2U0=; b=ig+Hg5zo2CV5S5WsnAKsdj5RkND7LgQQxrCGiOo3gCb6qkxBgosy0z6d 0y0tBDgemgDR134sfS6OWTV0CmALijdJQhZr6+j2dsF/2h9qa+WNl3l8e 2NHCNdkkp9t5fvdK2hWzOn0B6WI48J3SAje/jbXRraXRfugQRjAzEjmW1 uQw/Zi+zss7BnZ9ONEqEp6Us6QWzW0J9MpceT0f+2A8bRoLn400re1GZp 2yGf/cE0bm9AZDT5xyDKKMT8pjTYbN08MjI0h4XBXCbF5TBSg40Dw77dL P2r0MFaqz3crWIGynsHNvDGjvErXDiTfpsqw8KtfrjKlYAxU5thCjLoqh Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563207" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563207" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:21 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463580" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:07 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 03/37] vfio/container: Move spapr specific init/deinit into spapr.c Date: Thu, 26 Oct 2023 18:30:30 +0800 Message-Id: <20231026103104.1686921-4-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Move spapr specific init/deinit code into spapr.c and wrap them with vfio_spapr_container_init/deinit, this way footprint of spapr is further reduced, vfio_prereg_listener could also be made static. vfio_listener_release is unnecessary when prereg_listener is moved out, so have it removed. No functional changes intended. Suggested-by: Cédric Le Goater Signed-off-by: Zhenzhong Duan Reviewed-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 4 +- hw/vfio/container.c | 82 +++++------------------------------ hw/vfio/spapr.c | 80 +++++++++++++++++++++++++++++++++- 3 files changed, 94 insertions(+), 72 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 0c3d390e8b..8d1d4f0a89 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -225,11 +225,14 @@ int vfio_set_dirty_page_tracking(VFIOContainer *container, bool start); int vfio_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap, hwaddr iova, hwaddr size); +/* SPAPR specific */ int vfio_container_add_section_window(VFIOContainer *container, MemoryRegionSection *section, Error **errp); void vfio_container_del_section_window(VFIOContainer *container, MemoryRegionSection *section); +bool vfio_spapr_container_init(VFIOContainer *container, Error **errp); +void vfio_spapr_container_deinit(VFIOContainer *container); void vfio_disable_irqindex(VFIODevice *vbasedev, int index); void vfio_unmask_single_irqindex(VFIODevice *vbasedev, int index); @@ -289,7 +292,6 @@ vfio_get_device_info_cap(struct vfio_device_info *info, uint16_t id); struct vfio_info_cap_header * vfio_get_cap(void *ptr, uint32_t cap_offset, uint16_t id); #endif -extern const MemoryListener vfio_prereg_listener; int vfio_spapr_create_window(VFIOContainer *container, MemoryRegionSection *section, diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 7a3f005d1b..204b244b11 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -264,14 +264,6 @@ int vfio_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap, return ret; } -static void vfio_listener_release(VFIOContainer *container) -{ - memory_listener_unregister(&container->listener); - if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) { - memory_listener_unregister(&container->prereg_listener); - } -} - static struct vfio_info_cap_header * vfio_get_iommu_type1_info_cap(struct vfio_iommu_type1_info *info, uint16_t id) { @@ -612,69 +604,11 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, case VFIO_SPAPR_TCE_v2_IOMMU: case VFIO_SPAPR_TCE_IOMMU: { - struct vfio_iommu_spapr_tce_info info; - bool v2 = container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU; - - /* - * The host kernel code implementing VFIO_IOMMU_DISABLE is called - * when container fd is closed so we do not call it explicitly - * in this file. - */ - if (!v2) { - ret = ioctl(fd, VFIO_IOMMU_ENABLE); - if (ret) { - error_setg_errno(errp, errno, "failed to enable container"); - ret = -errno; - goto enable_discards_exit; - } - } else { - container->prereg_listener = vfio_prereg_listener; - - memory_listener_register(&container->prereg_listener, - &address_space_memory); - if (container->error) { - memory_listener_unregister(&container->prereg_listener); - ret = -1; - error_propagate_prepend(errp, container->error, - "RAM memory listener initialization failed: "); - goto enable_discards_exit; - } - } - - info.argsz = sizeof(info); - ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info); + ret = vfio_spapr_container_init(container, errp); if (ret) { - error_setg_errno(errp, errno, - "VFIO_IOMMU_SPAPR_TCE_GET_INFO failed"); - ret = -errno; - if (v2) { - memory_listener_unregister(&container->prereg_listener); - } goto enable_discards_exit; } - - if (v2) { - container->pgsizes = info.ddw.pgsizes; - /* - * There is a default window in just created container. - * To make region_add/del simpler, we better remove this - * window now and let those iommu_listener callbacks - * create/remove them when needed. - */ - ret = vfio_spapr_remove_window(container, info.dma32_window_start); - if (ret) { - error_setg_errno(errp, -ret, - "failed to remove existing window"); - goto enable_discards_exit; - } - } else { - /* The default table uses 4K pages */ - container->pgsizes = 0x1000; - vfio_host_win_add(container, info.dma32_window_start, - info.dma32_window_start + - info.dma32_window_size - 1, - 0x1000); - } + break; } } @@ -704,7 +638,11 @@ listener_release_exit: QLIST_REMOVE(group, container_next); QLIST_REMOVE(container, next); vfio_kvm_device_del_group(group); - vfio_listener_release(container); + memory_listener_unregister(&container->listener); + if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU || + container->iommu_type == VFIO_SPAPR_TCE_IOMMU) { + vfio_spapr_container_deinit(container); + } enable_discards_exit: vfio_ram_block_discard_disable(container, false); @@ -734,7 +672,11 @@ static void vfio_disconnect_container(VFIOGroup *group) * group. */ if (QLIST_EMPTY(&container->group_list)) { - vfio_listener_release(container); + memory_listener_unregister(&container->listener); + if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU || + container->iommu_type == VFIO_SPAPR_TCE_IOMMU) { + vfio_spapr_container_deinit(container); + } } if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, &container->fd)) { diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 9a7517c042..fd0613dfde 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -15,6 +15,7 @@ #include #endif #include "sysemu/kvm.h" +#include "exec/address-spaces.h" #include "hw/vfio/vfio-common.h" #include "hw/hw.h" @@ -139,7 +140,7 @@ static void vfio_prereg_listener_region_del(MemoryListener *listener, trace_vfio_prereg_unregister(reg.vaddr, reg.size, ret ? -errno : 0); } -const MemoryListener vfio_prereg_listener = { +static const MemoryListener vfio_prereg_listener = { .name = "vfio-pre-reg", .region_add = vfio_prereg_listener_region_add, .region_del = vfio_prereg_listener_region_del, @@ -343,3 +344,80 @@ void vfio_container_del_section_window(VFIOContainer *container, __func__, section->offset_within_address_space); } } + +bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) +{ + + struct vfio_iommu_spapr_tce_info info; + bool v2 = container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU; + int ret, fd = container->fd; + + /* + * The host kernel code implementing VFIO_IOMMU_DISABLE is called + * when container fd is closed so we do not call it explicitly + * in this file. + */ + if (!v2) { + ret = ioctl(fd, VFIO_IOMMU_ENABLE); + if (ret) { + error_setg_errno(errp, errno, "failed to enable container"); + return -errno; + } + } else { + container->prereg_listener = vfio_prereg_listener; + + memory_listener_register(&container->prereg_listener, + &address_space_memory); + if (container->error) { + ret = -1; + error_propagate_prepend(errp, container->error, + "RAM memory listener initialization failed: "); + goto listener_unregister_exit; + } + } + + info.argsz = sizeof(info); + ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info); + if (ret) { + error_setg_errno(errp, errno, + "VFIO_IOMMU_SPAPR_TCE_GET_INFO failed"); + ret = -errno; + goto listener_unregister_exit; + } + + if (v2) { + container->pgsizes = info.ddw.pgsizes; + /* + * There is a default window in just created container. + * To make region_add/del simpler, we better remove this + * window now and let those iommu_listener callbacks + * create/remove them when needed. + */ + ret = vfio_spapr_remove_window(container, info.dma32_window_start); + if (ret) { + error_setg_errno(errp, -ret, + "failed to remove existing window"); + goto listener_unregister_exit; + } + } else { + /* The default table uses 4K pages */ + container->pgsizes = 0x1000; + vfio_host_win_add(container, info.dma32_window_start, + info.dma32_window_start + + info.dma32_window_size - 1, + 0x1000); + } + +listener_unregister_exit: + if (v2) { + memory_listener_unregister(&container->prereg_listener); + } + return ret; +} + +void vfio_spapr_container_deinit(VFIOContainer *container) +{ + if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) { + memory_listener_unregister(&container->prereg_listener); + } +} From patchwork Thu Oct 26 10:30:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 680ADC25B70 for ; Thu, 26 Oct 2023 10:49:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxsl-0007ND-NR; Thu, 26 Oct 2023 06:46:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsj-0007Mn-Fz; Thu, 26 Oct 2023 06:46:29 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsh-0001Hy-Jy; Thu, 26 Oct 2023 06:46:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317187; x=1729853187; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yuZLpoamk7ehr3AZoYC8ULdgvhr7U0wkHNbZXdurKlo=; b=c2/94Af81RdT8DDxuqEwKgMTJvPcfLvRMQckZyOpwj/c1U7ol5uWmIdC 5iEcNpZseTMQ8tMzU9syUd3Rgd2m1oMPM9PuF3n3aXJV7FhBbquKy/SDe qKcur0lWWPvjeSilyYFkFIJb+/1fQJkbxQKODRNjuN+yXoV5mOgbzQ/SG DOGKsVXdKJwWEIHTXsSlgLwemXbstdUjpucbNy6ZFVU/PqjCpDpln7THx 65Mj7v0ljvEvGsacFI9TCsBdylRaFHZaPJ3kJBCh4nzn74cVUKBo+9VLh Ct+e6oNnuBPpuPE2aj7aQkUQzFgZYAn5eU3+8c0PVYfWw0qTEbnpxnLDo A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563219" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563219" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463588" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:11 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 04/37] vfio/spapr: Make vfio_spapr_create/remove_window static Date: Thu, 26 Oct 2023 18:30:31 +0800 Message-Id: <20231026103104.1686921-5-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org vfio_spapr_create_window calls vfio_spapr_remove_window, With reoder of definition of the two, we can make vfio_spapr_create/remove_window static. No functional changes intended. Signed-off-by: Zhenzhong Duan Reviewed-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 6 ----- hw/vfio/spapr.c | 48 +++++++++++++++++------------------ 2 files changed, 24 insertions(+), 30 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 8d1d4f0a89..f3174b3c5c 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -293,12 +293,6 @@ struct vfio_info_cap_header * vfio_get_cap(void *ptr, uint32_t cap_offset, uint16_t id); #endif -int vfio_spapr_create_window(VFIOContainer *container, - MemoryRegionSection *section, - hwaddr *pgsize); -int vfio_spapr_remove_window(VFIOContainer *container, - hwaddr offset_within_address_space); - bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp); void vfio_migration_exit(VFIODevice *vbasedev); diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index fd0613dfde..43adbdb7b3 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -146,9 +146,30 @@ static const MemoryListener vfio_prereg_listener = { .region_del = vfio_prereg_listener_region_del, }; -int vfio_spapr_create_window(VFIOContainer *container, - MemoryRegionSection *section, - hwaddr *pgsize) +static int vfio_spapr_remove_window(VFIOContainer *container, + hwaddr offset_within_address_space) +{ + struct vfio_iommu_spapr_tce_remove remove = { + .argsz = sizeof(remove), + .start_addr = offset_within_address_space, + }; + int ret; + + ret = ioctl(container->fd, VFIO_IOMMU_SPAPR_TCE_REMOVE, &remove); + if (ret) { + error_report("Failed to remove window at %"PRIx64, + (uint64_t)remove.start_addr); + return -errno; + } + + trace_vfio_spapr_remove_window(offset_within_address_space); + + return 0; +} + +static int vfio_spapr_create_window(VFIOContainer *container, + MemoryRegionSection *section, + hwaddr *pgsize) { int ret = 0; IOMMUMemoryRegion *iommu_mr = IOMMU_MEMORY_REGION(section->mr); @@ -238,27 +259,6 @@ int vfio_spapr_create_window(VFIOContainer *container, return 0; } -int vfio_spapr_remove_window(VFIOContainer *container, - hwaddr offset_within_address_space) -{ - struct vfio_iommu_spapr_tce_remove remove = { - .argsz = sizeof(remove), - .start_addr = offset_within_address_space, - }; - int ret; - - ret = ioctl(container->fd, VFIO_IOMMU_SPAPR_TCE_REMOVE, &remove); - if (ret) { - error_report("Failed to remove window at %"PRIx64, - (uint64_t)remove.start_addr); - return -errno; - } - - trace_vfio_spapr_remove_window(offset_within_address_space); - - return 0; -} - int vfio_container_add_section_window(VFIOContainer *container, MemoryRegionSection *section, Error **errp) From patchwork Thu Oct 26 10:30:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437478 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0C29BC25B6B for ; Thu, 26 Oct 2023 10:49:44 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxsr-0007O3-6B; Thu, 26 Oct 2023 06:46:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxso-0007Ng-Dv; Thu, 26 Oct 2023 06:46:35 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsm-0001Hy-Dh; Thu, 26 Oct 2023 06:46:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317192; x=1729853192; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iIgRmxMm0dvXKkWlz6MJBc7IUSupn8oIFecS49huiLs=; b=K0qrTX9xeIt/pc3XX/4NU/N77VTXbeQ4aeDJsyKUEnE+7xbUG215RnsE 9SJADqFsUobPffK7mOK0qru7zRP0vmmTHppIc/x8m4jwN14IAdEBKwYP2 f15XCO5AQKhz25NhNt93x3/9IzPWauCTPljz1nikLDH+yq8xatQQX0XzZ 9/uqPrSGL0iN5XQU4fyKBsbKCApibTPNCMdMJNdiOJRbrhPn+VY9dZhcd 1avpOTxl/HiNJYsDe9MC6N8orHXawe3Y02L1xNFoiASHehGELSSfLLfpR IPQHvWFEpsdeHymA8989AujbUfT461211GNT7hAmpXn0R7JC57b76pk+m Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563235" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563235" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463591" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:16 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 05/37] vfio/common: Move vfio_host_win_add/del into spapr.c Date: Thu, 26 Oct 2023 18:30:32 +0800 Message-Id: <20231026103104.1686921-6-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Only spapr supports a customed host window list, other vfio driver assume 64bit host window. So remove the check in listener callback and move vfio_host_win_add/del into spapr.c and make static. Suggested-by: Alex Williamson Signed-off-by: Zhenzhong Duan Reviewed-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 5 --- hw/vfio/common.c | 64 ----------------------------------- hw/vfio/container.c | 16 --------- hw/vfio/spapr.c | 38 +++++++++++++++++++++ 4 files changed, 38 insertions(+), 85 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index f3174b3c5c..b9c7a7e588 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -207,11 +207,6 @@ typedef struct { hwaddr pages; } VFIOBitmap; -void vfio_host_win_add(VFIOContainer *container, - hwaddr min_iova, hwaddr max_iova, - uint64_t iova_pgsizes); -int vfio_host_win_del(VFIOContainer *container, hwaddr min_iova, - hwaddr max_iova); VFIOAddressSpace *vfio_get_address_space(AddressSpace *as); void vfio_put_address_space(VFIOAddressSpace *space); bool vfio_devices_all_running_and_saving(VFIOContainer *container); diff --git a/hw/vfio/common.c b/hw/vfio/common.c index e72055e752..0ebf4d9256 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -245,44 +245,6 @@ bool vfio_devices_all_running_and_mig_active(VFIOContainer *container) return true; } -void vfio_host_win_add(VFIOContainer *container, hwaddr min_iova, - hwaddr max_iova, uint64_t iova_pgsizes) -{ - VFIOHostDMAWindow *hostwin; - - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { - if (ranges_overlap(hostwin->min_iova, - hostwin->max_iova - hostwin->min_iova + 1, - min_iova, - max_iova - min_iova + 1)) { - hw_error("%s: Overlapped IOMMU are not enabled", __func__); - } - } - - hostwin = g_malloc0(sizeof(*hostwin)); - - hostwin->min_iova = min_iova; - hostwin->max_iova = max_iova; - hostwin->iova_pgsizes = iova_pgsizes; - QLIST_INSERT_HEAD(&container->hostwin_list, hostwin, hostwin_next); -} - -int vfio_host_win_del(VFIOContainer *container, - hwaddr min_iova, hwaddr max_iova) -{ - VFIOHostDMAWindow *hostwin; - - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { - if (hostwin->min_iova == min_iova && hostwin->max_iova == max_iova) { - QLIST_REMOVE(hostwin, hostwin_next); - g_free(hostwin); - return 0; - } - } - - return -1; -} - static bool vfio_listener_skipped_section(MemoryRegionSection *section) { return (!memory_region_is_ram(section->mr) && @@ -531,22 +493,6 @@ static void vfio_unregister_ram_discard_listener(VFIOContainer *container, g_free(vrdl); } -static VFIOHostDMAWindow *vfio_find_hostwin(VFIOContainer *container, - hwaddr iova, hwaddr end) -{ - VFIOHostDMAWindow *hostwin; - bool hostwin_found = false; - - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { - if (hostwin->min_iova <= iova && end <= hostwin->max_iova) { - hostwin_found = true; - break; - } - } - - return hostwin_found ? hostwin : NULL; -} - static bool vfio_known_safe_misalignment(MemoryRegionSection *section) { MemoryRegion *mr = section->mr; @@ -647,13 +593,6 @@ static void vfio_listener_region_add(MemoryListener *listener, goto fail; } - hostwin = vfio_find_hostwin(container, iova, end); - if (!hostwin) { - error_setg(&err, "Container %p can't map guest IOVA region" - " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx, container, iova, end); - goto fail; - } - memory_region_ref(section->mr); if (memory_region_is_iommu(section->mr)) { @@ -835,9 +774,6 @@ static void vfio_listener_region_del(MemoryListener *listener, hwaddr pgmask; VFIOHostDMAWindow *hostwin; - hostwin = vfio_find_hostwin(container, iova, end); - assert(hostwin); /* or region_add() would have failed */ - pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1; try_unmap = !((iova & pgmask) || (int128_get64(llsize) & pgmask)); } else if (memory_region_has_ram_discard_manager(section->mr)) { diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 204b244b11..242010036a 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -551,7 +551,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->dma_max_mappings = 0; container->iova_ranges = NULL; QLIST_INIT(&container->giommu_list); - QLIST_INIT(&container->hostwin_list); QLIST_INIT(&container->vrdl_list); ret = vfio_init_container(container, group->fd, errp); @@ -591,14 +590,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, vfio_get_iommu_info_migration(container, info); g_free(info); - - /* - * FIXME: We should parse VFIO_IOMMU_TYPE1_INFO_CAP_IOVA_RANGE - * information to get the actual window extent rather than assume - * a 64-bit IOVA address space. - */ - vfio_host_win_add(container, 0, (hwaddr)-1, container->pgsizes); - break; } case VFIO_SPAPR_TCE_v2_IOMMU: @@ -687,7 +678,6 @@ static void vfio_disconnect_container(VFIOGroup *group) if (QLIST_EMPTY(&container->group_list)) { VFIOAddressSpace *space = container->space; VFIOGuestIOMMU *giommu, *tmp; - VFIOHostDMAWindow *hostwin, *next; QLIST_REMOVE(container, next); @@ -698,12 +688,6 @@ static void vfio_disconnect_container(VFIOGroup *group) g_free(giommu); } - QLIST_FOREACH_SAFE(hostwin, &container->hostwin_list, hostwin_next, - next) { - QLIST_REMOVE(hostwin, hostwin_next); - g_free(hostwin); - } - trace_vfio_disconnect_container(container->fd); close(container->fd); vfio_free_container(container); diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 43adbdb7b3..3495737ab2 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -146,6 +146,44 @@ static const MemoryListener vfio_prereg_listener = { .region_del = vfio_prereg_listener_region_del, }; +static void vfio_host_win_add(VFIOContainer *container, hwaddr min_iova, + hwaddr max_iova, uint64_t iova_pgsizes) +{ + VFIOHostDMAWindow *hostwin; + + QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { + if (ranges_overlap(hostwin->min_iova, + hostwin->max_iova - hostwin->min_iova + 1, + min_iova, + max_iova - min_iova + 1)) { + hw_error("%s: Overlapped IOMMU are not enabled", __func__); + } + } + + hostwin = g_malloc0(sizeof(*hostwin)); + + hostwin->min_iova = min_iova; + hostwin->max_iova = max_iova; + hostwin->iova_pgsizes = iova_pgsizes; + QLIST_INSERT_HEAD(&container->hostwin_list, hostwin, hostwin_next); +} + +static int vfio_host_win_del(VFIOContainer *container, + hwaddr min_iova, hwaddr max_iova) +{ + VFIOHostDMAWindow *hostwin; + + QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { + if (hostwin->min_iova == min_iova && hostwin->max_iova == max_iova) { + QLIST_REMOVE(hostwin, hostwin_next); + g_free(hostwin); + return 0; + } + } + + return -1; +} + static int vfio_spapr_remove_window(VFIOContainer *container, hwaddr offset_within_address_space) { From patchwork Thu Oct 26 10:30:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437491 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7A400C25B48 for ; Thu, 26 Oct 2023 10:51:32 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxt1-0007VF-NA; Thu, 26 Oct 2023 06:46:48 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxss-0007ON-Dd for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:46:38 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsp-0001Hy-D0 for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:46:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317195; x=1729853195; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sshorHfi04uXwQnAgCP731sQIeQ8jgd42t56+O1ah78=; b=OxKupBQn/q6GG0Uk2nES6V2zHyVpqK5vi68VS5agpP35+xWYXo//DQCG zzHEIioGgEzp+JgFLr1sU1B3uc8zC9G3tU4kdyYTRllApfN9dl6UMwDT3 2DB7lpdvijYovUW4RoTk6+ThKbLchLPEod89DmwObbC8WA1IH3nLqvN66 JR0sIMvdj3GXsjPCzvLlI7P0/q9L2onpDrw2OLNQRY+6sbzVQKTzcwomC MSEDTQJgHqDnnd4uFcrWgVgTg7O7qUI85XHr/Fppjp3N8fuaI5og+89MJ TCr/kE31tqviSsBeNpo/1j+JjiGtzGzbARKWhoQGqnwqOJ5O/bLuAkoCo Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563243" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563243" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463597" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:20 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Yi Sun Subject: [PATCH v3 06/37] vfio: Introduce base object for VFIOContainer and targetted interface Date: Thu, 26 Oct 2023 18:30:33 +0800 Message-Id: <20231026103104.1686921-7-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Introduce a dumb VFIOContainer base object and its targetted interface. This is willingly not a QOM object because we don't want it to be visible from the user interface. The VFIOContainer will be smoothly populated in subsequent patches as well as interfaces. No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 8 +--- include/hw/vfio/vfio-container-base.h | 64 +++++++++++++++++++++++++++ 2 files changed, 66 insertions(+), 6 deletions(-) create mode 100644 include/hw/vfio/vfio-container-base.h diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index b9c7a7e588..d8f293cb57 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -30,6 +30,7 @@ #include #endif #include "sysemu/sysemu.h" +#include "hw/vfio/vfio-container-base.h" #define VFIO_MSG_PREFIX "vfio %s: " @@ -81,6 +82,7 @@ typedef struct VFIOAddressSpace { struct VFIOGroup; typedef struct VFIOContainer { + VFIOContainerBase bcontainer; VFIOAddressSpace *space; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ MemoryListener listener; @@ -201,12 +203,6 @@ typedef struct VFIODisplay { } dmabuf; } VFIODisplay; -typedef struct { - unsigned long *bitmap; - hwaddr size; - hwaddr pages; -} VFIOBitmap; - VFIOAddressSpace *vfio_get_address_space(AddressSpace *as); void vfio_put_address_space(VFIOAddressSpace *space); bool vfio_devices_all_running_and_saving(VFIOContainer *container); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h new file mode 100644 index 0000000000..5becbd51a7 --- /dev/null +++ b/include/hw/vfio/vfio-container-base.h @@ -0,0 +1,64 @@ +/* + * VFIO BASE CONTAINER + * + * Copyright (C) 2023 Intel Corporation. + * Copyright Red Hat, Inc. 2023 + * + * Authors: Yi Liu + * Eric Auger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + + * You should have received a copy of the GNU General Public License along + * with this program; if not, see . + */ + +#ifndef HW_VFIO_VFIO_BASE_CONTAINER_H +#define HW_VFIO_VFIO_BASE_CONTAINER_H + +#include "exec/memory.h" +#ifndef CONFIG_USER_ONLY +#include "exec/hwaddr.h" +#endif + +typedef struct VFIODevice VFIODevice; +typedef struct VFIOIOMMUOps VFIOIOMMUOps; + +typedef struct { + unsigned long *bitmap; + hwaddr size; + hwaddr pages; +} VFIOBitmap; + +/* + * This is the base object for vfio container backends + */ +typedef struct VFIOContainerBase { + const VFIOIOMMUOps *ops; +} VFIOContainerBase; + +struct VFIOIOMMUOps { + /* basic feature */ + int (*dma_map)(VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + void *vaddr, bool readonly); + int (*dma_unmap)(VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + IOMMUTLBEntry *iotlb); + int (*attach_device)(char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp); + void (*detach_device)(VFIODevice *vbasedev); + /* migration feature */ + int (*set_dirty_page_tracking)(VFIOContainerBase *bcontainer, bool start); + int (*query_dirty_bitmap)(VFIOContainerBase *bcontainer, VFIOBitmap *vbmap, + hwaddr iova, hwaddr size); +}; +#endif /* HW_VFIO_VFIO_BASE_CONTAINER_H */ From patchwork Thu Oct 26 10:30:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437475 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E9274C25B6B for ; Thu, 26 Oct 2023 10:48:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxt0-0007SF-7g; Thu, 26 Oct 2023 06:46:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsv-0007Qg-Dv for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:46:42 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxst-0001Hy-3L for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:46:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317199; x=1729853199; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Qxw0A9Qxj6KixE8WbPl0VC7CbcnlGe2+ICTzKRPe+3s=; b=aNa/M5IeLbq8EC17uZnVjBAyQYkijfugRTQZj3dUanktwB3SGN6Pt+4h h6LtNNIztJMzg9DFae0TD6RrsegdomHKgQazbR6m3WLDKrb+5GKIo4vu3 yIl2D1iLn60zYvBIQWjSzUhWzjOkChBYlfVy3SrhQgRLpuIZjzKmdEtGY mGajGHUZCFTH/x5zf+IQ3Yjr1jiksMY2uLGSjAjlPZ8D61dclciZJpgYV 4frTwl9bnsFgJnVIizpo/2kJrSWAqpm/w+qvJDreeTS6IqYVWO4fAs6io BdzM+1CDOufsH6fbQtWNc0TuOKMtYjVVJl0GiXhozUCQ/tJim4RvRuGqk w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563250" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563250" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463603" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:24 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 07/37] vfio/container: Introduce a empty VFIOIOMMUOps Date: Thu, 26 Oct 2023 18:30:34 +0800 Message-Id: <20231026103104.1686921-8-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This empty VFIOIOMMUOps named vfio_legacy_ops will hold all general IOMMU ops of legacy container. Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 2 +- hw/vfio/container.c | 5 +++++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index d8f293cb57..8ded5cd8e4 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -255,7 +255,7 @@ typedef QLIST_HEAD(VFIOGroupList, VFIOGroup) VFIOGroupList; typedef QLIST_HEAD(VFIODeviceList, VFIODevice) VFIODeviceList; extern VFIOGroupList vfio_group_list; extern VFIODeviceList vfio_device_list; - +extern const VFIOIOMMUOps vfio_legacy_ops; extern const MemoryListener vfio_memory_listener; extern int vfio_kvm_device_fd; diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 242010036a..4bc43ddfa4 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -472,6 +472,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, Error **errp) { VFIOContainer *container; + VFIOContainerBase *bcontainer; int ret, fd; VFIOAddressSpace *space; @@ -552,6 +553,8 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->iova_ranges = NULL; QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->vrdl_list); + bcontainer = &container->bcontainer; + bcontainer->ops = &vfio_legacy_ops; ret = vfio_init_container(container, group->fd, errp); if (ret) { @@ -933,3 +936,5 @@ void vfio_detach_device(VFIODevice *vbasedev) vfio_put_base_device(vbasedev); vfio_put_group(group); } + +const VFIOIOMMUOps vfio_legacy_ops; From patchwork Thu Oct 26 10:30:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437506 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D5710C25B6B for ; Thu, 26 Oct 2023 10:53:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxt4-0007Wf-2R; Thu, 26 Oct 2023 06:46:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxt0-0007Sj-FO for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:46:46 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxsx-0001Hy-2b for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:46:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317203; x=1729853203; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=y3ctQRQpWde3ukjNBSm/zBw/ipeYeetUszt0vznDs/U=; b=VO99GXOCqqasHCMzLa0vYK/yF5+2OtxIWrNsvR3b/Y3SDbmoA+2FWCMW xzVyROjS8ALBseNsUv6ielZNPd7Fsw/B5wBDF1/k3oPIbHO95b376oRzC gJweGbBSKi+g/Rg7kgmGQ+bX9cejgif3dmw2iCz8E5XJ5XBEgjyn88z5I xKGvz3pdtGTiDcRhEtdkXjJicCvjU49QpIyDIb02SjEejVyOpW04z0xqs eAiKY26Z08moifJwNR0PykX2PmF1CBgcYPrbUHyR6husE8lyBqRpvWCik Pdi5y9KYrhH7Y4autnTlsEtlAgjBQ0IzsRXUAkYPxhsd2aJ87OCGT2lPG w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563256" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563256" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463607" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:28 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v3 08/37] vfio/container: Switch to dma_map|unmap API Date: Thu, 26 Oct 2023 18:30:35 +0800 Message-Id: <20231026103104.1686921-9-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 4 --- include/hw/vfio/vfio-container-base.h | 7 ++++ hw/vfio/common.c | 45 +++++++++++++----------- hw/vfio/container-base.c | 49 +++++++++++++++++++++++++++ hw/vfio/container.c | 22 ++++++++---- hw/vfio/meson.build | 1 + hw/vfio/trace-events | 2 +- 7 files changed, 98 insertions(+), 32 deletions(-) create mode 100644 hw/vfio/container-base.c diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 8ded5cd8e4..97056224f4 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -208,10 +208,6 @@ void vfio_put_address_space(VFIOAddressSpace *space); bool vfio_devices_all_running_and_saving(VFIOContainer *container); /* container->fd */ -int vfio_dma_unmap(VFIOContainer *container, hwaddr iova, - ram_addr_t size, IOMMUTLBEntry *iotlb); -int vfio_dma_map(VFIOContainer *container, hwaddr iova, - ram_addr_t size, void *vaddr, bool readonly); int vfio_set_dirty_page_tracking(VFIOContainer *container, bool start); int vfio_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap, hwaddr iova, hwaddr size); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 5becbd51a7..077e638ee8 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -45,6 +45,13 @@ typedef struct VFIOContainerBase { const VFIOIOMMUOps *ops; } VFIOContainerBase; +int vfio_container_dma_map(VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + void *vaddr, bool readonly); +int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + IOMMUTLBEntry *iotlb); + struct VFIOIOMMUOps { /* basic feature */ int (*dma_map)(VFIOContainerBase *bcontainer, diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 0ebf4d9256..141f2b54a4 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -292,7 +292,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) { VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n); - VFIOContainer *container = giommu->container; + VFIOContainerBase *bcontainer = &giommu->container->bcontainer; hwaddr iova = iotlb->iova + giommu->iommu_offset; void *vaddr; int ret; @@ -322,21 +322,22 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) * of vaddr will always be there, even if the memory object is * destroyed and its backing memory munmap-ed. */ - ret = vfio_dma_map(container, iova, - iotlb->addr_mask + 1, vaddr, - read_only); + ret = vfio_container_dma_map(bcontainer, iova, + iotlb->addr_mask + 1, vaddr, + read_only); if (ret) { - error_report("vfio_dma_map(%p, 0x%"HWADDR_PRIx", " + error_report("vfio_container_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%s)", - container, iova, + bcontainer, iova, iotlb->addr_mask + 1, vaddr, ret, strerror(-ret)); } } else { - ret = vfio_dma_unmap(container, iova, iotlb->addr_mask + 1, iotlb); + ret = vfio_container_dma_unmap(bcontainer, iova, + iotlb->addr_mask + 1, iotlb); if (ret) { - error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", " + error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", - container, iova, + bcontainer, iova, iotlb->addr_mask + 1, ret, strerror(-ret)); vfio_set_migration_error(ret); } @@ -355,9 +356,10 @@ static void vfio_ram_discard_notify_discard(RamDiscardListener *rdl, int ret; /* Unmap with a single call. */ - ret = vfio_dma_unmap(vrdl->container, iova, size , NULL); + ret = vfio_container_dma_unmap(&vrdl->container->bcontainer, + iova, size , NULL); if (ret) { - error_report("%s: vfio_dma_unmap() failed: %s", __func__, + error_report("%s: vfio_container_dma_unmap() failed: %s", __func__, strerror(-ret)); } } @@ -385,8 +387,8 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, section->offset_within_address_space; vaddr = memory_region_get_ram_ptr(section->mr) + start; - ret = vfio_dma_map(vrdl->container, iova, next - start, - vaddr, section->readonly); + ret = vfio_container_dma_map(&vrdl->container->bcontainer, iova, + next - start, vaddr, section->readonly); if (ret) { /* Rollback */ vfio_ram_discard_notify_discard(rdl, section); @@ -685,10 +687,11 @@ static void vfio_listener_region_add(MemoryListener *listener, } } - ret = vfio_dma_map(container, iova, int128_get64(llsize), - vaddr, section->readonly); + ret = vfio_container_dma_map(&container->bcontainer, + iova, int128_get64(llsize), vaddr, + section->readonly); if (ret) { - error_setg(&err, "vfio_dma_map(%p, 0x%"HWADDR_PRIx", " + error_setg(&err, "vfio_container_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%s)", container, iova, int128_get64(llsize), vaddr, ret, strerror(-ret)); @@ -786,18 +789,20 @@ static void vfio_listener_region_del(MemoryListener *listener, if (int128_eq(llsize, int128_2_64())) { /* The unmap ioctl doesn't accept a full 64-bit span. */ llsize = int128_rshift(llsize, 1); - ret = vfio_dma_unmap(container, iova, int128_get64(llsize), NULL); + ret = vfio_container_dma_unmap(&container->bcontainer, iova, + int128_get64(llsize), NULL); if (ret) { - error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", " + error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", container, iova, int128_get64(llsize), ret, strerror(-ret)); } iova += int128_get64(llsize); } - ret = vfio_dma_unmap(container, iova, int128_get64(llsize), NULL); + ret = vfio_container_dma_unmap(&container->bcontainer, iova, + int128_get64(llsize), NULL); if (ret) { - error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", " + error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", container, iova, int128_get64(llsize), ret, strerror(-ret)); diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c new file mode 100644 index 0000000000..9db8b89b2f --- /dev/null +++ b/hw/vfio/container-base.c @@ -0,0 +1,49 @@ +/* + * VFIO BASE CONTAINER + * + * Copyright (C) 2023 Intel Corporation. + * Copyright Red Hat, Inc. 2023 + * + * Authors: Yi Liu + * Eric Auger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + + * You should have received a copy of the GNU General Public License along + * with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include "qapi/error.h" +#include "qemu/error-report.h" +#include "hw/vfio/vfio-container-base.h" + +int vfio_container_dma_map(VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + void *vaddr, bool readonly) +{ + if (!bcontainer->ops->dma_map) { + return -EINVAL; + } + + return bcontainer->ops->dma_map(bcontainer, iova, size, vaddr, readonly); +} + +int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + IOMMUTLBEntry *iotlb) +{ + if (!bcontainer->ops->dma_unmap) { + return -EINVAL; + } + + return bcontainer->ops->dma_unmap(bcontainer, iova, size, iotlb); +} diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 4bc43ddfa4..c04df26323 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -115,9 +115,11 @@ unmap_exit: /* * DMA - Mapping and unmapping for the "type1" IOMMU interface used on x86 */ -int vfio_dma_unmap(VFIOContainer *container, hwaddr iova, - ram_addr_t size, IOMMUTLBEntry *iotlb) +static int vfio_legacy_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, + ram_addr_t size, IOMMUTLBEntry *iotlb) { + VFIOContainer *container = container_of(bcontainer, VFIOContainer, + bcontainer); struct vfio_iommu_type1_dma_unmap unmap = { .argsz = sizeof(unmap), .flags = 0, @@ -151,7 +153,7 @@ int vfio_dma_unmap(VFIOContainer *container, hwaddr iova, */ if (errno == EINVAL && unmap.size && !(unmap.iova + unmap.size) && container->iommu_type == VFIO_TYPE1v2_IOMMU) { - trace_vfio_dma_unmap_overflow_workaround(); + trace_vfio_legacy_dma_unmap_overflow_workaround(); unmap.size -= 1ULL << ctz64(container->pgsizes); continue; } @@ -170,9 +172,11 @@ int vfio_dma_unmap(VFIOContainer *container, hwaddr iova, return 0; } -int vfio_dma_map(VFIOContainer *container, hwaddr iova, - ram_addr_t size, void *vaddr, bool readonly) +static int vfio_legacy_dma_map(VFIOContainerBase *bcontainer, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly) { + VFIOContainer *container = container_of(bcontainer, VFIOContainer, + bcontainer); struct vfio_iommu_type1_dma_map map = { .argsz = sizeof(map), .flags = VFIO_DMA_MAP_FLAG_READ, @@ -191,7 +195,8 @@ int vfio_dma_map(VFIOContainer *container, hwaddr iova, * the VGA ROM space. */ if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 || - (errno == EBUSY && vfio_dma_unmap(container, iova, size, NULL) == 0 && + (errno == EBUSY && + vfio_legacy_dma_unmap(bcontainer, iova, size, NULL) == 0 && ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) { return 0; } @@ -937,4 +942,7 @@ void vfio_detach_device(VFIODevice *vbasedev) vfio_put_group(group); } -const VFIOIOMMUOps vfio_legacy_ops; +const VFIOIOMMUOps vfio_legacy_ops = { + .dma_map = vfio_legacy_dma_map, + .dma_unmap = vfio_legacy_dma_unmap, +}; diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index 2a6912c940..eb6ce6229d 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -2,6 +2,7 @@ vfio_ss = ss.source_set() vfio_ss.add(files( 'helpers.c', 'common.c', + 'container-base.c', 'container.c', 'spapr.c', 'migration.c', diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 0eb2387cf2..9f7fedee98 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -116,7 +116,7 @@ vfio_region_unmap(const char *name, unsigned long offset, unsigned long end) "Re vfio_region_sparse_mmap_header(const char *name, int index, int nr_areas) "Device %s region %d: %d sparse mmap entries" vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long end) "sparse entry %d [0x%lx - 0x%lx]" vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t subtype) "%s index %d, %08x/%08x" -vfio_dma_unmap_overflow_workaround(void) "" +vfio_legacy_dma_unmap_overflow_workaround(void) "" vfio_get_dirty_bitmap(int fd, uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t start, uint64_t dirty_pages) "container fd=%d, iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" start=0x%"PRIx64" dirty_pages=%"PRIu64 vfio_iommu_map_dirty_notify(uint64_t iova_start, uint64_t iova_end) "iommu dirty @ 0x%"PRIx64" - 0x%"PRIx64 From patchwork Thu Oct 26 10:30:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437471 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 93182C25B48 for ; Thu, 26 Oct 2023 10:47:52 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxt9-0007YH-JT; Thu, 26 Oct 2023 06:46:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxt4-0007WT-0F for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:46:50 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxt0-0001Hy-Sf for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:46:49 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317206; x=1729853206; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9g8Rm7I4TUQ8rsrmDlXEt2QNV3IT7WAzqzvS102c2LE=; b=aJ5u8tQFkzrom8sUfoneWj7UQ+5z63Ph9XM8ilqogYPRp8b6JbR6fDj5 x6mOqWZH3tyBza2dqykKt9pKy0DV+ZtEPMCh6o+4VNlImit6KVsi3UGfh XUajlgA/xN67yLzzXewoyDvN5gufJcbSkN20v84Nea5+92tPVHZR+vCVB zJyypi12wIBcTKt3Lq3IM5rSWe6vE9VW4hQTPiYvukFPrqq2nk3ntvsOy ytXzwDYj16Kuqaw1WGsydF0/OJfBF24v/fBfB3zC9gnDqx7yeW2RLUUVB XXK2PhAQcMEOpF+Q9HGkg4XulbbQtj83TbthzLsF4qcf2ZxnNYwxqozqG A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563267" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563267" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463617" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:31 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v3 09/37] vfio/common: Move giommu_list in base container Date: Thu, 26 Oct 2023 18:30:36 +0800 Message-Id: <20231026103104.1686921-10-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Move the giommu_list field in the base object and store the base container in the VFIOGuestIOMMU. We introduce vfio_container_init/destroy helper on the base container. No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan [ clg: context changes ] Signed-off-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 9 --------- include/hw/vfio/vfio-container-base.h | 13 +++++++++++++ hw/vfio/common.c | 17 +++++++++++------ hw/vfio/container-base.c | 18 ++++++++++++++++++ hw/vfio/container.c | 13 +++---------- 5 files changed, 45 insertions(+), 25 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 97056224f4..fcb4003a21 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -95,7 +95,6 @@ typedef struct VFIOContainer { uint64_t max_dirty_bitmap_size; unsigned long pgsizes; unsigned int dma_max_mappings; - QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; @@ -104,14 +103,6 @@ typedef struct VFIOContainer { GList *iova_ranges; } VFIOContainer; -typedef struct VFIOGuestIOMMU { - VFIOContainer *container; - IOMMUMemoryRegion *iommu_mr; - hwaddr iommu_offset; - IOMMUNotifier n; - QLIST_ENTRY(VFIOGuestIOMMU) giommu_next; -} VFIOGuestIOMMU; - typedef struct VFIORamDiscardListener { VFIOContainer *container; MemoryRegion *mr; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 077e638ee8..71e1e4324e 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -43,8 +43,17 @@ typedef struct { */ typedef struct VFIOContainerBase { const VFIOIOMMUOps *ops; + QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; } VFIOContainerBase; +typedef struct VFIOGuestIOMMU { + VFIOContainerBase *bcontainer; + IOMMUMemoryRegion *iommu_mr; + hwaddr iommu_offset; + IOMMUNotifier n; + QLIST_ENTRY(VFIOGuestIOMMU) giommu_next; +} VFIOGuestIOMMU; + int vfio_container_dma_map(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, void *vaddr, bool readonly); @@ -52,6 +61,10 @@ int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); +void vfio_container_init(VFIOContainerBase *bcontainer, + const VFIOIOMMUOps *ops); +void vfio_container_destroy(VFIOContainerBase *bcontainer); + struct VFIOIOMMUOps { /* basic feature */ int (*dma_map)(VFIOContainerBase *bcontainer, diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 141f2b54a4..4f130ad87c 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -292,7 +292,7 @@ static bool vfio_get_xlat_addr(IOMMUTLBEntry *iotlb, void **vaddr, static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) { VFIOGuestIOMMU *giommu = container_of(n, VFIOGuestIOMMU, n); - VFIOContainerBase *bcontainer = &giommu->container->bcontainer; + VFIOContainerBase *bcontainer = giommu->bcontainer; hwaddr iova = iotlb->iova + giommu->iommu_offset; void *vaddr; int ret; @@ -569,6 +569,7 @@ static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { VFIOContainer *container = container_of(listener, VFIOContainer, listener); + VFIOContainerBase *bcontainer = &container->bcontainer; hwaddr iova, end; Int128 llend, llsize; void *vaddr; @@ -613,7 +614,7 @@ static void vfio_listener_region_add(MemoryListener *listener, giommu->iommu_mr = iommu_mr; giommu->iommu_offset = section->offset_within_address_space - section->offset_within_region; - giommu->container = container; + giommu->bcontainer = bcontainer; llend = int128_add(int128_make64(section->offset_within_region), section->size); llend = int128_sub(llend, int128_one()); @@ -648,7 +649,7 @@ static void vfio_listener_region_add(MemoryListener *listener, g_free(giommu); goto fail; } - QLIST_INSERT_HEAD(&container->giommu_list, giommu, giommu_next); + QLIST_INSERT_HEAD(&bcontainer->giommu_list, giommu, giommu_next); memory_region_iommu_replay(giommu->iommu_mr, &giommu->n); return; @@ -733,6 +734,7 @@ static void vfio_listener_region_del(MemoryListener *listener, MemoryRegionSection *section) { VFIOContainer *container = container_of(listener, VFIOContainer, listener); + VFIOContainerBase *bcontainer = &container->bcontainer; hwaddr iova, end; Int128 llend, llsize; int ret; @@ -745,7 +747,7 @@ static void vfio_listener_region_del(MemoryListener *listener, if (memory_region_is_iommu(section->mr)) { VFIOGuestIOMMU *giommu; - QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) { + QLIST_FOREACH(giommu, &bcontainer->giommu_list, giommu_next) { if (MEMORY_REGION(giommu->iommu_mr) == section->mr && giommu->n.start == section->offset_within_region) { memory_region_unregister_iommu_notifier(section->mr, @@ -1208,7 +1210,9 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) vfio_giommu_dirty_notifier *gdn = container_of(n, vfio_giommu_dirty_notifier, n); VFIOGuestIOMMU *giommu = gdn->giommu; - VFIOContainer *container = giommu->container; + VFIOContainerBase *bcontainer = giommu->bcontainer; + VFIOContainer *container = container_of(bcontainer, VFIOContainer, + bcontainer); hwaddr iova = iotlb->iova + giommu->iommu_offset; ram_addr_t translated_addr; int ret = -EINVAL; @@ -1286,12 +1290,13 @@ static int vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainer *container, static int vfio_sync_dirty_bitmap(VFIOContainer *container, MemoryRegionSection *section) { + VFIOContainerBase *bcontainer = &container->bcontainer; ram_addr_t ram_addr; if (memory_region_is_iommu(section->mr)) { VFIOGuestIOMMU *giommu; - QLIST_FOREACH(giommu, &container->giommu_list, giommu_next) { + QLIST_FOREACH(giommu, &bcontainer->giommu_list, giommu_next) { if (MEMORY_REGION(giommu->iommu_mr) == section->mr && giommu->n.start == section->offset_within_region) { Int128 llend; diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 9db8b89b2f..6be7ce65ad 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -47,3 +47,21 @@ int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, return bcontainer->ops->dma_unmap(bcontainer, iova, size, iotlb); } + +void vfio_container_init(VFIOContainerBase *bcontainer, const VFIOIOMMUOps *ops) +{ + bcontainer->ops = ops; + QLIST_INIT(&bcontainer->giommu_list); +} + +void vfio_container_destroy(VFIOContainerBase *bcontainer) +{ + VFIOGuestIOMMU *giommu, *tmp; + + QLIST_FOREACH_SAFE(giommu, &bcontainer->giommu_list, giommu_next, tmp) { + memory_region_unregister_iommu_notifier( + MEMORY_REGION(giommu->iommu_mr), &giommu->n); + QLIST_REMOVE(giommu, giommu_next); + g_free(giommu); + } +} diff --git a/hw/vfio/container.c b/hw/vfio/container.c index c04df26323..b2b6e05287 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -556,10 +556,9 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->dirty_pages_supported = false; container->dma_max_mappings = 0; container->iova_ranges = NULL; - QLIST_INIT(&container->giommu_list); QLIST_INIT(&container->vrdl_list); bcontainer = &container->bcontainer; - bcontainer->ops = &vfio_legacy_ops; + vfio_container_init(bcontainer, &vfio_legacy_ops); ret = vfio_init_container(container, group->fd, errp); if (ret) { @@ -661,6 +660,7 @@ put_space_exit: static void vfio_disconnect_container(VFIOGroup *group) { VFIOContainer *container = group->container; + VFIOContainerBase *bcontainer = &container->bcontainer; QLIST_REMOVE(group, container_next); group->container = NULL; @@ -685,16 +685,9 @@ static void vfio_disconnect_container(VFIOGroup *group) if (QLIST_EMPTY(&container->group_list)) { VFIOAddressSpace *space = container->space; - VFIOGuestIOMMU *giommu, *tmp; QLIST_REMOVE(container, next); - - QLIST_FOREACH_SAFE(giommu, &container->giommu_list, giommu_next, tmp) { - memory_region_unregister_iommu_notifier( - MEMORY_REGION(giommu->iommu_mr), &giommu->n); - QLIST_REMOVE(giommu, giommu_next); - g_free(giommu); - } + vfio_container_destroy(bcontainer); trace_vfio_disconnect_container(container->fd); close(container->fd); From patchwork Thu Oct 26 10:30:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437470 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 97148C27C46 for ; Thu, 26 Oct 2023 10:47:40 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxtB-0007Yy-Js; Thu, 26 Oct 2023 06:46:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxt8-0007XV-V2; Thu, 26 Oct 2023 06:46:55 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxt6-0001Kb-Ue; Thu, 26 Oct 2023 06:46:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317212; x=1729853212; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4MsmyFCLG1GX+yRd8MfR70wbqbPBpwnJkOKo9nrwlRk=; b=ZnX6wEqCd3uA+CmiCko49kLD/LQtwUqUgsuqjQEFQB4EUHDgb5BoGUrl Of0KPWv7fTSmVaAfmbK0gHQulHp53JbNZ6h5kHtHTAM0EG0eMbQqxmSUu mqVCkbNlGq4ABQEeFAyJiU5W3IqfSsCz1vcPAPDWzsuMlnblKZqTEhHu9 MIk8fcLTNseptcCCxwj4Ny5sRlFbvSngC9tENTrFYv+/1PPiD/W10Uaaq 6dBCKIz80W+/U2OzcBPzrLeGf3IOWZmAMz/VeqPCkqEAuqBZFzmqdkgST Ci0FyJbUCrFbkQNaJK4nCBMD4PwsbDcn4prKDKT4H3hr5SBqvbGcfTm9m A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563285" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563285" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463633" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:35 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 10/37] vfio/container: Move space field to base container Date: Thu, 26 Oct 2023 18:30:37 +0800 Message-Id: <20231026103104.1686921-11-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Move the space field to the base object. Also the VFIOAddressSpace now contains a list of base containers. No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan [ clg: context changes ] Signed-off-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 8 -------- include/hw/vfio/vfio-container-base.h | 9 +++++++++ hw/ppc/spapr_pci_vfio.c | 10 +++++----- hw/vfio/common.c | 4 ++-- hw/vfio/container-base.c | 6 +++++- hw/vfio/container.c | 18 +++++++++--------- 6 files changed, 30 insertions(+), 25 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index fcb4003a21..857d2b8076 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -73,17 +73,10 @@ typedef struct VFIOMigration { bool initial_data_sent; } VFIOMigration; -typedef struct VFIOAddressSpace { - AddressSpace *as; - QLIST_HEAD(, VFIOContainer) containers; - QLIST_ENTRY(VFIOAddressSpace) list; -} VFIOAddressSpace; - struct VFIOGroup; typedef struct VFIOContainer { VFIOContainerBase bcontainer; - VFIOAddressSpace *space; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ MemoryListener listener; MemoryListener prereg_listener; @@ -98,7 +91,6 @@ typedef struct VFIOContainer { QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; - QLIST_ENTRY(VFIOContainer) next; QLIST_HEAD(, VFIODevice) device_list; GList *iova_ranges; } VFIOContainer; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 71e1e4324e..a5fef3e6a8 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -38,12 +38,20 @@ typedef struct { hwaddr pages; } VFIOBitmap; +typedef struct VFIOAddressSpace { + AddressSpace *as; + QLIST_HEAD(, VFIOContainerBase) containers; + QLIST_ENTRY(VFIOAddressSpace) list; +} VFIOAddressSpace; + /* * This is the base object for vfio container backends */ typedef struct VFIOContainerBase { const VFIOIOMMUOps *ops; + VFIOAddressSpace *space; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; + QLIST_ENTRY(VFIOContainerBase) next; } VFIOContainerBase; typedef struct VFIOGuestIOMMU { @@ -62,6 +70,7 @@ int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, IOMMUTLBEntry *iotlb); void vfio_container_init(VFIOContainerBase *bcontainer, + VFIOAddressSpace *space, const VFIOIOMMUOps *ops); void vfio_container_destroy(VFIOContainerBase *bcontainer); diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index f283f7e38d..d1d07bec46 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -84,27 +84,27 @@ static int vfio_eeh_container_op(VFIOContainer *container, uint32_t op) static VFIOContainer *vfio_eeh_as_container(AddressSpace *as) { VFIOAddressSpace *space = vfio_get_address_space(as); - VFIOContainer *container = NULL; + VFIOContainerBase *bcontainer = NULL; if (QLIST_EMPTY(&space->containers)) { /* No containers to act on */ goto out; } - container = QLIST_FIRST(&space->containers); + bcontainer = QLIST_FIRST(&space->containers); - if (QLIST_NEXT(container, next)) { + if (QLIST_NEXT(bcontainer, next)) { /* * We don't yet have logic to synchronize EEH state across * multiple containers */ - container = NULL; + bcontainer = NULL; goto out; } out: vfio_put_address_space(space); - return container; + return container_of(bcontainer, VFIOContainer, bcontainer); } static bool vfio_eeh_as_ok(AddressSpace *as) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 4f130ad87c..f87a0dcec3 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -145,7 +145,7 @@ void vfio_unblock_multiple_devices_migration(void) bool vfio_viommu_preset(VFIODevice *vbasedev) { - return vbasedev->container->space->as != &address_space_memory; + return vbasedev->container->bcontainer.space->as != &address_space_memory; } static void vfio_set_migration_error(int err) @@ -924,7 +924,7 @@ static void vfio_dirty_tracking_init(VFIOContainer *container, dirty.container = container; memory_listener_register(&dirty.listener, - container->space->as); + container->bcontainer.space->as); *ranges = dirty.ranges; diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 6be7ce65ad..99b2957d7b 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -48,9 +48,11 @@ int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, return bcontainer->ops->dma_unmap(bcontainer, iova, size, iotlb); } -void vfio_container_init(VFIOContainerBase *bcontainer, const VFIOIOMMUOps *ops) +void vfio_container_init(VFIOContainerBase *bcontainer, VFIOAddressSpace *space, + const VFIOIOMMUOps *ops) { bcontainer->ops = ops; + bcontainer->space = space; QLIST_INIT(&bcontainer->giommu_list); } @@ -58,6 +60,8 @@ void vfio_container_destroy(VFIOContainerBase *bcontainer) { VFIOGuestIOMMU *giommu, *tmp; + QLIST_REMOVE(bcontainer, next); + QLIST_FOREACH_SAFE(giommu, &bcontainer->giommu_list, giommu_next, tmp) { memory_region_unregister_iommu_notifier( MEMORY_REGION(giommu->iommu_mr), &giommu->n); diff --git a/hw/vfio/container.c b/hw/vfio/container.c index b2b6e05287..761310fa51 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -514,7 +514,8 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, * details once we know which type of IOMMU we are using. */ - QLIST_FOREACH(container, &space->containers, next) { + QLIST_FOREACH(bcontainer, &space->containers, next) { + container = container_of(bcontainer, VFIOContainer, bcontainer); if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) { ret = vfio_ram_block_discard_disable(container, true); if (ret) { @@ -550,7 +551,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } container = g_malloc0(sizeof(*container)); - container->space = space; container->fd = fd; container->error = NULL; container->dirty_pages_supported = false; @@ -558,7 +558,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->iova_ranges = NULL; QLIST_INIT(&container->vrdl_list); bcontainer = &container->bcontainer; - vfio_container_init(bcontainer, &vfio_legacy_ops); + vfio_container_init(bcontainer, space, &vfio_legacy_ops); ret = vfio_init_container(container, group->fd, errp); if (ret) { @@ -613,14 +613,15 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, vfio_kvm_device_add_group(group); QLIST_INIT(&container->group_list); - QLIST_INSERT_HEAD(&space->containers, container, next); + QLIST_INSERT_HEAD(&space->containers, bcontainer, next); group->container = container; QLIST_INSERT_HEAD(&container->group_list, group, container_next); container->listener = vfio_memory_listener; - memory_listener_register(&container->listener, container->space->as); + memory_listener_register(&container->listener, + container->bcontainer.space->as); if (container->error) { ret = -1; @@ -634,7 +635,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, return 0; listener_release_exit: QLIST_REMOVE(group, container_next); - QLIST_REMOVE(container, next); + QLIST_REMOVE(bcontainer, next); vfio_kvm_device_del_group(group); memory_listener_unregister(&container->listener); if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU || @@ -684,9 +685,8 @@ static void vfio_disconnect_container(VFIOGroup *group) } if (QLIST_EMPTY(&container->group_list)) { - VFIOAddressSpace *space = container->space; + VFIOAddressSpace *space = container->bcontainer.space; - QLIST_REMOVE(container, next); vfio_container_destroy(bcontainer); trace_vfio_disconnect_container(container->fd); @@ -706,7 +706,7 @@ static VFIOGroup *vfio_get_group(int groupid, AddressSpace *as, Error **errp) QLIST_FOREACH(group, &vfio_group_list, next) { if (group->groupid == groupid) { /* Found it. Now is it already in the right context? */ - if (group->container->space->as == as) { + if (group->container->bcontainer.space->as == as) { return group; } else { error_setg(errp, "group %d used in multiple address spaces", From patchwork Thu Oct 26 10:30:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437469 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8C1C7C25B70 for ; Thu, 26 Oct 2023 10:47:40 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxtD-0007bN-4g; Thu, 26 Oct 2023 06:46:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxtB-0007ZD-Bt for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:46:57 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxt9-0001Kb-Ag for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:46:57 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317215; x=1729853215; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GNRoFOe2mM/Er3WizCkJ+kBqmS/CXS+CIzmjKihYMio=; b=eXJ7CpOzWF1MmNN2ORNaslmS4jJxV1XZz2vtrRIUlWTZUBwMHNq815NC XW8AIwLN24AZCrtoyOl368AL4QHwNpyEg3twzqrl7mmoa9De9oYJQLWSu t4KtRrjdGAhHPNgAxcVUKfj6PcP32KDdIdcYU6TcmpA9Jp3ZTJ9lDqxMu sh6gXDFiEWfFiObFYiHKobXlvJtgwZWvTOb401Tn7rcFSwV6nExLZFnGN ELUsWprXQ3s91FmkXCtjRGrJBunqssHnN23VZMRfcrUwEewoap0W2N8tT EkXbnnVrepwsZAM/JmjIZwM/6CCUSiHbe4+5sOJzaxMU9zFxO2OAIyjmX w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563318" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563318" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:54 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463647" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:40 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v3 11/37] vfio/container: Switch to IOMMU BE set_dirty_page_tracking/query_dirty_bitmap API Date: Thu, 26 Oct 2023 18:30:38 +0800 Message-Id: <20231026103104.1686921-12-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger dirty_pages_supported field is also moved to the base container No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 6 ------ include/hw/vfio/vfio-container-base.h | 6 ++++++ hw/vfio/common.c | 12 ++++++++---- hw/vfio/container-base.c | 23 +++++++++++++++++++++++ hw/vfio/container.c | 21 ++++++++++++++------- 5 files changed, 51 insertions(+), 17 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 857d2b8076..d053c61872 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -83,7 +83,6 @@ typedef struct VFIOContainer { unsigned iommu_type; Error *error; bool initialized; - bool dirty_pages_supported; uint64_t dirty_pgsizes; uint64_t max_dirty_bitmap_size; unsigned long pgsizes; @@ -190,11 +189,6 @@ VFIOAddressSpace *vfio_get_address_space(AddressSpace *as); void vfio_put_address_space(VFIOAddressSpace *space); bool vfio_devices_all_running_and_saving(VFIOContainer *container); -/* container->fd */ -int vfio_set_dirty_page_tracking(VFIOContainer *container, bool start); -int vfio_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap, - hwaddr iova, hwaddr size); - /* SPAPR specific */ int vfio_container_add_section_window(VFIOContainer *container, MemoryRegionSection *section, diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index a5fef3e6a8..ea8436a064 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -50,6 +50,7 @@ typedef struct VFIOAddressSpace { typedef struct VFIOContainerBase { const VFIOIOMMUOps *ops; VFIOAddressSpace *space; + bool dirty_pages_supported; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_ENTRY(VFIOContainerBase) next; } VFIOContainerBase; @@ -68,6 +69,11 @@ int vfio_container_dma_map(VFIOContainerBase *bcontainer, int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); +int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer, + bool start); +int vfio_container_query_dirty_bitmap(VFIOContainerBase *bcontainer, + VFIOBitmap *vbmap, + hwaddr iova, hwaddr size); void vfio_container_init(VFIOContainerBase *bcontainer, VFIOAddressSpace *space, diff --git a/hw/vfio/common.c b/hw/vfio/common.c index f87a0dcec3..7d9b87fc67 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1081,7 +1081,8 @@ static void vfio_listener_log_global_start(MemoryListener *listener) if (vfio_devices_all_device_dirty_tracking(container)) { ret = vfio_devices_dma_logging_start(container); } else { - ret = vfio_set_dirty_page_tracking(container, true); + ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, + true); } if (ret) { @@ -1099,7 +1100,8 @@ static void vfio_listener_log_global_stop(MemoryListener *listener) if (vfio_devices_all_device_dirty_tracking(container)) { vfio_devices_dma_logging_stop(container); } else { - ret = vfio_set_dirty_page_tracking(container, false); + ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, + false); } if (ret) { @@ -1167,7 +1169,8 @@ int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, VFIOBitmap vbmap; int ret; - if (!container->dirty_pages_supported && !all_device_dirty_tracking) { + if (!container->bcontainer.dirty_pages_supported && + !all_device_dirty_tracking) { cpu_physical_memory_set_dirty_range(ram_addr, size, tcg_enabled() ? DIRTY_CLIENTS_ALL : DIRTY_CLIENTS_NOCODE); @@ -1182,7 +1185,8 @@ int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, if (all_device_dirty_tracking) { ret = vfio_devices_query_dirty_bitmap(container, &vbmap, iova, size); } else { - ret = vfio_query_dirty_bitmap(container, &vbmap, iova, size); + ret = vfio_container_query_dirty_bitmap(&container->bcontainer, &vbmap, + iova, size); } if (ret) { diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 99b2957d7b..a7cf517dd2 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -48,11 +48,34 @@ int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, return bcontainer->ops->dma_unmap(bcontainer, iova, size, iotlb); } +int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer, + bool start) +{ + /* Fallback to all pages dirty if dirty page sync isn't supported */ + if (!bcontainer->ops->set_dirty_page_tracking) { + return 0; + } + + return bcontainer->ops->set_dirty_page_tracking(bcontainer, start); +} + +int vfio_container_query_dirty_bitmap(VFIOContainerBase *bcontainer, + VFIOBitmap *vbmap, + hwaddr iova, hwaddr size) +{ + if (!bcontainer->ops->query_dirty_bitmap) { + return -EINVAL; + } + + return bcontainer->ops->query_dirty_bitmap(bcontainer, vbmap, iova, size); +} + void vfio_container_init(VFIOContainerBase *bcontainer, VFIOAddressSpace *space, const VFIOIOMMUOps *ops) { bcontainer->ops = ops; bcontainer->space = space; + bcontainer->dirty_pages_supported = false; QLIST_INIT(&bcontainer->giommu_list); } diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 761310fa51..6f02ca133e 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -131,7 +131,7 @@ static int vfio_legacy_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, if (iotlb && vfio_devices_all_running_and_mig_active(container)) { if (!vfio_devices_all_device_dirty_tracking(container) && - container->dirty_pages_supported) { + container->bcontainer.dirty_pages_supported) { return vfio_dma_unmap_bitmap(container, iova, size, iotlb); } @@ -205,14 +205,17 @@ static int vfio_legacy_dma_map(VFIOContainerBase *bcontainer, hwaddr iova, return -errno; } -int vfio_set_dirty_page_tracking(VFIOContainer *container, bool start) +static int vfio_legacy_set_dirty_page_tracking(VFIOContainerBase *bcontainer, + bool start) { + VFIOContainer *container = container_of(bcontainer, VFIOContainer, + bcontainer); int ret; struct vfio_iommu_type1_dirty_bitmap dirty = { .argsz = sizeof(dirty), }; - if (!container->dirty_pages_supported) { + if (!bcontainer->dirty_pages_supported) { return 0; } @@ -232,9 +235,12 @@ int vfio_set_dirty_page_tracking(VFIOContainer *container, bool start) return ret; } -int vfio_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap, - hwaddr iova, hwaddr size) +static int vfio_legacy_query_dirty_bitmap(VFIOContainerBase *bcontainer, + VFIOBitmap *vbmap, + hwaddr iova, hwaddr size) { + VFIOContainer *container = container_of(bcontainer, VFIOContainer, + bcontainer); struct vfio_iommu_type1_dirty_bitmap *dbitmap; struct vfio_iommu_type1_dirty_bitmap_get *range; int ret; @@ -461,7 +467,7 @@ static void vfio_get_iommu_info_migration(VFIOContainer *container, * qemu_real_host_page_size to mark those dirty. */ if (cap_mig->pgsize_bitmap & qemu_real_host_page_size()) { - container->dirty_pages_supported = true; + container->bcontainer.dirty_pages_supported = true; container->max_dirty_bitmap_size = cap_mig->max_dirty_bitmap_size; container->dirty_pgsizes = cap_mig->pgsize_bitmap; } @@ -553,7 +559,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container = g_malloc0(sizeof(*container)); container->fd = fd; container->error = NULL; - container->dirty_pages_supported = false; container->dma_max_mappings = 0; container->iova_ranges = NULL; QLIST_INIT(&container->vrdl_list); @@ -938,4 +943,6 @@ void vfio_detach_device(VFIODevice *vbasedev) const VFIOIOMMUOps vfio_legacy_ops = { .dma_map = vfio_legacy_dma_map, .dma_unmap = vfio_legacy_dma_unmap, + .set_dirty_page_tracking = vfio_legacy_set_dirty_page_tracking, + .query_dirty_bitmap = vfio_legacy_query_dirty_bitmap, }; From patchwork Thu Oct 26 10:30:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437495 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8217EC25B6F for ; Thu, 26 Oct 2023 10:51:35 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxth-0008E5-IK; Thu, 26 Oct 2023 06:47:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxtf-00088B-Pa for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:47:27 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxtd-0001Nq-Ku for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:47:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317245; x=1729853245; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8Q9n3gWiCRR4PqU+IktZvocxRr0g9kjMU/I1dH7+vfw=; b=MYuUcIN4yfsmb4FAI6Az1VGMde8aqW6JEPgnQ589mtPdKUFxf85xB0QL q5a8OHkNfmkPqPZQ91ZNS2scRzNO6pDjnVy1e7ebB4RRyKsLqcwdJIhzU Gof9SNKzdE4CN02TYLJnlcSOC8mQC5YaVvGhUn4z9bo55v2XabNFMpgcK GGJKF4gn0iZWKapARuItOc4wuh2Kdx44fj11sa/OQBdfWb9RaEMhNmkMW RepI+RBKWIHZBXlbN+BiTKejjaN727BH0WCjh9jBG0QKoDRjI2p4buDKc 5GSykKyM2moDIKKHpUUA9jOqeKIzyYuG1OBx6wbO+66vpcuDwDPoDByqC A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563350" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563350" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:57 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463653" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:44 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 12/37] vfio/container: Move per container device list in base container Date: Thu, 26 Oct 2023 18:30:39 +0800 Message-Id: <20231026103104.1686921-13-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org VFIO Device is also changed to point to base container instead of legacy container. No fucntional change intended. Signed-off-by: Zhenzhong Duan [ clg: context changes ] Signed-off-by: Cédric Le Goater Reviewed-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 3 +-- include/hw/vfio/vfio-container-base.h | 1 + hw/vfio/common.c | 23 +++++++++++++++-------- hw/vfio/container.c | 12 ++++++------ 4 files changed, 23 insertions(+), 16 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index d053c61872..f7a84dc8d0 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -90,7 +90,6 @@ typedef struct VFIOContainer { QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; - QLIST_HEAD(, VFIODevice) device_list; GList *iova_ranges; } VFIOContainer; @@ -118,7 +117,7 @@ typedef struct VFIODevice { QLIST_ENTRY(VFIODevice) container_next; QLIST_ENTRY(VFIODevice) global_next; struct VFIOGroup *group; - VFIOContainer *container; + VFIOContainerBase *bcontainer; char *sysfsdev; char *name; DeviceState *dev; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index ea8436a064..f1de1ef120 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -53,6 +53,7 @@ typedef struct VFIOContainerBase { bool dirty_pages_supported; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_ENTRY(VFIOContainerBase) next; + QLIST_HEAD(, VFIODevice) device_list; } VFIOContainerBase; typedef struct VFIOGuestIOMMU { diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 7d9b87fc67..f8475348ad 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -145,7 +145,7 @@ void vfio_unblock_multiple_devices_migration(void) bool vfio_viommu_preset(VFIODevice *vbasedev) { - return vbasedev->container->bcontainer.space->as != &address_space_memory; + return vbasedev->bcontainer->space->as != &address_space_memory; } static void vfio_set_migration_error(int err) @@ -179,6 +179,7 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev) static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) { + VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; MigrationState *ms = migrate_get_current(); @@ -187,7 +188,7 @@ static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) return false; } - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { VFIOMigration *migration = vbasedev->migration; if (!migration) { @@ -205,9 +206,10 @@ static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) bool vfio_devices_all_device_dirty_tracking(VFIOContainer *container) { + VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { if (!vbasedev->dirty_pages_supported) { return false; } @@ -222,13 +224,14 @@ bool vfio_devices_all_device_dirty_tracking(VFIOContainer *container) */ bool vfio_devices_all_running_and_mig_active(VFIOContainer *container) { + VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; if (!migration_is_active(migrate_get_current())) { return false; } - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { VFIOMigration *migration = vbasedev->migration; if (!migration) { @@ -835,12 +838,13 @@ static bool vfio_section_is_vfio_pci(MemoryRegionSection *section, VFIOContainer *container) { VFIOPCIDevice *pcidev; + VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; Object *owner; owner = memory_region_owner(section->mr); - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { if (vbasedev->type != VFIO_DEVICE_TYPE_PCI) { continue; } @@ -941,13 +945,14 @@ static void vfio_devices_dma_logging_stop(VFIOContainer *container) uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature), sizeof(uint64_t))] = {}; struct vfio_device_feature *feature = (struct vfio_device_feature *)buf; + VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; feature->argsz = sizeof(buf); feature->flags = VFIO_DEVICE_FEATURE_SET | VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP; - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { if (!vbasedev->dirty_tracking) { continue; } @@ -1038,6 +1043,7 @@ static int vfio_devices_dma_logging_start(VFIOContainer *container) { struct vfio_device_feature *feature; VFIODirtyRanges ranges; + VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; int ret = 0; @@ -1048,7 +1054,7 @@ static int vfio_devices_dma_logging_start(VFIOContainer *container) return -errno; } - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { if (vbasedev->dirty_tracking) { continue; } @@ -1141,10 +1147,11 @@ int vfio_devices_query_dirty_bitmap(VFIOContainer *container, VFIOBitmap *vbmap, hwaddr iova, hwaddr size) { + VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; int ret; - QLIST_FOREACH(vbasedev, &container->device_list, container_next) { + QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { ret = vfio_device_dma_logging_report(vbasedev, iova, size, vbmap->bitmap); if (ret) { diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 6f02ca133e..3228dd2109 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -889,7 +889,7 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev, int groupid = vfio_device_groupid(vbasedev, errp); VFIODevice *vbasedev_iter; VFIOGroup *group; - VFIOContainer *container; + VFIOContainerBase *bcontainer; int ret; if (groupid < 0) { @@ -916,9 +916,9 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev, return ret; } - container = group->container; - vbasedev->container = container; - QLIST_INSERT_HEAD(&container->device_list, vbasedev, container_next); + bcontainer = &group->container->bcontainer; + vbasedev->bcontainer = bcontainer; + QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); return ret; @@ -928,13 +928,13 @@ void vfio_detach_device(VFIODevice *vbasedev) { VFIOGroup *group = vbasedev->group; - if (!vbasedev->container) { + if (!vbasedev->bcontainer) { return; } QLIST_REMOVE(vbasedev, global_next); QLIST_REMOVE(vbasedev, container_next); - vbasedev->container = NULL; + vbasedev->bcontainer = NULL; trace_vfio_detach_device(vbasedev->name, group->groupid); vfio_put_base_device(vbasedev); vfio_put_group(group); From patchwork Thu Oct 26 10:30:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437473 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 70301C25B48 for ; Thu, 26 Oct 2023 10:48:03 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxtm-0008SA-1V; Thu, 26 Oct 2023 06:47:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxth-0008Dq-Df for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:47:29 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxte-0001Nv-Rq for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:47:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317246; x=1729853246; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zZK0COSD976kyMw31NflsdQhbowTcJEQCFzTXT4zXkw=; b=hUOQYVq56Psdp53KIEYDp+d6/hrZ3j0hWcmPM7DvCSUBEpCn+wvBffeo F6cNrYwQibmFhl5hoCT3yTj0rYHBiHJSF/9kXJDBMAdzrLTpV8DFfybap Je5psGc8PquKbAsEXSmvYj9KZ+w2VCeeXdSFRsh5VfbyXAYSJiVXSjNGI icWlowIwue/u4izybKrkCLeSvPqH9QJd3p7axP0Sdhb5h/O8D9uKfKxDF kNhKQn0YIQi7vNnEOS5eGSopVCwBsbYy/5qm/6CAo0oKOErO60sCIJxQN EtxZB53P2AdSo+Qv/xZBhLsh0Yx96eTnggCkoaAhH1zsp5olk0OV5Uv0E Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563370" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563370" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:01 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463663" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:47 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 13/37] vfio/container: Convert functions to base container Date: Thu, 26 Oct 2023 18:30:40 +0800 Message-Id: <20231026103104.1686921-14-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger In the prospect to get rid of VFIOContainer refs in common.c lets convert misc functions to use the base container object instead: vfio_devices_all_dirty_tracking vfio_devices_all_device_dirty_tracking vfio_devices_all_running_and_mig_active vfio_devices_query_dirty_bitmap vfio_get_dirty_bitmap Signed-off-by: Eric Auger Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater Reviewed-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 9 ++++---- hw/vfio/common.c | 42 +++++++++++++++-------------------- hw/vfio/container.c | 6 ++--- hw/vfio/trace-events | 2 +- 4 files changed, 26 insertions(+), 33 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index f7a84dc8d0..fb3c7aea8f 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -186,7 +186,6 @@ typedef struct VFIODisplay { VFIOAddressSpace *vfio_get_address_space(AddressSpace *as); void vfio_put_address_space(VFIOAddressSpace *space); -bool vfio_devices_all_running_and_saving(VFIOContainer *container); /* SPAPR specific */ int vfio_container_add_section_window(VFIOContainer *container, @@ -260,11 +259,11 @@ bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp); void vfio_migration_exit(VFIODevice *vbasedev); int vfio_bitmap_alloc(VFIOBitmap *vbmap, hwaddr size); -bool vfio_devices_all_running_and_mig_active(VFIOContainer *container); -bool vfio_devices_all_device_dirty_tracking(VFIOContainer *container); -int vfio_devices_query_dirty_bitmap(VFIOContainer *container, +bool vfio_devices_all_running_and_mig_active(VFIOContainerBase *bcontainer); +bool vfio_devices_all_device_dirty_tracking(VFIOContainerBase *bcontainer); +int vfio_devices_query_dirty_bitmap(VFIOContainerBase *bcontainer, VFIOBitmap *vbmap, hwaddr iova, hwaddr size); -int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, +int vfio_get_dirty_bitmap(VFIOContainerBase *bcontainer, uint64_t iova, uint64_t size, ram_addr_t ram_addr); #endif /* HW_VFIO_VFIO_COMMON_H */ diff --git a/hw/vfio/common.c b/hw/vfio/common.c index f8475348ad..91411d9844 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -177,9 +177,8 @@ bool vfio_device_state_is_precopy(VFIODevice *vbasedev) migration->device_state == VFIO_DEVICE_STATE_PRE_COPY_P2P; } -static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) +static bool vfio_devices_all_dirty_tracking(VFIOContainerBase *bcontainer) { - VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; MigrationState *ms = migrate_get_current(); @@ -204,9 +203,8 @@ static bool vfio_devices_all_dirty_tracking(VFIOContainer *container) return true; } -bool vfio_devices_all_device_dirty_tracking(VFIOContainer *container) +bool vfio_devices_all_device_dirty_tracking(VFIOContainerBase *bcontainer) { - VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { @@ -222,9 +220,8 @@ bool vfio_devices_all_device_dirty_tracking(VFIOContainer *container) * Check if all VFIO devices are running and migration is active, which is * essentially equivalent to the migration being in pre-copy phase. */ -bool vfio_devices_all_running_and_mig_active(VFIOContainer *container) +bool vfio_devices_all_running_and_mig_active(VFIOContainerBase *bcontainer) { - VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; if (!migration_is_active(migrate_get_current())) { @@ -1084,7 +1081,7 @@ static void vfio_listener_log_global_start(MemoryListener *listener) VFIOContainer *container = container_of(listener, VFIOContainer, listener); int ret; - if (vfio_devices_all_device_dirty_tracking(container)) { + if (vfio_devices_all_device_dirty_tracking(&container->bcontainer)) { ret = vfio_devices_dma_logging_start(container); } else { ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, @@ -1103,7 +1100,7 @@ static void vfio_listener_log_global_stop(MemoryListener *listener) VFIOContainer *container = container_of(listener, VFIOContainer, listener); int ret = 0; - if (vfio_devices_all_device_dirty_tracking(container)) { + if (vfio_devices_all_device_dirty_tracking(&container->bcontainer)) { vfio_devices_dma_logging_stop(container); } else { ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, @@ -1143,11 +1140,10 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova, return 0; } -int vfio_devices_query_dirty_bitmap(VFIOContainer *container, +int vfio_devices_query_dirty_bitmap(VFIOContainerBase *bcontainer, VFIOBitmap *vbmap, hwaddr iova, hwaddr size) { - VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; int ret; @@ -1167,17 +1163,16 @@ int vfio_devices_query_dirty_bitmap(VFIOContainer *container, return 0; } -int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, +int vfio_get_dirty_bitmap(VFIOContainerBase *bcontainer, uint64_t iova, uint64_t size, ram_addr_t ram_addr) { bool all_device_dirty_tracking = - vfio_devices_all_device_dirty_tracking(container); + vfio_devices_all_device_dirty_tracking(bcontainer); uint64_t dirty_pages; VFIOBitmap vbmap; int ret; - if (!container->bcontainer.dirty_pages_supported && - !all_device_dirty_tracking) { + if (!bcontainer->dirty_pages_supported && !all_device_dirty_tracking) { cpu_physical_memory_set_dirty_range(ram_addr, size, tcg_enabled() ? DIRTY_CLIENTS_ALL : DIRTY_CLIENTS_NOCODE); @@ -1190,10 +1185,9 @@ int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, } if (all_device_dirty_tracking) { - ret = vfio_devices_query_dirty_bitmap(container, &vbmap, iova, size); + ret = vfio_devices_query_dirty_bitmap(bcontainer, &vbmap, iova, size); } else { - ret = vfio_container_query_dirty_bitmap(&container->bcontainer, &vbmap, - iova, size); + ret = vfio_container_query_dirty_bitmap(bcontainer, &vbmap, iova, size); } if (ret) { @@ -1203,8 +1197,7 @@ int vfio_get_dirty_bitmap(VFIOContainer *container, uint64_t iova, dirty_pages = cpu_physical_memory_set_dirty_lebitmap(vbmap.bitmap, ram_addr, vbmap.pages); - trace_vfio_get_dirty_bitmap(container->fd, iova, size, vbmap.size, - ram_addr, dirty_pages); + trace_vfio_get_dirty_bitmap(iova, size, vbmap.size, ram_addr, dirty_pages); out: g_free(vbmap.bitmap); @@ -1238,8 +1231,8 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) rcu_read_lock(); if (vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) { - ret = vfio_get_dirty_bitmap(container, iova, iotlb->addr_mask + 1, - translated_addr); + ret = vfio_get_dirty_bitmap(&container->bcontainer, iova, + iotlb->addr_mask + 1, translated_addr); if (ret) { error_report("vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", @@ -1268,7 +1261,8 @@ static int vfio_ram_discard_get_dirty_bitmap(MemoryRegionSection *section, * Sync the whole mapped region (spanning multiple individual mappings) * in one go. */ - return vfio_get_dirty_bitmap(vrdl->container, iova, size, ram_addr); + return vfio_get_dirty_bitmap(&vrdl->container->bcontainer, iova, size, + ram_addr); } static int vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainer *container, @@ -1337,7 +1331,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container, ram_addr = memory_region_get_ram_addr(section->mr) + section->offset_within_region; - return vfio_get_dirty_bitmap(container, + return vfio_get_dirty_bitmap(&container->bcontainer, REAL_HOST_PAGE_ALIGN(section->offset_within_address_space), int128_get64(section->size), ram_addr); } @@ -1352,7 +1346,7 @@ static void vfio_listener_log_sync(MemoryListener *listener, return; } - if (vfio_devices_all_dirty_tracking(container)) { + if (vfio_devices_all_dirty_tracking(&container->bcontainer)) { ret = vfio_sync_dirty_bitmap(container, section); if (ret) { error_report("vfio: Failed to sync dirty bitmap, err: %d (%s)", ret, diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 3228dd2109..8d5b408e86 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -129,8 +129,8 @@ static int vfio_legacy_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, bool need_dirty_sync = false; int ret; - if (iotlb && vfio_devices_all_running_and_mig_active(container)) { - if (!vfio_devices_all_device_dirty_tracking(container) && + if (iotlb && vfio_devices_all_running_and_mig_active(bcontainer)) { + if (!vfio_devices_all_device_dirty_tracking(bcontainer) && container->bcontainer.dirty_pages_supported) { return vfio_dma_unmap_bitmap(container, iova, size, iotlb); } @@ -162,7 +162,7 @@ static int vfio_legacy_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, } if (need_dirty_sync) { - ret = vfio_get_dirty_bitmap(container, iova, size, + ret = vfio_get_dirty_bitmap(bcontainer, iova, size, iotlb->translated_addr); if (ret) { return ret; diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 9f7fedee98..08a1f9dfa4 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -117,7 +117,7 @@ vfio_region_sparse_mmap_header(const char *name, int index, int nr_areas) "Devic vfio_region_sparse_mmap_entry(int i, unsigned long start, unsigned long end) "sparse entry %d [0x%lx - 0x%lx]" vfio_get_dev_region(const char *name, int index, uint32_t type, uint32_t subtype) "%s index %d, %08x/%08x" vfio_legacy_dma_unmap_overflow_workaround(void) "" -vfio_get_dirty_bitmap(int fd, uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t start, uint64_t dirty_pages) "container fd=%d, iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" start=0x%"PRIx64" dirty_pages=%"PRIu64 +vfio_get_dirty_bitmap(uint64_t iova, uint64_t size, uint64_t bitmap_size, uint64_t start, uint64_t dirty_pages) "iova=0x%"PRIx64" size= 0x%"PRIx64" bitmap_size=0x%"PRIx64" start=0x%"PRIx64" dirty_pages=%"PRIu64 vfio_iommu_map_dirty_notify(uint64_t iova_start, uint64_t iova_end) "iommu dirty @ 0x%"PRIx64" - 0x%"PRIx64 # platform.c From patchwork Thu Oct 26 10:30:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437499 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D6F61C25B6B for ; Thu, 26 Oct 2023 10:51:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxtq-00005z-Dh; Thu, 26 Oct 2023 06:47:39 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxtj-0008Ov-Kc; Thu, 26 Oct 2023 06:47:31 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxtg-0001Nq-UP; Thu, 26 Oct 2023 06:47:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317248; x=1729853248; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/ZeplMRE3AJye1Y/KzA8yLdVbkLwLeSWTikx5OVfkwM=; b=lcXsZAijCn3Jy3BJLNX50439vbn3n4iqgMxQstmcH3V/gxpepL+zwei0 0jO+g/j2xfLyxMDxUH6pHde9bSTVf1Wxp4hGPq5hZXGHpCUsG7qhd3lKb nP0wOUPUQQjpkN2k7s6DXLQsoZay5XuAi8w/OVO3kjuXtT7hq3qk3vxe+ nXHOFTktVoPUSNMF6Pnl/4TEpd6lf9+ksZhdMV6la271Rkb9jW9Oh0Nc3 XAv3+tOO9ZIAS2o5uVUjqyWdVAKkAPqbKTaMlqRCIQzW/ccVhMDO89syj hZIBEGy/RkzmLxg50uBYbOaIOVNbc3p5EjHr1erTuS0/j25RMp1SqAXK0 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563405" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563405" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463672" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:51 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 14/37] vfio/container: Move vrdl_list, pgsizes and dma_max_mappings to base container Date: Thu, 26 Oct 2023 18:30:41 +0800 Message-Id: <20231026103104.1686921-15-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Move vrdl_list, pgsizes and dma_max_mappings to the base container object No functional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan [ clg: context changes ] Signed-off-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 13 ------- include/hw/vfio/vfio-container-base.h | 13 +++++++ hw/vfio/common.c | 49 ++++++++++++++------------- hw/vfio/container-base.c | 12 +++++++ hw/vfio/container.c | 12 +++---- hw/vfio/spapr.c | 18 +++++++--- 6 files changed, 68 insertions(+), 49 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index fb3c7aea8f..65ae2d76cf 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -85,24 +85,11 @@ typedef struct VFIOContainer { bool initialized; uint64_t dirty_pgsizes; uint64_t max_dirty_bitmap_size; - unsigned long pgsizes; - unsigned int dma_max_mappings; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; - QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; GList *iova_ranges; } VFIOContainer; -typedef struct VFIORamDiscardListener { - VFIOContainer *container; - MemoryRegion *mr; - hwaddr offset_within_address_space; - hwaddr size; - uint64_t granularity; - RamDiscardListener listener; - QLIST_ENTRY(VFIORamDiscardListener) next; -} VFIORamDiscardListener; - typedef struct VFIOHostDMAWindow { hwaddr min_iova; hwaddr max_iova; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index f1de1ef120..849c8b34b2 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -50,8 +50,11 @@ typedef struct VFIOAddressSpace { typedef struct VFIOContainerBase { const VFIOIOMMUOps *ops; VFIOAddressSpace *space; + unsigned long pgsizes; + unsigned int dma_max_mappings; bool dirty_pages_supported; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; + QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; QLIST_ENTRY(VFIOContainerBase) next; QLIST_HEAD(, VFIODevice) device_list; } VFIOContainerBase; @@ -64,6 +67,16 @@ typedef struct VFIOGuestIOMMU { QLIST_ENTRY(VFIOGuestIOMMU) giommu_next; } VFIOGuestIOMMU; +typedef struct VFIORamDiscardListener { + VFIOContainerBase *bcontainer; + MemoryRegion *mr; + hwaddr offset_within_address_space; + hwaddr size; + uint64_t granularity; + RamDiscardListener listener; + QLIST_ENTRY(VFIORamDiscardListener) next; +} VFIORamDiscardListener; + int vfio_container_dma_map(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, void *vaddr, bool readonly); diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 91411d9844..9b34e7e0f8 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -351,13 +351,13 @@ static void vfio_ram_discard_notify_discard(RamDiscardListener *rdl, { VFIORamDiscardListener *vrdl = container_of(rdl, VFIORamDiscardListener, listener); + VFIOContainerBase *bcontainer = vrdl->bcontainer; const hwaddr size = int128_get64(section->size); const hwaddr iova = section->offset_within_address_space; int ret; /* Unmap with a single call. */ - ret = vfio_container_dma_unmap(&vrdl->container->bcontainer, - iova, size , NULL); + ret = vfio_container_dma_unmap(bcontainer, iova, size , NULL); if (ret) { error_report("%s: vfio_container_dma_unmap() failed: %s", __func__, strerror(-ret)); @@ -369,6 +369,7 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, { VFIORamDiscardListener *vrdl = container_of(rdl, VFIORamDiscardListener, listener); + VFIOContainerBase *bcontainer = vrdl->bcontainer; const hwaddr end = section->offset_within_region + int128_get64(section->size); hwaddr start, next, iova; @@ -387,8 +388,8 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, section->offset_within_address_space; vaddr = memory_region_get_ram_ptr(section->mr) + start; - ret = vfio_container_dma_map(&vrdl->container->bcontainer, iova, - next - start, vaddr, section->readonly); + ret = vfio_container_dma_map(bcontainer, iova, next - start, + vaddr, section->readonly); if (ret) { /* Rollback */ vfio_ram_discard_notify_discard(rdl, section); @@ -398,7 +399,7 @@ static int vfio_ram_discard_notify_populate(RamDiscardListener *rdl, return 0; } -static void vfio_register_ram_discard_listener(VFIOContainer *container, +static void vfio_register_ram_discard_listener(VFIOContainerBase *bcontainer, MemoryRegionSection *section) { RamDiscardManager *rdm = memory_region_get_ram_discard_manager(section->mr); @@ -411,7 +412,7 @@ static void vfio_register_ram_discard_listener(VFIOContainer *container, g_assert(QEMU_IS_ALIGNED(int128_get64(section->size), TARGET_PAGE_SIZE)); vrdl = g_new0(VFIORamDiscardListener, 1); - vrdl->container = container; + vrdl->bcontainer = bcontainer; vrdl->mr = section->mr; vrdl->offset_within_address_space = section->offset_within_address_space; vrdl->size = int128_get64(section->size); @@ -419,14 +420,14 @@ static void vfio_register_ram_discard_listener(VFIOContainer *container, section->mr); g_assert(vrdl->granularity && is_power_of_2(vrdl->granularity)); - g_assert(container->pgsizes && - vrdl->granularity >= 1ULL << ctz64(container->pgsizes)); + g_assert(bcontainer->pgsizes && + vrdl->granularity >= 1ULL << ctz64(bcontainer->pgsizes)); ram_discard_listener_init(&vrdl->listener, vfio_ram_discard_notify_populate, vfio_ram_discard_notify_discard, true); ram_discard_manager_register_listener(rdm, &vrdl->listener, section); - QLIST_INSERT_HEAD(&container->vrdl_list, vrdl, next); + QLIST_INSERT_HEAD(&bcontainer->vrdl_list, vrdl, next); /* * Sanity-check if we have a theoretically problematic setup where we could @@ -441,7 +442,7 @@ static void vfio_register_ram_discard_listener(VFIOContainer *container, * number of sections in the address space we could have over time, * also consuming DMA mappings. */ - if (container->dma_max_mappings) { + if (bcontainer->dma_max_mappings) { unsigned int vrdl_count = 0, vrdl_mappings = 0, max_memslots = 512; #ifdef CONFIG_KVM @@ -450,7 +451,7 @@ static void vfio_register_ram_discard_listener(VFIOContainer *container, } #endif - QLIST_FOREACH(vrdl, &container->vrdl_list, next) { + QLIST_FOREACH(vrdl, &bcontainer->vrdl_list, next) { hwaddr start, end; start = QEMU_ALIGN_DOWN(vrdl->offset_within_address_space, @@ -462,23 +463,23 @@ static void vfio_register_ram_discard_listener(VFIOContainer *container, } if (vrdl_mappings + max_memslots - vrdl_count > - container->dma_max_mappings) { + bcontainer->dma_max_mappings) { warn_report("%s: possibly running out of DMA mappings. E.g., try" " increasing the 'block-size' of virtio-mem devies." " Maximum possible DMA mappings: %d, Maximum possible" - " memslots: %d", __func__, container->dma_max_mappings, + " memslots: %d", __func__, bcontainer->dma_max_mappings, max_memslots); } } } -static void vfio_unregister_ram_discard_listener(VFIOContainer *container, +static void vfio_unregister_ram_discard_listener(VFIOContainerBase *bcontainer, MemoryRegionSection *section) { RamDiscardManager *rdm = memory_region_get_ram_discard_manager(section->mr); VFIORamDiscardListener *vrdl = NULL; - QLIST_FOREACH(vrdl, &container->vrdl_list, next) { + QLIST_FOREACH(vrdl, &bcontainer->vrdl_list, next) { if (vrdl->mr == section->mr && vrdl->offset_within_address_space == section->offset_within_address_space) { @@ -627,7 +628,7 @@ static void vfio_listener_region_add(MemoryListener *listener, iommu_idx); ret = memory_region_iommu_set_page_size_mask(giommu->iommu_mr, - container->pgsizes, + bcontainer->pgsizes, &err); if (ret) { g_free(giommu); @@ -663,7 +664,7 @@ static void vfio_listener_region_add(MemoryListener *listener, * about changes. */ if (memory_region_has_ram_discard_manager(section->mr)) { - vfio_register_ram_discard_listener(container, section); + vfio_register_ram_discard_listener(bcontainer, section); return; } @@ -782,7 +783,7 @@ static void vfio_listener_region_del(MemoryListener *listener, pgmask = (1ULL << ctz64(hostwin->iova_pgsizes)) - 1; try_unmap = !((iova & pgmask) || (int128_get64(llsize) & pgmask)); } else if (memory_region_has_ram_discard_manager(section->mr)) { - vfio_unregister_ram_discard_listener(container, section); + vfio_unregister_ram_discard_listener(bcontainer, section); /* Unregistering will trigger an unmap. */ try_unmap = false; } @@ -1261,17 +1262,17 @@ static int vfio_ram_discard_get_dirty_bitmap(MemoryRegionSection *section, * Sync the whole mapped region (spanning multiple individual mappings) * in one go. */ - return vfio_get_dirty_bitmap(&vrdl->container->bcontainer, iova, size, - ram_addr); + return vfio_get_dirty_bitmap(vrdl->bcontainer, iova, size, ram_addr); } -static int vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainer *container, - MemoryRegionSection *section) +static int +vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainerBase *bcontainer, + MemoryRegionSection *section) { RamDiscardManager *rdm = memory_region_get_ram_discard_manager(section->mr); VFIORamDiscardListener *vrdl = NULL; - QLIST_FOREACH(vrdl, &container->vrdl_list, next) { + QLIST_FOREACH(vrdl, &bcontainer->vrdl_list, next) { if (vrdl->mr == section->mr && vrdl->offset_within_address_space == section->offset_within_address_space) { @@ -1325,7 +1326,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container, } return 0; } else if (memory_region_has_ram_discard_manager(section->mr)) { - return vfio_sync_ram_discard_listener_dirty_bitmap(container, section); + return vfio_sync_ram_discard_listener_dirty_bitmap(bcontainer, section); } ram_addr = memory_region_get_ram_addr(section->mr) + diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index a7cf517dd2..568f891841 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -76,15 +76,27 @@ void vfio_container_init(VFIOContainerBase *bcontainer, VFIOAddressSpace *space, bcontainer->ops = ops; bcontainer->space = space; bcontainer->dirty_pages_supported = false; + bcontainer->dma_max_mappings = 0; QLIST_INIT(&bcontainer->giommu_list); + QLIST_INIT(&bcontainer->vrdl_list); } void vfio_container_destroy(VFIOContainerBase *bcontainer) { + VFIORamDiscardListener *vrdl, *vrdl_tmp; VFIOGuestIOMMU *giommu, *tmp; QLIST_REMOVE(bcontainer, next); + QLIST_FOREACH_SAFE(vrdl, &bcontainer->vrdl_list, next, vrdl_tmp) { + RamDiscardManager *rdm; + + rdm = memory_region_get_ram_discard_manager(vrdl->mr); + ram_discard_manager_unregister_listener(rdm, &vrdl->listener); + QLIST_REMOVE(vrdl, next); + g_free(vrdl); + } + QLIST_FOREACH_SAFE(giommu, &bcontainer->giommu_list, giommu_next, tmp) { memory_region_unregister_iommu_notifier( MEMORY_REGION(giommu->iommu_mr), &giommu->n); diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 8d5b408e86..0e265ffa67 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -154,7 +154,7 @@ static int vfio_legacy_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, if (errno == EINVAL && unmap.size && !(unmap.iova + unmap.size) && container->iommu_type == VFIO_TYPE1v2_IOMMU) { trace_vfio_legacy_dma_unmap_overflow_workaround(); - unmap.size -= 1ULL << ctz64(container->pgsizes); + unmap.size -= 1ULL << ctz64(container->bcontainer.pgsizes); continue; } error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno)); @@ -559,9 +559,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container = g_malloc0(sizeof(*container)); container->fd = fd; container->error = NULL; - container->dma_max_mappings = 0; container->iova_ranges = NULL; - QLIST_INIT(&container->vrdl_list); bcontainer = &container->bcontainer; vfio_container_init(bcontainer, space, &vfio_legacy_ops); @@ -589,13 +587,13 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } if (info->flags & VFIO_IOMMU_INFO_PGSIZES) { - container->pgsizes = info->iova_pgsizes; + container->bcontainer.pgsizes = info->iova_pgsizes; } else { - container->pgsizes = qemu_real_host_page_size(); + container->bcontainer.pgsizes = qemu_real_host_page_size(); } - if (!vfio_get_info_dma_avail(info, &container->dma_max_mappings)) { - container->dma_max_mappings = 65535; + if (!vfio_get_info_dma_avail(info, &bcontainer->dma_max_mappings)) { + container->bcontainer.dma_max_mappings = 65535; } vfio_get_info_iova_range(info, container); diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 3495737ab2..dbc4c24052 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -223,13 +223,13 @@ static int vfio_spapr_create_window(VFIOContainer *container, if (pagesize > rampagesize) { pagesize = rampagesize; } - pgmask = container->pgsizes & (pagesize | (pagesize - 1)); + pgmask = container->bcontainer.pgsizes & (pagesize | (pagesize - 1)); pagesize = pgmask ? (1ULL << (63 - clz64(pgmask))) : 0; if (!pagesize) { error_report("Host doesn't support page size 0x%"PRIx64 ", the supported mask is 0x%lx", memory_region_iommu_get_min_page_size(iommu_mr), - container->pgsizes); + container->bcontainer.pgsizes); return -EINVAL; } @@ -385,7 +385,7 @@ void vfio_container_del_section_window(VFIOContainer *container, bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) { - + VFIOContainerBase *bcontainer = &container->bcontainer; struct vfio_iommu_spapr_tce_info info; bool v2 = container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU; int ret, fd = container->fd; @@ -424,7 +424,7 @@ bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) } if (v2) { - container->pgsizes = info.ddw.pgsizes; + bcontainer->pgsizes = info.ddw.pgsizes; /* * There is a default window in just created container. * To make region_add/del simpler, we better remove this @@ -439,7 +439,7 @@ bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) } } else { /* The default table uses 4K pages */ - container->pgsizes = 0x1000; + bcontainer->pgsizes = 0x1000; vfio_host_win_add(container, info.dma32_window_start, info.dma32_window_start + info.dma32_window_size - 1, @@ -455,7 +455,15 @@ listener_unregister_exit: void vfio_spapr_container_deinit(VFIOContainer *container) { + VFIOHostDMAWindow *hostwin, *next; + if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) { memory_listener_unregister(&container->prereg_listener); } + QLIST_FOREACH_SAFE(hostwin, &container->hostwin_list, hostwin_next, + next) { + QLIST_REMOVE(hostwin, hostwin_next); + g_free(hostwin); + } + } From patchwork Thu Oct 26 10:30:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3E831C25B48 for ; Thu, 26 Oct 2023 10:52:27 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxu2-0000XF-Bi; Thu, 26 Oct 2023 06:47:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxtl-0008SE-V6; Thu, 26 Oct 2023 06:47:33 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxti-0001O3-BD; Thu, 26 Oct 2023 06:47:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317250; x=1729853250; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yy18muAc8ESiglD3tLSpAmstXS3xjXDUpKu5ZBOnvNw=; b=mWS+9crGEDV8jN0QrFhdvgyyFhT43dDI0Asy6pOUtZgpIrgxjkKjnGXs iCpOYHvbx7Bvqsaf/4NytiL6aZUVOTw79EKedVZ4TMQ2ouwdm+SlOpTbo xr1eAr6Vcx1yrOsX4owV7ELBEzPJsEdJsqZV80OBawEfBqBaAVGT6XIQP HYQ3HvJ3RIy1dq9KTEP9upy2UB97+FGQcw9oIabi5xTK+cg1bhJNTFB9y 84D8hTl2C25uViJ7KWa2nq4YebqKO/6S7k72xDj719vtk0m/vwtVCiIBk 7tZtbnw3v1bx5LkeDyyOIHnDTA9jia0iqrNrYAktwlIVb4pDi1hjqOm6b w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563442" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563442" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463684" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:46:56 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 15/37] vfio/container: Move listener to base container Date: Thu, 26 Oct 2023 18:30:42 +0800 Message-Id: <20231026103104.1686921-16-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Move listener to base container. Also error and initialized fields are moved at the same time. No functional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan [ clg: context changes ] Signed-off-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 3 - include/hw/vfio/vfio-container-base.h | 3 + hw/vfio/common.c | 110 +++++++++++++------------- hw/vfio/container-base.c | 1 + hw/vfio/container.c | 23 +++--- hw/vfio/spapr.c | 11 +-- 6 files changed, 76 insertions(+), 75 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 65ae2d76cf..56452018a9 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -78,11 +78,8 @@ struct VFIOGroup; typedef struct VFIOContainer { VFIOContainerBase bcontainer; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ - MemoryListener listener; MemoryListener prereg_listener; unsigned iommu_type; - Error *error; - bool initialized; uint64_t dirty_pgsizes; uint64_t max_dirty_bitmap_size; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 849c8b34b2..89642e6b45 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -50,6 +50,9 @@ typedef struct VFIOAddressSpace { typedef struct VFIOContainerBase { const VFIOIOMMUOps *ops; VFIOAddressSpace *space; + MemoryListener listener; + Error *error; + bool initialized; unsigned long pgsizes; unsigned int dma_max_mappings; bool dirty_pages_supported; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 9b34e7e0f8..52872f515a 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -541,7 +541,7 @@ static bool vfio_listener_valid_section(MemoryRegionSection *section, return true; } -static bool vfio_get_section_iova_range(VFIOContainer *container, +static bool vfio_get_section_iova_range(VFIOContainerBase *bcontainer, MemoryRegionSection *section, hwaddr *out_iova, hwaddr *out_end, Int128 *out_llend) @@ -569,8 +569,10 @@ static bool vfio_get_section_iova_range(VFIOContainer *container, static void vfio_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { - VFIOContainer *container = container_of(listener, VFIOContainer, listener); - VFIOContainerBase *bcontainer = &container->bcontainer; + VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase, + listener); + VFIOContainer *container = container_of(bcontainer, VFIOContainer, + bcontainer); hwaddr iova, end; Int128 llend, llsize; void *vaddr; @@ -582,7 +584,8 @@ static void vfio_listener_region_add(MemoryListener *listener, return; } - if (!vfio_get_section_iova_range(container, section, &iova, &end, &llend)) { + if (!vfio_get_section_iova_range(bcontainer, section, &iova, &end, + &llend)) { if (memory_region_is_ram_device(section->mr)) { trace_vfio_listener_region_add_no_dma_map( memory_region_name(section->mr), @@ -689,13 +692,12 @@ static void vfio_listener_region_add(MemoryListener *listener, } } - ret = vfio_container_dma_map(&container->bcontainer, - iova, int128_get64(llsize), vaddr, - section->readonly); + ret = vfio_container_dma_map(bcontainer, iova, int128_get64(llsize), + vaddr, section->readonly); if (ret) { error_setg(&err, "vfio_container_dma_map(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx", %p) = %d (%s)", - container, iova, int128_get64(llsize), vaddr, ret, + bcontainer, iova, int128_get64(llsize), vaddr, ret, strerror(-ret)); if (memory_region_is_ram_device(section->mr)) { /* Allow unexpected mappings not to be fatal for RAM devices */ @@ -717,9 +719,9 @@ fail: * can gracefully fail. Runtime, there's not much we can do other * than throw a hardware error. */ - if (!container->initialized) { - if (!container->error) { - error_propagate_prepend(&container->error, err, + if (!bcontainer->initialized) { + if (!bcontainer->error) { + error_propagate_prepend(&bcontainer->error, err, "Region %s: ", memory_region_name(section->mr)); } else { @@ -734,8 +736,10 @@ fail: static void vfio_listener_region_del(MemoryListener *listener, MemoryRegionSection *section) { - VFIOContainer *container = container_of(listener, VFIOContainer, listener); - VFIOContainerBase *bcontainer = &container->bcontainer; + VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase, + listener); + VFIOContainer *container = container_of(bcontainer, VFIOContainer, + bcontainer); hwaddr iova, end; Int128 llend, llsize; int ret; @@ -768,7 +772,8 @@ static void vfio_listener_region_del(MemoryListener *listener, */ } - if (!vfio_get_section_iova_range(container, section, &iova, &end, &llend)) { + if (!vfio_get_section_iova_range(bcontainer, section, &iova, &end, + &llend)) { return; } @@ -792,22 +797,22 @@ static void vfio_listener_region_del(MemoryListener *listener, if (int128_eq(llsize, int128_2_64())) { /* The unmap ioctl doesn't accept a full 64-bit span. */ llsize = int128_rshift(llsize, 1); - ret = vfio_container_dma_unmap(&container->bcontainer, iova, + ret = vfio_container_dma_unmap(bcontainer, iova, int128_get64(llsize), NULL); if (ret) { error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", - container, iova, int128_get64(llsize), ret, + bcontainer, iova, int128_get64(llsize), ret, strerror(-ret)); } iova += int128_get64(llsize); } - ret = vfio_container_dma_unmap(&container->bcontainer, iova, + ret = vfio_container_dma_unmap(bcontainer, iova, int128_get64(llsize), NULL); if (ret) { error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", - container, iova, int128_get64(llsize), ret, + bcontainer, iova, int128_get64(llsize), ret, strerror(-ret)); } } @@ -827,16 +832,15 @@ typedef struct VFIODirtyRanges { } VFIODirtyRanges; typedef struct VFIODirtyRangesListener { - VFIOContainer *container; + VFIOContainerBase *bcontainer; VFIODirtyRanges ranges; MemoryListener listener; } VFIODirtyRangesListener; static bool vfio_section_is_vfio_pci(MemoryRegionSection *section, - VFIOContainer *container) + VFIOContainerBase *bcontainer) { VFIOPCIDevice *pcidev; - VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; Object *owner; @@ -865,7 +869,7 @@ static void vfio_dirty_tracking_update(MemoryListener *listener, hwaddr iova, end, *min, *max; if (!vfio_listener_valid_section(section, "tracking_update") || - !vfio_get_section_iova_range(dirty->container, section, + !vfio_get_section_iova_range(dirty->bcontainer, section, &iova, &end, NULL)) { return; } @@ -889,7 +893,7 @@ static void vfio_dirty_tracking_update(MemoryListener *listener, * The alternative would be an IOVATree but that has a much bigger runtime * overhead and unnecessary complexity. */ - if (vfio_section_is_vfio_pci(section, dirty->container) && + if (vfio_section_is_vfio_pci(section, dirty->bcontainer) && iova >= UINT32_MAX) { min = &range->minpci64; max = &range->maxpci64; @@ -913,7 +917,7 @@ static const MemoryListener vfio_dirty_tracking_listener = { .region_add = vfio_dirty_tracking_update, }; -static void vfio_dirty_tracking_init(VFIOContainer *container, +static void vfio_dirty_tracking_init(VFIOContainerBase *bcontainer, VFIODirtyRanges *ranges) { VFIODirtyRangesListener dirty; @@ -923,10 +927,10 @@ static void vfio_dirty_tracking_init(VFIOContainer *container, dirty.ranges.min64 = UINT64_MAX; dirty.ranges.minpci64 = UINT64_MAX; dirty.listener = vfio_dirty_tracking_listener; - dirty.container = container; + dirty.bcontainer = bcontainer; memory_listener_register(&dirty.listener, - container->bcontainer.space->as); + bcontainer->space->as); *ranges = dirty.ranges; @@ -938,12 +942,11 @@ static void vfio_dirty_tracking_init(VFIOContainer *container, memory_listener_unregister(&dirty.listener); } -static void vfio_devices_dma_logging_stop(VFIOContainer *container) +static void vfio_devices_dma_logging_stop(VFIOContainerBase *bcontainer) { uint64_t buf[DIV_ROUND_UP(sizeof(struct vfio_device_feature), sizeof(uint64_t))] = {}; struct vfio_device_feature *feature = (struct vfio_device_feature *)buf; - VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; feature->argsz = sizeof(buf); @@ -964,7 +967,7 @@ static void vfio_devices_dma_logging_stop(VFIOContainer *container) } static struct vfio_device_feature * -vfio_device_feature_dma_logging_start_create(VFIOContainer *container, +vfio_device_feature_dma_logging_start_create(VFIOContainerBase *bcontainer, VFIODirtyRanges *tracking) { struct vfio_device_feature *feature; @@ -1037,16 +1040,15 @@ static void vfio_device_feature_dma_logging_start_destroy( g_free(feature); } -static int vfio_devices_dma_logging_start(VFIOContainer *container) +static int vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer) { struct vfio_device_feature *feature; VFIODirtyRanges ranges; - VFIOContainerBase *bcontainer = &container->bcontainer; VFIODevice *vbasedev; int ret = 0; - vfio_dirty_tracking_init(container, &ranges); - feature = vfio_device_feature_dma_logging_start_create(container, + vfio_dirty_tracking_init(bcontainer, &ranges); + feature = vfio_device_feature_dma_logging_start_create(bcontainer, &ranges); if (!feature) { return -errno; @@ -1069,7 +1071,7 @@ static int vfio_devices_dma_logging_start(VFIOContainer *container) out: if (ret) { - vfio_devices_dma_logging_stop(container); + vfio_devices_dma_logging_stop(bcontainer); } vfio_device_feature_dma_logging_start_destroy(feature); @@ -1079,14 +1081,14 @@ out: static void vfio_listener_log_global_start(MemoryListener *listener) { - VFIOContainer *container = container_of(listener, VFIOContainer, listener); + VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase, + listener); int ret; - if (vfio_devices_all_device_dirty_tracking(&container->bcontainer)) { - ret = vfio_devices_dma_logging_start(container); + if (vfio_devices_all_device_dirty_tracking(bcontainer)) { + ret = vfio_devices_dma_logging_start(bcontainer); } else { - ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, - true); + ret = vfio_container_set_dirty_page_tracking(bcontainer, true); } if (ret) { @@ -1098,14 +1100,14 @@ static void vfio_listener_log_global_start(MemoryListener *listener) static void vfio_listener_log_global_stop(MemoryListener *listener) { - VFIOContainer *container = container_of(listener, VFIOContainer, listener); + VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase, + listener); int ret = 0; - if (vfio_devices_all_device_dirty_tracking(&container->bcontainer)) { - vfio_devices_dma_logging_stop(container); + if (vfio_devices_all_device_dirty_tracking(bcontainer)) { + vfio_devices_dma_logging_stop(bcontainer); } else { - ret = vfio_container_set_dirty_page_tracking(&container->bcontainer, - false); + ret = vfio_container_set_dirty_page_tracking(bcontainer, false); } if (ret) { @@ -1216,8 +1218,6 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) vfio_giommu_dirty_notifier, n); VFIOGuestIOMMU *giommu = gdn->giommu; VFIOContainerBase *bcontainer = giommu->bcontainer; - VFIOContainer *container = container_of(bcontainer, VFIOContainer, - bcontainer); hwaddr iova = iotlb->iova + giommu->iommu_offset; ram_addr_t translated_addr; int ret = -EINVAL; @@ -1232,12 +1232,12 @@ static void vfio_iommu_map_dirty_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb) rcu_read_lock(); if (vfio_get_xlat_addr(iotlb, NULL, &translated_addr, NULL)) { - ret = vfio_get_dirty_bitmap(&container->bcontainer, iova, - iotlb->addr_mask + 1, translated_addr); + ret = vfio_get_dirty_bitmap(bcontainer, iova, iotlb->addr_mask + 1, + translated_addr); if (ret) { error_report("vfio_iommu_map_dirty_notify(%p, 0x%"HWADDR_PRIx", " "0x%"HWADDR_PRIx") = %d (%s)", - container, iova, iotlb->addr_mask + 1, ret, + bcontainer, iova, iotlb->addr_mask + 1, ret, strerror(-ret)); } } @@ -1293,10 +1293,9 @@ vfio_sync_ram_discard_listener_dirty_bitmap(VFIOContainerBase *bcontainer, &vrdl); } -static int vfio_sync_dirty_bitmap(VFIOContainer *container, +static int vfio_sync_dirty_bitmap(VFIOContainerBase *bcontainer, MemoryRegionSection *section) { - VFIOContainerBase *bcontainer = &container->bcontainer; ram_addr_t ram_addr; if (memory_region_is_iommu(section->mr)) { @@ -1332,7 +1331,7 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container, ram_addr = memory_region_get_ram_addr(section->mr) + section->offset_within_region; - return vfio_get_dirty_bitmap(&container->bcontainer, + return vfio_get_dirty_bitmap(bcontainer, REAL_HOST_PAGE_ALIGN(section->offset_within_address_space), int128_get64(section->size), ram_addr); } @@ -1340,15 +1339,16 @@ static int vfio_sync_dirty_bitmap(VFIOContainer *container, static void vfio_listener_log_sync(MemoryListener *listener, MemoryRegionSection *section) { - VFIOContainer *container = container_of(listener, VFIOContainer, listener); + VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase, + listener); int ret; if (vfio_listener_skipped_section(section)) { return; } - if (vfio_devices_all_dirty_tracking(&container->bcontainer)) { - ret = vfio_sync_dirty_bitmap(container, section); + if (vfio_devices_all_dirty_tracking(bcontainer)) { + ret = vfio_sync_dirty_bitmap(bcontainer, section); if (ret) { error_report("vfio: Failed to sync dirty bitmap, err: %d (%s)", ret, strerror(-ret)); diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 568f891841..b57fdede91 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -75,6 +75,7 @@ void vfio_container_init(VFIOContainerBase *bcontainer, VFIOAddressSpace *space, { bcontainer->ops = ops; bcontainer->space = space; + bcontainer->error = NULL; bcontainer->dirty_pages_supported = false; bcontainer->dma_max_mappings = 0; QLIST_INIT(&bcontainer->giommu_list); diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 0e265ffa67..b8f36f56d2 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -453,6 +453,7 @@ static void vfio_get_iommu_info_migration(VFIOContainer *container, { struct vfio_info_cap_header *hdr; struct vfio_iommu_type1_info_cap_migration *cap_mig; + VFIOContainerBase *bcontainer = &container->bcontainer; hdr = vfio_get_iommu_info_cap(info, VFIO_IOMMU_TYPE1_INFO_CAP_MIGRATION); if (!hdr) { @@ -467,7 +468,7 @@ static void vfio_get_iommu_info_migration(VFIOContainer *container, * qemu_real_host_page_size to mark those dirty. */ if (cap_mig->pgsize_bitmap & qemu_real_host_page_size()) { - container->bcontainer.dirty_pages_supported = true; + bcontainer->dirty_pages_supported = true; container->max_dirty_bitmap_size = cap_mig->max_dirty_bitmap_size; container->dirty_pgsizes = cap_mig->pgsize_bitmap; } @@ -558,7 +559,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container = g_malloc0(sizeof(*container)); container->fd = fd; - container->error = NULL; container->iova_ranges = NULL; bcontainer = &container->bcontainer; vfio_container_init(bcontainer, space, &vfio_legacy_ops); @@ -587,9 +587,9 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, } if (info->flags & VFIO_IOMMU_INFO_PGSIZES) { - container->bcontainer.pgsizes = info->iova_pgsizes; + bcontainer->pgsizes = info->iova_pgsizes; } else { - container->bcontainer.pgsizes = qemu_real_host_page_size(); + bcontainer->pgsizes = qemu_real_host_page_size(); } if (!vfio_get_info_dma_avail(info, &bcontainer->dma_max_mappings)) { @@ -621,26 +621,25 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, group->container = container; QLIST_INSERT_HEAD(&container->group_list, group, container_next); - container->listener = vfio_memory_listener; + bcontainer->listener = vfio_memory_listener; - memory_listener_register(&container->listener, - container->bcontainer.space->as); + memory_listener_register(&bcontainer->listener, bcontainer->space->as); - if (container->error) { + if (bcontainer->error) { ret = -1; - error_propagate_prepend(errp, container->error, + error_propagate_prepend(errp, bcontainer->error, "memory listener initialization failed: "); goto listener_release_exit; } - container->initialized = true; + bcontainer->initialized = true; return 0; listener_release_exit: QLIST_REMOVE(group, container_next); QLIST_REMOVE(bcontainer, next); vfio_kvm_device_del_group(group); - memory_listener_unregister(&container->listener); + memory_listener_unregister(&bcontainer->listener); if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU || container->iommu_type == VFIO_SPAPR_TCE_IOMMU) { vfio_spapr_container_deinit(container); @@ -675,7 +674,7 @@ static void vfio_disconnect_container(VFIOGroup *group) * group. */ if (QLIST_EMPTY(&container->group_list)) { - memory_listener_unregister(&container->listener); + memory_listener_unregister(&bcontainer->listener); if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU || container->iommu_type == VFIO_SPAPR_TCE_IOMMU) { vfio_spapr_container_deinit(container); diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index dbc4c24052..5786377317 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -46,6 +46,7 @@ static void vfio_prereg_listener_region_add(MemoryListener *listener, { VFIOContainer *container = container_of(listener, VFIOContainer, prereg_listener); + VFIOContainerBase *bcontainer = &container->bcontainer; const hwaddr gpa = section->offset_within_address_space; hwaddr end; int ret; @@ -88,9 +89,9 @@ static void vfio_prereg_listener_region_add(MemoryListener *listener, * can gracefully fail. Runtime, there's not much we can do other * than throw a hardware error. */ - if (!container->initialized) { - if (!container->error) { - error_setg_errno(&container->error, -ret, + if (!bcontainer->initialized) { + if (!bcontainer->error) { + error_setg_errno(&bcontainer->error, -ret, "Memory registering failed"); } } else { @@ -406,9 +407,9 @@ bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) memory_listener_register(&container->prereg_listener, &address_space_memory); - if (container->error) { + if (bcontainer->error) { ret = -1; - error_propagate_prepend(errp, container->error, + error_propagate_prepend(errp, bcontainer->error, "RAM memory listener initialization failed: "); goto listener_unregister_exit; } From patchwork Thu Oct 26 10:30:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437484 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6705FC25B48 for ; Thu, 26 Oct 2023 10:50:07 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxtt-00009k-KB; Thu, 26 Oct 2023 06:47:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxtk-0008SB-Pd for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:47:33 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxti-0001Nv-Ox for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:47:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317250; x=1729853250; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2xi2jTaKcJ6d+ZkjLObBf9LDtNNfaRLiLBjPlah1eRQ=; b=GMZcolFc0D7MUnemyuaJpIwZX7VFvM9PsBoohiG/5ZFdo6tMRMZ3eYaU e5eiwrDc+5IoiRh8uOmMRMcE3xp4PmbxZNUlFV6IpKDXcaBxVIi5n+ema X7kGD3r/hCG1/dIvQjxCXbDFycj+IMZ6hpNUD7kcCG2vcIJxnvEAARSn9 4diTVF6IBi/DrDtiK9hqxiodDj5sBqOddiaq+ZPe6b2s3qc/Z0MDagNP2 dQnFFNgEFcWcO6yJAVcOZmIjKj850hgViiE2QMJZh5iGLAxLjcAfAs5wS oI5C1u3TjyFsmpLEGDw1gROs4PINyF2Yr6A+TGpX62rMMZcyBqjn84BLR g==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563459" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563459" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463695" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:00 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v3 16/37] vfio/container: Move dirty_pgsizes and max_dirty_bitmap_size to base container Date: Thu, 26 Oct 2023 18:30:43 +0800 Message-Id: <20231026103104.1686921-17-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger No functional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater Reviewed-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 2 -- include/hw/vfio/vfio-container-base.h | 2 ++ hw/vfio/container.c | 11 ++++++----- 3 files changed, 8 insertions(+), 7 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 56452018a9..423ab2436c 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -80,8 +80,6 @@ typedef struct VFIOContainer { int fd; /* /dev/vfio/vfio, empowered by the attached groups */ MemoryListener prereg_listener; unsigned iommu_type; - uint64_t dirty_pgsizes; - uint64_t max_dirty_bitmap_size; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; GList *iova_ranges; diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 89642e6b45..526d23acfd 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -53,6 +53,8 @@ typedef struct VFIOContainerBase { MemoryListener listener; Error *error; bool initialized; + uint64_t dirty_pgsizes; + uint64_t max_dirty_bitmap_size; unsigned long pgsizes; unsigned int dma_max_mappings; bool dirty_pages_supported; diff --git a/hw/vfio/container.c b/hw/vfio/container.c index b8f36f56d2..68dc6d240f 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -64,6 +64,7 @@ static int vfio_dma_unmap_bitmap(VFIOContainer *container, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb) { + VFIOContainerBase *bcontainer = &container->bcontainer; struct vfio_iommu_type1_dma_unmap *unmap; struct vfio_bitmap *bitmap; VFIOBitmap vbmap; @@ -91,7 +92,7 @@ static int vfio_dma_unmap_bitmap(VFIOContainer *container, bitmap->size = vbmap.size; bitmap->data = (__u64 *)vbmap.bitmap; - if (vbmap.size > container->max_dirty_bitmap_size) { + if (vbmap.size > bcontainer->max_dirty_bitmap_size) { error_report("UNMAP: Size of bitmap too big 0x%"PRIx64, vbmap.size); ret = -E2BIG; goto unmap_exit; @@ -131,7 +132,7 @@ static int vfio_legacy_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, if (iotlb && vfio_devices_all_running_and_mig_active(bcontainer)) { if (!vfio_devices_all_device_dirty_tracking(bcontainer) && - container->bcontainer.dirty_pages_supported) { + bcontainer->dirty_pages_supported) { return vfio_dma_unmap_bitmap(container, iova, size, iotlb); } @@ -154,7 +155,7 @@ static int vfio_legacy_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, if (errno == EINVAL && unmap.size && !(unmap.iova + unmap.size) && container->iommu_type == VFIO_TYPE1v2_IOMMU) { trace_vfio_legacy_dma_unmap_overflow_workaround(); - unmap.size -= 1ULL << ctz64(container->bcontainer.pgsizes); + unmap.size -= 1ULL << ctz64(bcontainer->pgsizes); continue; } error_report("VFIO_UNMAP_DMA failed: %s", strerror(errno)); @@ -469,8 +470,8 @@ static void vfio_get_iommu_info_migration(VFIOContainer *container, */ if (cap_mig->pgsize_bitmap & qemu_real_host_page_size()) { bcontainer->dirty_pages_supported = true; - container->max_dirty_bitmap_size = cap_mig->max_dirty_bitmap_size; - container->dirty_pgsizes = cap_mig->pgsize_bitmap; + bcontainer->max_dirty_bitmap_size = cap_mig->max_dirty_bitmap_size; + bcontainer->dirty_pgsizes = cap_mig->pgsize_bitmap; } } From patchwork Thu Oct 26 10:30:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437490 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A81E4C25B6F for ; Thu, 26 Oct 2023 10:51:32 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxu2-0000Yl-KQ; Thu, 26 Oct 2023 06:47:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxtl-0008SF-VA for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:47:33 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxtk-0001Nq-2H for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:47:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317252; x=1729853252; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pwBe+xn2AUzZ6ILX7f+ggHubuGoU7neJxXFsv65WvK4=; b=Bhef+jj1Wz/3YMGOY1e9x7JHcC53iqeFdi8bzU/Mmzy7NYmglbEq67pC 9Lugrpbdaeo82n1fEB/WFqQ/2Q0M+s7j2X0fXUrFyzATgQn19eAKtSjmt oVz5rEjl/Ik1wdIsCK0zHhsHCRQdYzLIoHvbU7Mt9Go9Kj3e6j65d02He rD4ldImp2cxalytZ5avfkjOkWdvq0P4+IfY/g5HxBdMDBWFoNenODaIjW hB5aW4Pbx1kYIYUxLBX/y4z+x/3ZYRzrDLX8bjYgLidfFrEfSjxTO4LrS IuvF4TDwXc9aAxtjWhB8p86XEp/Wb7gw/KDsNnRqIGv0fgZ3VoeRL7XBB g==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563472" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563472" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463702" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:04 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 17/37] vfio/container: Move iova_ranges to base container Date: Thu, 26 Oct 2023 18:30:44 +0800 Message-Id: <20231026103104.1686921-18-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Meanwhile remove the helper function vfio_free_container as it only calls g_free now. No functional change intended. Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 1 - include/hw/vfio/vfio-container-base.h | 1 + hw/vfio/common.c | 5 +++-- hw/vfio/container-base.c | 3 +++ hw/vfio/container.c | 19 ++++++------------- 5 files changed, 13 insertions(+), 16 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 423ab2436c..938f75e70c 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -82,7 +82,6 @@ typedef struct VFIOContainer { unsigned iommu_type; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; - GList *iova_ranges; } VFIOContainer; typedef struct VFIOHostDMAWindow { diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 526d23acfd..2ffafb0d58 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -62,6 +62,7 @@ typedef struct VFIOContainerBase { QLIST_HEAD(, VFIORamDiscardListener) vrdl_list; QLIST_ENTRY(VFIOContainerBase) next; QLIST_HEAD(, VFIODevice) device_list; + GList *iova_ranges; } VFIOContainerBase; typedef struct VFIOGuestIOMMU { diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 52872f515a..d62c815d7f 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -638,9 +638,10 @@ static void vfio_listener_region_add(MemoryListener *listener, goto fail; } - if (container->iova_ranges) { + if (bcontainer->iova_ranges) { ret = memory_region_iommu_set_iova_ranges(giommu->iommu_mr, - container->iova_ranges, &err); + bcontainer->iova_ranges, + &err); if (ret) { g_free(giommu); goto fail; diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index b57fdede91..6363ed3552 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -78,6 +78,7 @@ void vfio_container_init(VFIOContainerBase *bcontainer, VFIOAddressSpace *space, bcontainer->error = NULL; bcontainer->dirty_pages_supported = false; bcontainer->dma_max_mappings = 0; + bcontainer->iova_ranges = NULL; QLIST_INIT(&bcontainer->giommu_list); QLIST_INIT(&bcontainer->vrdl_list); } @@ -104,4 +105,6 @@ void vfio_container_destroy(VFIOContainerBase *bcontainer) QLIST_REMOVE(giommu, giommu_next); g_free(giommu); } + + g_list_free_full(bcontainer->iova_ranges, g_free); } diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 68dc6d240f..36c34683ad 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -308,7 +308,7 @@ bool vfio_get_info_dma_avail(struct vfio_iommu_type1_info *info, } static bool vfio_get_info_iova_range(struct vfio_iommu_type1_info *info, - VFIOContainer *container) + VFIOContainerBase *bcontainer) { struct vfio_info_cap_header *hdr; struct vfio_iommu_type1_info_cap_iova_range *cap; @@ -326,8 +326,8 @@ static bool vfio_get_info_iova_range(struct vfio_iommu_type1_info *info, range_set_bounds(range, cap->iova_ranges[i].start, cap->iova_ranges[i].end); - container->iova_ranges = - range_list_insert(container->iova_ranges, range); + bcontainer->iova_ranges = + range_list_insert(bcontainer->iova_ranges, range); } return true; @@ -475,12 +475,6 @@ static void vfio_get_iommu_info_migration(VFIOContainer *container, } } -static void vfio_free_container(VFIOContainer *container) -{ - g_list_free_full(container->iova_ranges, g_free); - g_free(container); -} - static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, Error **errp) { @@ -560,7 +554,6 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container = g_malloc0(sizeof(*container)); container->fd = fd; - container->iova_ranges = NULL; bcontainer = &container->bcontainer; vfio_container_init(bcontainer, space, &vfio_legacy_ops); @@ -597,7 +590,7 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as, container->bcontainer.dma_max_mappings = 65535; } - vfio_get_info_iova_range(info, container); + vfio_get_info_iova_range(info, bcontainer); vfio_get_iommu_info_migration(container, info); g_free(info); @@ -650,7 +643,7 @@ enable_discards_exit: vfio_ram_block_discard_disable(container, false); free_container_exit: - vfio_free_container(container); + g_free(container); close_fd_exit: close(fd); @@ -694,7 +687,7 @@ static void vfio_disconnect_container(VFIOGroup *group) trace_vfio_disconnect_container(container->fd); close(container->fd); - vfio_free_container(container); + g_free(container); vfio_put_address_space(space); } From patchwork Thu Oct 26 10:30:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437476 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3FD99C25B48 for ; Thu, 26 Oct 2023 10:49:22 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxur-0002bR-V7; Thu, 26 Oct 2023 06:48:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuM-0002Es-Vp for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:16 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxu9-0001Nv-DN for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317277; x=1729853277; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=b3YC3uNOJHAxXEHMMdFXdR2eF8wgMFjKTbsf/i6lBr0=; b=Bi/QEAZXRBb5LLumhADpp7OvbgovG4Ink3bLIytCtsqhOYK526aMetty /GstOm/uNQ8K8hlEl0qQm/cLvPyrs4Ahvpri+KeiU/YR8Fao9Nl/qKYhp 9+bIfX9shFebFYc4XFPzyuPZojLLmTsf1PqHfy22CzOFzJsia1dTue9/2 /cB2lzn7/x6WIbGYj3ZfgdoEZVqUNQbJk1SEwMrxs1ZEs5B8eVnfzA0A7 TNosIbOKGurRsa6F/FiUHqF/BBRO5cpdvbNC7zN3IhPFLtxKCEKNihuQ9 3Y7gcgzOFXBLsJVLQpbVOzttkwCZmGz7Nq9P+B6pfIlcF2aVGo4TCZw3E g==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563490" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563490" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463708" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:08 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Yi Sun , Zhenzhong Duan Subject: [PATCH v3 18/37] vfio/container: Implement attach/detach_device Date: Thu, 26 Oct 2023 18:30:45 +0800 Message-Id: <20231026103104.1686921-19-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger No fucntional change intended. Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Yi Sun Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- hw/vfio/common.c | 16 ++++++++++++++++ hw/vfio/container.c | 12 +++++------- 2 files changed, 21 insertions(+), 7 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index d62c815d7f..64565b4ae9 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1500,3 +1500,19 @@ retry: return info; } + +int vfio_attach_device(char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp) +{ + const VFIOIOMMUOps *ops = &vfio_legacy_ops; + + return ops->attach_device(name, vbasedev, as, errp); +} + +void vfio_detach_device(VFIODevice *vbasedev) +{ + if (!vbasedev->bcontainer) { + return; + } + vbasedev->bcontainer->ops->detach_device(vbasedev); +} diff --git a/hw/vfio/container.c b/hw/vfio/container.c index 36c34683ad..c8ff0f2037 100644 --- a/hw/vfio/container.c +++ b/hw/vfio/container.c @@ -874,8 +874,8 @@ static int vfio_device_groupid(VFIODevice *vbasedev, Error **errp) * @name and @vbasedev->name are likely to be different depending * on the type of the device, hence the need for passing @name */ -int vfio_attach_device(char *name, VFIODevice *vbasedev, - AddressSpace *as, Error **errp) +static int vfio_legacy_attach_device(char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp) { int groupid = vfio_device_groupid(vbasedev, errp); VFIODevice *vbasedev_iter; @@ -915,14 +915,10 @@ int vfio_attach_device(char *name, VFIODevice *vbasedev, return ret; } -void vfio_detach_device(VFIODevice *vbasedev) +static void vfio_legacy_detach_device(VFIODevice *vbasedev) { VFIOGroup *group = vbasedev->group; - if (!vbasedev->bcontainer) { - return; - } - QLIST_REMOVE(vbasedev, global_next); QLIST_REMOVE(vbasedev, container_next); vbasedev->bcontainer = NULL; @@ -934,6 +930,8 @@ void vfio_detach_device(VFIODevice *vbasedev) const VFIOIOMMUOps vfio_legacy_ops = { .dma_map = vfio_legacy_dma_map, .dma_unmap = vfio_legacy_dma_unmap, + .attach_device = vfio_legacy_attach_device, + .detach_device = vfio_legacy_detach_device, .set_dirty_page_tracking = vfio_legacy_set_dirty_page_tracking, .query_dirty_bitmap = vfio_legacy_query_dirty_bitmap, }; From patchwork Thu Oct 26 10:30:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437474 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8DEBFC25B6B for ; Thu, 26 Oct 2023 10:48:05 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxuE-0001cm-6W; Thu, 26 Oct 2023 06:48:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxu9-0001Jd-2F; Thu, 26 Oct 2023 06:47:59 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxu6-0001Nv-VE; Thu, 26 Oct 2023 06:47:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317274; x=1729853274; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Q4+Rw92ILOH6hUtVhfMsdOHd49usfmzFgq2SDp59M9g=; b=RB/r5AvTD1RExQBsRZh+vI3DOVQj8WsS1c075qIlEp4WnfpPyMOZTL7F UCpf6Mf7hkwmjAHPwH85a+VAj8z3DGBD4jImEmiBBFO13rmex1ns9I6C8 xWzYIrcC5M21A2qnNQZaivjJfaALXN/jmfq4kDbgtDQiqqclSo0jAFxvI ZcIV6C3TLsximAAVcna5dloEf5+iWxWIVNd0PxaOPr+hoDHjzqrE0J0nQ qaUzMZYSczT7Oiv20wccccXI44mSqWD7gmmKqUsaZhQ5pB4I+AsfNvrHs slVj7k+ki6kdV2OMQ9FmaOw3V+CoSF/EF4jyf8+BwgNMftGWTRb5q3ftv Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563522" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563522" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463718" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:11 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 19/37] vfio/spapr: Introduce spapr backend and target interface Date: Thu, 26 Oct 2023 18:30:46 +0800 Message-Id: <20231026103104.1686921-20-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Introduce an empry spapr backend which will hold spapr specific content, currently only prereg_listener and hostwin_list. Also introduce and instantiate a spapr specific target interface, currently only has add/del_window callbacks. Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 8 ++++++++ include/hw/vfio/vfio-container-base.h | 2 ++ hw/vfio/spapr.c | 8 ++++++++ 3 files changed, 18 insertions(+) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 938f75e70c..a74e60e677 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -169,6 +169,14 @@ VFIOAddressSpace *vfio_get_address_space(AddressSpace *as); void vfio_put_address_space(VFIOAddressSpace *space); /* SPAPR specific */ +typedef struct VFIOIOMMUSpaprOps { + int (*add_window)(VFIOContainerBase *bcontainer, + MemoryRegionSection *section, + Error **errp); + void (*del_window)(VFIOContainerBase *bcontainer, + MemoryRegionSection *section); +} VFIOIOMMUSpaprOps; + int vfio_container_add_section_window(VFIOContainer *container, MemoryRegionSection *section, Error **errp); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 2ffafb0d58..1e1854d24f 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -31,6 +31,7 @@ typedef struct VFIODevice VFIODevice; typedef struct VFIOIOMMUOps VFIOIOMMUOps; +typedef struct VFIOIOMMUSpaprOps VFIOIOMMUSpaprOps; typedef struct { unsigned long *bitmap; @@ -49,6 +50,7 @@ typedef struct VFIOAddressSpace { */ typedef struct VFIOContainerBase { const VFIOIOMMUOps *ops; + const VFIOIOMMUSpaprOps *spapr_ops; VFIOAddressSpace *space; MemoryListener listener; Error *error; diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 5786377317..3739004151 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -24,6 +24,10 @@ #include "qapi/error.h" #include "trace.h" +typedef struct VFIOSpaprContainer { + VFIOContainer container; +} VFIOSpaprContainer; + static bool vfio_prereg_listener_skipped_section(MemoryRegionSection *section) { if (memory_region_is_iommu(section->mr)) { @@ -384,6 +388,8 @@ void vfio_container_del_section_window(VFIOContainer *container, } } +const VFIOIOMMUSpaprOps vfio_iommu_spapr_ops; + bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) { VFIOContainerBase *bcontainer = &container->bcontainer; @@ -447,6 +453,8 @@ bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) 0x1000); } + bcontainer->spapr_ops = &vfio_iommu_spapr_ops; + listener_unregister_exit: if (v2) { memory_listener_unregister(&container->prereg_listener); From patchwork Thu Oct 26 10:30:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437488 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3555BC25B6B for ; Thu, 26 Oct 2023 10:50:25 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxus-0002dA-Sy; Thu, 26 Oct 2023 06:48:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuL-0002EB-Nw; Thu, 26 Oct 2023 06:48:16 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxu9-0001PR-Hs; Thu, 26 Oct 2023 06:48:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317277; x=1729853277; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Cgs1UlOu9KSvDtr3cTTJNPXjMQDSb8lNfbQ0g80dHeA=; b=SMg0b3+I3WpEqesLkb6Q9bYTdx8gVI0m+sqUe48xYaoz1z1JnXdSjnUa YSJeWXVvPnSvMW4bB091JiwEuHH5j9phjDnVVqjDYOcWp/OHpr4WtM9GL KMURxw9lSYkObV5Knb8EpZE1XkikVlnzooN7fELjNVGwNPNowru8Z1orR N4hdiPELfxXaMy/OYL7Fl7R+6FbQ4gwx5ClPLmxkVDYwFoKb0CL6vbZ8i cP7e3CfTlqB5Y7cyYYbTt9mhI4bEVXXEDAL4VSZOdqw9WssUjRd03vmLl 5ijVLyNvOy9rzv3syPdPfI4xcNZUR7ilU7TtFZBdtf7QBLZraOdtF1WBE w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563560" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563560" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:31 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463727" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:15 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 20/37] vfio/spapr: switch to spapr IOMMU BE add/del_section_window Date: Thu, 26 Oct 2023 18:30:47 +0800 Message-Id: <20231026103104.1686921-21-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org No fucntional change intended. Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 5 ----- include/hw/vfio/vfio-container-base.h | 5 +++++ hw/vfio/common.c | 8 ++------ hw/vfio/container-base.c | 23 ++++++++++++++++++++++- hw/vfio/spapr.c | 22 ++++++++++++++++------ 5 files changed, 45 insertions(+), 18 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index a74e60e677..1489b49d71 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -177,11 +177,6 @@ typedef struct VFIOIOMMUSpaprOps { MemoryRegionSection *section); } VFIOIOMMUSpaprOps; -int vfio_container_add_section_window(VFIOContainer *container, - MemoryRegionSection *section, - Error **errp); -void vfio_container_del_section_window(VFIOContainer *container, - MemoryRegionSection *section); bool vfio_spapr_container_init(VFIOContainer *container, Error **errp); void vfio_spapr_container_deinit(VFIOContainer *container); diff --git a/include/hw/vfio/vfio-container-base.h b/include/hw/vfio/vfio-container-base.h index 1e1854d24f..88463bcf24 100644 --- a/include/hw/vfio/vfio-container-base.h +++ b/include/hw/vfio/vfio-container-base.h @@ -91,6 +91,11 @@ int vfio_container_dma_map(VFIOContainerBase *bcontainer, int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, IOMMUTLBEntry *iotlb); +int vfio_container_add_section_window(VFIOContainerBase *bcontainer, + MemoryRegionSection *section, + Error **errp); +void vfio_container_del_section_window(VFIOContainerBase *bcontainer, + MemoryRegionSection *section); int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer, bool start); int vfio_container_query_dirty_bitmap(VFIOContainerBase *bcontainer, diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 64565b4ae9..41a5609e33 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -571,8 +571,6 @@ static void vfio_listener_region_add(MemoryListener *listener, { VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase, listener); - VFIOContainer *container = container_of(bcontainer, VFIOContainer, - bcontainer); hwaddr iova, end; Int128 llend, llsize; void *vaddr; @@ -596,7 +594,7 @@ static void vfio_listener_region_add(MemoryListener *listener, return; } - if (vfio_container_add_section_window(container, section, &err)) { + if (vfio_container_add_section_window(bcontainer, section, &err)) { goto fail; } @@ -739,8 +737,6 @@ static void vfio_listener_region_del(MemoryListener *listener, { VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase, listener); - VFIOContainer *container = container_of(bcontainer, VFIOContainer, - bcontainer); hwaddr iova, end; Int128 llend, llsize; int ret; @@ -820,7 +816,7 @@ static void vfio_listener_region_del(MemoryListener *listener, memory_region_unref(section->mr); - vfio_container_del_section_window(container, section); + vfio_container_del_section_window(bcontainer, section); } typedef struct VFIODirtyRanges { diff --git a/hw/vfio/container-base.c b/hw/vfio/container-base.c index 6363ed3552..2efb26cfe4 100644 --- a/hw/vfio/container-base.c +++ b/hw/vfio/container-base.c @@ -24,7 +24,7 @@ #include "qemu/osdep.h" #include "qapi/error.h" #include "qemu/error-report.h" -#include "hw/vfio/vfio-container-base.h" +#include "hw/vfio/vfio-common.h" int vfio_container_dma_map(VFIOContainerBase *bcontainer, hwaddr iova, ram_addr_t size, @@ -48,6 +48,27 @@ int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, return bcontainer->ops->dma_unmap(bcontainer, iova, size, iotlb); } +int vfio_container_add_section_window(VFIOContainerBase *bcontainer, + MemoryRegionSection *section, + Error **errp) +{ + if (!bcontainer->spapr_ops || !bcontainer->spapr_ops->add_window) { + return 0; + } + + return bcontainer->spapr_ops->add_window(bcontainer, section, errp); +} + +void vfio_container_del_section_window(VFIOContainerBase *bcontainer, + MemoryRegionSection *section) +{ + if (!bcontainer->spapr_ops || !bcontainer->spapr_ops->del_window) { + return; + } + + return bcontainer->spapr_ops->del_window(bcontainer, section); +} + int vfio_container_set_dirty_page_tracking(VFIOContainerBase *bcontainer, bool start) { diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 3739004151..43e32e544b 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -302,10 +302,13 @@ static int vfio_spapr_create_window(VFIOContainer *container, return 0; } -int vfio_container_add_section_window(VFIOContainer *container, - MemoryRegionSection *section, - Error **errp) +static int +vfio_spapr_container_add_section_window(VFIOContainerBase *bcontainer, + MemoryRegionSection *section, + Error **errp) { + VFIOContainer *container = container_of(bcontainer, VFIOContainer, + bcontainer); VFIOHostDMAWindow *hostwin; hwaddr pgsize = 0; int ret; @@ -370,9 +373,13 @@ int vfio_container_add_section_window(VFIOContainer *container, return 0; } -void vfio_container_del_section_window(VFIOContainer *container, - MemoryRegionSection *section) +static void +vfio_spapr_container_del_section_window(VFIOContainerBase *bcontainer, + MemoryRegionSection *section) { + VFIOContainer *container = container_of(bcontainer, VFIOContainer, + bcontainer); + if (container->iommu_type != VFIO_SPAPR_TCE_v2_IOMMU) { return; } @@ -388,7 +395,10 @@ void vfio_container_del_section_window(VFIOContainer *container, } } -const VFIOIOMMUSpaprOps vfio_iommu_spapr_ops; +const VFIOIOMMUSpaprOps vfio_iommu_spapr_ops = { + .add_window = vfio_spapr_container_add_section_window, + .del_window = vfio_spapr_container_del_section_window, +}; bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) { From patchwork Thu Oct 26 10:30:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437507 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8B80BC25B70 for ; Thu, 26 Oct 2023 10:53:16 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxut-0002le-Ez; Thu, 26 Oct 2023 06:48:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuM-0002Et-Vs; Thu, 26 Oct 2023 06:48:17 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuC-0001Q0-5E; Thu, 26 Oct 2023 06:48:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317280; x=1729853280; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=w5+kwyEFTNpXEaRbCnpjDpn9Oq8F5pci+Oh4DxWnBwQ=; b=d0lPCVv7Aq2eiGNgAiHSCAxUO/YdGDYvnUh6F263jADf5Yo7LMseU2Uv toDwpInTscSXLA2ILHuahd9aJE5H13Q+HJnipcx1DXemu2s+MMfR34xHH lUY8leqW82QnLqGBb6O/+GCLjxYWhiSi97sRxf+uRe9Yaa9PdhET5Um4O izld+jTcqAB93JluclstLcY+wSpULqVM8nVoVFHqmJJAdy/d4L7wdmys2 7WauZhTf4DtSscYB/GEADmcGfA5ePSUeVB3X3gPbXWuihv2N2rEr+xhJa 0Gradb5Ze8LJo75DlkdKESCelmVHYq3isxMAm+WSS/xXXY0vMnNOoDX+4 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563605" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563605" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463737" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:20 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 21/37] vfio/spapr: Move prereg_listener into spapr container Date: Thu, 26 Oct 2023 18:30:48 +0800 Message-Id: <20231026103104.1686921-22-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org No functional changes intended. Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 1 - hw/vfio/spapr.c | 24 ++++++++++++++++-------- 2 files changed, 16 insertions(+), 9 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 1489b49d71..913e9f5771 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -78,7 +78,6 @@ struct VFIOGroup; typedef struct VFIOContainer { VFIOContainerBase bcontainer; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ - MemoryListener prereg_listener; unsigned iommu_type; QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 43e32e544b..566a79906c 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -26,6 +26,7 @@ typedef struct VFIOSpaprContainer { VFIOContainer container; + MemoryListener prereg_listener; } VFIOSpaprContainer; static bool vfio_prereg_listener_skipped_section(MemoryRegionSection *section) @@ -48,8 +49,9 @@ static void *vfio_prereg_gpa_to_vaddr(MemoryRegionSection *section, hwaddr gpa) static void vfio_prereg_listener_region_add(MemoryListener *listener, MemoryRegionSection *section) { - VFIOContainer *container = container_of(listener, VFIOContainer, - prereg_listener); + VFIOSpaprContainer *scontainer = container_of(listener, VFIOSpaprContainer, + prereg_listener); + VFIOContainer *container = &scontainer->container; VFIOContainerBase *bcontainer = &container->bcontainer; const hwaddr gpa = section->offset_within_address_space; hwaddr end; @@ -107,8 +109,9 @@ static void vfio_prereg_listener_region_add(MemoryListener *listener, static void vfio_prereg_listener_region_del(MemoryListener *listener, MemoryRegionSection *section) { - VFIOContainer *container = container_of(listener, VFIOContainer, - prereg_listener); + VFIOSpaprContainer *scontainer = container_of(listener, VFIOSpaprContainer, + prereg_listener); + VFIOContainer *container = &scontainer->container; const hwaddr gpa = section->offset_within_address_space; hwaddr end; int ret; @@ -403,6 +406,8 @@ const VFIOIOMMUSpaprOps vfio_iommu_spapr_ops = { bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) { VFIOContainerBase *bcontainer = &container->bcontainer; + VFIOSpaprContainer *scontainer = container_of(container, VFIOSpaprContainer, + container); struct vfio_iommu_spapr_tce_info info; bool v2 = container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU; int ret, fd = container->fd; @@ -419,9 +424,9 @@ bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) return -errno; } } else { - container->prereg_listener = vfio_prereg_listener; + scontainer->prereg_listener = vfio_prereg_listener; - memory_listener_register(&container->prereg_listener, + memory_listener_register(&scontainer->prereg_listener, &address_space_memory); if (bcontainer->error) { ret = -1; @@ -467,7 +472,7 @@ bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) listener_unregister_exit: if (v2) { - memory_listener_unregister(&container->prereg_listener); + memory_listener_unregister(&scontainer->prereg_listener); } return ret; } @@ -477,7 +482,10 @@ void vfio_spapr_container_deinit(VFIOContainer *container) VFIOHostDMAWindow *hostwin, *next; if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) { - memory_listener_unregister(&container->prereg_listener); + VFIOSpaprContainer *scontainer = container_of(container, + VFIOSpaprContainer, + container); + memory_listener_unregister(&scontainer->prereg_listener); } QLIST_FOREACH_SAFE(hostwin, &container->hostwin_list, hostwin_next, next) { From patchwork Thu Oct 26 10:30:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437479 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 39D8CC25B6F for ; Thu, 26 Oct 2023 10:49:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxuv-000352-IC; Thu, 26 Oct 2023 06:48:45 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuR-0002Gm-Pp; Thu, 26 Oct 2023 06:48:20 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuO-0001PR-Ew; Thu, 26 Oct 2023 06:48:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317292; x=1729853292; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Xq4kUGwNmzseTosgBBkQHV6Ilmdyoea1Ti8uTgY5BNs=; b=LvcFYouZi+n2Ut4z1iagcez449FqAMl42aOyvUrQ0BRF76ESM+txzU2f xWA9wNl0Auo5QdFg3Qq3iona2BE3L9UsGUvX9AAhXbPMsXIfVxLn7I2cJ A3XZL1V977WaitnzR7iKnW3LUXfKdMRp9uVluxH1NN3H/JTEWU9jYjvWO VMxCgQ75lnpLlKbHW2b8PZQkeU+W4LiL0Ms0Sskbp5m3uZ+hGO563ocH+ 35rBtDX3x4la/SEkdO86Oa+MZp+IjfwrIch7He2ZBoPfScMbCRzxj/gP5 acY8PzZ1gLjZWFhj6cdZAoHzoCCYlEpAOE7JqvItodY9DHQSU1f9ZZsZc A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563633" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563633" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463749" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:24 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 22/37] vfio/spapr: Move hostwin_list into spapr container Date: Thu, 26 Oct 2023 18:30:49 +0800 Message-Id: <20231026103104.1686921-23-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org No functional changes intended. Signed-off-by: Zhenzhong Duan --- include/hw/vfio/vfio-common.h | 1 - hw/vfio/spapr.c | 30 +++++++++++++++++------------- 2 files changed, 17 insertions(+), 14 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 913e9f5771..4f46af1ef4 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -79,7 +79,6 @@ typedef struct VFIOContainer { VFIOContainerBase bcontainer; int fd; /* /dev/vfio/vfio, empowered by the attached groups */ unsigned iommu_type; - QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; QLIST_HEAD(, VFIOGroup) group_list; } VFIOContainer; diff --git a/hw/vfio/spapr.c b/hw/vfio/spapr.c index 566a79906c..70b59e858c 100644 --- a/hw/vfio/spapr.c +++ b/hw/vfio/spapr.c @@ -27,6 +27,7 @@ typedef struct VFIOSpaprContainer { VFIOContainer container; MemoryListener prereg_listener; + QLIST_HEAD(, VFIOHostDMAWindow) hostwin_list; } VFIOSpaprContainer; static bool vfio_prereg_listener_skipped_section(MemoryRegionSection *section) @@ -154,12 +155,12 @@ static const MemoryListener vfio_prereg_listener = { .region_del = vfio_prereg_listener_region_del, }; -static void vfio_host_win_add(VFIOContainer *container, hwaddr min_iova, +static void vfio_host_win_add(VFIOSpaprContainer *scontainer, hwaddr min_iova, hwaddr max_iova, uint64_t iova_pgsizes) { VFIOHostDMAWindow *hostwin; - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { + QLIST_FOREACH(hostwin, &scontainer->hostwin_list, hostwin_next) { if (ranges_overlap(hostwin->min_iova, hostwin->max_iova - hostwin->min_iova + 1, min_iova, @@ -173,15 +174,15 @@ static void vfio_host_win_add(VFIOContainer *container, hwaddr min_iova, hostwin->min_iova = min_iova; hostwin->max_iova = max_iova; hostwin->iova_pgsizes = iova_pgsizes; - QLIST_INSERT_HEAD(&container->hostwin_list, hostwin, hostwin_next); + QLIST_INSERT_HEAD(&scontainer->hostwin_list, hostwin, hostwin_next); } -static int vfio_host_win_del(VFIOContainer *container, +static int vfio_host_win_del(VFIOSpaprContainer *scontainer, hwaddr min_iova, hwaddr max_iova) { VFIOHostDMAWindow *hostwin; - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { + QLIST_FOREACH(hostwin, &scontainer->hostwin_list, hostwin_next) { if (hostwin->min_iova == min_iova && hostwin->max_iova == max_iova) { QLIST_REMOVE(hostwin, hostwin_next); g_free(hostwin); @@ -312,6 +313,8 @@ vfio_spapr_container_add_section_window(VFIOContainerBase *bcontainer, { VFIOContainer *container = container_of(bcontainer, VFIOContainer, bcontainer); + VFIOSpaprContainer *scontainer = container_of(container, VFIOSpaprContainer, + container); VFIOHostDMAWindow *hostwin; hwaddr pgsize = 0; int ret; @@ -321,7 +324,7 @@ vfio_spapr_container_add_section_window(VFIOContainerBase *bcontainer, } /* For now intersections are not allowed, we may relax this later */ - QLIST_FOREACH(hostwin, &container->hostwin_list, hostwin_next) { + QLIST_FOREACH(hostwin, &scontainer->hostwin_list, hostwin_next) { if (ranges_overlap(hostwin->min_iova, hostwin->max_iova - hostwin->min_iova + 1, section->offset_within_address_space, @@ -343,7 +346,7 @@ vfio_spapr_container_add_section_window(VFIOContainerBase *bcontainer, return ret; } - vfio_host_win_add(container, section->offset_within_address_space, + vfio_host_win_add(scontainer, section->offset_within_address_space, section->offset_within_address_space + int128_get64(section->size) - 1, pgsize); #ifdef CONFIG_KVM @@ -382,6 +385,8 @@ vfio_spapr_container_del_section_window(VFIOContainerBase *bcontainer, { VFIOContainer *container = container_of(bcontainer, VFIOContainer, bcontainer); + VFIOSpaprContainer *scontainer = container_of(container, VFIOSpaprContainer, + container); if (container->iommu_type != VFIO_SPAPR_TCE_v2_IOMMU) { return; @@ -389,7 +394,7 @@ vfio_spapr_container_del_section_window(VFIOContainerBase *bcontainer, vfio_spapr_remove_window(container, section->offset_within_address_space); - if (vfio_host_win_del(container, + if (vfio_host_win_del(scontainer, section->offset_within_address_space, section->offset_within_address_space + int128_get64(section->size) - 1) < 0) { @@ -462,7 +467,7 @@ bool vfio_spapr_container_init(VFIOContainer *container, Error **errp) } else { /* The default table uses 4K pages */ bcontainer->pgsizes = 0x1000; - vfio_host_win_add(container, info.dma32_window_start, + vfio_host_win_add(scontainer, info.dma32_window_start, info.dma32_window_start + info.dma32_window_size - 1, 0x1000); @@ -479,15 +484,14 @@ listener_unregister_exit: void vfio_spapr_container_deinit(VFIOContainer *container) { + VFIOSpaprContainer *scontainer = container_of(container, VFIOSpaprContainer, + container); VFIOHostDMAWindow *hostwin, *next; if (container->iommu_type == VFIO_SPAPR_TCE_v2_IOMMU) { - VFIOSpaprContainer *scontainer = container_of(container, - VFIOSpaprContainer, - container); memory_listener_unregister(&scontainer->prereg_listener); } - QLIST_FOREACH_SAFE(hostwin, &container->hostwin_list, hostwin_next, + QLIST_FOREACH_SAFE(hostwin, &scontainer->hostwin_list, hostwin_next, next) { QLIST_REMOVE(hostwin, hostwin_next); g_free(hostwin); From patchwork Thu Oct 26 10:30:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 375D4C25B6F for ; Thu, 26 Oct 2023 10:50:18 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvC-00040w-T3; Thu, 26 Oct 2023 06:49:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuS-0002Gu-RU for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:26 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuP-0001Nv-Vi for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317293; x=1729853293; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yr99guRhnxIYZzeV6x6Oum0jvei4Z/eHaYV4XkfxTGQ=; b=oGTOMNBaQVkgm5sg/Y0V56C9sMGA5QBFnTe/890gwLaT4+OYNDiHxAYj xqX24674SIMc3VoJQYSkefSRD6onWMob2U8PzcofnDuo5A7OJ42rYcOB2 MfmcS/mhl3jxnw8Hz863BrLxR1bexVN8YJ64nrBrU0ssJ/U2xTo/sPl49 GQWPgtTehHRdPyMt6Gr0F9fFmFafmpe5Yu9YBbFmWqAUIujjZaH63taxb T/HxUIQMVcNnga3Lw4gO13H7Jhb+qlvRAcXRmOkmMn6d/dk25NUHZX2lP QeXlS+9jiEwDF5wICPVu2BmtWVnlonSnUs0lil1FrHxLNOMQYmWTGM8J2 w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563664" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563664" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:45 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463765" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:29 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Paolo Bonzini , =?utf-8?q?Marc-Andr=C3=A9_Lureau?= , =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Thomas Huth , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= Subject: [PATCH v3 23/37] Add iommufd configure option Date: Thu, 26 Oct 2023 18:30:50 +0800 Message-Id: <20231026103104.1686921-24-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This adds "--enable-iommufd/--disable-iommufd" to enable or disable iommufd support, enabled by default. Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- meson.build | 6 ++++++ meson_options.txt | 2 ++ scripts/meson-buildoptions.sh | 3 +++ 3 files changed, 11 insertions(+) diff --git a/meson.build b/meson.build index 4961c82a6b..a8b0edf7b1 100644 --- a/meson.build +++ b/meson.build @@ -560,6 +560,10 @@ have_tpm = get_option('tpm') \ .require(targetos != 'windows', error_message: 'TPM emulation only available on POSIX systems') \ .allowed() +have_iommufd = get_option('iommufd') \ + .require(targetos == 'linux', error_message: 'iommufd is supported only on Linux') \ + .allowed() + # vhost have_vhost_user = get_option('vhost_user') \ .disable_auto_if(targetos != 'linux') \ @@ -2133,6 +2137,7 @@ if get_option('tcg').allowed() endif config_host_data.set('CONFIG_TPM', have_tpm) config_host_data.set('CONFIG_TSAN', get_option('tsan')) +config_host_data.set('CONFIG_IOMMUFD', have_iommufd) config_host_data.set('CONFIG_USB_LIBUSB', libusb.found()) config_host_data.set('CONFIG_VDE', vde.found()) config_host_data.set('CONFIG_VHOST_NET', have_vhost_net) @@ -4074,6 +4079,7 @@ summary_info += {'vhost-user-crypto support': have_vhost_user_crypto} summary_info += {'vhost-user-blk server support': have_vhost_user_blk_server} summary_info += {'vhost-vdpa support': have_vhost_vdpa} summary_info += {'build guest agent': have_ga} +summary_info += {'iommufd support': have_iommufd} summary(summary_info, bool_yn: true, section: 'Configurable features') # Compilation information diff --git a/meson_options.txt b/meson_options.txt index 3c7398f3c6..91bb958cae 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -109,6 +109,8 @@ option('dbus_display', type: 'feature', value: 'auto', description: '-display dbus support') option('tpm', type : 'feature', value : 'auto', description: 'TPM support') +option('iommufd', type : 'feature', value : 'auto', + description: 'iommufd support') # Do not enable it by default even for Mingw32, because it doesn't # work on Wine. diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh index 7ca4b77eae..1effc46f7d 100644 --- a/scripts/meson-buildoptions.sh +++ b/scripts/meson-buildoptions.sh @@ -125,6 +125,7 @@ meson_options_help() { printf "%s\n" ' guest-agent-msi Build MSI package for the QEMU Guest Agent' printf "%s\n" ' hvf HVF acceleration support' printf "%s\n" ' iconv Font glyph conversion support' + printf "%s\n" ' iommufd iommufd support' printf "%s\n" ' jack JACK sound support' printf "%s\n" ' keyring Linux keyring support' printf "%s\n" ' kvm KVM acceleration support' @@ -342,6 +343,8 @@ _meson_option_parse() { --enable-install-blobs) printf "%s" -Dinstall_blobs=true ;; --disable-install-blobs) printf "%s" -Dinstall_blobs=false ;; --interp-prefix=*) quote_sh "-Dinterp_prefix=$2" ;; + --enable-iommufd) printf "%s" -Diommufd=enabled ;; + --disable-iommufd) printf "%s" -Diommufd=disabled ;; --enable-jack) printf "%s" -Djack=enabled ;; --disable-jack) printf "%s" -Djack=disabled ;; --enable-keyring) printf "%s" -Dkeyring=enabled ;; From patchwork Thu Oct 26 10:30:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D5269C25B48 for ; Thu, 26 Oct 2023 10:52:03 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxv1-0003RG-8W; Thu, 26 Oct 2023 06:48:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuT-0002HE-OC for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:24 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuP-0001Q0-Vi for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317293; x=1729853293; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yZrbYCvpachzTiR3gIM6OKym7cQX7p3KP9/NGrhdsi0=; b=iV+Kay9p5rAuaJWprNXD2w0FFU6MZGYKkUGDLWRrHLmloq8uzNvkkiU6 zp4ouWjyFr4615ItotH12I6D5dKl4rOyufxdbJdyEKuPoafYlWjxtztJa 1je/4OwIx6DrB9a8h6ZURY2CvFWMVEuagptAaYVMebRDgDjdBY1O627RA HoM7hNKdk0h54SwfdXgQ365xZ+teDpGKjvKpzZo/ld9ERyQ3KVefX3rpS 3nbOn0avltCYLwfUy2d2d+4+m2W2MCu8nnP3veGJ/C/jnt95ZS5IOtD0y FYkk0GbKIodRYAA5t9Vk56Cr3uwySIZVtzMapqJks9k1BS3tcwMgPBtmK w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563690" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563690" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463777" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:33 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Paolo Bonzini , =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= , Eduardo Habkost , Eric Blake , Markus Armbruster Subject: [PATCH v3 24/37] backends/iommufd: Introduce the iommufd object Date: Thu, 26 Oct 2023 18:30:51 +0800 Message-Id: <20231026103104.1686921-25-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Introduce an iommufd object which allows the interaction with the host /dev/iommu device. The /dev/iommu can have been already pre-opened outside of qemu, in which case the fd can be passed directly along with the iommufd object: This allows the iommufd object to be shared accross several subsystems (VFIO, VDPA, ...). For example, libvirt would open the /dev/iommu once. If no fd is passed along with the iommufd object, the /dev/iommu is opened by the qemu code. The CONFIG_IOMMUFD option must be set to compile this new object. Suggested-by: Alex Williamson Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- MAINTAINERS | 7 + qapi/qom.json | 20 +++ include/sysemu/iommufd.h | 46 +++++++ backends/iommufd-stub.c | 59 +++++++++ backends/iommufd.c | 268 +++++++++++++++++++++++++++++++++++++++ backends/Kconfig | 4 + backends/meson.build | 5 + backends/trace-events | 12 ++ qemu-options.hx | 13 ++ 9 files changed, 434 insertions(+) create mode 100644 include/sysemu/iommufd.h create mode 100644 backends/iommufd-stub.c create mode 100644 backends/iommufd.c diff --git a/MAINTAINERS b/MAINTAINERS index 7f9912baa0..7aa57ab16f 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -2109,6 +2109,13 @@ F: hw/vfio/ap.c F: docs/system/s390x/vfio-ap.rst L: qemu-s390x@nongnu.org +iommufd +M: Yi Liu +M: Eric Auger +S: Supported +F: backends/iommufd.c +F: include/sysemu/iommufd.h + vhost M: Michael S. Tsirkin S: Supported diff --git a/qapi/qom.json b/qapi/qom.json index c53ef978ff..ef0b50f107 100644 --- a/qapi/qom.json +++ b/qapi/qom.json @@ -794,6 +794,22 @@ { 'struct': 'VfioUserServerProperties', 'data': { 'socket': 'SocketAddress', 'device': 'str' } } +## +# @IOMMUFDProperties: +# +# Properties for iommufd objects. +# +# @fd: file descriptor name previously passed via 'getfd' command, +# which represents a pre-opened /dev/iommu. This allows the +# iommufd object to be shared accross several subsystems +# (VFIO, VDPA, ...) and file descriptor to be shared with +# other process, e.g: DPDK. +# +# Since: 8.2 +## +{ 'struct': 'IOMMUFDProperties', + 'data': { '*fd': 'str' } } + ## # @RngProperties: # @@ -934,6 +950,8 @@ 'input-barrier', { 'name': 'input-linux', 'if': 'CONFIG_LINUX' }, + { 'name': 'iommufd', + 'if': 'CONFIG_IOMMUFD' }, 'iothread', 'main-loop', { 'name': 'memory-backend-epc', @@ -1003,6 +1021,8 @@ 'input-barrier': 'InputBarrierProperties', 'input-linux': { 'type': 'InputLinuxProperties', 'if': 'CONFIG_LINUX' }, + 'iommufd': { 'type': 'IOMMUFDProperties', + 'if': 'CONFIG_IOMMUFD' }, 'iothread': 'IothreadProperties', 'main-loop': 'MainLoopProperties', 'memory-backend-epc': { 'type': 'MemoryBackendEpcProperties', diff --git a/include/sysemu/iommufd.h b/include/sysemu/iommufd.h new file mode 100644 index 0000000000..f0e5c7eeb8 --- /dev/null +++ b/include/sysemu/iommufd.h @@ -0,0 +1,46 @@ +#ifndef SYSEMU_IOMMUFD_H +#define SYSEMU_IOMMUFD_H + +#include "qom/object.h" +#include "qemu/thread.h" +#include "exec/hwaddr.h" +#include "exec/cpu-common.h" + +#define TYPE_IOMMUFD_BACKEND "iommufd" +OBJECT_DECLARE_TYPE(IOMMUFDBackend, IOMMUFDBackendClass, + IOMMUFD_BACKEND) +#define IOMMUFD_BACKEND(obj) \ + OBJECT_CHECK(IOMMUFDBackend, (obj), TYPE_IOMMUFD_BACKEND) +#define IOMMUFD_BACKEND_GET_CLASS(obj) \ + OBJECT_GET_CLASS(IOMMUFDBackendClass, (obj), TYPE_IOMMUFD_BACKEND) +#define IOMMUFD_BACKEND_CLASS(klass) \ + OBJECT_CLASS_CHECK(IOMMUFDBackendClass, (klass), TYPE_IOMMUFD_BACKEND) +struct IOMMUFDBackendClass { + ObjectClass parent_class; +}; + +struct IOMMUFDBackend { + Object parent; + + /*< protected >*/ + int fd; /* /dev/iommu file descriptor */ + bool owned; /* is the /dev/iommu opened internally */ + QemuMutex lock; + uint32_t users; + + /*< public >*/ +}; + +int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp); +void iommufd_backend_disconnect(IOMMUFDBackend *be); + +int iommufd_backend_get_ioas(IOMMUFDBackend *be, uint32_t *ioas_id); +void iommufd_backend_put_ioas(IOMMUFDBackend *be, uint32_t ioas_id); +void iommufd_backend_free_id(int fd, uint32_t id); +int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly); +int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id, + hwaddr iova, ram_addr_t size); +int iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, + uint32_t pt_id, uint32_t *out_hwpt); +#endif diff --git a/backends/iommufd-stub.c b/backends/iommufd-stub.c new file mode 100644 index 0000000000..02ac844c17 --- /dev/null +++ b/backends/iommufd-stub.c @@ -0,0 +1,59 @@ +/* + * iommufd container backend stub + * + * Copyright (C) 2023 Intel Corporation. + * Copyright Red Hat, Inc. 2023 + * + * Authors: Yi Liu + * Eric Auger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + + * You should have received a copy of the GNU General Public License along + * with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include "sysemu/iommufd.h" +#include "qemu/error-report.h" + +int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp) +{ + return 0; +} +void iommufd_backend_disconnect(IOMMUFDBackend *be) +{ +} +void iommufd_backend_free_id(int fd, uint32_t id) +{ +} +int iommufd_backend_get_ioas(IOMMUFDBackend *be, uint32_t *ioas_id) +{ + return 0; +} +void iommufd_backend_put_ioas(IOMMUFDBackend *be, uint32_t ioas_id) +{ +} +int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly) +{ + return 0; +} +int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id, + hwaddr iova, ram_addr_t size) +{ + return 0; +} +int iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, + uint32_t pt_id, uint32_t *out_hwpt) +{ + return 0; +} diff --git a/backends/iommufd.c b/backends/iommufd.c new file mode 100644 index 0000000000..3f0ed37847 --- /dev/null +++ b/backends/iommufd.c @@ -0,0 +1,268 @@ +/* + * iommufd container backend + * + * Copyright (C) 2023 Intel Corporation. + * Copyright Red Hat, Inc. 2023 + * + * Authors: Yi Liu + * Eric Auger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + + * You should have received a copy of the GNU General Public License along + * with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include "sysemu/iommufd.h" +#include "qapi/error.h" +#include "qapi/qmp/qerror.h" +#include "qemu/module.h" +#include "qom/object_interfaces.h" +#include "qemu/error-report.h" +#include "monitor/monitor.h" +#include "trace.h" +#include +#include + +static void iommufd_backend_init(Object *obj) +{ + IOMMUFDBackend *be = IOMMUFD_BACKEND(obj); + + be->fd = -1; + be->users = 0; + be->owned = true; + qemu_mutex_init(&be->lock); +} + +static void iommufd_backend_finalize(Object *obj) +{ + IOMMUFDBackend *be = IOMMUFD_BACKEND(obj); + + if (be->owned) { + close(be->fd); + be->fd = -1; + } +} + +static void iommufd_backend_set_fd(Object *obj, const char *str, Error **errp) +{ + IOMMUFDBackend *be = IOMMUFD_BACKEND(obj); + int fd = -1; + + fd = monitor_fd_param(monitor_cur(), str, errp); + if (fd == -1) { + error_prepend(errp, "Could not parse remote object fd %s:", str); + return; + } + qemu_mutex_lock(&be->lock); + be->fd = fd; + be->owned = false; + qemu_mutex_unlock(&be->lock); + trace_iommu_backend_set_fd(be->fd); +} + +static void iommufd_backend_class_init(ObjectClass *oc, void *data) +{ + object_class_property_add_str(oc, "fd", NULL, iommufd_backend_set_fd); +} + +int iommufd_backend_connect(IOMMUFDBackend *be, Error **errp) +{ + int fd, ret = 0; + + qemu_mutex_lock(&be->lock); + if (be->users == UINT32_MAX) { + error_setg(errp, "too many connections"); + ret = -E2BIG; + goto out; + } + if (be->owned && !be->users) { + fd = qemu_open_old("/dev/iommu", O_RDWR); + if (fd < 0) { + error_setg_errno(errp, errno, "/dev/iommu opening failed"); + ret = fd; + goto out; + } + be->fd = fd; + } + be->users++; +out: + trace_iommufd_backend_connect(be->fd, be->owned, + be->users, ret); + qemu_mutex_unlock(&be->lock); + return ret; +} + +void iommufd_backend_disconnect(IOMMUFDBackend *be) +{ + qemu_mutex_lock(&be->lock); + if (!be->users) { + goto out; + } + be->users--; + if (!be->users && be->owned) { + close(be->fd); + be->fd = -1; + } +out: + trace_iommufd_backend_disconnect(be->fd, be->users); + qemu_mutex_unlock(&be->lock); +} + +static int iommufd_backend_alloc_ioas(int fd, uint32_t *ioas_id) +{ + int ret; + struct iommu_ioas_alloc alloc_data = { + .size = sizeof(alloc_data), + .flags = 0, + }; + + ret = ioctl(fd, IOMMU_IOAS_ALLOC, &alloc_data); + if (ret) { + error_report("Failed to allocate ioas %m"); + } + + *ioas_id = alloc_data.out_ioas_id; + trace_iommufd_backend_alloc_ioas(fd, *ioas_id, ret); + + return ret; +} + +void iommufd_backend_free_id(int fd, uint32_t id) +{ + int ret; + struct iommu_destroy des = { + .size = sizeof(des), + .id = id, + }; + + ret = ioctl(fd, IOMMU_DESTROY, &des); + trace_iommufd_backend_free_id(fd, id, ret); + if (ret) { + error_report("Failed to free id: %u %m", id); + } +} + +int iommufd_backend_get_ioas(IOMMUFDBackend *be, uint32_t *ioas_id) +{ + int ret; + + ret = iommufd_backend_alloc_ioas(be->fd, ioas_id); + trace_iommufd_backend_get_ioas(be->fd, *ioas_id, ret); + return ret; +} + +void iommufd_backend_put_ioas(IOMMUFDBackend *be, uint32_t ioas_id) +{ + iommufd_backend_free_id(be->fd, ioas_id); + trace_iommufd_backend_put_ioas(be->fd, ioas_id); +} + +int iommufd_backend_map_dma(IOMMUFDBackend *be, uint32_t ioas_id, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly) +{ + int ret; + struct iommu_ioas_map map = { + .size = sizeof(map), + .flags = IOMMU_IOAS_MAP_READABLE | + IOMMU_IOAS_MAP_FIXED_IOVA, + .ioas_id = ioas_id, + .__reserved = 0, + .user_va = (uintptr_t)vaddr, + .iova = iova, + .length = size, + }; + + if (!readonly) { + map.flags |= IOMMU_IOAS_MAP_WRITEABLE; + } + + ret = ioctl(be->fd, IOMMU_IOAS_MAP, &map); + trace_iommufd_backend_map_dma(be->fd, ioas_id, iova, size, + vaddr, readonly, ret); + if (ret) { + error_report("IOMMU_IOAS_MAP failed: %m"); + } + return !ret ? 0 : -errno; +} + +int iommufd_backend_unmap_dma(IOMMUFDBackend *be, uint32_t ioas_id, + hwaddr iova, ram_addr_t size) +{ + int ret; + struct iommu_ioas_unmap unmap = { + .size = sizeof(unmap), + .ioas_id = ioas_id, + .iova = iova, + .length = size, + }; + + ret = ioctl(be->fd, IOMMU_IOAS_UNMAP, &unmap); + trace_iommufd_backend_unmap_dma(be->fd, ioas_id, iova, size, ret); + /* + * TODO: IOMMUFD doesn't support mapping PCI BARs for now. + * It's not a problem if there is no p2p dma, relax it here + * and avoid many noisy trigger from vIOMMU side. + */ + if (ret && errno == ENOENT) { + ret = 0; + } + if (ret) { + error_report("IOMMU_IOAS_UNMAP failed: %m"); + } + return !ret ? 0 : -errno; +} + +int iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, + uint32_t pt_id, uint32_t *out_hwpt) +{ + int ret; + struct iommu_hwpt_alloc alloc_hwpt = { + .size = sizeof(struct iommu_hwpt_alloc), + .flags = 0, + .dev_id = dev_id, + .pt_id = pt_id, + .__reserved = 0, + }; + + ret = ioctl(iommufd, IOMMU_HWPT_ALLOC, &alloc_hwpt); + trace_iommufd_backend_alloc_hwpt(iommufd, dev_id, pt_id, + alloc_hwpt.out_hwpt_id, ret); + + if (ret) { + error_report("IOMMU_HWPT_ALLOC failed: %m"); + } else { + *out_hwpt = alloc_hwpt.out_hwpt_id; + } + return !ret ? 0 : -errno; +} + +static const TypeInfo iommufd_backend_info = { + .name = TYPE_IOMMUFD_BACKEND, + .parent = TYPE_OBJECT, + .instance_size = sizeof(IOMMUFDBackend), + .instance_init = iommufd_backend_init, + .instance_finalize = iommufd_backend_finalize, + .class_size = sizeof(IOMMUFDBackendClass), + .class_init = iommufd_backend_class_init, + .interfaces = (InterfaceInfo[]) { + { TYPE_USER_CREATABLE }, + { } + } +}; + +static void register_types(void) +{ + type_register_static(&iommufd_backend_info); +} + +type_init(register_types); diff --git a/backends/Kconfig b/backends/Kconfig index f35abc1609..2cb23f62fa 100644 --- a/backends/Kconfig +++ b/backends/Kconfig @@ -1 +1,5 @@ source tpm/Kconfig + +config IOMMUFD + bool + depends on VFIO diff --git a/backends/meson.build b/backends/meson.build index 914c7c4afb..05ac57ff15 100644 --- a/backends/meson.build +++ b/backends/meson.build @@ -20,6 +20,11 @@ if have_vhost_user system_ss.add(when: 'CONFIG_VIRTIO', if_true: files('vhost-user.c')) endif system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost.c')) +if have_iommufd + system_ss.add(files('iommufd.c')) +else + system_ss.add(files('iommufd-stub.c')) +endif if have_vhost_user_crypto system_ss.add(when: 'CONFIG_VIRTIO_CRYPTO', if_true: files('cryptodev-vhost-user.c')) endif diff --git a/backends/trace-events b/backends/trace-events index 652eb76a57..e5f828bca2 100644 --- a/backends/trace-events +++ b/backends/trace-events @@ -5,3 +5,15 @@ dbus_vmstate_pre_save(void) dbus_vmstate_post_load(int version_id) "version_id: %d" dbus_vmstate_loading(const char *id) "id: %s" dbus_vmstate_saving(const char *id) "id: %s" + +# iommufd.c +iommufd_backend_connect(int fd, bool owned, uint32_t users, int ret) "fd=%d owned=%d users=%d (%d)" +iommufd_backend_disconnect(int fd, uint32_t users) "fd=%d users=%d" +iommu_backend_set_fd(int fd) "pre-opened /dev/iommu fd=%d" +iommufd_backend_get_ioas(int iommufd, uint32_t ioas, int ret) " iommufd=%d ioas=%d (%d)" +iommufd_backend_put_ioas(int iommufd, uint32_t ioas) " iommufd=%d ioas=%d" +iommufd_backend_map_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, void *vaddr, bool readonly, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" addr=%p readonly=%d (%d)" +iommufd_backend_unmap_dma(int iommufd, uint32_t ioas, uint64_t iova, uint64_t size, int ret) " iommufd=%d ioas=%d iova=0x%"PRIx64" size=0x%"PRIx64" (%d)" +iommufd_backend_alloc_ioas(int iommufd, uint32_t ioas, int ret) " iommufd=%d ioas=%d (%d)" +iommufd_backend_free_id(int iommufd, uint32_t id, int ret) " iommufd=%d id=%d (%d)" +iommufd_backend_alloc_hwpt(int iommufd, uint32_t dev_id, uint32_t pt_id, uint32_t out_hwpt_id, int ret) " iommufd=%d dev_id=%u pt_id=%u out_hwpt=%u (%d)" diff --git a/qemu-options.hx b/qemu-options.hx index e26230bac5..ddfaddf8ce 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -5210,6 +5210,19 @@ SRST The ``share`` boolean option is on by default with memfd. +#ifdef CONFIG_IOMMUFD + ``-object iommufd,id=id[,fd=fd]`` + Creates an iommufd backend which allows control of DMA mapping + through the /dev/iommu device. + + The ``id`` parameter is a unique ID which frontends (such as + vfio-pci of vdpa) will use to connect with the iommufd backend. + + The ``fd`` parameter is an optional pre-opened file descriptor + resulting from /dev/iommu opening. Usually the iommufd is shared + across all subsystems, bringing the benefit of centralized + reference counting. +#endif ``-object rng-builtin,id=id`` Creates a random number generator backend which obtains entropy from QEMU builtin functions. The ``id`` parameter is a unique ID From patchwork Thu Oct 26 10:30:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437477 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4AFF0C25B6B for ; Thu, 26 Oct 2023 10:49:40 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxuz-0003M7-Lz; Thu, 26 Oct 2023 06:48:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuh-0002Rv-Ej for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:35 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuc-0001PR-Sq for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317306; x=1729853306; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=rYliQG+s2p5FvOjNi6C3BSCV4OUXauhKWAFwOAi35j0=; b=IQjFYcO9Zq0TU3BNgnDa6dcEIGHCKL0Ah3y8pJh3S5AggBjaicRfoBDs ZacSRIv59r7bg2bsfOAEmQPwS02DAZQZSKvIQdyA8bCEghS97HWOPBQe/ gwSI11cA9GLRXZ6cAy62KG3NP6yZWFALZ9ZItThccdvyUX3i5EzAO5XMv lcOs52RmmYTyLNmWEFJVG4IlPfgCQQ2D5JJU3829fnqIg286J1UhYUmI4 YdYAACiYTkHbZcN7RBUPs0uA8n65JZjQCOJMf/WxG8+F2oMVdwWBI+p1x 1di2B0VReKWSUQNUc4E8WxEhGHhOcLipvJYICY+VCpVvFaTmCMXG5Ypln w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563700" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563700" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:52 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463784" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:38 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 25/37] util/char_dev: Add open_cdev() Date: Thu, 26 Oct 2023 18:30:52 +0800 Message-Id: <20231026103104.1686921-26-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Yi Liu /dev/vfio/devices/vfioX may not exist. In that case it is still possible to open /dev/char/$major:$minor instead. Add helper function to abstract the cdev open. Suggested-by: Jason Gunthorpe Signed-off-by: Yi Liu Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- MAINTAINERS | 6 +++ include/qemu/chardev_open.h | 16 ++++++++ util/chardev_open.c | 81 +++++++++++++++++++++++++++++++++++++ util/meson.build | 1 + 4 files changed, 104 insertions(+) create mode 100644 include/qemu/chardev_open.h create mode 100644 util/chardev_open.c diff --git a/MAINTAINERS b/MAINTAINERS index 7aa57ab16f..123e999bee 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -3447,6 +3447,12 @@ S: Maintained F: include/qemu/iova-tree.h F: util/iova-tree.c +cdev Open +M: Yi Liu +S: Maintained +F: include/qemu/chardev_open.h +F: util/chardev_open.c + elf2dmp M: Viktor Prutyanov S: Maintained diff --git a/include/qemu/chardev_open.h b/include/qemu/chardev_open.h new file mode 100644 index 0000000000..6580d351c6 --- /dev/null +++ b/include/qemu/chardev_open.h @@ -0,0 +1,16 @@ +/* + * QEMU Chardev Helper + * + * Copyright (C) 2023 Intel Corporation. + * + * Authors: Yi Liu + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + */ + +#ifndef QEMU_CHARDEV_HELPERS_H +#define QEMU_CHARDEV_HELPERS_H + +int open_cdev(const char *devpath, dev_t cdev); +#endif diff --git a/util/chardev_open.c b/util/chardev_open.c new file mode 100644 index 0000000000..005d2b81bd --- /dev/null +++ b/util/chardev_open.c @@ -0,0 +1,81 @@ +/* + * Copyright (c) 2019, Mellanox Technologies. All rights reserved. + * Copyright (C) 2023 Intel Corporation. + * + * This software is available to you under a choice of one of two + * licenses. You may choose to be licensed under the terms of the GNU + * General Public License (GPL) Version 2, available from the file + * COPYING in the main directory of this source tree, or the + * OpenIB.org BSD license below: + * + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * - Redistributions of source code must retain the above + * copyright notice, this list of conditions and the following + * disclaimer. + * + * - Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS + * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN + * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN + * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + * SOFTWARE. + * + * Authors: Yi Liu + * + * Copied from + * https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c + * + */ + +#include "qemu/osdep.h" +#include "qemu/chardev_open.h" + +static int open_cdev_internal(const char *path, dev_t cdev) +{ + struct stat st; + int fd; + + fd = qemu_open_old(path, O_RDWR); + if (fd == -1) { + return -1; + } + if (fstat(fd, &st) || !S_ISCHR(st.st_mode) || + (cdev != 0 && st.st_rdev != cdev)) { + close(fd); + return -1; + } + return fd; +} + +static int open_cdev_robust(dev_t cdev) +{ + g_autofree char *devpath; + + /* + * This assumes that udev is being used and is creating the /dev/char/ + * symlinks. + */ + devpath = g_strdup_printf("/dev/char/%u:%u", major(cdev), minor(cdev)); + return open_cdev_internal(devpath, cdev); +} + +int open_cdev(const char *devpath, dev_t cdev) +{ + int fd; + + fd = open_cdev_internal(devpath, cdev); + if (fd == -1 && cdev != 0) { + return open_cdev_robust(cdev); + } + return fd; +} diff --git a/util/meson.build b/util/meson.build index eb677b40c2..eda0b06062 100644 --- a/util/meson.build +++ b/util/meson.build @@ -107,6 +107,7 @@ if have_block util_ss.add(files('filemonitor-stub.c')) endif util_ss.add(when: 'CONFIG_LINUX', if_true: files('vfio-helpers.c')) + util_ss.add(when: 'CONFIG_LINUX', if_true: files('chardev_open.c')) endif if cpu == 'aarch64' From patchwork Thu Oct 26 10:30:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437486 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 28576C25B48 for ; Thu, 26 Oct 2023 10:50:18 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvB-0003uV-5q; Thu, 26 Oct 2023 06:49:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxuq-0002bs-2Z for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:41 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxud-0001Nv-Iw for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317307; x=1729853307; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7CdYzZVDThCqPml0yQIup1sA5iTAYraVjiotcugxb3E=; b=Bd9VdmASPeDMcmhc+xQIhb7q6wlKcP+XRPUvv8ZuGC2JGHDBRNvwOlfJ aAwxUlrrz9Bn2F2vFCYXPCOx9ThywjiYjhcv+UX+X0CyA5AIt8txfcngE 5JrxEZywrKm+LXNdsSzTd+tajNrdZYpEScw/45Cic/Tg9G0/E99czHaWm PIQ0lnE3K3nkdqLNNOqMpWDwQuKzA+zoMCFFPvZOhOdfDt7aayU8XsHvU Ze+b5Jg1hp+CL5H2x71h5R70GGhmVPdeO54cEjNBOqpYiaWTjM4+CZp1o 5B56Ks2YpEWRuAPJbPpsCCYPHn5OTKM3Mue0waJT8Z6M5+jIropje2J6A w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563708" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563708" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463795" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:42 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 26/37] vfio/iommufd: Implement the iommufd backend Date: Thu, 26 Oct 2023 18:30:53 +0800 Message-Id: <20231026103104.1686921-27-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Yi Liu Add the iommufd backend. The IOMMUFD container class is implemented based on the new /dev/iommu user API. This backend obviously depends on CONFIG_IOMMUFD. So far, the iommufd backend doesn't support dirty page sync yet due to missing support in the host kernel. Co-authored-by: Eric Auger Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 23 ++ hw/vfio/common.c | 19 +- hw/vfio/iommufd.c | 460 ++++++++++++++++++++++++++++++++++ hw/vfio/meson.build | 3 + hw/vfio/trace-events | 12 + 5 files changed, 513 insertions(+), 4 deletions(-) create mode 100644 hw/vfio/iommufd.c diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 4f46af1ef4..95afee7243 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -89,6 +89,23 @@ typedef struct VFIOHostDMAWindow { QLIST_ENTRY(VFIOHostDMAWindow) hostwin_next; } VFIOHostDMAWindow; +#ifdef CONFIG_IOMMUFD +typedef struct VFIOIOASHwpt { + uint32_t hwpt_id; + QLIST_HEAD(, VFIODevice) device_list; + QLIST_ENTRY(VFIOIOASHwpt) next; +} VFIOIOASHwpt; + +typedef struct IOMMUFDBackend IOMMUFDBackend; + +typedef struct VFIOIOMMUFDContainer { + VFIOContainerBase bcontainer; + IOMMUFDBackend *be; + uint32_t ioas_id; + QLIST_HEAD(, VFIOIOASHwpt) hwpt_list; +} VFIOIOMMUFDContainer; +#endif + typedef struct VFIODeviceOps VFIODeviceOps; typedef struct VFIODevice { @@ -116,6 +133,11 @@ typedef struct VFIODevice { OnOffAuto pre_copy_dirty_page_tracking; bool dirty_pages_supported; bool dirty_tracking; +#ifdef CONFIG_IOMMUFD + int devid; + VFIOIOASHwpt *hwpt; + IOMMUFDBackend *iommufd; +#endif } VFIODevice; struct VFIODeviceOps { @@ -209,6 +231,7 @@ typedef QLIST_HEAD(VFIODeviceList, VFIODevice) VFIODeviceList; extern VFIOGroupList vfio_group_list; extern VFIODeviceList vfio_device_list; extern const VFIOIOMMUOps vfio_legacy_ops; +extern const VFIOIOMMUOps vfio_iommufd_ops; extern const MemoryListener vfio_memory_listener; extern int vfio_kvm_device_fd; diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 41a5609e33..6f0534d832 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -1464,10 +1464,13 @@ VFIOAddressSpace *vfio_get_address_space(AddressSpace *as) void vfio_put_address_space(VFIOAddressSpace *space) { - if (QLIST_EMPTY(&space->containers)) { - QLIST_REMOVE(space, list); - g_free(space); + if (!QLIST_EMPTY(&space->containers)) { + return; } + + QLIST_REMOVE(space, list); + g_free(space); + if (QLIST_EMPTY(&vfio_address_spaces)) { qemu_unregister_reset(vfio_reset_handler, NULL); } @@ -1500,8 +1503,16 @@ retry: int vfio_attach_device(char *name, VFIODevice *vbasedev, AddressSpace *as, Error **errp) { - const VFIOIOMMUOps *ops = &vfio_legacy_ops; + const VFIOIOMMUOps *ops; +#ifdef CONFIG_IOMMUFD + if (vbasedev->iommufd) { + ops = &vfio_iommufd_ops; + } else +#endif + { + ops = &vfio_legacy_ops; + } return ops->attach_device(name, vbasedev, as, errp); } diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c new file mode 100644 index 0000000000..aee64d63f3 --- /dev/null +++ b/hw/vfio/iommufd.c @@ -0,0 +1,460 @@ +/* + * iommufd container backend + * + * Copyright (C) 2023 Intel Corporation. + * Copyright Red Hat, Inc. 2023 + * + * Authors: Yi Liu + * Eric Auger + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + + * You should have received a copy of the GNU General Public License along + * with this program; if not, see . + */ + +#include "qemu/osdep.h" +#include +#include +#include + +#include "hw/vfio/vfio-common.h" +#include "qemu/error-report.h" +#include "trace.h" +#include "qapi/error.h" +#include "sysemu/iommufd.h" +#include "hw/qdev-core.h" +#include "sysemu/reset.h" +#include "qemu/cutils.h" +#include "qemu/chardev_open.h" + +static int iommufd_map(VFIOContainerBase *bcontainer, hwaddr iova, + ram_addr_t size, void *vaddr, bool readonly) +{ + VFIOIOMMUFDContainer *container = + container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); + + return iommufd_backend_map_dma(container->be, + container->ioas_id, + iova, size, vaddr, readonly); +} + +static int iommufd_unmap(VFIOContainerBase *bcontainer, + hwaddr iova, ram_addr_t size, + IOMMUTLBEntry *iotlb) +{ + VFIOIOMMUFDContainer *container = + container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); + + /* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */ + return iommufd_backend_unmap_dma(container->be, + container->ioas_id, iova, size); +} + +static void vfio_kvm_device_add_device(VFIODevice *vbasedev) +{ + Error *err = NULL; + + if (vfio_kvm_device_add_fd(vbasedev->fd, &err)) { + error_report_err(err); + } +} + +static void vfio_kvm_device_del_device(VFIODevice *vbasedev) +{ + Error *err = NULL; + + if (vfio_kvm_device_del_fd(vbasedev->fd, &err)) { + error_report_err(err); + } +} + +static int iommufd_connect_and_bind(VFIODevice *vbasedev, Error **errp) +{ + IOMMUFDBackend *iommufd = vbasedev->iommufd; + struct vfio_device_bind_iommufd bind = { + .argsz = sizeof(bind), + .flags = 0, + }; + int ret; + + ret = iommufd_backend_connect(iommufd, errp); + if (ret) { + return ret; + } + + /* + * Add device to kvm-vfio to be prepared for the tracking + * in KVM. Especially for some emulated devices, it requires + * to have kvm information in the device open. + */ + vfio_kvm_device_add_device(vbasedev); + + /* Bind device to iommufd */ + bind.iommufd = iommufd->fd; + ret = ioctl(vbasedev->fd, VFIO_DEVICE_BIND_IOMMUFD, &bind); + if (ret) { + error_setg_errno(errp, errno, "error bind device fd=%d to iommufd=%d", + vbasedev->fd, bind.iommufd); + goto err_bind; + } + + vbasedev->devid = bind.out_devid; + trace_vfio_iommufd_bind_device(bind.iommufd, vbasedev->name, + vbasedev->fd, vbasedev->devid); + return ret; +err_bind: + vfio_kvm_device_del_device(vbasedev); + iommufd_backend_disconnect(iommufd); + return ret; +} + +static void iommufd_unbind_and_disconnect(VFIODevice *vbasedev) +{ + /* Unbind is automatically conducted when device fd is closed */ + vfio_kvm_device_del_device(vbasedev); + iommufd_backend_disconnect(vbasedev->iommufd); +} + +static int vfio_get_devicefd(const char *sysfs_path, Error **errp) +{ + long int ret = -ENOTTY; + char *path, *vfio_dev_path = NULL, *vfio_path = NULL; + DIR *dir = NULL; + struct dirent *dent; + gchar *contents; + struct stat st; + gsize length; + int major, minor; + dev_t vfio_devt; + + path = g_strdup_printf("%s/vfio-dev", sysfs_path); + if (stat(path, &st) < 0) { + error_setg_errno(errp, errno, "no such host device"); + goto out_free_path; + } + + dir = opendir(path); + if (!dir) { + error_setg_errno(errp, errno, "couldn't open dirrectory %s", path); + goto out_free_path; + } + + while ((dent = readdir(dir))) { + if (!strncmp(dent->d_name, "vfio", 4)) { + vfio_dev_path = g_strdup_printf("%s/%s/dev", path, dent->d_name); + break; + } + } + + if (!vfio_dev_path) { + error_setg(errp, "failed to find vfio-dev/vfioX/dev"); + goto out_close_dir; + } + + if (!g_file_get_contents(vfio_dev_path, &contents, &length, NULL)) { + error_setg(errp, "failed to load \"%s\"", vfio_dev_path); + goto out_free_dev_path; + } + + if (sscanf(contents, "%d:%d", &major, &minor) != 2) { + error_setg(errp, "failed to get major:minor for \"%s\"", vfio_dev_path); + goto out_free_dev_path; + } + g_free(contents); + vfio_devt = makedev(major, minor); + + vfio_path = g_strdup_printf("/dev/vfio/devices/%s", dent->d_name); + ret = open_cdev(vfio_path, vfio_devt); + if (ret < 0) { + error_setg(errp, "Failed to open %s", vfio_path); + } + + trace_vfio_iommufd_get_devicefd(vfio_path, ret); + g_free(vfio_path); + +out_free_dev_path: + g_free(vfio_dev_path); +out_close_dir: + closedir(dir); +out_free_path: + if (*errp) { + error_prepend(errp, VFIO_MSG_PREFIX, path); + } + g_free(path); + + return ret; +} + +static VFIOIOASHwpt *vfio_container_get_hwpt(VFIOIOMMUFDContainer *container, + uint32_t hwpt_id) +{ + VFIOIOASHwpt *hwpt; + + QLIST_FOREACH(hwpt, &container->hwpt_list, next) { + if (hwpt->hwpt_id == hwpt_id) { + return hwpt; + } + } + + hwpt = g_malloc0(sizeof(*hwpt)); + + hwpt->hwpt_id = hwpt_id; + QLIST_INIT(&hwpt->device_list); + QLIST_INSERT_HEAD(&container->hwpt_list, hwpt, next); + + return hwpt; +} + +static void vfio_container_put_hwpt(IOMMUFDBackend *be, VFIOIOASHwpt *hwpt) +{ + QLIST_REMOVE(hwpt, next); + g_free(hwpt); +} + +static int vfio_device_attach_container(VFIODevice *vbasedev, + VFIOIOMMUFDContainer *container, + Error **errp) +{ + int ret, iommufd = vbasedev->iommufd->fd; + VFIOIOASHwpt *hwpt; + struct vfio_device_attach_iommufd_pt attach_data = { + .argsz = sizeof(attach_data), + .flags = 0, + .pt_id = container->ioas_id, + }; + + /* Attach device to an ioas within iommufd */ + ret = ioctl(vbasedev->fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data); + if (ret) { + error_setg_errno(errp, errno, + "[iommufd=%d] error attach %s (%d) to ioasid=%d", + container->be->fd, vbasedev->name, vbasedev->fd, + attach_data.pt_id); + return ret; + + } + hwpt = vfio_container_get_hwpt(container, attach_data.pt_id); + + QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, next); + vbasedev->hwpt = hwpt; + + trace_vfio_iommufd_attach_device(iommufd, vbasedev->name, vbasedev->fd, + container->ioas_id, attach_data.pt_id); + return ret; +} + +static void vfio_device_detach_container(VFIODevice *vbasedev, + VFIOIOMMUFDContainer *container) +{ + VFIOIOASHwpt *hwpt = vbasedev->hwpt; + struct vfio_device_attach_iommufd_pt detach_data = { + .argsz = sizeof(detach_data), + .flags = 0, + }; + + if (ioctl(vbasedev->fd, VFIO_DEVICE_DETACH_IOMMUFD_PT, &detach_data)) { + error_report("detach %s from ioas id=%d failed: %m", vbasedev->name, + container->ioas_id); + } + + QLIST_REMOVE(vbasedev, next); + vbasedev->hwpt = NULL; + if (QLIST_EMPTY(&hwpt->device_list)) { + vfio_container_put_hwpt(vbasedev->iommufd, hwpt); + } + + trace_vfio_iommufd_detach_device(container->be->fd, vbasedev->name, + container->ioas_id); +} + +static void vfio_iommufd_container_destroy(VFIOIOMMUFDContainer *container) +{ + VFIOContainerBase *bcontainer = &container->bcontainer; + + if (!QLIST_EMPTY(&container->hwpt_list)) { + return; + } + memory_listener_unregister(&bcontainer->listener); + vfio_container_destroy(bcontainer); + iommufd_backend_put_ioas(container->be, container->ioas_id); + g_free(container); +} + +static int vfio_ram_block_discard_disable(bool state) +{ + /* + * We support coordinated discarding of RAM via the RamDiscardManager. + */ + return ram_block_uncoordinated_discard_disable(state); +} + +static int iommufd_attach_device(char *name, VFIODevice *vbasedev, + AddressSpace *as, Error **errp) +{ + VFIOContainerBase *bcontainer; + VFIOIOMMUFDContainer *container; + VFIOAddressSpace *space; + struct vfio_device_info dev_info = { .argsz = sizeof(dev_info) }; + int ret, devfd; + uint32_t ioas_id; + Error *err = NULL; + + devfd = vfio_get_devicefd(vbasedev->sysfsdev, errp); + if (devfd < 0) { + return devfd; + } + vbasedev->fd = devfd; + + ret = iommufd_connect_and_bind(vbasedev, errp); + if (ret) { + goto err_connect_bind; + } + + space = vfio_get_address_space(as); + + /* try to attach to an existing container in this space */ + QLIST_FOREACH(bcontainer, &space->containers, next) { + container = container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); + if (bcontainer->ops != &vfio_iommufd_ops || + vbasedev->iommufd != container->be) { + continue; + } + if (vfio_device_attach_container(vbasedev, container, &err)) { + const char *msg = error_get_pretty(err); + + trace_vfio_iommufd_fail_attach_existing_container(msg); + error_free(err); + err = NULL; + } else { + ret = vfio_ram_block_discard_disable(true); + if (ret) { + error_setg(errp, + "Cannot set discarding of RAM broken (%d)", ret); + goto err_discard_disable; + } + goto found_container; + } + } + + /* Need to allocate a new dedicated container */ + ret = iommufd_backend_get_ioas(vbasedev->iommufd, &ioas_id); + if (ret < 0) { + error_setg_errno(errp, errno, "Failed to alloc ioas"); + goto err_get_ioas; + } + + trace_vfio_iommufd_alloc_ioas(vbasedev->iommufd->fd, ioas_id); + + container = g_malloc0(sizeof(*container)); + container->be = vbasedev->iommufd; + container->ioas_id = ioas_id; + QLIST_INIT(&container->hwpt_list); + + bcontainer = &container->bcontainer; + vfio_container_init(bcontainer, space, &vfio_iommufd_ops); + QLIST_INSERT_HEAD(&space->containers, bcontainer, next); + + ret = vfio_device_attach_container(vbasedev, container, errp); + if (ret) { + goto err_attach_container; + } + + ret = vfio_ram_block_discard_disable(true); + if (ret) { + goto err_discard_disable; + } + + bcontainer->pgsizes = qemu_real_host_page_size(); + + bcontainer->listener = vfio_memory_listener; + memory_listener_register(&bcontainer->listener, bcontainer->space->as); + + if (bcontainer->error) { + ret = -1; + error_propagate_prepend(errp, bcontainer->error, + "memory listener initialization failed: "); + goto err_listener_register; + } + + bcontainer->initialized = true; + +found_container: + ret = ioctl(devfd, VFIO_DEVICE_GET_INFO, &dev_info); + if (ret) { + error_setg_errno(errp, errno, "error getting device info"); + goto err_listener_register; + } + + /* + * TODO: examine RAM_BLOCK_DISCARD stuff, should we do group level + * for discarding incompatibility check as well? + */ + if (vbasedev->ram_block_discard_allowed) { + vfio_ram_block_discard_disable(false); + } + + vbasedev->group = 0; + vbasedev->num_irqs = dev_info.num_irqs; + vbasedev->num_regions = dev_info.num_regions; + vbasedev->flags = dev_info.flags; + vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET); + vbasedev->bcontainer = bcontainer; + QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next); + QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next); + + trace_vfio_iommufd_device_info(vbasedev->name, devfd, vbasedev->num_irqs, + vbasedev->num_regions, vbasedev->flags); + return 0; + +err_listener_register: + vfio_ram_block_discard_disable(false); +err_discard_disable: + vfio_device_detach_container(vbasedev, container); +err_attach_container: + vfio_iommufd_container_destroy(container); +err_get_ioas: + vfio_put_address_space(space); + iommufd_unbind_and_disconnect(vbasedev); +err_connect_bind: + close(vbasedev->fd); + return ret; +} + +static void iommufd_detach_device(VFIODevice *vbasedev) +{ + VFIOContainerBase *bcontainer = vbasedev->bcontainer; + VFIOIOMMUFDContainer *container; + VFIOAddressSpace *space = bcontainer->space; + + QLIST_REMOVE(vbasedev, global_next); + QLIST_REMOVE(vbasedev, container_next); + vbasedev->bcontainer = NULL; + + if (!vbasedev->ram_block_discard_allowed) { + vfio_ram_block_discard_disable(false); + } + + container = container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); + vfio_device_detach_container(vbasedev, container); + vfio_iommufd_container_destroy(container); + vfio_put_address_space(space); + + iommufd_unbind_and_disconnect(vbasedev); + close(vbasedev->fd); +} + +const VFIOIOMMUOps vfio_iommufd_ops = { + .dma_map = iommufd_map, + .dma_unmap = iommufd_unmap, + .attach_device = iommufd_attach_device, + .detach_device = iommufd_detach_device, +}; diff --git a/hw/vfio/meson.build b/hw/vfio/meson.build index eb6ce6229d..9cae2c9e21 100644 --- a/hw/vfio/meson.build +++ b/hw/vfio/meson.build @@ -7,6 +7,9 @@ vfio_ss.add(files( 'spapr.c', 'migration.c', )) +if have_iommufd + vfio_ss.add(files('iommufd.c')) +endif vfio_ss.add(when: 'CONFIG_VFIO_PCI', if_true: files( 'display.c', 'pci-quirks.c', diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 08a1f9dfa4..9b180cf77c 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -164,3 +164,15 @@ vfio_state_pending_estimate(const char *name, uint64_t precopy, uint64_t postcop vfio_state_pending_exact(const char *name, uint64_t precopy, uint64_t postcopy, uint64_t stopcopy_size, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy 0x%"PRIx64" postcopy 0x%"PRIx64" stopcopy size 0x%"PRIx64" precopy initial size 0x%"PRIx64" precopy dirty size 0x%"PRIx64 vfio_vmstate_change(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s" vfio_vmstate_change_prepare(const char *name, int running, const char *reason, const char *dev_state) " (%s) running %d reason %s device state %s" + +#iommufd.c + +vfio_iommufd_get_devicefd(const char *dev, int devfd) " %s (fd=%d)" +vfio_iommufd_bind_device(int iommufd, const char *name, int devfd, int devid) " [iommufd=%d] Successfully bound device %s (fd=%d): output devid=%d" +vfio_iommufd_fail_attach_existing_hwpt(const char *msg) " %s" +vfio_iommufd_attach_device(int iommufd, const char *name, int devfd, int ioasid, int hwptid) " [iommufd=%d] Successfully attached device %s (%d) to ioasid=%d: output hwptd=%d" +vfio_iommufd_detach_device(int iommufd, const char *name, int ioasid) " [iommufd=%d] Detached %s from ioasid=%d" +vfio_iommufd_alloc_ioas(int iommufd, int ioas_id) " [iommufd=%d] new IOMMUFD container with ioasid=%d" +vfio_iommufd_device_info(char *name, int devfd, int num_irqs, int num_regions, int flags) " %s (%d) num_irqs=%d num_regions=%d flags=%d" +vfio_iommufd_fail_attach_existing_container(const char *msg) " %s" +vfio_iommufd_container_reset(char *name) " Successfully reset %s" From patchwork Thu Oct 26 10:30:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437496 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3FF1FC25B48 for ; Thu, 26 Oct 2023 10:51:41 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxv1-0003RF-87; Thu, 26 Oct 2023 06:48:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxus-0002fF-FJ for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:42 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxug-0001Q0-Ei for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:40 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317310; x=1729853310; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p/60cmea6GKOpVhxSghBXzAP6SrSy/0FosUd0BMTBBM=; b=en+lmH70g1l25E7lY+XouLJN03Cbnsxp9ZeNvNA3TrhllsNLEmWYfNMM pBfypwIq1E6n/+tXML8tKvOFQ12fkIhct2Zf3C41oreCB4qEtNgOUWTwN SyQMeYWeEBQSrhWeueJlrkQTFMZYhxFG/jn8edXHSlrL5uPf2uZeYUW4a WRvzTbFx33QkGAWvnrplP7Ha+Tuzgtibcdt5/MkPK0DKB4JKOogrOosH8 bllLWdb+glLfK1fHUWrAHJLqBKgTMueonrXCGfKNG3FGsb56yGqN+PD4z 8SsmYupqsPuOMcuE3jl/IGyL1vzBUdOHaNkaISUe1+WqVI8oFVyc7utft A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563719" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563719" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:59 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463804" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:45 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 27/37] vfio/iommufd: Switch to manual hwpt allocation Date: Thu, 26 Oct 2023 18:30:54 +0800 Message-Id: <20231026103104.1686921-28-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org IOMMUFD supports auto allocated hwpt and manually allocated one. Manually allocated hwpt has benefit that its life cycle is under user's control, so it could be used as stage 2 page table by nested feature in the future. Introduce two helpers __vfio_device_attach/detach_hwpt to facilitate this change. Signed-off-by: Zhenzhong Duan --- hw/vfio/iommufd.c | 89 +++++++++++++++++++++++++++++++++++++---------- 1 file changed, 70 insertions(+), 19 deletions(-) diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index aee64d63f3..c1daaf1c39 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -217,38 +217,91 @@ static VFIOIOASHwpt *vfio_container_get_hwpt(VFIOIOMMUFDContainer *container, static void vfio_container_put_hwpt(IOMMUFDBackend *be, VFIOIOASHwpt *hwpt) { QLIST_REMOVE(hwpt, next); + iommufd_backend_free_id(be->fd, hwpt->hwpt_id); g_free(hwpt); } -static int vfio_device_attach_container(VFIODevice *vbasedev, - VFIOIOMMUFDContainer *container, - Error **errp) +static int __vfio_device_attach_hwpt(VFIODevice *vbasedev, uint32_t hwpt_id, + Error **errp) { - int ret, iommufd = vbasedev->iommufd->fd; - VFIOIOASHwpt *hwpt; struct vfio_device_attach_iommufd_pt attach_data = { .argsz = sizeof(attach_data), .flags = 0, - .pt_id = container->ioas_id, + .pt_id = hwpt_id, }; + int ret; - /* Attach device to an ioas within iommufd */ ret = ioctl(vbasedev->fd, VFIO_DEVICE_ATTACH_IOMMUFD_PT, &attach_data); if (ret) { error_setg_errno(errp, errno, - "[iommufd=%d] error attach %s (%d) to ioasid=%d", - container->be->fd, vbasedev->name, vbasedev->fd, - attach_data.pt_id); + "[iommufd=%d] error attach %s (%d) to hwpt_id=%d", + vbasedev->iommufd->fd, vbasedev->name, vbasedev->fd, + hwpt_id); + } + return ret; +} + +static int __vfio_device_detach_hwpt(VFIODevice *vbasedev, Error **errp) +{ + struct vfio_device_detach_iommufd_pt detach_data = { + .argsz = sizeof(detach_data), + .flags = 0, + }; + int ret; + + ret = ioctl(vbasedev->fd, VFIO_DEVICE_DETACH_IOMMUFD_PT, &detach_data); + if (ret) { + error_setg_errno(errp, errno, "detach %s from ioas failed", + vbasedev->name); + } + return ret; +} + +static int vfio_device_attach_container(VFIODevice *vbasedev, + VFIOIOMMUFDContainer *container, + Error **errp) +{ + int ret, iommufd = vbasedev->iommufd->fd; + VFIOIOASHwpt *hwpt; + uint32_t hwpt_id; + Error *err = NULL; + + /* try to attach to an existing hwpt in this container */ + QLIST_FOREACH(hwpt, &container->hwpt_list, next) { + ret = __vfio_device_attach_hwpt(vbasedev, hwpt->hwpt_id, &err); + if (ret) { + const char *msg = error_get_pretty(err); + + trace_vfio_iommufd_fail_attach_existing_hwpt(msg); + error_free(err); + err = NULL; + } else { + goto found_hwpt; + } + } + + ret = iommufd_backend_alloc_hwpt(iommufd, vbasedev->devid, + container->ioas_id, &hwpt_id); + + if (ret) { + error_setg_errno(errp, errno, "error alloc shadow hwpt"); return ret; + } + /* Attach device to an hwpt within iommufd */ + ret = __vfio_device_attach_hwpt(vbasedev, hwpt_id, errp); + if (ret) { + iommufd_backend_free_id(iommufd, hwpt_id); + return ret; } - hwpt = vfio_container_get_hwpt(container, attach_data.pt_id); + hwpt = vfio_container_get_hwpt(container, hwpt_id); +found_hwpt: QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, next); vbasedev->hwpt = hwpt; trace_vfio_iommufd_attach_device(iommufd, vbasedev->name, vbasedev->fd, - container->ioas_id, attach_data.pt_id); + container->ioas_id, hwpt->hwpt_id); return ret; } @@ -256,14 +309,12 @@ static void vfio_device_detach_container(VFIODevice *vbasedev, VFIOIOMMUFDContainer *container) { VFIOIOASHwpt *hwpt = vbasedev->hwpt; - struct vfio_device_attach_iommufd_pt detach_data = { - .argsz = sizeof(detach_data), - .flags = 0, - }; + Error *err = NULL; + int ret; - if (ioctl(vbasedev->fd, VFIO_DEVICE_DETACH_IOMMUFD_PT, &detach_data)) { - error_report("detach %s from ioas id=%d failed: %m", vbasedev->name, - container->ioas_id); + ret = __vfio_device_detach_hwpt(vbasedev, &err); + if (ret) { + error_report_err(err); } QLIST_REMOVE(vbasedev, next); From patchwork Thu Oct 26 10:30:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437489 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D3714C25B6B for ; Thu, 26 Oct 2023 10:50:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvB-0003v7-Jq; Thu, 26 Oct 2023 06:49:01 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxv4-0003ib-MB for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:57 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxv1-0001PR-4x for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317331; x=1729853331; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4fpM8djcG6yzw2/d0EJVHfGLIQl+ddsysHN+ZtJ/98U=; b=mD4S6y2LbuKu/JO8az3vixfI4aX+gol3MdhFbIsKhmq9UOMTtha6Qf2F WcZoO7Ry9feod91+TKVXzB5PUsrtT8vjeZCV+Xqt8c5Tqahk1Dh5DDnwA sW3nfxnaBadbVIfyk/641jV8mdMxV7KPFLwfKHx7NrETCBdmVX9LZ5z3P uKIAy5hppI9PrH4aRia1w4MtOYmpo47q7sDOLJPRiq+wdsz68yyPI5OPT nC2XZFzd7gSy6KKyDEj8WTAA0g2acIKw+WpOh+YOfZ61vZHqCRDqQ4R2q Zq4ImzWJTDGwDs43u3Xfj+H5WGQB/qt+YB3SAnrjASO5/YiU+4zj40iaY w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563726" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563726" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463817" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:49 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 28/37] vfio/iommufd: Add support for iova_ranges Date: Thu, 26 Oct 2023 18:30:55 +0800 Message-Id: <20231026103104.1686921-29-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Some vIOMMU such as virtio-iommu use iova ranges from host side to setup reserved ranges for passthrough device, so that guest will not use an iova range beyond host support. Use an uAPI of IOMMUFD to get iova ranges of host side and pass to vIOMMU just like the legacy backend. Signed-off-by: Zhenzhong Duan --- hw/vfio/iommufd.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index c1daaf1c39..18a09d7f5a 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -348,6 +348,52 @@ static int vfio_ram_block_discard_disable(bool state) return ram_block_uncoordinated_discard_disable(state); } +static int vfio_get_info_iova_range(VFIOIOMMUFDContainer *container, + uint32_t ioas_id) +{ + VFIOContainerBase *bcontainer = &container->bcontainer; + struct iommu_ioas_iova_ranges *info; + struct iommu_iova_range *iova_ranges; + int ret, sz, fd = container->be->fd; + + info = g_malloc0(sizeof(*info)); + info->size = sizeof(*info); + info->ioas_id = ioas_id; + + ret = ioctl(fd, IOMMU_IOAS_IOVA_RANGES, info); + if (ret && errno != EMSGSIZE) { + goto error; + } + + sz = info->num_iovas * sizeof(struct iommu_iova_range); + info = g_realloc(info, sizeof(*info) + sz); + info->allowed_iovas = (uint64_t)(info + 1); + + ret = ioctl(fd, IOMMU_IOAS_IOVA_RANGES, info); + if (ret) { + goto error; + } + + iova_ranges = (struct iommu_iova_range *)info->allowed_iovas; + + for (int i = 0; i < info->num_iovas; i++) { + Range *range = g_new(Range, 1); + + range_set_bounds(range, iova_ranges[i].start, iova_ranges[i].last); + bcontainer->iova_ranges = + range_list_insert(bcontainer->iova_ranges, range); + } + + g_free(info); + return 0; + +error: + ret = -errno; + g_free(info); + error_report("vfio/iommufd: Cannot get iova ranges: %m"); + return ret; +} + static int iommufd_attach_device(char *name, VFIODevice *vbasedev, AddressSpace *as, Error **errp) { @@ -425,6 +471,7 @@ static int iommufd_attach_device(char *name, VFIODevice *vbasedev, } bcontainer->pgsizes = qemu_real_host_page_size(); + vfio_get_info_iova_range(container, ioas_id); bcontainer->listener = vfio_memory_listener; memory_listener_register(&bcontainer->listener, bcontainer->space->as); From patchwork Thu Oct 26 10:30:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6C2D7C25B6B for ; Thu, 26 Oct 2023 10:49:53 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvG-00047N-39; Thu, 26 Oct 2023 06:49:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvC-00041K-Sf; Thu, 26 Oct 2023 06:49:02 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvB-0001Nv-54; Thu, 26 Oct 2023 06:49:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317341; x=1729853341; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GXWVJSJKrA6OskhLM2e5zCd84xdWO+/D3ApsdMrVliA=; b=ZPjF3C+xwaCy0Od1nj1i75bvmcQOZWEAq3ar12501YJoOVufPb0PDSBA pcL0BF8HJkrj7zktUw/4KezS2w904G+74dOY6FKz03rFGZgXWd7p9Ln3v fHlGBZei/xhJIktkHQTO/axGmiNCjq9C1ZMnI3jzENQJFLDdKcQFjLa7E 5XSG5OE31cm2nCzJ5IIH7G1RHshi7t3st3AA4Fu8FI2flIHKA/ccv4GJY HF3FSOH11e1rqVtUisEaJx2NVAcAPx5vQggCZEQqNlgeaXTeWzjFsv/b4 7dsXKm30QjoDnnwHffZ2IdS8v4/jnBsnDoZz939JX5MCKeP9K1SO3jrMv A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563751" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563751" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:08 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463825" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:52 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Nicholas Piggin , Daniel Henrique Barboza , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , David Gibson , Harsh Prateek Bora , qemu-ppc@nongnu.org (open list:sPAPR (pseries)) Subject: [PATCH v3 29/37] vfio/iommufd: Bypass EEH if iommufd backend Date: Thu, 26 Oct 2023 18:30:56 +0800 Message-Id: <20231026103104.1686921-30-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org IBM EEH is only supported by legacy backend currently, bypass it for IOMMUFD backend. Signed-off-by: Zhenzhong Duan --- hw/ppc/spapr_pci_vfio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/ppc/spapr_pci_vfio.c b/hw/ppc/spapr_pci_vfio.c index d1d07bec46..a2518838a1 100644 --- a/hw/ppc/spapr_pci_vfio.c +++ b/hw/ppc/spapr_pci_vfio.c @@ -93,10 +93,10 @@ static VFIOContainer *vfio_eeh_as_container(AddressSpace *as) bcontainer = QLIST_FIRST(&space->containers); - if (QLIST_NEXT(bcontainer, next)) { + if (QLIST_NEXT(bcontainer, next) || bcontainer->ops != &vfio_legacy_ops) { /* * We don't yet have logic to synchronize EEH state across - * multiple containers + * multiple containers, iommufd isn't supported too. */ bcontainer = NULL; goto out; From patchwork Thu Oct 26 10:30:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437487 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id A1756C25B48 for ; Thu, 26 Oct 2023 10:50:23 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvC-00040f-ND; Thu, 26 Oct 2023 06:49:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxv5-0003ie-0j for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:57 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxv2-0001Q0-SZ for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317332; x=1729853332; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jjlVJ29JqSy3D8jINIsIygktcWD2qs3ZSL2RaHPUFuI=; b=lweTTt+aUD5nX6s5WhcmA7pNZ1TQULBgI62r7xKaaBjW2D7uKK6Uti+Z mYDccRbFRM/2/7aSBH9z0lcgrWXxsUbHSkueVY7qpwIspcdmsafyhWZCH P+ismcyE3G+KZIVwY2ri4FxZpyYqzIYncnAXeFMiXa/oXCmPGcV8Paw7k LdkQUJYgCF9exn8fC0HnmJ0BgxA4gO8G+5joDsw/pMZ7F8h6iIZKBb0Hc 0vVo2I8pmsrHUELqvoWz5nI+w4xy3YW1yQ+IBsMkEhRNdqKFjrhT+pody 1ZOahlaLlEwF97GVD6mVA6gwb4f1XSBtRXAMSblGwBlAjnWtybfnX30qW Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563779" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563779" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:12 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463833" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:47:57 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 30/37] vfio/pci: Extract out a helper vfio_pci_get_pci_hot_reset_info Date: Thu, 26 Oct 2023 18:30:57 +0800 Message-Id: <20231026103104.1686921-31-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This helper will be used by both legacy and iommufd backends. No functional changes intended. Signed-off-by: Zhenzhong Duan --- hw/vfio/pci.c | 54 +++++++++++++++++++++++++++++++++++---------------- 1 file changed, 37 insertions(+), 17 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 5cbc771e55..c17e1f4376 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -2459,22 +2459,13 @@ static bool vfio_pci_host_match(PCIHostDeviceAddress *addr, const char *name) return (strcmp(tmp, name) == 0); } -static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) +static int vfio_pci_get_pci_hot_reset_info(VFIOPCIDevice *vdev, + struct vfio_pci_hot_reset_info **info_p) { - VFIOGroup *group; struct vfio_pci_hot_reset_info *info; - struct vfio_pci_dependent_device *devices; - struct vfio_pci_hot_reset *reset; - int32_t *fds; - int ret, i, count; - bool multi = false; + int ret, count; - trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi"); - - if (!single) { - vfio_pci_pre_reset(vdev); - } - vdev->vbasedev.needs_reset = false; + assert(info_p && !*info_p); info = g_malloc0(sizeof(*info)); info->argsz = sizeof(*info); @@ -2482,24 +2473,53 @@ static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info); if (ret && errno != ENOSPC) { ret = -errno; + g_free(info); if (!vdev->has_pm_reset) { error_report("vfio: Cannot reset device %s, " "no available reset mechanism.", vdev->vbasedev.name); } - goto out_single; + return ret; } count = info->count; - info = g_realloc(info, sizeof(*info) + (count * sizeof(*devices))); - info->argsz = sizeof(*info) + (count * sizeof(*devices)); - devices = &info->devices[0]; + info = g_realloc(info, sizeof(*info) + (count * sizeof(info->devices[0]))); + info->argsz = sizeof(*info) + (count * sizeof(info->devices[0])); ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_PCI_HOT_RESET_INFO, info); if (ret) { ret = -errno; + g_free(info); error_report("vfio: hot reset info failed: %m"); + return ret; + } + + *info_p = info; + return 0; +} + +static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) +{ + VFIOGroup *group; + struct vfio_pci_hot_reset_info *info = NULL; + struct vfio_pci_dependent_device *devices; + struct vfio_pci_hot_reset *reset; + int32_t *fds; + int ret, i, count; + bool multi = false; + + trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi"); + + if (!single) { + vfio_pci_pre_reset(vdev); + } + vdev->vbasedev.needs_reset = false; + + ret = vfio_pci_get_pci_hot_reset_info(vdev, &info); + + if (ret) { goto out_single; } + devices = &info->devices[0]; trace_vfio_pci_hot_reset_has_dep_devices(vdev->vbasedev.name); From patchwork Thu Oct 26 10:30:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437502 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 51A29C25B6B for ; Thu, 26 Oct 2023 10:52:12 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvE-000444-E6; Thu, 26 Oct 2023 06:49:04 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxv6-0003ik-S8 for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:57 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxv5-0001PR-0b for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:48:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317335; x=1729853335; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=R/+ewG8v978gJGlYUDJaC7WmoKKlcpiH18ocYc0hRUo=; b=Qr12zO6hiP/BfMWsgdCsRsCbu7CXt6URhyPlv16lYV43c0OalasYHYYG s3LS0R/f2a83YZsgraIVleufcSnAJvIQnbyAoN3QNWBGYSaq9OfioCoK9 bm/k3+zWNgxqy87y4yvC7OaTwstk2/hfRZFa/3AK8OwTwZ7wcjbeNPQPU s4nXlhRKIigyNdG/GVikTX/k85Y5un9gR7ownusw/J0FL3EXhB0y3rCLC AhmLctATrdx3paWLjehyVaCa/3obMAIyJCFDm6/z9w8U6nJkhIlJhcvf5 VQYPWyLIArsIKHVO9GBo/ee72CGy3r+rBquDoR6BgyJdS3qH0je7w+4p5 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563798" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563798" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463840" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:00 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 31/37] vfio/pci: Adapt vfio pci hot reset support with iommufd BE Date: Thu, 26 Oct 2023 18:30:58 +0800 Message-Id: <20231026103104.1686921-32-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org As pci hot reset path need to reference pci specific functions and data structures, adding container level callback functions for legacy and iommufd BE and referencing those pci specific func/data is no better than implementing reset support with iommufd BE directly in pci.c This way we can also share the common bus reset and system reset path for both BEs. Signed-off-by: Zhenzhong Duan --- hw/vfio/pci.c | 156 ++++++++++++++++++++++++++++++++++++++++++- hw/vfio/trace-events | 1 + 2 files changed, 156 insertions(+), 1 deletion(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index c17e1f4376..d7a41c8def 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -42,6 +42,7 @@ #include "qapi/error.h" #include "migration/blocker.h" #include "migration/qemu-file.h" +#include "linux/iommufd.h" #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" @@ -2497,7 +2498,7 @@ static int vfio_pci_get_pci_hot_reset_info(VFIOPCIDevice *vdev, return 0; } -static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) +static int vfio_pci_hot_reset_legacy(VFIOPCIDevice *vdev, bool single) { VFIOGroup *group; struct vfio_pci_hot_reset_info *info = NULL; @@ -2661,6 +2662,159 @@ out_single: return ret; } +#ifdef CONFIG_IOMMUFD +static VFIODevice *vfio_pci_find_by_iommufd_devid(__u32 devid) +{ + VFIODevice *vbasedev_iter; + + QLIST_FOREACH(vbasedev_iter, &vfio_device_list, global_next) { + if (vbasedev_iter->bcontainer->ops != &vfio_iommufd_ops) { + continue; + } + if (devid == vbasedev_iter->devid) { + return vbasedev_iter; + } + } + return NULL; +} + +static int vfio_pci_hot_reset_iommufd(VFIOPCIDevice *vdev, bool single) +{ + struct vfio_pci_hot_reset_info *info = NULL; + struct vfio_pci_dependent_device *devices; + struct vfio_pci_hot_reset *reset; + int ret, i; + bool multi = false; + + trace_vfio_pci_hot_reset(vdev->vbasedev.name, single ? "one" : "multi"); + + if (!single) { + vfio_pci_pre_reset(vdev); + } + vdev->vbasedev.needs_reset = false; + + ret = vfio_pci_get_pci_hot_reset_info(vdev, &info); + + if (ret) { + goto out_single; + } + + assert(info->flags & VFIO_PCI_HOT_RESET_FLAG_DEV_ID); + + devices = &info->devices[0]; + + if (!(info->flags & VFIO_PCI_HOT_RESET_FLAG_DEV_ID_OWNED)) { + if (!vdev->has_pm_reset) { + for (i = 0; i < info->count; i++) { + if (devices[i].devid == VFIO_PCI_DEVID_NOT_OWNED) { + error_report("vfio: Cannot reset device %s, " + "depends on device %04x:%02x:%02x.%x " + "which is not owned.", + vdev->vbasedev.name, devices[i].segment, + devices[i].bus, PCI_SLOT(devices[i].devfn), + PCI_FUNC(devices[i].devfn)); + } + } + } + ret = -EPERM; + goto out_single; + } + + trace_vfio_pci_hot_reset_has_dep_devices(vdev->vbasedev.name); + + for (i = 0; i < info->count; i++) { + VFIOPCIDevice *tmp; + VFIODevice *vbasedev_iter; + + trace_vfio_pci_hot_reset_dep_devices_iommufd(devices[i].segment, + devices[i].bus, + PCI_SLOT(devices[i].devfn), + PCI_FUNC(devices[i].devfn), + devices[i].devid); + + /* + * If a VFIO cdev device is resettable, all the dependent devices + * are either bound to same iommufd or within same iommu_groups as + * one of the iommufd bound devices. + */ + assert(devices[i].devid != VFIO_PCI_DEVID_NOT_OWNED); + + if (devices[i].devid == vdev->vbasedev.devid || + devices[i].devid == VFIO_PCI_DEVID_OWNED) { + continue; + } + + vbasedev_iter = vfio_pci_find_by_iommufd_devid(devices[i].devid); + if (!vbasedev_iter || !vbasedev_iter->dev->realized || + vbasedev_iter->type != VFIO_DEVICE_TYPE_PCI) { + continue; + } + tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev); + if (single) { + ret = -EINVAL; + goto out_single; + } + vfio_pci_pre_reset(tmp); + tmp->vbasedev.needs_reset = false; + multi = true; + } + + if (!single && !multi) { + ret = -EINVAL; + goto out_single; + } + + /* Use zero length array for hot reset with iommufd backend */ + reset = g_malloc0(sizeof(*reset)); + reset->argsz = sizeof(*reset); + + /* Bus reset! */ + ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_PCI_HOT_RESET, reset); + g_free(reset); + + trace_vfio_pci_hot_reset_result(vdev->vbasedev.name, + ret ? strerror(errno) : "Success"); + + /* Re-enable INTx on affected devices */ + for (i = 0; i < info->count; i++) { + VFIOPCIDevice *tmp; + VFIODevice *vbasedev_iter; + + if (devices[i].devid == vdev->vbasedev.devid || + devices[i].devid == VFIO_PCI_DEVID_OWNED) { + continue; + } + + vbasedev_iter = vfio_pci_find_by_iommufd_devid(devices[i].devid); + if (!vbasedev_iter || !vbasedev_iter->dev->realized || + vbasedev_iter->type != VFIO_DEVICE_TYPE_PCI) { + continue; + } + tmp = container_of(vbasedev_iter, VFIOPCIDevice, vbasedev); + vfio_pci_post_reset(tmp); + } +out_single: + if (!single) { + vfio_pci_post_reset(vdev); + } + g_free(info); + + return ret; +} +#endif + +static int vfio_pci_hot_reset(VFIOPCIDevice *vdev, bool single) +{ +#ifdef CONFIG_IOMMUFD + if (vdev->vbasedev.iommufd) { + return vfio_pci_hot_reset_iommufd(vdev, single); + } else +#endif + { + return vfio_pci_hot_reset_legacy(vdev, single); + } +} + /* * We want to differentiate hot reset of multiple in-use devices vs hot reset * of a single in-use device. VFIO_DEVICE_RESET will already handle the case diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 9b180cf77c..71c5840636 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -34,6 +34,7 @@ vfio_check_af_flr(const char *name) "%s Supports FLR via AF cap" vfio_pci_hot_reset(const char *name, const char *type) " (%s) %s" vfio_pci_hot_reset_has_dep_devices(const char *name) "%s: hot reset dependent devices:" vfio_pci_hot_reset_dep_devices(int domain, int bus, int slot, int function, int group_id) "\t%04x:%02x:%02x.%x group %d" +vfio_pci_hot_reset_dep_devices_iommufd(int domain, int bus, int slot, int function, int dev_id) "\t%04x:%02x:%02x.%x devid %d" vfio_pci_hot_reset_result(const char *name, const char *result) "%s hot reset: %s" vfio_populate_device_config(const char *name, unsigned long size, unsigned long offset, unsigned long flags) "Device %s config:\n size: 0x%lx, offset: 0x%lx, flags: 0x%lx" vfio_populate_device_get_irq_info_failure(const char *errstr) "VFIO_DEVICE_GET_IRQ_INFO failure: %s" From patchwork Thu Oct 26 10:30:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D29F7C25B6F for ; Thu, 26 Oct 2023 10:51:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvI-0004C5-IG; Thu, 26 Oct 2023 06:49:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvG-000491-Vh for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:49:06 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvF-0001Q0-A1 for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:49:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317345; x=1729853345; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6MR5WCo9v5L0Vdk2+6JvGxis64p5EfrrivNJ1aQxcac=; b=UqHnRX8s1sqG16bN6MKRtFvcWir+eLHmvVHNSdIguXMW6uUr/B4OYSAz d78JCTkAwqeeobmw7lvUj6/9Q9WOMuF/YS0rsesynHl7UgF2B2Gfau8yd LkOzJIoxIl8z6hB3B4vIxmjqLujrnCwvEbqnLuH5R6uWLr+J19MmKI+CK cg/wlLUamVEvnYt5iQcspDIE0icVhxNPliiu1ymlbeDDR60S5wwyjsEUM Hr+NBWMkYBvY9HpCHXmx6RzIOoepBkZVpVjR3aWKxrsCLSOVYJb/RxUku xWLGtXIGdlYraQOsWzzzYaiIEBFFvI/XNd7DLCFoiiFiDDSfnvaeGjmvz w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563813" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563813" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463846" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:04 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 32/37] vfio/pci: Allow the selection of a given iommu backend Date: Thu, 26 Oct 2023 18:30:59 +0800 Message-Id: <20231026103104.1686921-33-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Eric Auger Now we support two types of iommu backends, let's add the capability to select one of them. This depends on whether an iommufd object has been linked with the vfio-pci device: if the user wants to use the legacy backend, it shall not link the vfio-pci device with any iommufd object: -device vfio-pci,host=0000:02:00.0 This is called the legacy mode/backend. If the user wants to use the iommufd backend (/dev/iommu) it shall pass an iommufd object id in the vfio-pci device options: -object iommufd,id=iommufd0 -device vfio-pci,host=0000:02:00.0,iommufd=iommufd0 Suggested-by: Alex Williamson Signed-off-by: Eric Auger Signed-off-by: Yi Liu Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- hw/vfio/pci.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index d7a41c8def..386c08576a 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -43,6 +43,7 @@ #include "migration/blocker.h" #include "migration/qemu-file.h" #include "linux/iommufd.h" +#include "sysemu/iommufd.h" #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" @@ -3710,6 +3711,10 @@ static Property vfio_pci_dev_properties[] = { * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name), * DEFINE_PROP_STRING("vfiogroupfd, VFIOPCIDevice, vfiogroupfd_name), */ +#ifdef CONFIG_IOMMUFD + DEFINE_PROP_LINK("iommufd", VFIOPCIDevice, vbasedev.iommufd, + TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), +#endif DEFINE_PROP_END_OF_LIST(), }; From patchwork Thu Oct 26 10:31:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 86DC9C25B70 for ; Thu, 26 Oct 2023 10:51:33 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvL-0004Hv-JY; Thu, 26 Oct 2023 06:49:11 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvJ-0004EJ-JP for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:49:09 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvH-0001PR-9a for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:49:09 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317347; x=1729853347; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=unz1VcERlRWpgvRCFPYWoFixGJk6XxMNh9+O79XWD54=; b=L2SXLZimAKpjM5Wxdsr26GoafZC2Litb3s6B/b2jhqQbeQ9LTkSjqt+Q FHs7WZ0WbEz7lR3A350jnxlpjNGYTwApzuRwwzZ6Ww0esH//krAkpQKwW MhE9eDIk0sGyKg34BwMLXGlLC8NFZPm4bUZDpL/VN6XLWNXioAEa76faG mD6ZJH0zCTWnrsZQMXYlY2ne9GC14j0S6EyXCQrlckXmZIUQa0fuo3S9h jH64bWMDCNISvF6jXSVyNH3KazrlFoFp5oxGD+WJPND40l7qYV0jVThf1 9U+24wvr5YSFFuerMvqWcTAH534arw22VQztDxmhdzSuvETqr1Q6TnhRF A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563829" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563829" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463854" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:07 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 33/37] vfio/pci: Make vfio cdev pre-openable by passing a file handle Date: Thu, 26 Oct 2023 18:31:00 +0800 Message-Id: <20231026103104.1686921-34-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This gives management tools like libvirt a chance to open the vfio cdev with privilege and pass FD to qemu. This way qemu never needs to have privilege to open a VFIO or iommu cdev node. Together with the earlier support of pre-opening /dev/iommu device, now we have full support of passing a vfio device to unprivileged qemu by management tool. This mode is no more considered for the legacy backend. So let's remove the "TODO" comment. Add a helper function vfio_device_get_name() to check fd and get device name, it will also be used by other vfio devices. There is no easy way to check if a device is mdev with FD passing, so fail the x-balloon-allowed check unconditionally in this case. There is also no easy way to get BDF as name with FD passing, so we fake a name by VFIO_FD[fd]. Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- include/hw/vfio/vfio-common.h | 1 + hw/vfio/helpers.c | 33 +++++++++++++++++++++++++++++ hw/vfio/iommufd.c | 12 +++++++---- hw/vfio/pci.c | 40 ++++++++++++++++++++++++----------- 4 files changed, 70 insertions(+), 16 deletions(-) diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 95afee7243..5c00699290 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -258,6 +258,7 @@ struct vfio_info_cap_header * vfio_get_device_info_cap(struct vfio_device_info *info, uint16_t id); struct vfio_info_cap_header * vfio_get_cap(void *ptr, uint32_t cap_offset, uint16_t id); +int vfio_device_get_name(VFIODevice *vbasedev, Error **errp); #endif bool vfio_migration_realize(VFIODevice *vbasedev, Error **errp); diff --git a/hw/vfio/helpers.c b/hw/vfio/helpers.c index 168847e7c5..044dbbc501 100644 --- a/hw/vfio/helpers.c +++ b/hw/vfio/helpers.c @@ -609,3 +609,36 @@ bool vfio_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type) return ret; } + +int vfio_device_get_name(VFIODevice *vbasedev, Error **errp) +{ + struct stat st; + + if (vbasedev->fd < 0) { + if (stat(vbasedev->sysfsdev, &st) < 0) { + error_setg_errno(errp, errno, "no such host device"); + error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->sysfsdev); + return -errno; + } + /* User may specify a name, e.g: VFIO platform device */ + if (!vbasedev->name) { + vbasedev->name = g_path_get_basename(vbasedev->sysfsdev); + } + } +#ifdef CONFIG_IOMMUFD + else { + if (!vbasedev->iommufd) { + error_setg(errp, "Use FD passing only with iommufd backend"); + return -EINVAL; + } + /* + * Give a name with fd so any function printing out vbasedev->name + * will not break. + */ + if (!vbasedev->name) { + vbasedev->name = g_strdup_printf("VFIO_FD%d", vbasedev->fd); + } + } +#endif + return 0; +} diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c index 18a09d7f5a..20df807d7c 100644 --- a/hw/vfio/iommufd.c +++ b/hw/vfio/iommufd.c @@ -405,11 +405,15 @@ static int iommufd_attach_device(char *name, VFIODevice *vbasedev, uint32_t ioas_id; Error *err = NULL; - devfd = vfio_get_devicefd(vbasedev->sysfsdev, errp); - if (devfd < 0) { - return devfd; + if (vbasedev->fd < 0) { + devfd = vfio_get_devicefd(vbasedev->sysfsdev, errp); + if (devfd < 0) { + return devfd; + } + vbasedev->fd = devfd; + } else { + devfd = vbasedev->fd; } - vbasedev->fd = devfd; ret = iommufd_connect_and_bind(vbasedev, errp); if (ret) { diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 386c08576a..32a4e8beb0 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -44,6 +44,7 @@ #include "migration/qemu-file.h" #include "linux/iommufd.h" #include "sysemu/iommufd.h" +#include "monitor/monitor.h" #define TYPE_VFIO_PCI_NOHOTPLUG "vfio-pci-nohotplug" @@ -3267,18 +3268,23 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) VFIODevice *vbasedev = &vdev->vbasedev; char *tmp, *subsys; Error *err = NULL; - struct stat st; int i, ret; bool is_mdev; char uuid[UUID_FMT_LEN]; char *name; - if (!vbasedev->sysfsdev) { + if (vbasedev->fd < 0 && !vbasedev->sysfsdev) { if (!(~vdev->host.domain || ~vdev->host.bus || ~vdev->host.slot || ~vdev->host.function)) { error_setg(errp, "No provided host device"); +#ifdef CONFIG_IOMMUFD + error_append_hint(errp, "Use -device vfio-pci,host=DDDD:BB:DD.F, " + "-device vfio-pci,sysfsdev=PATH_TO_DEVICE " + "or -device vfio-pci,fd=DEVICE_FD\n"); +#else error_append_hint(errp, "Use -device vfio-pci,host=DDDD:BB:DD.F " "or -device vfio-pci,sysfsdev=PATH_TO_DEVICE\n"); +#endif return; } vbasedev->sysfsdev = @@ -3287,13 +3293,9 @@ static void vfio_realize(PCIDevice *pdev, Error **errp) vdev->host.slot, vdev->host.function); } - if (stat(vbasedev->sysfsdev, &st) < 0) { - error_setg_errno(errp, errno, "no such host device"); - error_prepend(errp, VFIO_MSG_PREFIX, vbasedev->sysfsdev); + if (vfio_device_get_name(vbasedev, errp)) { return; } - - vbasedev->name = g_path_get_basename(vbasedev->sysfsdev); vbasedev->ops = &vfio_pci_ops; vbasedev->type = VFIO_DEVICE_TYPE_PCI; vbasedev->dev = DEVICE(vdev); @@ -3653,6 +3655,7 @@ static void vfio_instance_init(Object *obj) vdev->host.bus = ~0U; vdev->host.slot = ~0U; vdev->host.function = ~0U; + vdev->vbasedev.fd = -1; vdev->nv_gpudirect_clique = 0xFF; @@ -3706,11 +3709,6 @@ static Property vfio_pci_dev_properties[] = { qdev_prop_nv_gpudirect_clique, uint8_t), DEFINE_PROP_OFF_AUTO_PCIBAR("x-msix-relocation", VFIOPCIDevice, msix_relo, OFF_AUTOPCIBAR_OFF), - /* - * TODO - support passed fds... is this necessary? - * DEFINE_PROP_STRING("vfiofd", VFIOPCIDevice, vfiofd_name), - * DEFINE_PROP_STRING("vfiogroupfd, VFIOPCIDevice, vfiogroupfd_name), - */ #ifdef CONFIG_IOMMUFD DEFINE_PROP_LINK("iommufd", VFIOPCIDevice, vbasedev.iommufd, TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), @@ -3718,6 +3716,21 @@ static Property vfio_pci_dev_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +#ifdef CONFIG_IOMMUFD +static void vfio_pci_set_fd(Object *obj, const char *str, Error **errp) +{ + VFIOPCIDevice *vdev = VFIO_PCI(obj); + int fd = -1; + + fd = monitor_fd_param(monitor_cur(), str, errp); + if (fd == -1) { + error_prepend(errp, "Could not parse remote object fd %s:", str); + return; + } + vdev->vbasedev.fd = fd; +} +#endif + static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -3725,6 +3738,9 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, void *data) dc->reset = vfio_pci_reset; device_class_set_props(dc, vfio_pci_dev_properties); +#ifdef CONFIG_IOMMUFD + object_class_property_add_str(klass, "fd", NULL, vfio_pci_set_fd); +#endif dc->desc = "VFIO-based PCI device assignment"; set_bit(DEVICE_CATEGORY_MISC, dc->categories); pdc->realize = vfio_realize; From patchwork Thu Oct 26 10:31:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437482 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3D8A5C25B48 for ; Thu, 26 Oct 2023 10:49:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvb-0004kI-EZ; Thu, 26 Oct 2023 06:49:27 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvZ-0004hD-RP; Thu, 26 Oct 2023 06:49:25 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvX-0001Nv-Vm; Thu, 26 Oct 2023 06:49:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317364; x=1729853364; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=xAVnyoxQjXc6bTchK6lRpYFZlZKjJeNFs/piATVUzLM=; b=E17aznTY05Qed/gwfO9hhVaiSveKKLpEZvZOaZqKN82lLE31dmgBiDZu E6q5fDHZCT+EyCuiv5AJ+BHgw3sn1vNSsDYeQZVjXGlfnY8lsdudAV51l Nn5xPD0yRULYikAoyx1Vrzs403l9TEGe/klBh0JfqITxdXkYlTZYVGz/t OhxurImH996M8cT/MH8wqzi/SK5mD0zXNa+5jvBbLPuvLJuiVfkvo7H5j oWx3zKLaMW4dlLMDkUc7uu/Pu+gf0HZCJpjzWjk46jtOf5OJ8nr2pNWZF cfuEQScCZqiuweqHqqz0xUnXV66Ej+eU8vRH6O2dLfv9UcM94hl13PiPf w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563852" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563852" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463874" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:11 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Thomas Huth , Tony Krowiak , Halil Pasic , Jason Herne , Eric Farman , Matthew Rosato , qemu-s390x@nongnu.org (open list:S390 general arch...) Subject: [PATCH v3 34/37] vfio: Allow the selection of a given iommu backend for platform ap and ccw Date: Thu, 26 Oct 2023 18:31:01 +0800 Message-Id: <20231026103104.1686921-35-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Previously we added support to select iommu backend for vfio pci device. Now we added others, E.g: platform, ap and ccw. Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- include/hw/vfio/vfio-platform.h | 1 + hw/vfio/ap.c | 5 +++++ hw/vfio/ccw.c | 5 +++++ hw/vfio/platform.c | 4 ++++ 4 files changed, 15 insertions(+) diff --git a/include/hw/vfio/vfio-platform.h b/include/hw/vfio/vfio-platform.h index c414c3dffc..f57f4276f2 100644 --- a/include/hw/vfio/vfio-platform.h +++ b/include/hw/vfio/vfio-platform.h @@ -18,6 +18,7 @@ #include "hw/sysbus.h" #include "hw/vfio/vfio-common.h" +#include "sysemu/iommufd.h" #include "qemu/event_notifier.h" #include "qemu/queue.h" #include "qom/object.h" diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c index bbf69ff55a..6a4186ccd3 100644 --- a/hw/vfio/ap.c +++ b/hw/vfio/ap.c @@ -15,6 +15,7 @@ #include #include "qapi/error.h" #include "hw/vfio/vfio-common.h" +#include "sysemu/iommufd.h" #include "hw/s390x/ap-device.h" #include "qemu/error-report.h" #include "qemu/event_notifier.h" @@ -204,6 +205,10 @@ static void vfio_ap_unrealize(DeviceState *dev) static Property vfio_ap_properties[] = { DEFINE_PROP_STRING("sysfsdev", VFIOAPDevice, vdev.sysfsdev), +#ifdef CONFIG_IOMMUFD + DEFINE_PROP_LINK("iommufd", VFIOAPDevice, vdev.iommufd, + TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), +#endif DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index d857bb8d0f..7695ede0fc 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -21,6 +21,7 @@ #include "qapi/error.h" #include "hw/vfio/vfio-common.h" +#include "sysemu/iommufd.h" #include "hw/s390x/s390-ccw.h" #include "hw/s390x/vfio-ccw.h" #include "hw/qdev-properties.h" @@ -677,6 +678,10 @@ static void vfio_ccw_unrealize(DeviceState *dev) static Property vfio_ccw_properties[] = { DEFINE_PROP_STRING("sysfsdev", VFIOCCWDevice, vdev.sysfsdev), DEFINE_PROP_BOOL("force-orb-pfch", VFIOCCWDevice, force_orb_pfch, false), +#ifdef CONFIG_IOMMUFD + DEFINE_PROP_LINK("iommufd", VFIOCCWDevice, vdev.iommufd, + TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), +#endif DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index 8e3d4ac458..a1c25e0337 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -649,6 +649,10 @@ static Property vfio_platform_dev_properties[] = { DEFINE_PROP_UINT32("mmap-timeout-ms", VFIOPlatformDevice, mmap_timeout, 1100), DEFINE_PROP_BOOL("x-irqfd", VFIOPlatformDevice, irqfd_allowed, true), +#ifdef CONFIG_IOMMUFD + DEFINE_PROP_LINK("iommufd", VFIOPlatformDevice, vbasedev.iommufd, + TYPE_IOMMUFD_BACKEND, IOMMUFDBackend *), +#endif DEFINE_PROP_END_OF_LIST(), }; From patchwork Thu Oct 26 10:31:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437492 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DC37AC25B6B for ; Thu, 26 Oct 2023 10:51:32 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvU-0004WU-C1; Thu, 26 Oct 2023 06:49:20 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvT-0004Uu-9v for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:49:19 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvR-0001Q0-9W for qemu-devel@nongnu.org; Thu, 26 Oct 2023 06:49:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317357; x=1729853357; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sbnxqcYPWs3o7QPAjjVFtMUDEodelFH40axz2Jl59K0=; b=WbHKjVQofRwpWYw6xWWPGnVzLRilNSdFCSPt8Oub68wyoRQwAEVBp2No HZv7nN4z8gbUzoHrAXx0hCICZaZ/mV14TCl7C4LQFO2P0kg4Nw8TUt8V6 lZ62f45Ho+1DwhKYgSTAC4tXm7LEg8bdvAwTi1SjVZy/R+o8SCWe37q2U 2A6jEb4BLmH/UTuKCFirf3Tq75r37QZ5RnMFkJZ0SvA/f6Su6n3Nbwrwa qdLhYs+BG+XPoVoemA/f+Sgh/kRBjdn6xMERbUlZcEt5BKF33mrpSfhgE 3i4w73s6A+xg9h9EorYN6KsNUAjr6A7JgHuFBYFlch6l+gX+RWDncbRaC A==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563871" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563871" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463884" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:16 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan Subject: [PATCH v3 35/37] vfio/platform: Make vfio cdev pre-openable by passing a file handle Date: Thu, 26 Oct 2023 18:31:02 +0800 Message-Id: <20231026103104.1686921-36-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This gives management tools like libvirt a chance to open the vfio cdev with privilege and pass FD to qemu. This way qemu never needs to have privilege to open a VFIO or iommu cdev node. Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- hw/vfio/platform.c | 41 +++++++++++++++++++++++++++++++++-------- 1 file changed, 33 insertions(+), 8 deletions(-) diff --git a/hw/vfio/platform.c b/hw/vfio/platform.c index a1c25e0337..aa0b2b9583 100644 --- a/hw/vfio/platform.c +++ b/hw/vfio/platform.c @@ -35,6 +35,7 @@ #include "hw/platform-bus.h" #include "hw/qdev-properties.h" #include "sysemu/kvm.h" +#include "monitor/monitor.h" /* * Functions used whatever the injection method @@ -529,14 +530,13 @@ static VFIODeviceOps vfio_platform_ops = { */ static int vfio_base_device_init(VFIODevice *vbasedev, Error **errp) { - struct stat st; int ret; - /* @sysfsdev takes precedence over @host */ - if (vbasedev->sysfsdev) { + /* @fd takes precedence over @sysfsdev which takes precedence over @host */ + if (vbasedev->fd < 0 && vbasedev->sysfsdev) { g_free(vbasedev->name); vbasedev->name = g_path_get_basename(vbasedev->sysfsdev); - } else { + } else if (vbasedev->fd < 0) { if (!vbasedev->name || strchr(vbasedev->name, '/')) { error_setg(errp, "wrong host device name"); return -EINVAL; @@ -546,10 +546,9 @@ static int vfio_base_device_init(VFIODevice *vbasedev, Error **errp) vbasedev->name); } - if (stat(vbasedev->sysfsdev, &st) < 0) { - error_setg_errno(errp, errno, - "failed to get the sysfs host device file status"); - return -errno; + ret = vfio_device_get_name(vbasedev, errp); + if (ret) { + return ret; } ret = vfio_attach_device(vbasedev->name, vbasedev, @@ -656,6 +655,28 @@ static Property vfio_platform_dev_properties[] = { DEFINE_PROP_END_OF_LIST(), }; +static void vfio_platform_instance_init(Object *obj) +{ + VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(obj); + + vdev->vbasedev.fd = -1; +} + +#ifdef CONFIG_IOMMUFD +static void vfio_platform_set_fd(Object *obj, const char *str, Error **errp) +{ + VFIOPlatformDevice *vdev = VFIO_PLATFORM_DEVICE(obj); + int fd = -1; + + fd = monitor_fd_param(monitor_cur(), str, errp); + if (fd == -1) { + error_prepend(errp, "Could not parse remote object fd %s:", str); + return; + } + vdev->vbasedev.fd = fd; +} +#endif + static void vfio_platform_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); @@ -663,6 +684,9 @@ static void vfio_platform_class_init(ObjectClass *klass, void *data) dc->realize = vfio_platform_realize; device_class_set_props(dc, vfio_platform_dev_properties); +#ifdef CONFIG_IOMMUFD + object_class_property_add_str(klass, "fd", NULL, vfio_platform_set_fd); +#endif dc->vmsd = &vfio_platform_vmstate; dc->desc = "VFIO-based platform device assignment"; sbc->connect_irq_notifier = vfio_start_irqfd_injection; @@ -675,6 +699,7 @@ static const TypeInfo vfio_platform_dev_info = { .name = TYPE_VFIO_PLATFORM, .parent = TYPE_SYS_BUS_DEVICE, .instance_size = sizeof(VFIOPlatformDevice), + .instance_init = vfio_platform_instance_init, .class_init = vfio_platform_class_init, .class_size = sizeof(VFIOPlatformDeviceClass), }; From patchwork Thu Oct 26 10:31:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9B754C25B6B for ; Thu, 26 Oct 2023 10:49:58 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvh-0004zW-VG; Thu, 26 Oct 2023 06:49:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvg-0004vc-Bn; Thu, 26 Oct 2023 06:49:32 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxve-0001PR-IC; Thu, 26 Oct 2023 06:49:32 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317370; x=1729853370; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0DEzYUesEZNzx3C3dtdSO+Zao1RRZKEyYyRWPcjm4IA=; b=BceY/jUS54DAnUzu8BYhwk1wnPM9uL7BCA0EBHgdEcvoVMP5goDXJMd8 ZYwc3jQwlPXV3x9Rp5sro33zaZbF/CmizRz9gmZ2GT4A2yK/vZJ5Y+9QW wzzNXGwE79C2+iLN/btG9Un6MZkgx1SKwMUidGZ3kisOXLX+xXxYR3sra Cn1THIhJna8dgDDMbcTcel3MkTR/j8pjk4A1UFuWJMyE5DtkEH+5BpMGw 9zcQeqPGtdrP21qgtIxtlmOAeMJqCgODeITRqZ0f7hLAoI/yYfSrQrrFY JdiY7eLrBaMUPmbj1G8001egYO4LD0LCa+tXl4/dtkP4gMhwZXiKSq1PH w==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563893" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563893" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463892" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:19 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Thomas Huth , Tony Krowiak , Halil Pasic , Jason Herne , qemu-s390x@nongnu.org (open list:S390 general arch...) Subject: [PATCH v3 36/37] vfio/ap: Make vfio cdev pre-openable by passing a file handle Date: Thu, 26 Oct 2023 18:31:03 +0800 Message-Id: <20231026103104.1686921-37-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This gives management tools like libvirt a chance to open the vfio cdev with privilege and pass FD to qemu. This way qemu never needs to have privilege to open a VFIO or iommu cdev node. Opportunisticly, remove some unnecessory double-cast. Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- hw/vfio/ap.c | 32 +++++++++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/hw/vfio/ap.c b/hw/vfio/ap.c index 6a4186ccd3..0a810f8b88 100644 --- a/hw/vfio/ap.c +++ b/hw/vfio/ap.c @@ -29,6 +29,7 @@ #include "hw/s390x/ap-bridge.h" #include "exec/address-spaces.h" #include "qom/object.h" +#include "monitor/monitor.h" #define TYPE_VFIO_AP_DEVICE "vfio-ap" @@ -159,7 +160,10 @@ static void vfio_ap_realize(DeviceState *dev, Error **errp) VFIOAPDevice *vapdev = VFIO_AP_DEVICE(dev); VFIODevice *vbasedev = &vapdev->vdev; - vbasedev->name = g_path_get_basename(vbasedev->sysfsdev); + if (vfio_device_get_name(vbasedev, errp)) { + return; + } + vbasedev->ops = &vfio_ap_ops; vbasedev->type = VFIO_DEVICE_TYPE_AP; vbasedev->dev = dev; @@ -229,11 +233,36 @@ static const VMStateDescription vfio_ap_vmstate = { .unmigratable = 1, }; +static void vfio_ap_instance_init(Object *obj) +{ + VFIOAPDevice *vapdev = VFIO_AP_DEVICE(obj); + + vapdev->vdev.fd = -1; +} + +#ifdef CONFIG_IOMMUFD +static void vfio_ap_set_fd(Object *obj, const char *str, Error **errp) +{ + VFIOAPDevice *vapdev = VFIO_AP_DEVICE(obj); + int fd = -1; + + fd = monitor_fd_param(monitor_cur(), str, errp); + if (fd == -1) { + error_prepend(errp, "Could not parse remote object fd %s:", str); + return; + } + vapdev->vdev.fd = fd; +} +#endif + static void vfio_ap_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); device_class_set_props(dc, vfio_ap_properties); +#ifdef CONFIG_IOMMUFD + object_class_property_add_str(klass, "fd", NULL, vfio_ap_set_fd); +#endif dc->vmsd = &vfio_ap_vmstate; dc->desc = "VFIO-based AP device assignment"; set_bit(DEVICE_CATEGORY_MISC, dc->categories); @@ -248,6 +277,7 @@ static const TypeInfo vfio_ap_info = { .name = TYPE_VFIO_AP_DEVICE, .parent = TYPE_AP_DEVICE, .instance_size = sizeof(VFIOAPDevice), + .instance_init = vfio_ap_instance_init, .class_init = vfio_ap_class_init, }; From patchwork Thu Oct 26 10:31:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Duan, Zhenzhong" X-Patchwork-Id: 13437504 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B6CEBC25B6B for ; Thu, 26 Oct 2023 10:53:08 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qvxvr-0006BO-7K; Thu, 26 Oct 2023 06:49:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvp-0005zG-Qr; Thu, 26 Oct 2023 06:49:41 -0400 Received: from mgamail.intel.com ([134.134.136.126]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qvxvo-0001Q0-3k; Thu, 26 Oct 2023 06:49:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1698317380; x=1729853380; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LCQxX27B9hAKu3fvfHgufolvqB8nmhm4GP+TjWb+FHI=; b=MhrKiyoA5n7L6Ugb7Vh4WThPmLJ7Oljh+Z71DHtGn+fgCdVZINiXJwso Mmf53aXh4o+nINCZ6M9I10QpSKdmmtF8WYP9WrRjdnzZMB7yOjKM8jxPR 0d3G0bpm2d5JhgDQ765huAaUJ+LFlW+992oBlvxJdYLDR3idjjJc+yE4O HRBZuhh0/U3V1YeFqodtdXBgMOjjg/Dg3bC6Ui4VTKRSW4zYgDHCTM8CN 9CDm7vJOSlLNoRBVZITE9MkjdpPShQXwRwKRoAhemt6KvFU9G/HKylJVY jKDtxniGuV+whgBZNhHWmahNmx65ZF6EN6wX8nxZpqrFCZVQeUQnHa6zD Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10874"; a="372563902" X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="372563902" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.03,253,1694761200"; d="scan'208";a="463903" Received: from duan-server-s2600bt.bj.intel.com ([10.240.192.147]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2023 03:48:24 -0700 From: Zhenzhong Duan To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, jgg@nvidia.com, nicolinc@nvidia.com, joao.m.martins@oracle.com, eric.auger@redhat.com, peterx@redhat.com, jasowang@redhat.com, kevin.tian@intel.com, yi.l.liu@intel.com, yi.y.sun@intel.com, chao.p.peng@intel.com, Zhenzhong Duan , Eric Farman , Matthew Rosato , Thomas Huth , qemu-s390x@nongnu.org (open list:vfio-ccw) Subject: [PATCH v3 37/37] vfio/ccw: Make vfio cdev pre-openable by passing a file handle Date: Thu, 26 Oct 2023 18:31:04 +0800 Message-Id: <20231026103104.1686921-38-zhenzhong.duan@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231026103104.1686921-1-zhenzhong.duan@intel.com> References: <20231026103104.1686921-1-zhenzhong.duan@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.126; envelope-from=zhenzhong.duan@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org This gives management tools like libvirt a chance to open the vfio cdev with privilege and pass FD to qemu. This way qemu never needs to have privilege to open a VFIO or iommu cdev node. Opportunisticly, remove a redundant definition of TYPE_VFIO_CCW. Signed-off-by: Zhenzhong Duan Signed-off-by: Cédric Le Goater --- hw/vfio/ccw.c | 34 +++++++++++++++++++++++++++++++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c index 7695ede0fc..a674bd8d6d 100644 --- a/hw/vfio/ccw.c +++ b/hw/vfio/ccw.c @@ -30,6 +30,7 @@ #include "qemu/error-report.h" #include "qemu/main-loop.h" #include "qemu/module.h" +#include "monitor/monitor.h" struct VFIOCCWDevice { S390CCWDevice cdev; @@ -589,11 +590,12 @@ static void vfio_ccw_realize(DeviceState *dev, Error **errp) } } + if (vfio_device_get_name(vbasedev, errp)) { + return; + } + vbasedev->ops = &vfio_ccw_ops; vbasedev->type = VFIO_DEVICE_TYPE_CCW; - vbasedev->name = g_strdup_printf("%x.%x.%04x", vcdev->cdev.hostid.cssid, - vcdev->cdev.hostid.ssid, - vcdev->cdev.hostid.devid); vbasedev->dev = dev; /* @@ -690,12 +692,37 @@ static const VMStateDescription vfio_ccw_vmstate = { .unmigratable = 1, }; +static void vfio_ccw_instance_init(Object *obj) +{ + VFIOCCWDevice *vcdev = VFIO_CCW(obj); + + vcdev->vdev.fd = -1; +} + +#ifdef CONFIG_IOMMUFD +static void vfio_ccw_set_fd(Object *obj, const char *str, Error **errp) +{ + VFIOCCWDevice *vcdev = VFIO_CCW(obj); + int fd = -1; + + fd = monitor_fd_param(monitor_cur(), str, errp); + if (fd == -1) { + error_prepend(errp, "Could not parse remote object fd %s:", str); + return; + } + vcdev->vdev.fd = fd; +} +#endif + static void vfio_ccw_class_init(ObjectClass *klass, void *data) { DeviceClass *dc = DEVICE_CLASS(klass); S390CCWDeviceClass *cdc = S390_CCW_DEVICE_CLASS(klass); device_class_set_props(dc, vfio_ccw_properties); +#ifdef CONFIG_IOMMUFD + object_class_property_add_str(klass, "fd", NULL, vfio_ccw_set_fd); +#endif dc->vmsd = &vfio_ccw_vmstate; dc->desc = "VFIO-based subchannel assignment"; set_bit(DEVICE_CATEGORY_MISC, dc->categories); @@ -713,6 +740,7 @@ static const TypeInfo vfio_ccw_info = { .name = TYPE_VFIO_CCW, .parent = TYPE_S390_CCW, .instance_size = sizeof(VFIOCCWDevice), + .instance_init = vfio_ccw_instance_init, .class_init = vfio_ccw_class_init, };