From patchwork Thu Jun 13 13:50:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Teddy Astie X-Patchwork-Id: 13696864 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5C858C27C6E for ; Thu, 13 Jun 2024 13:50:45 +0000 (UTC) Received: from list by lists.xenproject.org with outflank-mailman.739992.1146985 (Exim 4.92) (envelope-from ) id 1sHkqR-0007Pi-U8; Thu, 13 Jun 2024 13:50:27 +0000 X-Outflank-Mailman: Message body and most headers restored to incoming version Received: by outflank-mailman (output) from mailman id 739992.1146985; Thu, 13 Jun 2024 13:50:27 +0000 Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sHkqR-0007Pb-RT; Thu, 13 Jun 2024 13:50:27 +0000 Received: by outflank-mailman (input) for mailman id 739992; Thu, 13 Jun 2024 13:50:26 +0000 Received: from se1-gles-sth1-in.inumbo.com ([159.253.27.254] helo=se1-gles-sth1.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1sHkqQ-0007PV-Bp for xen-devel@lists.xenproject.org; Thu, 13 Jun 2024 13:50:26 +0000 Received: from mail177-18.suw61.mandrillapp.com (mail177-18.suw61.mandrillapp.com [198.2.177.18]) by se1-gles-sth1.inumbo.com (Halon) with ESMTPS id e196fc1f-298b-11ef-90a3-e314d9c70b13; Thu, 13 Jun 2024 15:50:24 +0200 (CEST) Received: from pmta14.mandrill.prod.suw01.rsglab.com (localhost [127.0.0.1]) by mail177-18.suw61.mandrillapp.com (Mailchimp) with ESMTP id 4W0P1V5l3fzCf9KVH for ; Thu, 13 Jun 2024 13:50:22 +0000 (GMT) Received: from [37.26.189.201] by mandrillapp.com id c40a10789c0a4bc7a37d42d048579bcb; Thu, 13 Jun 2024 13:50:22 +0000 X-BeenThere: xen-devel@lists.xenproject.org List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" X-Inumbo-ID: e196fc1f-298b-11ef-90a3-e314d9c70b13 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; s=mte1; t=1718286622; x=1718547122; bh=yOp6n6ZWWUNSdC8h4VQyGHPUzOnfugZqh72W0IfKJNE=; h=From:Subject:To:Cc:Message-Id:Feedback-ID:Date:MIME-Version: Content-Type:Content-Transfer-Encoding:CC:Date:Subject:From; b=bluulqKV2uhDQ3h6rzdi6g/7DS4FjkAGcgjElY3Q58ME0EcBv0/QSPphXCDKAFh++ dhezZMDMNtebajEQNAIQc8t5vWKBCNsgBDnRJsMbS64hlsB7Gr4pncb4uogYHQ0NJo C/JoAwxeWWYO4cz3YMgYL9N9L38WdBAOsuwKuJhibWC/Jgtkhqq1ZKavk8R0N2SaYx Nvmwhy4LvgnDvaPSoS7JJQN74ZZO2DXseJBS/af8RODzff+vgu2hNsP8zcv7qqeTUb kboK836RIq7aEBjyW/V/B/dtfjZLNmN6kdPA+3J84zx7sss1JQK3Qhlk4JO+jMfC7D 4UOR+E2EM3LSA== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vates.tech; s=mte1; t=1718286622; x=1718547122; i=teddy.astie@vates.tech; bh=yOp6n6ZWWUNSdC8h4VQyGHPUzOnfugZqh72W0IfKJNE=; h=From:Subject:To:Cc:Message-Id:Feedback-ID:Date:MIME-Version: Content-Type:Content-Transfer-Encoding:CC:Date:Subject:From; b=LYZGrVAkhwyV/dBB4cixg2j3YKNcjPWpHllqgD+1Gv+j3sTcq7MnrxNQGfusVnntG pd1nndbX8TudUjijsqS5d9DCof+h9T/rQrysVwpnW4nvcTXcNTqhkFGtt0CT/jTKD2 8g8axTGdyd0TvXWf6kKjp0SU58KKoMBVDUA9XjZnM3K7P0u6VnQLkAFDw8DqL3vZHE F6XWy0flnjoSUTy+Nr1f6bDEojrN7H1dbI5GEj7aQD9xOls2yfGI+d33r0vgEkoBw9 MAV1DdHC5F22q2Grk8u3MDeQx+RGzC9QYk8BJHf0BHfGKvw9WfTIymp7CG6p4xcQkN WHR8fhBPhEG9w== From: Teddy Astie Subject: =?utf-8?q?=5BRFC_PATCH=5D_iommu/xen=3A_Add_Xen_PV-IOMMU_driver?= X-Mailer: git-send-email 2.45.2 X-Bm-Disclaimer: Yes X-Bm-Milter-Handled: 4ffbd6c1-ee69-4e1b-aabd-f977039bd3e2 X-Bm-Transport-Timestamp: 1718286620009 To: xen-devel@lists.xenproject.org, iommu@lists.linux.dev Cc: Teddy Astie , Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko , Joerg Roedel , Will Deacon , Robin Murphy , =?utf-8?q?Marek_Marczykowski-G=C3=B3re?= =?utf-8?q?cki?= Message-Id: X-Native-Encoded: 1 X-Report-Abuse: =?utf-8?q?Please_forward_a_copy_of_this_message=2C_including?= =?utf-8?q?_all_headers=2C_to_abuse=40mandrill=2Ecom=2E_You_can_also_report_?= =?utf-8?q?abuse_here=3A_https=3A//mandrillapp=2Ecom/contact/abuse=3Fid=3D30?= =?utf-8?q?504962=2Ec40a10789c0a4bc7a37d42d048579bcb?= X-Mandrill-User: md_30504962 Feedback-ID: 30504962:30504962.20240613:md Date: Thu, 13 Jun 2024 13:50:22 +0000 MIME-Version: 1.0 In the context of Xen, Linux runs as Dom0 and doesn't have access to the machine IOMMU. Although, a IOMMU is mandatory to use some kernel features such as VFIO or DMA protection. In Xen, we added a paravirtualized IOMMU with iommu_op hypercall in order to allow Dom0 to implement such feature. This commit introduces a new IOMMU driver that uses this new hypercall interface. Signed-off-by Teddy Astie --- arch/x86/include/asm/xen/hypercall.h | 6 + drivers/iommu/Kconfig | 9 + drivers/iommu/Makefile | 1 + drivers/iommu/xen-iommu.c | 508 +++++++++++++++++++++++++++ include/xen/interface/memory.h | 33 ++ include/xen/interface/pv-iommu.h | 114 ++++++ include/xen/interface/xen.h | 1 + 7 files changed, 672 insertions(+) create mode 100644 drivers/iommu/xen-iommu.c create mode 100644 include/xen/interface/pv-iommu.h diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h index a2dd24947eb8..6b1857f27c14 100644 --- a/arch/x86/include/asm/xen/hypercall.h +++ b/arch/x86/include/asm/xen/hypercall.h @@ -490,6 +490,12 @@ HYPERVISOR_xenpmu_op(unsigned int op, void *arg) return _hypercall2(int, xenpmu_op, op, arg); } +static inline int +HYPERVISOR_iommu_op(void *arg) +{ + return _hypercall1(int, iommu_op, arg); +} + static inline int HYPERVISOR_dm_op( domid_t dom, unsigned int nr_bufs, struct xen_dm_op_buf *bufs) diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig index 2b12b583ef4b..8d8a22b91e34 100644 --- a/drivers/iommu/Kconfig +++ b/drivers/iommu/Kconfig @@ -482,6 +482,15 @@ config VIRTIO_IOMMU Say Y here if you intend to run this kernel as a guest. +config XEN_IOMMU + bool "Xen IOMMU driver" + depends on XEN_DOM0 + select IOMMU_API + help + Xen PV-IOMMU driver for Dom0. + + Say Y here if you intend to run this guest as Xen Dom0. + config SPRD_IOMMU tristate "Unisoc IOMMU Support" depends on ARCH_SPRD || COMPILE_TEST diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile index 769e43d780ce..11fa258d3a04 100644 --- a/drivers/iommu/Makefile +++ b/drivers/iommu/Makefile @@ -30,3 +30,4 @@ obj-$(CONFIG_VIRTIO_IOMMU) += virtio-iommu.o obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o io-pgfault.o obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o obj-$(CONFIG_APPLE_DART) += apple-dart.o +obj-$(CONFIG_XEN_IOMMU) += xen-iommu.o \ No newline at end of file diff --git a/drivers/iommu/xen-iommu.c b/drivers/iommu/xen-iommu.c new file mode 100644 index 000000000000..2c8e42240a6b --- /dev/null +++ b/drivers/iommu/xen-iommu.c @@ -0,0 +1,508 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Xen PV-IOMMU driver. + * + * Copyright (C) 2024 Vates SAS + * + * Author: Teddy Astie + * + */ + +#define pr_fmt(fmt) "xen-iommu: " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +MODULE_DESCRIPTION("Xen IOMMU driver"); +MODULE_AUTHOR("Teddy Astie "); +MODULE_LICENSE("GPL"); + +#define MSI_RANGE_START (0xfee00000) +#define MSI_RANGE_END (0xfeefffff) + +#define XEN_IOMMU_PGSIZES (0x1000) + +struct xen_iommu_domain { + struct iommu_domain domain; + + u16 ctx_no; /* Xen PV-IOMMU context number */ +}; + +static struct iommu_device xen_iommu_device; + +static uint32_t max_nr_pages; +static uint64_t max_iova_addr; + +static spinlock_t lock; + +static inline struct xen_iommu_domain *to_xen_iommu_domain(struct iommu_domain *dom) +{ + return container_of(dom, struct xen_iommu_domain, domain); +} + +static inline u64 addr_to_pfn(u64 addr) +{ + return addr >> 12; +} + +static inline u64 pfn_to_addr(u64 pfn) +{ + return pfn << 12; +} + +bool xen_iommu_capable(struct device *dev, enum iommu_cap cap) +{ + switch (cap) { + case IOMMU_CAP_CACHE_COHERENCY: + return true; + + default: + return false; + } +} + +struct iommu_domain *xen_iommu_domain_alloc(unsigned type) +{ + struct xen_iommu_domain *domain; + u16 ctx_no; + int ret; + + if (type & IOMMU_DOMAIN_IDENTITY) { + /* use default domain */ + ctx_no = 0; + } else { + struct pv_iommu_op op = { + .ctx_no = 0, + .flags = 0, + .subop_id = IOMMUOP_alloc_context + }; + + ret = HYPERVISOR_iommu_op(&op); + + if (ret) { + pr_err("Unable to create Xen IOMMU context (%d)", ret); + return ERR_PTR(ret); + } + + ctx_no = op.ctx_no; + } + + domain = kzalloc(sizeof(*domain), GFP_KERNEL); + + domain->ctx_no = ctx_no; + + domain->domain.geometry.aperture_start = 0; + + domain->domain.geometry.aperture_end = max_iova_addr; + domain->domain.geometry.force_aperture = true; + + return &domain->domain; +} + +static struct iommu_group *xen_iommu_device_group(struct device *dev) +{ + if (!dev_is_pci(dev)) + return ERR_PTR(-ENODEV); + + return pci_device_group(dev); +} + +static struct iommu_device *xen_iommu_probe_device(struct device *dev) +{ + if (!dev_is_pci(dev)) + return ERR_PTR(-ENODEV); + + return &xen_iommu_device; +} + +static void xen_iommu_probe_finalize(struct device *dev) +{ + set_dma_ops(dev, NULL); + iommu_setup_dma_ops(dev, 0, max_iova_addr); +} + +static void xen_iommu_release_device(struct device *dev) +{ + int ret; + struct pci_dev *pdev; + struct pv_iommu_op op = { + .subop_id = IOMMUOP_reattach_device, + .flags = 0, + .ctx_no = 0 /* reattach device back to default context */ + }; + + if (!dev_is_pci(dev)) + return; + + pdev = to_pci_dev(dev); + + op.reattach_device.dev.seg = pci_domain_nr(pdev->bus); + op.reattach_device.dev.bus = pdev->bus->number; + op.reattach_device.dev.devfn = pdev->devfn; + + ret = HYPERVISOR_iommu_op(&op); + + if (ret) + pr_warn("Unable to release device %p\n", &op.reattach_device.dev); +} + +static int xen_iommu_map_pages(struct iommu_domain *domain, unsigned long iova, + phys_addr_t paddr, size_t pgsize, size_t pgcount, + int prot, gfp_t gfp, size_t *mapped) +{ + size_t xen_pg_count = (pgsize / XEN_PAGE_SIZE) * pgcount; + struct xen_iommu_domain *dom = to_xen_iommu_domain(domain); + struct pv_iommu_op op = { + .subop_id = IOMMUOP_map_pages, + .flags = 0, + .ctx_no = dom->ctx_no + }; + /* NOTE: paddr is actually bound to pfn, not gfn */ + uint64_t pfn = addr_to_pfn(paddr); + uint64_t dfn = addr_to_pfn(iova); + int ret = 0; + + if (WARN(!dom->ctx_no, "Tried to map page to default context")) + return -EINVAL; + + //pr_info("Mapping to %lx %zu %zu paddr %x\n", iova, pgsize, pgcount, paddr); + + if (prot & IOMMU_READ) + op.flags |= IOMMU_OP_readable; + + if (prot & IOMMU_WRITE) + op.flags |= IOMMU_OP_writeable; + + while (xen_pg_count) { + size_t to_map = min(xen_pg_count, max_nr_pages); + uint64_t gfn = pfn_to_gfn(pfn); + + //pr_info("Mapping %lx-%lx at %lx-%lx\n", gfn, gfn + to_map - 1, dfn, dfn + to_map - 1); + + op.map_pages.gfn = gfn; + op.map_pages.dfn = dfn; + + op.map_pages.nr_pages = to_map; + + ret = HYPERVISOR_iommu_op(&op); + + //pr_info("map_pages.mapped = %u\n", op.map_pages.mapped); + + if (mapped) + *mapped += XEN_PAGE_SIZE * op.map_pages.mapped; + + if (ret) + break; + + xen_pg_count -= to_map; + + pfn += to_map; + dfn += to_map; + } + + return ret; +} + +static size_t xen_iommu_unmap_pages(struct iommu_domain *domain, unsigned long iova, + size_t pgsize, size_t pgcount, + struct iommu_iotlb_gather *iotlb_gather) +{ + size_t xen_pg_count = (pgsize / XEN_PAGE_SIZE) * pgcount; + struct xen_iommu_domain *dom = to_xen_iommu_domain(domain); + struct pv_iommu_op op = { + .subop_id = IOMMUOP_unmap_pages, + .ctx_no = dom->ctx_no, + .flags = 0, + }; + uint64_t dfn = addr_to_pfn(iova); + int ret = 0; + + if (WARN(!dom->ctx_no, "Tried to unmap page to default context")) + return -EINVAL; + + while (xen_pg_count) { + size_t to_unmap = min(xen_pg_count, max_nr_pages); + + //pr_info("Unmapping %lx-%lx\n", dfn, dfn + to_unmap - 1); + + op.unmap_pages.dfn = dfn; + op.unmap_pages.nr_pages = to_unmap; + + ret = HYPERVISOR_iommu_op(&op); + + if (ret) + pr_warn("Unmap failure (%lx-%lx)\n", dfn, dfn + to_unmap - 1); + + xen_pg_count -= to_unmap; + + dfn += to_unmap; + } + + return pgcount * pgsize; +} + +int xen_iommu_attach_dev(struct iommu_domain *domain, struct device *dev) +{ + struct pci_dev *pdev; + struct xen_iommu_domain *dom = to_xen_iommu_domain(domain); + struct pv_iommu_op op = { + .subop_id = IOMMUOP_reattach_device, + .flags = 0, + .ctx_no = dom->ctx_no, + }; + + if (!dev_is_pci(dev)) + return -EINVAL; + + pdev = to_pci_dev(dev); + + op.reattach_device.dev.seg = pci_domain_nr(pdev->bus); + op.reattach_device.dev.bus = pdev->bus->number; + op.reattach_device.dev.devfn = pdev->devfn; + + return HYPERVISOR_iommu_op(&op); +} + +static void xen_iommu_free(struct iommu_domain *domain) +{ + int ret; + struct xen_iommu_domain *dom = to_xen_iommu_domain(domain); + + if (dom->ctx_no != 0) { + struct pv_iommu_op op = { + .ctx_no = dom->ctx_no, + .flags = 0, + .subop_id = IOMMUOP_free_context + }; + + ret = HYPERVISOR_iommu_op(&op); + + if (ret) + pr_err("Context %hu destruction failure\n", dom->ctx_no); + } + + kfree(domain); +} + +static phys_addr_t xen_iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova) +{ + int ret; + struct xen_iommu_domain *dom = to_xen_iommu_domain(domain); + + struct pv_iommu_op op = { + .ctx_no = dom->ctx_no, + .flags = 0, + .subop_id = IOMMUOP_lookup_page, + }; + + op.lookup_page.dfn = addr_to_pfn(iova); + + ret = HYPERVISOR_iommu_op(&op); + + if (ret) + return 0; + + phys_addr_t page_addr = pfn_to_addr(gfn_to_pfn(op.lookup_page.gfn)); + + /* Consider non-aligned iova */ + return page_addr + (iova & 0xFFF); +} + +static void xen_iommu_get_resv_regions(struct device *dev, struct list_head *head) +{ + struct iommu_resv_region *reg; + struct xen_reserved_device_memory *entries; + struct xen_reserved_device_memory_map map; + struct pci_dev *pdev; + int ret, i; + + if (!dev_is_pci(dev)) + return; + + pdev = to_pci_dev(dev); + + reg = iommu_alloc_resv_region(MSI_RANGE_START, + MSI_RANGE_END - MSI_RANGE_START + 1, + 0, IOMMU_RESV_MSI, GFP_KERNEL); + + if (!reg) + return; + + list_add_tail(®->list, head); + + /* Map xen-specific entries */ + + /* First, get number of entries to map */ + map.buffer = NULL; + map.nr_entries = 0; + map.flags = 0; + + map.dev.pci.seg = pci_domain_nr(pdev->bus); + map.dev.pci.bus = pdev->bus->number; + map.dev.pci.devfn = pdev->devfn; + + ret = HYPERVISOR_memory_op(XENMEM_reserved_device_memory_map, &map); + + if (ret == 0) + /* No reserved region, nothing to do */ + return; + + if (ret != -ENOBUFS) { + pr_err("Unable to get reserved region count (%d)\n", ret); + return; + } + + /* Assume a reasonable number of entries, otherwise, something is probably wrong */ + if (WARN_ON(map.nr_entries > 256)) + pr_warn("Xen reporting many reserved regions (%u)\n", map.nr_entries); + + /* And finally get actual mappings */ + entries = kcalloc(map.nr_entries, sizeof(struct xen_reserved_device_memory), + GFP_KERNEL); + + if (!entries) { + pr_err("No memory for map entries\n"); + return; + } + + map.buffer = entries; + + ret = HYPERVISOR_memory_op(XENMEM_reserved_device_memory_map, &map); + + if (ret != 0) { + pr_err("Unable to get reserved regions (%d)\n", ret); + kfree(entries); + return; + } + + for (i = 0; i < map.nr_entries; i++) { + struct xen_reserved_device_memory entry = entries[i]; + + reg = iommu_alloc_resv_region(pfn_to_addr(entry.start_pfn), + pfn_to_addr(entry.nr_pages), + 0, IOMMU_RESV_RESERVED, GFP_KERNEL); + + if (!reg) + break; + + list_add_tail(®->list, head); + } + + kfree(entries); +} + +static struct iommu_ops xen_iommu_ops = { + .capable = xen_iommu_capable, + .domain_alloc = xen_iommu_domain_alloc, + .probe_device = xen_iommu_probe_device, + .probe_finalize = xen_iommu_probe_finalize, + .device_group = xen_iommu_device_group, + .release_device = xen_iommu_release_device, + .get_resv_regions = xen_iommu_get_resv_regions, + .pgsize_bitmap = XEN_IOMMU_PGSIZES, + .default_domain_ops = &(const struct iommu_domain_ops) { + .map_pages = xen_iommu_map_pages, + .unmap_pages = xen_iommu_unmap_pages, + .attach_dev = xen_iommu_attach_dev, + .iova_to_phys = xen_iommu_iova_to_phys, + .free = xen_iommu_free, + }, +}; + +int __init xen_iommu_init(void) +{ + int ret; + struct pv_iommu_op op = { + .subop_id = IOMMUOP_query_capabilities + }; + + if (!xen_domain()) + return -ENODEV; + + /* Check if iommu_op is supported */ + if (HYPERVISOR_iommu_op(&op) == -ENOSYS) + return -ENODEV; /* No Xen IOMMU hardware */ + + pr_info("Initialising Xen IOMMU driver\n"); + pr_info("max_nr_pages=%d\n", op.cap.max_nr_pages); + pr_info("max_ctx_no=%d\n", op.cap.max_ctx_no); + pr_info("max_iova_addr=%llx\n", op.cap.max_iova_addr); + + if (op.cap.max_ctx_no == 0) { + pr_err("Unable to use IOMMU PV driver (no context available)\n"); + return -ENOTSUPP; /* Unable to use IOMMU PV ? */ + } + + if (xen_domain_type == XEN_PV_DOMAIN) + /* TODO: In PV domain, due to the existing pfn-gfn mapping we need to + * consider that under certains circonstances, we have : + * pfn_to_gfn(x + 1) != pfn_to_gfn(x) + 1 + * + * In these cases, we would want to separate the subop into several calls. + * (only doing the grouped operation when the mapping is actually contigous) + * Only map operation would be affected, as unmap actually uses dfn which + * doesn't have this kind of mapping. + * + * Force single-page operations to work arround this issue for now. + */ + max_nr_pages = 1; + else + /* With HVM domains, pfn_to_gfn is identity, there is no issue regarding this. */ + max_nr_pages = op.cap.max_nr_pages; + + max_iova_addr = op.cap.max_iova_addr; + + spin_lock_init(&lock); + + ret = iommu_device_sysfs_add(&xen_iommu_device, NULL, NULL, "xen-iommu"); + if (ret) { + pr_err("Unable to add Xen IOMMU sysfs\n"); + return ret; + } + + ret = iommu_device_register(&xen_iommu_device, &xen_iommu_ops, NULL); + if (ret) { + pr_err("Unable to register Xen IOMMU device %d\n", ret); + iommu_device_sysfs_remove(&xen_iommu_device); + return ret; + } + + /* swiotlb is redundant when IOMMU is active. */ + x86_swiotlb_enable = false; + + return 0; +} + +void __exit xen_iommu_fini(void) +{ + pr_info("Unregistering Xen IOMMU driver\n"); + + iommu_device_unregister(&xen_iommu_device); + iommu_device_sysfs_remove(&xen_iommu_device); +} + +module_init(xen_iommu_init); +module_exit(xen_iommu_fini); diff --git a/include/xen/interface/memory.h b/include/xen/interface/memory.h index 1a371a825c55..08571add426b 100644 --- a/include/xen/interface/memory.h +++ b/include/xen/interface/memory.h @@ -10,6 +10,7 @@ #ifndef __XEN_PUBLIC_MEMORY_H__ #define __XEN_PUBLIC_MEMORY_H__ +#include "xen/interface/physdev.h" #include /* @@ -214,6 +215,38 @@ struct xen_add_to_physmap_range { }; DEFINE_GUEST_HANDLE_STRUCT(xen_add_to_physmap_range); +/* + * With some legacy devices, certain guest-physical addresses cannot safely + * be used for other purposes, e.g. to map guest RAM. This hypercall + * enumerates those regions so the toolstack can avoid using them. + */ +#define XENMEM_reserved_device_memory_map 27 +struct xen_reserved_device_memory { + xen_pfn_t start_pfn; + xen_ulong_t nr_pages; +}; +DEFINE_GUEST_HANDLE_STRUCT(xen_reserved_device_memory); + +struct xen_reserved_device_memory_map { +#define XENMEM_RDM_ALL 1 /* Request all regions (ignore dev union). */ + /* IN */ + uint32_t flags; + /* + * IN/OUT + * + * Gets set to the required number of entries when too low, + * signaled by error code -ERANGE. + */ + unsigned int nr_entries; + /* OUT */ + GUEST_HANDLE(xen_reserved_device_memory) buffer; + /* IN */ + union { + struct physdev_pci_device pci; + } dev; +}; +DEFINE_GUEST_HANDLE_STRUCT(xen_reserved_device_memory_map); + /* * Returns the pseudo-physical memory map as it was when the domain * was started (specified by XENMEM_set_memory_map). diff --git a/include/xen/interface/pv-iommu.h b/include/xen/interface/pv-iommu.h new file mode 100644 index 000000000000..5560609d0e7a --- /dev/null +++ b/include/xen/interface/pv-iommu.h @@ -0,0 +1,114 @@ +/* SPDX-License-Identifier: MIT */ +/****************************************************************************** + * pv-iommu.h + * + * Paravirtualized IOMMU driver interface. + * + * Copyright (c) 2024 Teddy Astie + */ + +#ifndef __XEN_PUBLIC_PV_IOMMU_H__ +#define __XEN_PUBLIC_PV_IOMMU_H__ + +#include "xen.h" +#include "physdev.h" + +#define IOMMU_DEFAULT_CONTEXT (0) + +/** + * Query PV-IOMMU capabilities for this domain. + */ +#define IOMMUOP_query_capabilities 1 + +/** + * Allocate an IOMMU context, the new context handle will be written to ctx_no. + */ +#define IOMMUOP_alloc_context 2 + +/** + * Destroy a IOMMU context. + * All devices attached to this context are reattached to default context. + * + * The default context can't be destroyed (0). + */ +#define IOMMUOP_free_context 3 + +/** + * Reattach the device to IOMMU context. + */ +#define IOMMUOP_reattach_device 4 + +#define IOMMUOP_map_pages 5 +#define IOMMUOP_unmap_pages 6 + +/** + * Get the GFN associated to a specific DFN. + */ +#define IOMMUOP_lookup_page 7 + +struct pv_iommu_op { + uint16_t subop_id; + uint16_t ctx_no; + +/** + * Create a context that is cloned from default. + * The new context will be populated with 1:1 mappings covering the entire guest memory. + */ +#define IOMMU_CREATE_clone (1 << 0) + +#define IOMMU_OP_readable (1 << 0) +#define IOMMU_OP_writeable (1 << 1) + uint32_t flags; + + union { + struct { + uint64_t gfn; + uint64_t dfn; + /* Number of pages to map */ + uint32_t nr_pages; + /* Number of pages actually mapped after sub-op */ + uint32_t mapped; + } map_pages; + + struct { + uint64_t dfn; + /* Number of pages to unmap */ + uint32_t nr_pages; + /* Number of pages actually unmapped after sub-op */ + uint32_t unmapped; + } unmap_pages; + + struct { + struct physdev_pci_device dev; + } reattach_device; + + struct { + uint64_t gfn; + uint64_t dfn; + } lookup_page; + + struct { + /* Maximum number of IOMMU context this domain can use. */ + uint16_t max_ctx_no; + /* Maximum number of pages that can be modified in a single map/unmap operation. */ + uint32_t max_nr_pages; + /* Maximum device address (iova) that the guest can use for mappings. */ + uint64_t max_iova_addr; + } cap; + }; +}; + +typedef struct pv_iommu_op pv_iommu_op_t; +DEFINE_GUEST_HANDLE_STRUCT(pv_iommu_op_t); + +#endif + +/* + * Local variables: + * mode: C + * c-file-style: "BSD" + * c-basic-offset: 4 + * tab-width: 4 + * indent-tabs-mode: nil + * End: + */ \ No newline at end of file diff --git a/include/xen/interface/xen.h b/include/xen/interface/xen.h index 0ca23eca2a9c..8b1daf3fecc6 100644 --- a/include/xen/interface/xen.h +++ b/include/xen/interface/xen.h @@ -65,6 +65,7 @@ #define __HYPERVISOR_xc_reserved_op 39 /* reserved for XenClient */ #define __HYPERVISOR_xenpmu_op 40 #define __HYPERVISOR_dm_op 41 +#define __HYPERVISOR_iommu_op 43 /* Architecture-specific hypercall definitions. */ #define __HYPERVISOR_arch_0 48