From patchwork Thu Sep 19 13:23:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11152385 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B6B876 for ; Thu, 19 Sep 2019 13:25:13 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D9961217D6 for ; Thu, 19 Sep 2019 13:25:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D9961217D6 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwPZ-0008Bg-Ls; Thu, 19 Sep 2019 13:23:53 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwPY-0008BL-QX for xen-devel@lists.xenproject.org; Thu, 19 Sep 2019 13:23:52 +0000 X-Inumbo-ID: b396fc62-dae0-11e9-965d-12813bfff9fa Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id b396fc62-dae0-11e9-965d-12813bfff9fa; Thu, 19 Sep 2019 13:23:43 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id EB617B009; Thu, 19 Sep 2019 13:23:42 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: Message-ID: <6e78af43-45f4-b2b1-26e5-04dfb40f1858@suse.com> Date: Thu, 19 Sep 2019 15:23:51 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: [Xen-devel] [PATCH v6 5/8] AMD/IOMMU: restrict interrupt remapping table sizes X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Suravee Suthikulpanit Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" There's no point setting up tables with more space than a PCI device can use. For both MSI and MSI-X we can determine how many interrupts could be set up at most. Tables allocated during ACPI table parsing, however, will (for now at least) continue to be set up to have maximum size. Note that until we would want to use sub-page allocations here there's no point checking whether both MSI and MSI-X are supported by a device - an order-0 allocation will fit the dual case in any event, no matter that the MSI-X vector count may be smaller than the MSI one. On my Rome system this reduces space needed from just over 1k pages to about 125. Suggested-by: Andrew Cooper Signed-off-by: Jan Beulich Reviewed-by: Paul Durrant Reviewed-by: Andrew Cooper --- v6: Don't allocate any IRT at all when device is neither MSI-X nor MSI-capable. Re-base over changes earlier in this series. v5: New. --- xen/drivers/passthrough/amd/iommu_acpi.c | 4 +- xen/drivers/passthrough/amd/iommu_init.c | 13 ++++----- xen/drivers/passthrough/amd/iommu_intr.c | 36 +++++++++++++++----------- xen/drivers/passthrough/amd/iommu_map.c | 20 ++++++++++---- xen/drivers/passthrough/amd/pci_amd_iommu.c | 18 +++++++------ xen/include/asm-x86/hvm/svm/amd-iommu-defs.h | 3 -- xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 5 ++- 7 files changed, 59 insertions(+), 40 deletions(-) --- a/xen/drivers/passthrough/amd/iommu_acpi.c +++ b/xen/drivers/passthrough/amd/iommu_acpi.c @@ -77,7 +77,7 @@ static void __init add_ivrs_mapping_entr { if ( !shared_intremap_table ) shared_intremap_table = amd_iommu_alloc_intremap_table( - iommu, &shared_intremap_inuse); + iommu, &shared_intremap_inuse, 0); if ( !shared_intremap_table ) panic("No memory for shared IRT\n"); @@ -89,7 +89,7 @@ static void __init add_ivrs_mapping_entr { ivrs_mappings[alias_id].intremap_table = amd_iommu_alloc_intremap_table( - iommu, &ivrs_mappings[alias_id].intremap_inuse); + iommu, &ivrs_mappings[alias_id].intremap_inuse, 0); if ( !ivrs_mappings[alias_id].intremap_table ) panic("No memory for %04x:%02x:%02x.%u's IRT\n", --- a/xen/drivers/passthrough/amd/iommu_init.c +++ b/xen/drivers/passthrough/amd/iommu_init.c @@ -1284,12 +1284,14 @@ static int __init amd_iommu_setup_device pcidevs_unlock(); } - if ( pdev ) + if ( pdev && (pdev->msix || pdev->msi_maxvec) ) { ivrs_mappings[bdf].intremap_table = amd_iommu_alloc_intremap_table( ivrs_mappings[bdf].iommu, - &ivrs_mappings[bdf].intremap_inuse); + &ivrs_mappings[bdf].intremap_inuse, + pdev->msix ? pdev->msix->nr_entries + : pdev->msi_maxvec); if ( !ivrs_mappings[bdf].intremap_table ) return -ENOMEM; @@ -1312,11 +1314,8 @@ static int __init amd_iommu_setup_device } amd_iommu_set_intremap_table( - dte, - ivrs_mappings[bdf].intremap_table - ? virt_to_maddr(ivrs_mappings[bdf].intremap_table) - : 0, - iommu_intremap); + dte, ivrs_mappings[bdf].intremap_table, + ivrs_mappings[bdf].iommu, iommu_intremap); } } --- a/xen/drivers/passthrough/amd/iommu_intr.c +++ b/xen/drivers/passthrough/amd/iommu_intr.c @@ -69,7 +69,8 @@ union irte_cptr { const union irte128 *ptr128; } __transparent__; -#define INTREMAP_MAX_ENTRIES (1 << IOMMU_INTREMAP_ORDER) +#define INTREMAP_MAX_ORDER 0xB +#define INTREMAP_MAX_ENTRIES (1 << INTREMAP_MAX_ORDER) struct ioapic_sbdf ioapic_sbdf[MAX_IO_APICS]; struct hpet_sbdf hpet_sbdf; @@ -80,17 +81,13 @@ unsigned int nr_ioapic_sbdf; static void dump_intremap_tables(unsigned char key); -static unsigned int __init intremap_table_order(const struct amd_iommu *iommu) -{ - return iommu->ctrl.ga_en - ? get_order_from_bytes(INTREMAP_MAX_ENTRIES * sizeof(union irte128)) - : get_order_from_bytes(INTREMAP_MAX_ENTRIES * sizeof(union irte32)); -} +#define intremap_page_order(irt) PFN_ORDER(virt_to_page(irt)) unsigned int amd_iommu_intremap_table_order( const void *irt, const struct amd_iommu *iommu) { - return IOMMU_INTREMAP_ORDER; + return intremap_page_order(irt) + PAGE_SHIFT - + (iommu->ctrl.ga_en ? 4 : 2); } static unsigned int intremap_table_entries( @@ -825,7 +822,10 @@ int amd_iommu_free_intremap_table( if ( *tblp ) { - __free_amd_iommu_tables(*tblp, intremap_table_order(iommu)); + unsigned int order = intremap_page_order(*tblp); + + intremap_page_order(*tblp) = 0; + __free_amd_iommu_tables(*tblp, order); *tblp = NULL; } @@ -833,15 +833,23 @@ int amd_iommu_free_intremap_table( } void *amd_iommu_alloc_intremap_table( - const struct amd_iommu *iommu, unsigned long **inuse_map) + const struct amd_iommu *iommu, unsigned long **inuse_map, unsigned int nr) { - unsigned int order = intremap_table_order(iommu); - void *tb = __alloc_amd_iommu_tables(order); + unsigned int order; + void *tb; + if ( !nr ) + nr = INTREMAP_MAX_ENTRIES; + + order = iommu->ctrl.ga_en + ? get_order_from_bytes(nr * sizeof(union irte128)) + : get_order_from_bytes(nr * sizeof(union irte32)); + + tb = __alloc_amd_iommu_tables(order); if ( tb ) { - unsigned int nr = intremap_table_entries(tb, iommu); - + intremap_page_order(tb) = order; + nr = intremap_table_entries(tb, iommu); *inuse_map = xzalloc_array(unsigned long, BITS_TO_LONGS(nr)); if ( *inuse_map ) memset(tb, 0, PAGE_SIZE << order); --- a/xen/drivers/passthrough/amd/iommu_map.c +++ b/xen/drivers/passthrough/amd/iommu_map.c @@ -113,12 +113,22 @@ void amd_iommu_set_root_page_table(struc } void __init amd_iommu_set_intremap_table( - struct amd_iommu_dte *dte, uint64_t intremap_ptr, bool valid) + struct amd_iommu_dte *dte, const void *ptr, + const struct amd_iommu *iommu, bool valid) { - dte->it_root = intremap_ptr >> 6; - dte->int_tab_len = intremap_ptr ? IOMMU_INTREMAP_ORDER : 0; - dte->int_ctl = intremap_ptr ? IOMMU_DEV_TABLE_INT_CONTROL_TRANSLATED - : IOMMU_DEV_TABLE_INT_CONTROL_ABORTED; + if ( ptr ) + { + dte->it_root = virt_to_maddr(ptr) >> 6; + dte->int_tab_len = amd_iommu_intremap_table_order(ptr, iommu); + dte->int_ctl = IOMMU_DEV_TABLE_INT_CONTROL_TRANSLATED; + } + else + { + dte->it_root = 0; + dte->int_tab_len = 0; + dte->int_ctl = IOMMU_DEV_TABLE_INT_CONTROL_ABORTED; + } + dte->ig = false; /* unmapped interrupts result in i/o page faults */ dte->iv = valid; } --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -470,18 +470,22 @@ static int amd_iommu_add_device(u8 devfn { unsigned long flags; - ivrs_mappings[bdf].intremap_table = - amd_iommu_alloc_intremap_table( - iommu, &ivrs_mappings[bdf].intremap_inuse); - if ( !ivrs_mappings[bdf].intremap_table ) - return -ENOMEM; + if ( pdev->msix || pdev->msi_maxvec ) + { + ivrs_mappings[bdf].intremap_table = + amd_iommu_alloc_intremap_table( + iommu, &ivrs_mappings[bdf].intremap_inuse, + pdev->msix ? pdev->msix->nr_entries + : pdev->msi_maxvec); + if ( !ivrs_mappings[bdf].intremap_table ) + return -ENOMEM; + } spin_lock_irqsave(&iommu->lock, flags); amd_iommu_set_intremap_table( iommu->dev_table.buffer + (bdf * IOMMU_DEV_TABLE_ENTRY_SIZE), - virt_to_maddr(ivrs_mappings[bdf].intremap_table), - iommu_intremap); + ivrs_mappings[bdf].intremap_table, iommu, iommu_intremap); amd_iommu_flush_device(iommu, bdf); --- a/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h @@ -107,9 +107,6 @@ #define IOMMU_DEV_TABLE_INT_CONTROL_FORWARDED 0x1 #define IOMMU_DEV_TABLE_INT_CONTROL_TRANSLATED 0x2 -/* For now, we always allocate the maximum: 2048 entries. */ -#define IOMMU_INTREMAP_ORDER 0xB - struct amd_iommu_dte { /* 0 - 63 */ bool v:1; --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h @@ -72,7 +72,8 @@ int __must_check amd_iommu_flush_iotlb_a /* device table functions */ int get_dma_requestor_id(uint16_t seg, uint16_t bdf); void amd_iommu_set_intremap_table(struct amd_iommu_dte *dte, - uint64_t intremap_ptr, + const void *ptr, + const struct amd_iommu *iommu, bool valid); void amd_iommu_set_root_page_table(struct amd_iommu_dte *dte, uint64_t root_ptr, uint16_t domain_id, @@ -99,7 +100,7 @@ struct amd_iommu *find_iommu_for_device( bool iov_supports_xt(void); int amd_iommu_setup_ioapic_remapping(void); void *amd_iommu_alloc_intremap_table( - const struct amd_iommu *, unsigned long **); + const struct amd_iommu *, unsigned long **, unsigned int nr); int amd_iommu_free_intremap_table( const struct amd_iommu *, struct ivrs_mappings *, uint16_t); unsigned int amd_iommu_intremap_table_order(