From patchwork Thu Sep 19 13:21:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11152373 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CE282195A for ; Thu, 19 Sep 2019 13:23:08 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id A8422217D6 for ; Thu, 19 Sep 2019 13:23:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A8422217D6 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwNb-0007bP-1K; Thu, 19 Sep 2019 13:21:51 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwNZ-0007bE-S7 for xen-devel@lists.xenproject.org; Thu, 19 Sep 2019 13:21:49 +0000 X-Inumbo-ID: 6f04f05e-dae0-11e9-965d-12813bfff9fa Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id 6f04f05e-dae0-11e9-965d-12813bfff9fa; Thu, 19 Sep 2019 13:21:48 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id A4F09AFCE; Thu, 19 Sep 2019 13:21:47 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: Message-ID: Date: Thu, 19 Sep 2019 15:21:55 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: [Xen-devel] [PATCH v6 1/8] AMD/IOMMU: don't blindly allocate interrupt remapping tables X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Suravee Suthikulpanit Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" ACPI tables are free to list far more device coordinates than there are actual devices. By delaying the table allocations for most cases, and doing them only when an actual device is known to be present at a given position, overall memory used for the tables goes down from over 500k pages to just over 1k (on my system having such ACPI table contents). Tables continue to get allocated right away for special entries (IO-APIC, HPET) as well as for alias IDs. While in the former case that's simply because there may not be any device at a given position, in the latter case this is to avoid having to introduce ref-counting of table usage. The change involves invoking iterate_ivrs_mappings(amd_iommu_setup_device_table) a second time, because the function now wants to be able to find PCI devices, which isn't possible yet when IOMMU setup happens very early during x2APIC mode setup. In this context amd_iommu_init_interrupt() gets renamed as well. The logic adjusting a DTE's interrupt remapping attributes also gets changed, such that the lack of an IRT would result in target aborted rather than not remapped interrupts (should any occur). Note that for now phantom functions get separate IRTs allocated, as was the case before. Signed-off-by: Jan Beulich Reviewed-by: Paul Durrant Acked-by: Andrew Cooper --- v6: Acquire IOMMU lock in code added to amd_iommu_add_device(). Drop a pointless use of the conditional operator. v5: New. --- TBD: This retains prior (but suspicious) behavior of not calling amd_iommu_set_intremap_table() for "invalid" IVRS mapping entries. Since DTE.IV=0 means un-remapped interrupts, I wonder if this needs changing. Backporting note: This depends on b5fbe81196 "iommu / x86: move call to scan_pci_devices() out of vendor code"! --- xen/drivers/passthrough/amd/iommu_acpi.c | 65 +++++++++++++---------- xen/drivers/passthrough/amd/iommu_init.c | 73 ++++++++++++++++++++------ xen/drivers/passthrough/amd/iommu_intr.c | 4 - xen/drivers/passthrough/amd/iommu_map.c | 5 + xen/drivers/passthrough/amd/pci_amd_iommu.c | 43 ++++++++++++++- xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 6 files changed, 143 insertions(+), 49 deletions(-) --- a/xen/drivers/passthrough/amd/iommu_acpi.c +++ b/xen/drivers/passthrough/amd/iommu_acpi.c @@ -53,7 +53,8 @@ union acpi_ivhd_device { }; static void __init add_ivrs_mapping_entry( - u16 bdf, u16 alias_id, u8 flags, struct amd_iommu *iommu) + uint16_t bdf, uint16_t alias_id, uint8_t flags, bool alloc_irt, + struct amd_iommu *iommu) { struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(iommu->seg); @@ -69,27 +70,32 @@ static void __init add_ivrs_mapping_entr if ( iommu->bdf == bdf ) return; - if ( !ivrs_mappings[alias_id].intremap_table ) + /* Allocate interrupt remapping table if needed. */ + if ( iommu_intremap && !ivrs_mappings[alias_id].intremap_table ) { - /* allocate per-device interrupt remapping table */ - if ( amd_iommu_perdev_intremap ) - ivrs_mappings[alias_id].intremap_table = - amd_iommu_alloc_intremap_table( - iommu, - &ivrs_mappings[alias_id].intremap_inuse); - else - { - if ( shared_intremap_table == NULL ) - shared_intremap_table = amd_iommu_alloc_intremap_table( - iommu, - &shared_intremap_inuse); - ivrs_mappings[alias_id].intremap_table = shared_intremap_table; - ivrs_mappings[alias_id].intremap_inuse = shared_intremap_inuse; - } - - if ( !ivrs_mappings[alias_id].intremap_table ) - panic("No memory for %04x:%02x:%02x.%u's IRT\n", iommu->seg, - PCI_BUS(alias_id), PCI_SLOT(alias_id), PCI_FUNC(alias_id)); + if ( !amd_iommu_perdev_intremap ) + { + if ( !shared_intremap_table ) + shared_intremap_table = amd_iommu_alloc_intremap_table( + iommu, &shared_intremap_inuse); + + if ( !shared_intremap_table ) + panic("No memory for shared IRT\n"); + + ivrs_mappings[alias_id].intremap_table = shared_intremap_table; + ivrs_mappings[alias_id].intremap_inuse = shared_intremap_inuse; + } + else if ( alloc_irt ) + { + ivrs_mappings[alias_id].intremap_table = + amd_iommu_alloc_intremap_table( + iommu, &ivrs_mappings[alias_id].intremap_inuse); + + if ( !ivrs_mappings[alias_id].intremap_table ) + panic("No memory for %04x:%02x:%02x.%u's IRT\n", + iommu->seg, PCI_BUS(alias_id), PCI_SLOT(alias_id), + PCI_FUNC(alias_id)); + } } ivrs_mappings[alias_id].valid = true; @@ -433,7 +439,8 @@ static u16 __init parse_ivhd_device_sele return 0; } - add_ivrs_mapping_entry(bdf, bdf, select->header.data_setting, iommu); + add_ivrs_mapping_entry(bdf, bdf, select->header.data_setting, false, + iommu); return sizeof(*select); } @@ -479,7 +486,7 @@ static u16 __init parse_ivhd_device_rang for ( bdf = first_bdf; bdf <= last_bdf; bdf++ ) add_ivrs_mapping_entry(bdf, bdf, range->start.header.data_setting, - iommu); + false, iommu); return dev_length; } @@ -513,7 +520,8 @@ static u16 __init parse_ivhd_device_alia AMD_IOMMU_DEBUG(" Dev_Id Alias: %#x\n", alias_id); - add_ivrs_mapping_entry(bdf, alias_id, alias->header.data_setting, iommu); + add_ivrs_mapping_entry(bdf, alias_id, alias->header.data_setting, true, + iommu); return dev_length; } @@ -568,7 +576,7 @@ static u16 __init parse_ivhd_device_alia for ( bdf = first_bdf; bdf <= last_bdf; bdf++ ) add_ivrs_mapping_entry(bdf, alias_id, range->alias.header.data_setting, - iommu); + true, iommu); return dev_length; } @@ -593,7 +601,7 @@ static u16 __init parse_ivhd_device_exte return 0; } - add_ivrs_mapping_entry(bdf, bdf, ext->header.data_setting, iommu); + add_ivrs_mapping_entry(bdf, bdf, ext->header.data_setting, false, iommu); return dev_length; } @@ -640,7 +648,7 @@ static u16 __init parse_ivhd_device_exte for ( bdf = first_bdf; bdf <= last_bdf; bdf++ ) add_ivrs_mapping_entry(bdf, bdf, range->extended.header.data_setting, - iommu); + false, iommu); return dev_length; } @@ -733,7 +741,8 @@ static u16 __init parse_ivhd_device_spec AMD_IOMMU_DEBUG("IVHD Special: %04x:%02x:%02x.%u variety %#x handle %#x\n", seg, PCI_BUS(bdf), PCI_SLOT(bdf), PCI_FUNC(bdf), special->variety, special->handle); - add_ivrs_mapping_entry(bdf, bdf, special->header.data_setting, iommu); + add_ivrs_mapping_entry(bdf, bdf, special->header.data_setting, true, + iommu); switch ( special->variety ) { --- a/xen/drivers/passthrough/amd/iommu_init.c +++ b/xen/drivers/passthrough/amd/iommu_init.c @@ -30,6 +30,7 @@ #include static int __initdata nr_amd_iommus; +static bool __initdata pci_init; static void do_amd_iommu_irq(unsigned long data); static DECLARE_SOFTIRQ_TASKLET(amd_iommu_irq_tasklet, do_amd_iommu_irq, 0); @@ -1244,17 +1245,20 @@ static int __init amd_iommu_setup_device BUG_ON( (ivrs_bdf_entries == 0) ); - /* allocate 'device table' on a 4K boundary */ - device_table.alloc_size = PAGE_SIZE << - get_order_from_bytes( - PAGE_ALIGN(ivrs_bdf_entries * - IOMMU_DEV_TABLE_ENTRY_SIZE)); - device_table.entries = device_table.alloc_size / - IOMMU_DEV_TABLE_ENTRY_SIZE; - - device_table.buffer = allocate_buffer(device_table.alloc_size, - "Device Table"); - if ( device_table.buffer == NULL ) + if ( !device_table.buffer ) + { + /* allocate 'device table' on a 4K boundary */ + device_table.alloc_size = PAGE_SIZE << + get_order_from_bytes( + PAGE_ALIGN(ivrs_bdf_entries * + IOMMU_DEV_TABLE_ENTRY_SIZE)); + device_table.entries = device_table.alloc_size / + IOMMU_DEV_TABLE_ENTRY_SIZE; + + device_table.buffer = allocate_buffer(device_table.alloc_size, + "Device Table"); + } + if ( !device_table.buffer ) return -ENOMEM; /* Add device table entries */ @@ -1263,13 +1267,46 @@ static int __init amd_iommu_setup_device if ( ivrs_mappings[bdf].valid ) { void *dte; + const struct pci_dev *pdev = NULL; /* add device table entry */ dte = device_table.buffer + (bdf * IOMMU_DEV_TABLE_ENTRY_SIZE); iommu_dte_add_device_entry(dte, &ivrs_mappings[bdf]); + if ( iommu_intremap && + ivrs_mappings[bdf].dte_requestor_id == bdf && + !ivrs_mappings[bdf].intremap_table ) + { + if ( !pci_init ) + continue; + pcidevs_lock(); + pdev = pci_get_pdev(seg, PCI_BUS(bdf), PCI_DEVFN2(bdf)); + pcidevs_unlock(); + } + + if ( pdev ) + { + unsigned int req_id = bdf; + + do { + ivrs_mappings[req_id].intremap_table = + amd_iommu_alloc_intremap_table( + ivrs_mappings[bdf].iommu, + &ivrs_mappings[req_id].intremap_inuse); + if ( !ivrs_mappings[req_id].intremap_table ) + return -ENOMEM; + + if ( !pdev->phantom_stride ) + break; + req_id += pdev->phantom_stride; + } while ( PCI_SLOT(req_id) == pdev->sbdf.dev ); + } + amd_iommu_set_intremap_table( - dte, virt_to_maddr(ivrs_mappings[bdf].intremap_table), + dte, + ivrs_mappings[bdf].intremap_table + ? virt_to_maddr(ivrs_mappings[bdf].intremap_table) + : 0, iommu_intremap); } } @@ -1402,7 +1439,8 @@ int __init amd_iommu_init(bool xt) if ( rc ) goto error_out; - /* allocate and initialize a global device table shared by all iommus */ + /* Allocate and initialize device table(s). */ + pci_init = !xt; rc = iterate_ivrs_mappings(amd_iommu_setup_device_table); if ( rc ) goto error_out; @@ -1422,7 +1460,7 @@ int __init amd_iommu_init(bool xt) /* * Setting up of the IOMMU interrupts cannot occur yet at the (very * early) time we get here when enabling x2APIC mode. Suppress it - * here, and do it explicitly in amd_iommu_init_interrupt(). + * here, and do it explicitly in amd_iommu_init_late(). */ rc = amd_iommu_init_one(iommu, !xt); if ( rc ) @@ -1436,11 +1474,16 @@ error_out: return rc; } -int __init amd_iommu_init_interrupt(void) +int __init amd_iommu_init_late(void) { struct amd_iommu *iommu; int rc = 0; + /* Further initialize the device table(s). */ + pci_init = true; + if ( iommu_intremap ) + rc = iterate_ivrs_mappings(amd_iommu_setup_device_table); + for_each_amd_iommu ( iommu ) { struct irq_desc *desc; --- a/xen/drivers/passthrough/amd/iommu_intr.c +++ b/xen/drivers/passthrough/amd/iommu_intr.c @@ -789,7 +789,7 @@ void amd_iommu_read_msi_from_ire( } } -int __init amd_iommu_free_intremap_table( +int amd_iommu_free_intremap_table( const struct amd_iommu *iommu, struct ivrs_mappings *ivrs_mapping) { void **tblp; @@ -814,7 +814,7 @@ int __init amd_iommu_free_intremap_table return 0; } -void *__init amd_iommu_alloc_intremap_table( +void *amd_iommu_alloc_intremap_table( const struct amd_iommu *iommu, unsigned long **inuse_map) { unsigned int order = intremap_table_order(iommu); --- a/xen/drivers/passthrough/amd/iommu_map.c +++ b/xen/drivers/passthrough/amd/iommu_map.c @@ -116,8 +116,9 @@ void __init amd_iommu_set_intremap_table struct amd_iommu_dte *dte, uint64_t intremap_ptr, bool valid) { dte->it_root = intremap_ptr >> 6; - dte->int_tab_len = IOMMU_INTREMAP_ORDER; - dte->int_ctl = IOMMU_DEV_TABLE_INT_CONTROL_TRANSLATED; + dte->int_tab_len = intremap_ptr ? IOMMU_INTREMAP_ORDER : 0; + dte->int_ctl = intremap_ptr ? IOMMU_DEV_TABLE_INT_CONTROL_TRANSLATED + : IOMMU_DEV_TABLE_INT_CONTROL_ABORTED; dte->ig = false; /* unmapped interrupts result in i/o page faults */ dte->iv = valid; } --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -164,7 +164,7 @@ static int __init iov_detect(void) if ( !iommu_enable && !iommu_intremap ) return 0; - if ( (init_done ? amd_iommu_init_interrupt() + if ( (init_done ? amd_iommu_init_late() : amd_iommu_init(false)) != 0 ) { printk("AMD-Vi: Error initialization\n"); @@ -428,6 +428,7 @@ static int amd_iommu_add_device(u8 devfn { struct amd_iommu *iommu; u16 bdf; + struct ivrs_mappings *ivrs_mappings; if ( !pdev->domain ) return -EINVAL; @@ -457,6 +458,36 @@ static int amd_iommu_add_device(u8 devfn return -ENODEV; } + ivrs_mappings = get_ivrs_mappings(pdev->seg); + bdf = PCI_BDF2(pdev->bus, devfn); + if ( !ivrs_mappings || + !ivrs_mappings[ivrs_mappings[bdf].dte_requestor_id].valid ) + return -EPERM; + + if ( iommu_intremap && + ivrs_mappings[bdf].dte_requestor_id == bdf && + !ivrs_mappings[bdf].intremap_table ) + { + unsigned long flags; + + ivrs_mappings[bdf].intremap_table = + amd_iommu_alloc_intremap_table( + iommu, &ivrs_mappings[bdf].intremap_inuse); + if ( !ivrs_mappings[bdf].intremap_table ) + return -ENOMEM; + + spin_lock_irqsave(&iommu->lock, flags); + + amd_iommu_set_intremap_table( + iommu->dev_table.buffer + (bdf * IOMMU_DEV_TABLE_ENTRY_SIZE), + virt_to_maddr(ivrs_mappings[bdf].intremap_table), + iommu_intremap); + + amd_iommu_flush_device(iommu, bdf); + + spin_unlock_irqrestore(&iommu->lock, flags); + } + amd_iommu_setup_domain_device(pdev->domain, iommu, devfn, pdev); return 0; } @@ -465,6 +496,8 @@ static int amd_iommu_remove_device(u8 de { struct amd_iommu *iommu; u16 bdf; + struct ivrs_mappings *ivrs_mappings; + if ( !pdev->domain ) return -EINVAL; @@ -480,6 +513,14 @@ static int amd_iommu_remove_device(u8 de } amd_iommu_disable_domain_device(pdev->domain, iommu, devfn, pdev); + + ivrs_mappings = get_ivrs_mappings(pdev->seg); + bdf = PCI_BDF2(pdev->bus, devfn); + if ( amd_iommu_perdev_intremap && + ivrs_mappings[bdf].dte_requestor_id == bdf && + ivrs_mappings[bdf].intremap_table ) + amd_iommu_free_intremap_table(iommu, &ivrs_mappings[bdf]); + return 0; } --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h @@ -50,7 +50,7 @@ void get_iommu_features(struct amd_iommu /* amd-iommu-init functions */ int amd_iommu_prepare(bool xt); int amd_iommu_init(bool xt); -int amd_iommu_init_interrupt(void); +int amd_iommu_init_late(void); int amd_iommu_update_ivrs_mapping_acpi(void); int iov_adjust_irq_affinities(void); From patchwork Thu Sep 19 13:22:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11152375 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0A2B176 for ; Thu, 19 Sep 2019 13:23:50 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id E2AA1217D6 for ; Thu, 19 Sep 2019 13:23:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E2AA1217D6 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwNw-0007hF-CX; Thu, 19 Sep 2019 13:22:12 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwNv-0007gy-Bs for xen-devel@lists.xenproject.org; Thu, 19 Sep 2019 13:22:11 +0000 X-Inumbo-ID: 7bbfc724-dae0-11e9-b299-bc764e2007e4 Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 7bbfc724-dae0-11e9-b299-bc764e2007e4; Thu, 19 Sep 2019 13:22:10 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 456C1AF1D; Thu, 19 Sep 2019 13:22:09 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: Message-ID: <06a35251-013f-d215-d70c-70a4c98ac86e@suse.com> Date: Thu, 19 Sep 2019 15:22:17 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: [Xen-devel] [PATCH v6 2/8] AMD/IOMMU: make phantom functions share interrupt remapping tables X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Suravee Suthikulpanit Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Rather than duplicating entries in amd_iommu_msi_msg_update_ire(), share the tables. This mainly requires some care while freeing them, to avoid freeing memory blocks twice. Signed-off-by: Jan Beulich Acked-by: Andrew Cooper Reviewed-by: Paul Durrant --- v5: New. --- xen/drivers/passthrough/amd/iommu_init.c | 43 +++++++++++++++--------- xen/drivers/passthrough/amd/iommu_intr.c | 45 +++++++++++++------------- xen/drivers/passthrough/amd/pci_amd_iommu.c | 2 - xen/include/asm-x86/amd-iommu.h | 2 - xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 - 5 files changed, 53 insertions(+), 41 deletions(-) --- a/xen/drivers/passthrough/amd/iommu_init.c +++ b/xen/drivers/passthrough/amd/iommu_init.c @@ -1111,7 +1111,7 @@ static void __init amd_iommu_init_cleanu amd_iommu_free_intremap_table(list_first_entry(&amd_iommu_head, struct amd_iommu, list), - NULL); + NULL, 0); /* free amd iommu list */ list_for_each_entry_safe ( iommu, next, &amd_iommu_head, list ) @@ -1176,7 +1176,7 @@ int iterate_ivrs_mappings(int (*handler) } int iterate_ivrs_entries(int (*handler)(const struct amd_iommu *, - struct ivrs_mappings *)) + struct ivrs_mappings *, uint16_t bdf)) { u16 seg = 0; int rc = 0; @@ -1193,7 +1193,7 @@ int iterate_ivrs_entries(int (*handler)( const struct amd_iommu *iommu = map[bdf].iommu; if ( iommu && map[bdf].dte_requestor_id == bdf ) - rc = handler(iommu, &map[bdf]); + rc = handler(iommu, &map[bdf], bdf); } } while ( !rc && ++seg ); @@ -1286,20 +1286,29 @@ static int __init amd_iommu_setup_device if ( pdev ) { - unsigned int req_id = bdf; - - do { - ivrs_mappings[req_id].intremap_table = - amd_iommu_alloc_intremap_table( - ivrs_mappings[bdf].iommu, - &ivrs_mappings[req_id].intremap_inuse); - if ( !ivrs_mappings[req_id].intremap_table ) - return -ENOMEM; - - if ( !pdev->phantom_stride ) - break; - req_id += pdev->phantom_stride; - } while ( PCI_SLOT(req_id) == pdev->sbdf.dev ); + ivrs_mappings[bdf].intremap_table = + amd_iommu_alloc_intremap_table( + ivrs_mappings[bdf].iommu, + &ivrs_mappings[bdf].intremap_inuse); + if ( !ivrs_mappings[bdf].intremap_table ) + return -ENOMEM; + + if ( pdev->phantom_stride ) + { + unsigned int req_id = bdf; + + for ( ; ; ) + { + req_id += pdev->phantom_stride; + if ( PCI_SLOT(req_id) != pdev->sbdf.dev ) + break; + + ivrs_mappings[req_id].intremap_table = + ivrs_mappings[bdf].intremap_table; + ivrs_mappings[req_id].intremap_inuse = + ivrs_mappings[bdf].intremap_inuse; + } + } } amd_iommu_set_intremap_table( --- a/xen/drivers/passthrough/amd/iommu_intr.c +++ b/xen/drivers/passthrough/amd/iommu_intr.c @@ -711,33 +711,20 @@ int amd_iommu_msi_msg_update_ire( if ( msi_desc->remap_index >= 0 && !msg ) { - do { - update_intremap_entry_from_msi_msg(iommu, bdf, nr, - &msi_desc->remap_index, - NULL, NULL); - if ( !pdev || !pdev->phantom_stride ) - break; - bdf += pdev->phantom_stride; - } while ( PCI_SLOT(bdf) == PCI_SLOT(pdev->devfn) ); + update_intremap_entry_from_msi_msg(iommu, bdf, nr, + &msi_desc->remap_index, + NULL, NULL); for ( i = 0; i < nr; ++i ) msi_desc[i].remap_index = -1; - if ( pdev ) - bdf = PCI_BDF2(pdev->bus, pdev->devfn); } if ( !msg ) return 0; - do { - rc = update_intremap_entry_from_msi_msg(iommu, bdf, nr, - &msi_desc->remap_index, - msg, &data); - if ( rc || !pdev || !pdev->phantom_stride ) - break; - bdf += pdev->phantom_stride; - } while ( PCI_SLOT(bdf) == PCI_SLOT(pdev->devfn) ); - + rc = update_intremap_entry_from_msi_msg(iommu, bdf, nr, + &msi_desc->remap_index, + msg, &data); if ( !rc ) { for ( i = 1; i < nr; ++i ) @@ -790,12 +777,27 @@ void amd_iommu_read_msi_from_ire( } int amd_iommu_free_intremap_table( - const struct amd_iommu *iommu, struct ivrs_mappings *ivrs_mapping) + const struct amd_iommu *iommu, struct ivrs_mappings *ivrs_mapping, + uint16_t bdf) { void **tblp; if ( ivrs_mapping ) { + unsigned int i; + + /* + * PCI device phantom functions use the same tables as their "base" + * function: Look ahead to zap the pointers. + */ + for ( i = 1; PCI_FUNC(bdf + i) && bdf + i < ivrs_bdf_entries; ++i ) + if ( ivrs_mapping[i].intremap_table == + ivrs_mapping->intremap_table ) + { + ivrs_mapping[i].intremap_table = NULL; + ivrs_mapping[i].intremap_inuse = NULL; + } + XFREE(ivrs_mapping->intremap_inuse); tblp = &ivrs_mapping->intremap_table; } @@ -934,7 +936,8 @@ static void dump_intremap_table(const st } static int dump_intremap_mapping(const struct amd_iommu *iommu, - struct ivrs_mappings *ivrs_mapping) + struct ivrs_mappings *ivrs_mapping, + uint16_t unused) { unsigned long flags; --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -519,7 +519,7 @@ static int amd_iommu_remove_device(u8 de if ( amd_iommu_perdev_intremap && ivrs_mappings[bdf].dte_requestor_id == bdf && ivrs_mappings[bdf].intremap_table ) - amd_iommu_free_intremap_table(iommu, &ivrs_mappings[bdf]); + amd_iommu_free_intremap_table(iommu, &ivrs_mappings[bdf], bdf); return 0; } --- a/xen/include/asm-x86/amd-iommu.h +++ b/xen/include/asm-x86/amd-iommu.h @@ -131,7 +131,7 @@ extern u8 ivhd_type; struct ivrs_mappings *get_ivrs_mappings(u16 seg); int iterate_ivrs_mappings(int (*)(u16 seg, struct ivrs_mappings *)); int iterate_ivrs_entries(int (*)(const struct amd_iommu *, - struct ivrs_mappings *)); + struct ivrs_mappings *, uint16_t)); /* iommu tables in guest space */ struct mmio_reg { --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h @@ -101,7 +101,7 @@ int amd_iommu_setup_ioapic_remapping(voi void *amd_iommu_alloc_intremap_table( const struct amd_iommu *, unsigned long **); int amd_iommu_free_intremap_table( - const struct amd_iommu *, struct ivrs_mappings *); + const struct amd_iommu *, struct ivrs_mappings *, uint16_t); void amd_iommu_ioapic_update_ire( unsigned int apic, unsigned int reg, unsigned int value); unsigned int amd_iommu_read_ioapic_from_ire( From patchwork Thu Sep 19 13:22:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11152377 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9DA9814ED for ; Thu, 19 Sep 2019 13:24:00 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 83B08217D6 for ; Thu, 19 Sep 2019 13:24:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 83B08217D6 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwOW-0007rl-T9; Thu, 19 Sep 2019 13:22:48 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwOV-0007rU-PS for xen-devel@lists.xenproject.org; Thu, 19 Sep 2019 13:22:47 +0000 X-Inumbo-ID: 91c8eb40-dae0-11e9-b299-bc764e2007e4 Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 91c8eb40-dae0-11e9-b299-bc764e2007e4; Thu, 19 Sep 2019 13:22:47 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 33B46AE40; Thu, 19 Sep 2019 13:22:46 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: Message-ID: <14624609-019f-2993-261c-d4f45ce78cbe@suse.com> Date: Thu, 19 Sep 2019 15:22:54 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: [Xen-devel] [PATCH v6 3/8] x86/PCI: read maximum MSI vector count early X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Wei Liu , Suravee Suthikulpanit , =?utf-8?q?Roger_Pau_?= =?utf-8?q?Monn=C3=A9?= Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Rather than doing this every time we set up interrupts for a device anew (and then in several places) fill this invariant field right after allocating struct pci_dev. Signed-off-by: Jan Beulich Reviewed-by: Roger Pau Monné Reviewed-by: Paul Durrant Reviewed-by: Andrew Cooper --- v6: New. --- xen/arch/x86/msi.c | 13 +++++-------- xen/drivers/passthrough/pci.c | 10 ++++++++++ xen/drivers/vpci/msi.c | 9 ++++----- xen/include/xen/pci.h | 3 ++- xen/include/xen/vpci.h | 6 ++---- 5 files changed, 23 insertions(+), 18 deletions(-) --- a/xen/arch/x86/msi.c +++ b/xen/arch/x86/msi.c @@ -664,7 +664,7 @@ static int msi_capability_init(struct pc { struct msi_desc *entry; int pos; - unsigned int i, maxvec, mpos; + unsigned int i, mpos; u16 control, seg = dev->seg; u8 bus = dev->bus; u8 slot = PCI_SLOT(dev->devfn); @@ -675,9 +675,8 @@ static int msi_capability_init(struct pc if ( !pos ) return -ENODEV; control = pci_conf_read16(dev->sbdf, msi_control_reg(pos)); - maxvec = multi_msi_capable(control); - if ( nvec > maxvec ) - return maxvec; + if ( nvec > dev->msi_maxvec ) + return dev->msi_maxvec; control &= ~PCI_MSI_FLAGS_QSIZE; multi_msi_enable(control, nvec); @@ -711,7 +710,7 @@ static int msi_capability_init(struct pc /* All MSIs are unmasked by default, Mask them all */ maskbits = pci_conf_read32(dev->sbdf, mpos); - maskbits |= ~(u32)0 >> (32 - maxvec); + maskbits |= ~(u32)0 >> (32 - dev->msi_maxvec); pci_conf_write32(dev->sbdf, mpos, maskbits); } list_add_tail(&entry->list, &dev->msi_list); @@ -1284,7 +1283,6 @@ int pci_msi_conf_write_intercept(struct entry = find_msi_entry(pdev, -1, PCI_CAP_ID_MSI); if ( entry && entry->msi_attrib.maskbit ) { - uint16_t cntl; uint32_t unused; unsigned int nvec = entry->msi.nvec; @@ -1297,8 +1295,7 @@ int pci_msi_conf_write_intercept(struct if ( reg < entry->msi.mpos || reg >= entry->msi.mpos + 4 || size != 4 ) return -EACCES; - cntl = pci_conf_read16(pdev->sbdf, msi_control_reg(pos)); - unused = ~(uint32_t)0 >> (32 - multi_msi_capable(cntl)); + unused = ~(uint32_t)0 >> (32 - pdev->msi_maxvec); for ( pos = 0; pos < nvec; ++pos, ++entry ) { entry->msi_attrib.guest_masked = --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -340,6 +340,16 @@ static struct pci_dev *alloc_pdev(struct pdev->domain = NULL; INIT_LIST_HEAD(&pdev->msi_list); + + pos = pci_find_cap_offset(pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), + PCI_CAP_ID_MSI); + if ( pos ) + { + uint16_t ctrl = pci_conf_read16(pdev->sbdf, msi_control_reg(pos)); + + pdev->msi_maxvec = multi_msi_capable(ctrl); + } + pos = pci_find_cap_offset(pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn), PCI_CAP_ID_MSIX); if ( pos ) --- a/xen/drivers/vpci/msi.c +++ b/xen/drivers/vpci/msi.c @@ -27,7 +27,7 @@ static uint32_t control_read(const struc { const struct vpci_msi *msi = data; - return MASK_INSR(fls(msi->max_vectors) - 1, PCI_MSI_FLAGS_QMASK) | + return MASK_INSR(fls(pdev->msi_maxvec) - 1, PCI_MSI_FLAGS_QMASK) | MASK_INSR(fls(msi->vectors) - 1, PCI_MSI_FLAGS_QSIZE) | (msi->enabled ? PCI_MSI_FLAGS_ENABLE : 0) | (msi->masking ? PCI_MSI_FLAGS_MASKBIT : 0) | @@ -40,7 +40,7 @@ static void control_write(const struct p struct vpci_msi *msi = data; unsigned int vectors = min_t(uint8_t, 1u << MASK_EXTR(val, PCI_MSI_FLAGS_QSIZE), - msi->max_vectors); + pdev->msi_maxvec); bool new_enabled = val & PCI_MSI_FLAGS_ENABLE; /* @@ -215,8 +215,7 @@ static int init_msi(struct pci_dev *pdev * FIXME: I've only been able to test this code with devices using a single * MSI interrupt and no mask register. */ - pdev->vpci->msi->max_vectors = multi_msi_capable(control); - ASSERT(pdev->vpci->msi->max_vectors <= 32); + ASSERT(pdev->msi_maxvec <= 32); /* The multiple message enable is 0 after reset (1 message enabled). */ pdev->vpci->msi->vectors = 1; @@ -298,7 +297,7 @@ void vpci_dump_msi(void) if ( msi->masking ) printk(" mask=%08x", msi->mask); printk(" vectors max: %u enabled: %u\n", - msi->max_vectors, msi->vectors); + pdev->msi_maxvec, msi->vectors); vpci_msi_arch_print(msi); } --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -94,7 +94,8 @@ struct pci_dev { pci_sbdf_t sbdf; }; - u8 phantom_stride; + uint8_t msi_maxvec; + uint8_t phantom_stride; nodeid_t node; /* NUMA node */ --- a/xen/include/xen/vpci.h +++ b/xen/include/xen/vpci.h @@ -99,14 +99,12 @@ struct vpci { uint32_t mask; /* Data. */ uint16_t data; - /* Maximum number of vectors supported by the device. */ - uint8_t max_vectors : 6; + /* Number of vectors configured. */ + uint8_t vectors : 6; /* Supports per-vector masking? */ bool masking : 1; /* 64-bit address capable? */ bool address64 : 1; - /* Number of vectors configured. */ - uint8_t vectors : 6; /* Enabled? */ bool enabled : 1; /* Arch-specific data. */ From patchwork Thu Sep 19 13:23:23 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11152387 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 73DD176 for ; Thu, 19 Sep 2019 13:25:18 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 4E127217D6 for ; Thu, 19 Sep 2019 13:25:18 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4E127217D6 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwP0-0007zg-9C; Thu, 19 Sep 2019 13:23:18 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwOz-0007zE-4s for xen-devel@lists.xenproject.org; Thu, 19 Sep 2019 13:23:17 +0000 X-Inumbo-ID: a2f78e80-dae0-11e9-978d-bc764e2007e4 Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id a2f78e80-dae0-11e9-978d-bc764e2007e4; Thu, 19 Sep 2019 13:23:15 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 16B0FAFB1; Thu, 19 Sep 2019 13:23:15 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: Message-ID: Date: Thu, 19 Sep 2019 15:23:23 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: [Xen-devel] [PATCH v6 4/8] AMD/IOMMU: replace INTREMAP_ENTRIES X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Suravee Suthikulpanit Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Prepare for the number of entries to not be the maximum possible, by separating checks against maximum size from ones against actual size. For caller side simplicity have alloc_intremap_entry() return the maximum possible value upon allocation failure, rather than the first just out-of-bounds one. Have the involved functions already take all the subsequently needed arguments here already, to reduce code churn in the patch actually making the allocation size dynamic. Signed-off-by: Jan Beulich Acked-by: Andrew Cooper Reviewed-by: Paul Durrant --- v5: New. --- xen/drivers/passthrough/amd/iommu_intr.c | 93 +++++++++++++++----------- xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 2 2 files changed, 59 insertions(+), 36 deletions(-) --- a/xen/drivers/passthrough/amd/iommu_intr.c +++ b/xen/drivers/passthrough/amd/iommu_intr.c @@ -69,7 +69,7 @@ union irte_cptr { const union irte128 *ptr128; } __transparent__; -#define INTREMAP_ENTRIES (1 << IOMMU_INTREMAP_ORDER) +#define INTREMAP_MAX_ENTRIES (1 << IOMMU_INTREMAP_ORDER) struct ioapic_sbdf ioapic_sbdf[MAX_IO_APICS]; struct hpet_sbdf hpet_sbdf; @@ -83,8 +83,20 @@ static void dump_intremap_tables(unsigne static unsigned int __init intremap_table_order(const struct amd_iommu *iommu) { return iommu->ctrl.ga_en - ? get_order_from_bytes(INTREMAP_ENTRIES * sizeof(union irte128)) - : get_order_from_bytes(INTREMAP_ENTRIES * sizeof(union irte32)); + ? get_order_from_bytes(INTREMAP_MAX_ENTRIES * sizeof(union irte128)) + : get_order_from_bytes(INTREMAP_MAX_ENTRIES * sizeof(union irte32)); +} + +unsigned int amd_iommu_intremap_table_order( + const void *irt, const struct amd_iommu *iommu) +{ + return IOMMU_INTREMAP_ORDER; +} + +static unsigned int intremap_table_entries( + const void *irt, const struct amd_iommu *iommu) +{ + return 1u << amd_iommu_intremap_table_order(irt, iommu); } unsigned int ioapic_id_to_index(unsigned int apic_id) @@ -122,20 +134,24 @@ static int get_intremap_requestor_id(int return get_ivrs_mappings(seg)[bdf].dte_requestor_id; } -static unsigned int alloc_intremap_entry(int seg, int bdf, unsigned int nr) +static unsigned int alloc_intremap_entry(const struct amd_iommu *iommu, + unsigned int bdf, unsigned int nr) { - unsigned long *inuse = get_ivrs_mappings(seg)[bdf].intremap_inuse; - unsigned int slot = find_first_zero_bit(inuse, INTREMAP_ENTRIES); + const struct ivrs_mappings *ivrs_mappings = get_ivrs_mappings(iommu->seg); + unsigned long *inuse = ivrs_mappings[bdf].intremap_inuse; + unsigned int nr_ents = + intremap_table_entries(ivrs_mappings[bdf].intremap_table, iommu); + unsigned int slot = find_first_zero_bit(inuse, nr_ents); for ( ; ; ) { unsigned int end; - if ( slot >= INTREMAP_ENTRIES ) + if ( slot >= nr_ents ) break; - end = find_next_bit(inuse, INTREMAP_ENTRIES, slot + 1); - if ( end > INTREMAP_ENTRIES ) - end = INTREMAP_ENTRIES; + end = find_next_bit(inuse, nr_ents, slot + 1); + if ( end > nr_ents ) + end = nr_ents; slot = (slot + nr - 1) & ~(nr - 1); if ( slot + nr <= end ) { @@ -144,12 +160,12 @@ static unsigned int alloc_intremap_entry break; } slot = (end + nr) & ~(nr - 1); - if ( slot >= INTREMAP_ENTRIES ) + if ( slot >= nr_ents ) break; - slot = find_next_zero_bit(inuse, INTREMAP_ENTRIES, slot); + slot = find_next_zero_bit(inuse, nr_ents, slot); } - return slot; + return slot < nr_ents ? slot : INTREMAP_MAX_ENTRIES; } static union irte_ptr get_intremap_entry(const struct amd_iommu *iommu, @@ -159,7 +175,7 @@ static union irte_ptr get_intremap_entry .ptr = get_ivrs_mappings(iommu->seg)[bdf].intremap_table }; - ASSERT(table.ptr && (index < INTREMAP_ENTRIES)); + ASSERT(table.ptr && (index < intremap_table_entries(table.ptr, iommu))); if ( iommu->ctrl.ga_en ) table.ptr128 += index; @@ -279,10 +295,10 @@ static int update_intremap_entry_from_io spin_lock_irqsave(lock, flags); offset = *index; - if ( offset >= INTREMAP_ENTRIES ) + if ( offset >= INTREMAP_MAX_ENTRIES ) { - offset = alloc_intremap_entry(iommu->seg, req_id, 1); - if ( offset >= INTREMAP_ENTRIES ) + offset = alloc_intremap_entry(iommu, req_id, 1); + if ( offset >= INTREMAP_MAX_ENTRIES ) { spin_unlock_irqrestore(lock, flags); rte->mask = 1; @@ -400,8 +416,8 @@ int __init amd_iommu_setup_ioapic_remapp } spin_lock_irqsave(lock, flags); - offset = alloc_intremap_entry(seg, req_id, 1); - BUG_ON(offset >= INTREMAP_ENTRIES); + offset = alloc_intremap_entry(iommu, req_id, 1); + BUG_ON(offset >= INTREMAP_MAX_ENTRIES); entry = get_intremap_entry(iommu, req_id, offset); update_intremap_entry(iommu, entry, vector, delivery_mode, dest_mode, dest); @@ -476,7 +492,7 @@ void amd_iommu_ioapic_update_ire( *(((u32 *)&new_rte) + 1) = value; } - if ( ioapic_sbdf[idx].pin_2_idx[pin] >= INTREMAP_ENTRIES ) + if ( ioapic_sbdf[idx].pin_2_idx[pin] >= INTREMAP_MAX_ENTRIES ) { ASSERT(saved_mask); @@ -548,7 +564,7 @@ unsigned int amd_iommu_read_ioapic_from_ return val; offset = ioapic_sbdf[idx].pin_2_idx[pin]; - if ( offset >= INTREMAP_ENTRIES ) + if ( offset >= INTREMAP_MAX_ENTRIES ) return val; seg = ioapic_sbdf[idx].seg; @@ -561,8 +577,8 @@ unsigned int amd_iommu_read_ioapic_from_ if ( !(reg & 1) ) { - ASSERT(offset == (val & (INTREMAP_ENTRIES - 1))); - val &= ~(INTREMAP_ENTRIES - 1); + ASSERT(offset == (val & (INTREMAP_MAX_ENTRIES - 1))); + val &= ~(INTREMAP_MAX_ENTRIES - 1); /* The IntType fields match for both formats. */ val |= MASK_INSR(entry.ptr32->flds.int_type, IO_APIC_REDIR_DELIV_MODE_MASK); @@ -622,11 +638,11 @@ static int update_intremap_entry_from_ms dest = MASK_EXTR(msg->address_lo, MSI_ADDR_DEST_ID_MASK); offset = *remap_index; - if ( offset >= INTREMAP_ENTRIES ) + if ( offset >= INTREMAP_MAX_ENTRIES ) { ASSERT(nr); - offset = alloc_intremap_entry(iommu->seg, bdf, nr); - if ( offset >= INTREMAP_ENTRIES ) + offset = alloc_intremap_entry(iommu, bdf, nr); + if ( offset >= INTREMAP_MAX_ENTRIES ) { spin_unlock_irqrestore(lock, flags); return -ENOSPC; @@ -654,7 +670,7 @@ static int update_intremap_entry_from_ms update_intremap_entry(iommu, entry, vector, delivery_mode, dest_mode, dest); spin_unlock_irqrestore(lock, flags); - *data = (msg->data & ~(INTREMAP_ENTRIES - 1)) | offset; + *data = (msg->data & ~(INTREMAP_MAX_ENTRIES - 1)) | offset; /* * In some special cases, a pci-e device(e.g SATA controller in IDE mode) @@ -738,7 +754,7 @@ int amd_iommu_msi_msg_update_ire( void amd_iommu_read_msi_from_ire( struct msi_desc *msi_desc, struct msi_msg *msg) { - unsigned int offset = msg->data & (INTREMAP_ENTRIES - 1); + unsigned int offset = msg->data & (INTREMAP_MAX_ENTRIES - 1); const struct pci_dev *pdev = msi_desc->dev; u16 bdf = pdev ? PCI_BDF2(pdev->bus, pdev->devfn) : hpet_sbdf.bdf; u16 seg = pdev ? pdev->seg : hpet_sbdf.seg; @@ -758,7 +774,7 @@ void amd_iommu_read_msi_from_ire( offset |= nr; } - msg->data &= ~(INTREMAP_ENTRIES - 1); + msg->data &= ~(INTREMAP_MAX_ENTRIES - 1); /* The IntType fields match for both formats. */ msg->data |= MASK_INSR(entry.ptr32->flds.int_type, MSI_DATA_DELIVERY_MODE_MASK); @@ -824,8 +840,9 @@ void *amd_iommu_alloc_intremap_table( if ( tb ) { - *inuse_map = xzalloc_array(unsigned long, - BITS_TO_LONGS(INTREMAP_ENTRIES)); + unsigned int nr = intremap_table_entries(tb, iommu); + + *inuse_map = xzalloc_array(unsigned long, BITS_TO_LONGS(nr)); if ( *inuse_map ) memset(tb, 0, PAGE_SIZE << order); else @@ -869,6 +886,7 @@ bool __init iov_supports_xt(void) int __init amd_setup_hpet_msi(struct msi_desc *msi_desc) { + const struct amd_iommu *iommu; spinlock_t *lock; unsigned long flags; int rc = 0; @@ -886,12 +904,15 @@ int __init amd_setup_hpet_msi(struct msi return -ENODEV; } + iommu = find_iommu_for_device(hpet_sbdf.seg, hpet_sbdf.bdf); + if ( !iommu ) + return -ENXIO; + lock = get_intremap_lock(hpet_sbdf.seg, hpet_sbdf.bdf); spin_lock_irqsave(lock, flags); - msi_desc->remap_index = alloc_intremap_entry(hpet_sbdf.seg, - hpet_sbdf.bdf, 1); - if ( msi_desc->remap_index >= INTREMAP_ENTRIES ) + msi_desc->remap_index = alloc_intremap_entry(iommu, hpet_sbdf.bdf, 1); + if ( msi_desc->remap_index >= INTREMAP_MAX_ENTRIES ) { msi_desc->remap_index = -1; rc = -ENXIO; @@ -906,12 +927,12 @@ static void dump_intremap_table(const st union irte_cptr tbl, const struct ivrs_mappings *ivrs_mapping) { - unsigned int count; + unsigned int count, nr = intremap_table_entries(tbl.ptr, iommu); if ( !tbl.ptr ) return; - for ( count = 0; count < INTREMAP_ENTRIES; count++ ) + for ( count = 0; count < nr; count++ ) { if ( iommu->ctrl.ga_en ? !tbl.ptr128[count].raw[0] && !tbl.ptr128[count].raw[1] --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h @@ -102,6 +102,8 @@ void *amd_iommu_alloc_intremap_table( const struct amd_iommu *, unsigned long **); int amd_iommu_free_intremap_table( const struct amd_iommu *, struct ivrs_mappings *, uint16_t); +unsigned int amd_iommu_intremap_table_order( + const void *irt, const struct amd_iommu *iommu); void amd_iommu_ioapic_update_ire( unsigned int apic, unsigned int reg, unsigned int value); unsigned int amd_iommu_read_ioapic_from_ire( From patchwork Thu Sep 19 13:23:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11152385 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0B6B876 for ; Thu, 19 Sep 2019 13:25:13 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D9961217D6 for ; Thu, 19 Sep 2019 13:25:12 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D9961217D6 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwPZ-0008Bg-Ls; Thu, 19 Sep 2019 13:23:53 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwPY-0008BL-QX for xen-devel@lists.xenproject.org; Thu, 19 Sep 2019 13:23:52 +0000 X-Inumbo-ID: b396fc62-dae0-11e9-965d-12813bfff9fa Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id b396fc62-dae0-11e9-965d-12813bfff9fa; Thu, 19 Sep 2019 13:23:43 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id EB617B009; Thu, 19 Sep 2019 13:23:42 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: Message-ID: <6e78af43-45f4-b2b1-26e5-04dfb40f1858@suse.com> Date: Thu, 19 Sep 2019 15:23:51 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: [Xen-devel] [PATCH v6 5/8] AMD/IOMMU: restrict interrupt remapping table sizes X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Suravee Suthikulpanit Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" There's no point setting up tables with more space than a PCI device can use. For both MSI and MSI-X we can determine how many interrupts could be set up at most. Tables allocated during ACPI table parsing, however, will (for now at least) continue to be set up to have maximum size. Note that until we would want to use sub-page allocations here there's no point checking whether both MSI and MSI-X are supported by a device - an order-0 allocation will fit the dual case in any event, no matter that the MSI-X vector count may be smaller than the MSI one. On my Rome system this reduces space needed from just over 1k pages to about 125. Suggested-by: Andrew Cooper Signed-off-by: Jan Beulich Reviewed-by: Paul Durrant Reviewed-by: Andrew Cooper --- v6: Don't allocate any IRT at all when device is neither MSI-X nor MSI-capable. Re-base over changes earlier in this series. v5: New. --- xen/drivers/passthrough/amd/iommu_acpi.c | 4 +- xen/drivers/passthrough/amd/iommu_init.c | 13 ++++----- xen/drivers/passthrough/amd/iommu_intr.c | 36 +++++++++++++++----------- xen/drivers/passthrough/amd/iommu_map.c | 20 ++++++++++---- xen/drivers/passthrough/amd/pci_amd_iommu.c | 18 +++++++------ xen/include/asm-x86/hvm/svm/amd-iommu-defs.h | 3 -- xen/include/asm-x86/hvm/svm/amd-iommu-proto.h | 5 ++- 7 files changed, 59 insertions(+), 40 deletions(-) --- a/xen/drivers/passthrough/amd/iommu_acpi.c +++ b/xen/drivers/passthrough/amd/iommu_acpi.c @@ -77,7 +77,7 @@ static void __init add_ivrs_mapping_entr { if ( !shared_intremap_table ) shared_intremap_table = amd_iommu_alloc_intremap_table( - iommu, &shared_intremap_inuse); + iommu, &shared_intremap_inuse, 0); if ( !shared_intremap_table ) panic("No memory for shared IRT\n"); @@ -89,7 +89,7 @@ static void __init add_ivrs_mapping_entr { ivrs_mappings[alias_id].intremap_table = amd_iommu_alloc_intremap_table( - iommu, &ivrs_mappings[alias_id].intremap_inuse); + iommu, &ivrs_mappings[alias_id].intremap_inuse, 0); if ( !ivrs_mappings[alias_id].intremap_table ) panic("No memory for %04x:%02x:%02x.%u's IRT\n", --- a/xen/drivers/passthrough/amd/iommu_init.c +++ b/xen/drivers/passthrough/amd/iommu_init.c @@ -1284,12 +1284,14 @@ static int __init amd_iommu_setup_device pcidevs_unlock(); } - if ( pdev ) + if ( pdev && (pdev->msix || pdev->msi_maxvec) ) { ivrs_mappings[bdf].intremap_table = amd_iommu_alloc_intremap_table( ivrs_mappings[bdf].iommu, - &ivrs_mappings[bdf].intremap_inuse); + &ivrs_mappings[bdf].intremap_inuse, + pdev->msix ? pdev->msix->nr_entries + : pdev->msi_maxvec); if ( !ivrs_mappings[bdf].intremap_table ) return -ENOMEM; @@ -1312,11 +1314,8 @@ static int __init amd_iommu_setup_device } amd_iommu_set_intremap_table( - dte, - ivrs_mappings[bdf].intremap_table - ? virt_to_maddr(ivrs_mappings[bdf].intremap_table) - : 0, - iommu_intremap); + dte, ivrs_mappings[bdf].intremap_table, + ivrs_mappings[bdf].iommu, iommu_intremap); } } --- a/xen/drivers/passthrough/amd/iommu_intr.c +++ b/xen/drivers/passthrough/amd/iommu_intr.c @@ -69,7 +69,8 @@ union irte_cptr { const union irte128 *ptr128; } __transparent__; -#define INTREMAP_MAX_ENTRIES (1 << IOMMU_INTREMAP_ORDER) +#define INTREMAP_MAX_ORDER 0xB +#define INTREMAP_MAX_ENTRIES (1 << INTREMAP_MAX_ORDER) struct ioapic_sbdf ioapic_sbdf[MAX_IO_APICS]; struct hpet_sbdf hpet_sbdf; @@ -80,17 +81,13 @@ unsigned int nr_ioapic_sbdf; static void dump_intremap_tables(unsigned char key); -static unsigned int __init intremap_table_order(const struct amd_iommu *iommu) -{ - return iommu->ctrl.ga_en - ? get_order_from_bytes(INTREMAP_MAX_ENTRIES * sizeof(union irte128)) - : get_order_from_bytes(INTREMAP_MAX_ENTRIES * sizeof(union irte32)); -} +#define intremap_page_order(irt) PFN_ORDER(virt_to_page(irt)) unsigned int amd_iommu_intremap_table_order( const void *irt, const struct amd_iommu *iommu) { - return IOMMU_INTREMAP_ORDER; + return intremap_page_order(irt) + PAGE_SHIFT - + (iommu->ctrl.ga_en ? 4 : 2); } static unsigned int intremap_table_entries( @@ -825,7 +822,10 @@ int amd_iommu_free_intremap_table( if ( *tblp ) { - __free_amd_iommu_tables(*tblp, intremap_table_order(iommu)); + unsigned int order = intremap_page_order(*tblp); + + intremap_page_order(*tblp) = 0; + __free_amd_iommu_tables(*tblp, order); *tblp = NULL; } @@ -833,15 +833,23 @@ int amd_iommu_free_intremap_table( } void *amd_iommu_alloc_intremap_table( - const struct amd_iommu *iommu, unsigned long **inuse_map) + const struct amd_iommu *iommu, unsigned long **inuse_map, unsigned int nr) { - unsigned int order = intremap_table_order(iommu); - void *tb = __alloc_amd_iommu_tables(order); + unsigned int order; + void *tb; + if ( !nr ) + nr = INTREMAP_MAX_ENTRIES; + + order = iommu->ctrl.ga_en + ? get_order_from_bytes(nr * sizeof(union irte128)) + : get_order_from_bytes(nr * sizeof(union irte32)); + + tb = __alloc_amd_iommu_tables(order); if ( tb ) { - unsigned int nr = intremap_table_entries(tb, iommu); - + intremap_page_order(tb) = order; + nr = intremap_table_entries(tb, iommu); *inuse_map = xzalloc_array(unsigned long, BITS_TO_LONGS(nr)); if ( *inuse_map ) memset(tb, 0, PAGE_SIZE << order); --- a/xen/drivers/passthrough/amd/iommu_map.c +++ b/xen/drivers/passthrough/amd/iommu_map.c @@ -113,12 +113,22 @@ void amd_iommu_set_root_page_table(struc } void __init amd_iommu_set_intremap_table( - struct amd_iommu_dte *dte, uint64_t intremap_ptr, bool valid) + struct amd_iommu_dte *dte, const void *ptr, + const struct amd_iommu *iommu, bool valid) { - dte->it_root = intremap_ptr >> 6; - dte->int_tab_len = intremap_ptr ? IOMMU_INTREMAP_ORDER : 0; - dte->int_ctl = intremap_ptr ? IOMMU_DEV_TABLE_INT_CONTROL_TRANSLATED - : IOMMU_DEV_TABLE_INT_CONTROL_ABORTED; + if ( ptr ) + { + dte->it_root = virt_to_maddr(ptr) >> 6; + dte->int_tab_len = amd_iommu_intremap_table_order(ptr, iommu); + dte->int_ctl = IOMMU_DEV_TABLE_INT_CONTROL_TRANSLATED; + } + else + { + dte->it_root = 0; + dte->int_tab_len = 0; + dte->int_ctl = IOMMU_DEV_TABLE_INT_CONTROL_ABORTED; + } + dte->ig = false; /* unmapped interrupts result in i/o page faults */ dte->iv = valid; } --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -470,18 +470,22 @@ static int amd_iommu_add_device(u8 devfn { unsigned long flags; - ivrs_mappings[bdf].intremap_table = - amd_iommu_alloc_intremap_table( - iommu, &ivrs_mappings[bdf].intremap_inuse); - if ( !ivrs_mappings[bdf].intremap_table ) - return -ENOMEM; + if ( pdev->msix || pdev->msi_maxvec ) + { + ivrs_mappings[bdf].intremap_table = + amd_iommu_alloc_intremap_table( + iommu, &ivrs_mappings[bdf].intremap_inuse, + pdev->msix ? pdev->msix->nr_entries + : pdev->msi_maxvec); + if ( !ivrs_mappings[bdf].intremap_table ) + return -ENOMEM; + } spin_lock_irqsave(&iommu->lock, flags); amd_iommu_set_intremap_table( iommu->dev_table.buffer + (bdf * IOMMU_DEV_TABLE_ENTRY_SIZE), - virt_to_maddr(ivrs_mappings[bdf].intremap_table), - iommu_intremap); + ivrs_mappings[bdf].intremap_table, iommu, iommu_intremap); amd_iommu_flush_device(iommu, bdf); --- a/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-defs.h @@ -107,9 +107,6 @@ #define IOMMU_DEV_TABLE_INT_CONTROL_FORWARDED 0x1 #define IOMMU_DEV_TABLE_INT_CONTROL_TRANSLATED 0x2 -/* For now, we always allocate the maximum: 2048 entries. */ -#define IOMMU_INTREMAP_ORDER 0xB - struct amd_iommu_dte { /* 0 - 63 */ bool v:1; --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h @@ -72,7 +72,8 @@ int __must_check amd_iommu_flush_iotlb_a /* device table functions */ int get_dma_requestor_id(uint16_t seg, uint16_t bdf); void amd_iommu_set_intremap_table(struct amd_iommu_dte *dte, - uint64_t intremap_ptr, + const void *ptr, + const struct amd_iommu *iommu, bool valid); void amd_iommu_set_root_page_table(struct amd_iommu_dte *dte, uint64_t root_ptr, uint16_t domain_id, @@ -99,7 +100,7 @@ struct amd_iommu *find_iommu_for_device( bool iov_supports_xt(void); int amd_iommu_setup_ioapic_remapping(void); void *amd_iommu_alloc_intremap_table( - const struct amd_iommu *, unsigned long **); + const struct amd_iommu *, unsigned long **, unsigned int nr); int amd_iommu_free_intremap_table( const struct amd_iommu *, struct ivrs_mappings *, uint16_t); unsigned int amd_iommu_intremap_table_order( From patchwork Thu Sep 19 13:24:11 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11152389 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id EE61E76 for ; Thu, 19 Sep 2019 13:25:37 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D3889217D6 for ; Thu, 19 Sep 2019 13:25:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D3889217D6 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwPm-0008GC-4l; Thu, 19 Sep 2019 13:24:06 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwPk-0008Fi-Cn for xen-devel@lists.xenproject.org; Thu, 19 Sep 2019 13:24:04 +0000 X-Inumbo-ID: bf84d8a0-dae0-11e9-b299-bc764e2007e4 Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id bf84d8a0-dae0-11e9-b299-bc764e2007e4; Thu, 19 Sep 2019 13:24:03 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id F0F81AB9B; Thu, 19 Sep 2019 13:24:02 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: Message-ID: <6de11867-b872-a2a1-7c26-af004164bfea@suse.com> Date: Thu, 19 Sep 2019 15:24:11 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: [Xen-devel] [PATCH v6 6/8] AMD/IOMMU: tidy struct ivrs_mappings X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Suravee Suthikulpanit Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Move the device flags field up into an unused hole, thus shrinking overall structure size by 8 bytes. Use bool and uint_t as appropriate. Drop pointless (redundant) initializations. Signed-off-by: Jan Beulich Reviewed-by: Paul Durrant Acked-by: Andrew Cooper --- v6: New. --- xen/drivers/passthrough/amd/iommu_acpi.c | 6 +++--- xen/drivers/passthrough/amd/iommu_init.c | 6 ------ xen/include/asm-x86/amd-iommu.h | 17 +++++++++-------- 3 files changed, 12 insertions(+), 17 deletions(-) --- a/xen/drivers/passthrough/amd/iommu_acpi.c +++ b/xen/drivers/passthrough/amd/iommu_acpi.c @@ -165,7 +165,7 @@ static void __init reserve_unity_map_for /* extend r/w permissioms and keep aggregate */ ivrs_mappings[bdf].write_permission = iw; ivrs_mappings[bdf].read_permission = ir; - ivrs_mappings[bdf].unity_map_enable = IOMMU_CONTROL_ENABLED; + ivrs_mappings[bdf].unity_map_enable = true; ivrs_mappings[bdf].addr_range_start = base; ivrs_mappings[bdf].addr_range_length = length; } @@ -242,8 +242,8 @@ static int __init register_exclusion_ran if ( limit >= iommu_top ) { reserve_iommu_exclusion_range(iommu, base, limit); - ivrs_mappings[bdf].dte_allow_exclusion = IOMMU_CONTROL_ENABLED; - ivrs_mappings[req].dte_allow_exclusion = IOMMU_CONTROL_ENABLED; + ivrs_mappings[bdf].dte_allow_exclusion = true; + ivrs_mappings[req].dte_allow_exclusion = true; } return 0; --- a/xen/drivers/passthrough/amd/iommu_init.c +++ b/xen/drivers/passthrough/amd/iommu_init.c @@ -1222,12 +1222,6 @@ static int __init alloc_ivrs_mappings(u1 for ( bdf = 0; bdf < ivrs_bdf_entries; bdf++ ) { ivrs_mappings[bdf].dte_requestor_id = bdf; - ivrs_mappings[bdf].dte_allow_exclusion = IOMMU_CONTROL_DISABLED; - ivrs_mappings[bdf].unity_map_enable = IOMMU_CONTROL_DISABLED; - ivrs_mappings[bdf].iommu = NULL; - - ivrs_mappings[bdf].intremap_table = NULL; - ivrs_mappings[bdf].device_flags = 0; if ( amd_iommu_perdev_intremap ) spin_lock_init(&ivrs_mappings[bdf].intremap_lock); --- a/xen/include/asm-x86/amd-iommu.h +++ b/xen/include/asm-x86/amd-iommu.h @@ -106,12 +106,16 @@ struct amd_iommu { }; struct ivrs_mappings { - u16 dte_requestor_id; - u8 dte_allow_exclusion; - u8 unity_map_enable; - u8 write_permission; - u8 read_permission; + uint16_t dte_requestor_id; bool valid; + bool dte_allow_exclusion; + bool unity_map_enable; + bool write_permission; + bool read_permission; + + /* ivhd device data settings */ + uint8_t device_flags; + unsigned long addr_range_start; unsigned long addr_range_length; struct amd_iommu *iommu; @@ -120,9 +124,6 @@ struct ivrs_mappings { void *intremap_table; unsigned long *intremap_inuse; spinlock_t intremap_lock; - - /* ivhd device data settings */ - u8 device_flags; }; extern unsigned int ivrs_bdf_entries; From patchwork Thu Sep 19 13:24:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11152391 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8415214ED for ; Thu, 19 Sep 2019 13:25:54 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 696DC217D6 for ; Thu, 19 Sep 2019 13:25:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 696DC217D6 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwQJ-0008Pz-Fl; Thu, 19 Sep 2019 13:24:39 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwQH-0008PP-VA for xen-devel@lists.xenproject.org; Thu, 19 Sep 2019 13:24:37 +0000 X-Inumbo-ID: d37b6b80-dae0-11e9-965d-12813bfff9fa Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id d37b6b80-dae0-11e9-965d-12813bfff9fa; Thu, 19 Sep 2019 13:24:37 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 732C0AFBB; Thu, 19 Sep 2019 13:24:36 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: Message-ID: Date: Thu, 19 Sep 2019 15:24:44 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: [Xen-devel] [PATCH v6 7/8] AMD/IOMMU: allocate one device table per PCI segment X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Suravee Suthikulpanit Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Having a single device table for all segments can't possibly be right. (Even worse, the symbol wasn't static despite being used in just one source file.) Attach the device tables to their respective IVRS mapping ones. Signed-off-by: Jan Beulich --- v6: New. --- xen/drivers/passthrough/amd/iommu_init.c | 81 ++++++++++++++++--------------- 1 file changed, 43 insertions(+), 38 deletions(-) --- a/xen/drivers/passthrough/amd/iommu_init.c +++ b/xen/drivers/passthrough/amd/iommu_init.c @@ -39,7 +39,6 @@ unsigned int __read_mostly ivrs_bdf_entr u8 __read_mostly ivhd_type; static struct radix_tree_root ivrs_maps; LIST_HEAD_READ_MOSTLY(amd_iommu_head); -struct table_struct device_table; bool_t iommuv2_enabled; static bool iommu_has_ht_flag(struct amd_iommu *iommu, u8 mask) @@ -989,6 +988,12 @@ static void disable_iommu(struct amd_iom spin_unlock_irqrestore(&iommu->lock, flags); } +static unsigned int __init dt_alloc_size(void) +{ + return PAGE_SIZE << get_order_from_bytes(ivrs_bdf_entries * + IOMMU_DEV_TABLE_ENTRY_SIZE); +} + static void __init deallocate_buffer(void *buf, uint32_t sz) { int order = 0; @@ -999,12 +1004,6 @@ static void __init deallocate_buffer(voi } } -static void __init deallocate_device_table(struct table_struct *table) -{ - deallocate_buffer(table->buffer, table->alloc_size); - table->buffer = NULL; -} - static void __init deallocate_ring_buffer(struct ring_buffer *ring_buf) { deallocate_buffer(ring_buf->buffer, ring_buf->alloc_size); @@ -1068,8 +1067,29 @@ static void * __init allocate_ppr_log(st IOMMU_PPR_LOG_DEFAULT_ENTRIES, "PPR Log"); } +/* + * Within ivrs_mappings[] we allocate an extra array element to store + * - segment number, + * - device table. + */ +#define IVRS_MAPPINGS_SEG(m) (m)[ivrs_bdf_entries].dte_requestor_id +#define IVRS_MAPPINGS_DEVTAB(m) (m)[ivrs_bdf_entries].intremap_table + +static void __init free_ivrs_mapping(void *ptr) +{ + const struct ivrs_mappings *ivrs_mappings = ptr; + + if ( IVRS_MAPPINGS_DEVTAB(ivrs_mappings) ) + deallocate_buffer(IVRS_MAPPINGS_DEVTAB(ivrs_mappings), + dt_alloc_size()); + + xfree(ptr); +} + static int __init amd_iommu_init_one(struct amd_iommu *iommu, bool intr) { + const struct ivrs_mappings *ivrs_mappings; + if ( allocate_cmd_buffer(iommu) == NULL ) goto error_out; @@ -1082,13 +1102,15 @@ static int __init amd_iommu_init_one(str if ( intr && !set_iommu_interrupt_handler(iommu) ) goto error_out; - /* To make sure that device_table.buffer has been successfully allocated */ - if ( device_table.buffer == NULL ) + /* Make sure that the device table has been successfully allocated. */ + ivrs_mappings = get_ivrs_mappings(iommu->seg); + if ( !IVRS_MAPPINGS_DEVTAB(ivrs_mappings) ) goto error_out; - iommu->dev_table.alloc_size = device_table.alloc_size; - iommu->dev_table.entries = device_table.entries; - iommu->dev_table.buffer = device_table.buffer; + iommu->dev_table.alloc_size = dt_alloc_size(); + iommu->dev_table.entries = iommu->dev_table.alloc_size / + IOMMU_DEV_TABLE_ENTRY_SIZE; + iommu->dev_table.buffer = IVRS_MAPPINGS_DEVTAB(ivrs_mappings); enable_iommu(iommu); printk("AMD-Vi: IOMMU %d Enabled.\n", nr_amd_iommus ); @@ -1135,11 +1157,8 @@ static void __init amd_iommu_init_cleanu xfree(iommu); } - /* free device table */ - deallocate_device_table(&device_table); - - /* free ivrs_mappings[] */ - radix_tree_destroy(&ivrs_maps, xfree); + /* Free ivrs_mappings[] and their device tables. */ + radix_tree_destroy(&ivrs_maps, free_ivrs_mapping); iommu_enabled = 0; iommu_hwdom_passthrough = false; @@ -1147,12 +1166,6 @@ static void __init amd_iommu_init_cleanu iommuv2_enabled = 0; } -/* - * We allocate an extra array element to store the segment number - * (and in the future perhaps other global information). - */ -#define IVRS_MAPPINGS_SEG(m) m[ivrs_bdf_entries].dte_requestor_id - struct ivrs_mappings *get_ivrs_mappings(u16 seg) { return radix_tree_lookup(&ivrs_maps, seg); @@ -1235,24 +1248,18 @@ static int __init alloc_ivrs_mappings(u1 static int __init amd_iommu_setup_device_table( u16 seg, struct ivrs_mappings *ivrs_mappings) { + struct amd_iommu_dte *dt = IVRS_MAPPINGS_DEVTAB(ivrs_mappings); unsigned int bdf; BUG_ON( (ivrs_bdf_entries == 0) ); - if ( !device_table.buffer ) + if ( !dt ) { /* allocate 'device table' on a 4K boundary */ - device_table.alloc_size = PAGE_SIZE << - get_order_from_bytes( - PAGE_ALIGN(ivrs_bdf_entries * - IOMMU_DEV_TABLE_ENTRY_SIZE)); - device_table.entries = device_table.alloc_size / - IOMMU_DEV_TABLE_ENTRY_SIZE; - - device_table.buffer = allocate_buffer(device_table.alloc_size, - "Device Table"); + dt = IVRS_MAPPINGS_DEVTAB(ivrs_mappings) = + allocate_buffer(dt_alloc_size(), "Device Table"); } - if ( !device_table.buffer ) + if ( !dt ) return -ENOMEM; /* Add device table entries */ @@ -1260,12 +1267,10 @@ static int __init amd_iommu_setup_device { if ( ivrs_mappings[bdf].valid ) { - void *dte; const struct pci_dev *pdev = NULL; /* add device table entry */ - dte = device_table.buffer + (bdf * IOMMU_DEV_TABLE_ENTRY_SIZE); - iommu_dte_add_device_entry(dte, &ivrs_mappings[bdf]); + iommu_dte_add_device_entry(&dt[bdf], &ivrs_mappings[bdf]); if ( iommu_intremap && ivrs_mappings[bdf].dte_requestor_id == bdf && @@ -1308,7 +1313,7 @@ static int __init amd_iommu_setup_device } amd_iommu_set_intremap_table( - dte, ivrs_mappings[bdf].intremap_table, + &dt[bdf], ivrs_mappings[bdf].intremap_table, ivrs_mappings[bdf].iommu, iommu_intremap); } } From patchwork Thu Sep 19 13:25:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Beulich X-Patchwork-Id: 11152393 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id CA0ED14ED for ; Thu, 19 Sep 2019 13:26:49 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id AF6992196F for ; Thu, 19 Sep 2019 13:26:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AF6992196F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwQs-000095-Rv; Thu, 19 Sep 2019 13:25:14 +0000 Received: from all-amaz-eas1.inumbo.com ([34.197.232.57] helo=us1-amaz-eas2.inumbo.com) by lists.xenproject.org with esmtp (Exim 4.89) (envelope-from ) id 1iAwQs-00008o-2i for xen-devel@lists.xenproject.org; Thu, 19 Sep 2019 13:25:14 +0000 X-Inumbo-ID: e82d5cb5-dae0-11e9-965d-12813bfff9fa Received: from mx1.suse.de (unknown [195.135.220.15]) by us1-amaz-eas2.inumbo.com (Halon) with ESMTPS id e82d5cb5-dae0-11e9-965d-12813bfff9fa; Thu, 19 Sep 2019 13:25:12 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 127BDAFB1; Thu, 19 Sep 2019 13:25:12 +0000 (UTC) From: Jan Beulich To: "xen-devel@lists.xenproject.org" References: Message-ID: Date: Thu, 19 Sep 2019 15:25:20 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US Subject: [Xen-devel] [PATCH v6 8/8] AMD/IOMMU: pre-fill all DTEs right after table allocation X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Cc: Andrew Cooper , Suravee Suthikulpanit Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" Make sure we don't leave any DTEs unexpected requests through which would be passed through untranslated. Set V and IV right away (with all other fields left as zero), relying on the V and/or IV bits getting cleared only by amd_iommu_set_root_page_table() and amd_iommu_set_intremap_table() under special pass-through circumstances. Switch back to initial settings in amd_iommu_disable_domain_device(). Take the liberty and also make the latter function static, constifying its first parameter at the same time, at this occasion. Signed-off-by: Jan Beulich Reviewed-by: Paul Durrant --- v6: New. --- xen/drivers/passthrough/amd/iommu_init.c | 22 +++++++++++++++++++--- xen/drivers/passthrough/amd/pci_amd_iommu.c | 20 ++++++++++++++++---- 2 files changed, 35 insertions(+), 7 deletions(-) --- a/xen/drivers/passthrough/amd/iommu_init.c +++ b/xen/drivers/passthrough/amd/iommu_init.c @@ -1255,12 +1255,28 @@ static int __init amd_iommu_setup_device if ( !dt ) { + unsigned int size = dt_alloc_size(); + /* allocate 'device table' on a 4K boundary */ dt = IVRS_MAPPINGS_DEVTAB(ivrs_mappings) = - allocate_buffer(dt_alloc_size(), "Device Table"); + allocate_buffer(size, "Device Table"); + if ( !dt ) + return -ENOMEM; + + /* + * Prefill every DTE such that all kinds of requests will get aborted. + * Besides the two bits set to true below this builds upon + * IOMMU_DEV_TABLE_SYS_MGT_DMA_ABORTED, + * IOMMU_DEV_TABLE_IO_CONTROL_ABORTED, as well as + * IOMMU_DEV_TABLE_INT_CONTROL_ABORTED all being zero, and us also + * wanting at least TV, GV, I, and EX set to false. + */ + for ( bdf = 0, size /= sizeof(*dt); bdf < size; ++bdf ) + { + dt[bdf].v = true; + dt[bdf].iv = true; + } } - if ( !dt ) - return -ENOMEM; /* Add device table entries */ for ( bdf = 0; bdf < ivrs_bdf_entries; bdf++ ) --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -267,9 +267,9 @@ static void __hwdom_init amd_iommu_hwdom setup_hwdom_pci_devices(d, amd_iommu_add_device); } -void amd_iommu_disable_domain_device(struct domain *domain, - struct amd_iommu *iommu, - u8 devfn, struct pci_dev *pdev) +static void amd_iommu_disable_domain_device(const struct domain *domain, + struct amd_iommu *iommu, + uint8_t devfn, struct pci_dev *pdev) { struct amd_iommu_dte *table, *dte; unsigned long flags; @@ -284,9 +284,21 @@ void amd_iommu_disable_domain_device(str spin_lock_irqsave(&iommu->lock, flags); if ( dte->tv || dte->v ) { + /* See the comment in amd_iommu_setup_device_table(). */ + dte->int_ctl = IOMMU_DEV_TABLE_INT_CONTROL_ABORTED; + smp_wmb(); + dte->iv = true; dte->tv = false; - dte->v = false; + dte->gv = false; dte->i = false; + dte->ex = false; + dte->sa = false; + dte->se = false; + dte->sd = false; + dte->sys_mgt = IOMMU_DEV_TABLE_SYS_MGT_DMA_ABORTED; + dte->ioctl = IOMMU_DEV_TABLE_IO_CONTROL_ABORTED; + smp_wmb(); + dte->v = true; amd_iommu_flush_device(iommu, req_id);