From patchwork Fri Jul 29 16:29:04 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Roger_Pau_Monn=C3=A9?= X-Patchwork-Id: 9252685 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C0A2B60757 for ; Fri, 29 Jul 2016 16:31:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B12A2283E1 for ; Fri, 29 Jul 2016 16:31:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A5B0A283E3; Fri, 29 Jul 2016 16:31:56 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id F3F6F283E1 for ; Fri, 29 Jul 2016 16:31:55 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bTAfG-0003aD-Gp; Fri, 29 Jul 2016 16:29:34 +0000 Received: from mail6.bemta14.messagelabs.com ([193.109.254.103]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bTAfF-0003ZK-CB for xen-devel@lists.xenproject.org; Fri, 29 Jul 2016 16:29:33 +0000 Received: from [193.109.254.147] by server-4.bemta-14.messagelabs.com id 9A/BC-10431-C648B975; Fri, 29 Jul 2016 16:29:32 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrAIsWRWlGSWpSXmKPExsXitHSDvW52y+x wgy+HLC2+b5nM5MDocfjDFZYAxijWzLyk/IoE1ow5S86yFDwIqJj96R9zA2ODTRcjJ4eEgL/E pA2/GUFsNgEdiYtzd7J1MXJwiAioSNzeawBiMguUS8y4EQ9SISzgINE7ZQYbiM0ioCoxa+8yV hCbV8BVYkXTUVaIiboSD8/9ZgVp5QSK/19VDWIKCbhINDSFQlQLSpyc+YQFxGYW0JRo3f6bHc KWl2jeOpsZxBYSUJTon/cA7BYJAW6Jv932Exj5ZyHpnoWkexaS7gWMzKsYNYpTi8pSi3QNDfW SijLTM0pyEzNzgDwTvdzU4uLE9NScxKRiveT83E2MwNBjAIIdjGdWOB9ilORgUhLlXbBiVrgQ X1J+SmVGYnFGfFFpTmrxIUYNDg6BzWtXX2CUYsnLz0tVkuBNaJ4dLiRYlJqeWpGWmQOMDphSC Q4eJRHeWSBp3uKCxNzizHSI1ClGRSlxXn2QhABIIqM0D64NFpGXGGWlhHkZgY4S4ilILcrNLE GVf8UozsGoJMxbCTKFJzOvBG76K6DFTECLi2NngCwuSURISTUwuhe9YxQ1Teg2n7jiV3gM3xq rV1Iikfe3Lk16e3DW6p96/c4T/7lsPvNi/f3I55mnLq+dlmhzNVH2yCH5Q7Oavrrbul4V5V7c O7X1w619//frv5Ayl+WI2Rq5NcIjUJj73d5YtmvyTze429pNPRF9c3XgtdhNTlztYW+ST0o/f W/EW/lUc+ZXHyWW4oxEQy3mouJEAN3y/yDDAgAA X-Env-Sender: prvs=011b0443a=roger.pau@citrix.com X-Msg-Ref: server-15.tower-27.messagelabs.com!1469809760!1527603!5 X-Originating-IP: [66.165.176.63] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogNjYuMTY1LjE3Ni42MyA9PiAzMDYwNDg=\n, received_headers: No Received headers X-StarScan-Received: X-StarScan-Version: 8.77; banners=-,-,- X-VirusChecked: Checked Received: (qmail 13132 invoked from network); 29 Jul 2016 16:29:31 -0000 Received: from smtp02.citrix.com (HELO SMTP02.CITRIX.COM) (66.165.176.63) by server-15.tower-27.messagelabs.com with RC4-SHA encrypted SMTP; 29 Jul 2016 16:29:31 -0000 X-IronPort-AV: E=Sophos;i="5.28,440,1464652800"; d="scan'208";a="376340892" From: Roger Pau Monne To: Date: Fri, 29 Jul 2016 18:29:04 +0200 Message-ID: <1469809747-11176-10-git-send-email-roger.pau@citrix.com> X-Mailer: git-send-email 2.7.4 (Apple Git-66) In-Reply-To: <1469809747-11176-1-git-send-email-roger.pau@citrix.com> References: <1469809747-11176-1-git-send-email-roger.pau@citrix.com> MIME-Version: 1.0 X-DLP: MIA1 Cc: Andrew Cooper , Jan Beulich , Roger Pau Monne Subject: [Xen-devel] [PATCH RFC 09/12] xen/x86: setup PVHv2 Dom0 ACPI tables X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP This maps all the regions in the e820 marked as E820_ACPI or E820_NVS to Dom0 1:1. It also shadows the page(s) where the native MADT is placed by mapping a RAM page over it, copying the original data and modifying it afterwards in order to represent the real CPU topology exposed to Dom0. Signed-off-by: Roger Pau Monné --- Cc: Jan Beulich Cc: Andrew Cooper --- FWIW, I think that the current approach that I've used in order to craft the MADT is not the best one, IMHO it would be better to place the MADT at the end of the E820_ACPI region (expanding it's size one page), and modify the XSDT/RSDT in order to point to it, that way we avoid shadowing any other ACPI data that might be at the same page as the native MADT (and that needs to be modified by Dom0) . --- xen/arch/x86/domain_build.c | 250 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 250 insertions(+) diff --git a/xen/arch/x86/domain_build.c b/xen/arch/x86/domain_build.c index 89ef59e..fad4f5c 100644 --- a/xen/arch/x86/domain_build.c +++ b/xen/arch/x86/domain_build.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include #include @@ -38,6 +39,8 @@ #include #include +#include + #include #include #include @@ -50,6 +53,8 @@ static long __initdata dom0_max_nrpages = LONG_MAX; #define HVM_IDENT_PT_GFN 0xfeffeu static unsigned int __initdata hvm_mem_stats[MAX_ORDER + 1]; +static unsigned int __initdata acpi_intr_overrrides = 0; +static struct acpi_madt_interrupt_override __initdata *intsrcovr = NULL; /* * dom0_mem=[min:,][max:,][] @@ -1932,6 +1937,7 @@ static int __init hvm_load_kernel(struct domain *d, const module_t *image, last_addr += sizeof(mod); start_info.magic = XEN_HVM_START_MAGIC_VALUE; start_info.flags = SIF_PRIVILEGED | SIF_INITDOMAIN; + start_info.rsdp_paddr = acpi_os_get_root_pointer(); rc = hvm_copy_to_guest_phys(last_addr, &start_info, sizeof(start_info)); if ( rc != HVMCOPY_okay ) { @@ -2044,6 +2050,243 @@ static int __init hvm_setup_cpus(struct domain *d, paddr_t entry, return 0; } +static int __init acpi_count_intr_ov(struct acpi_subtable_header *header, + const unsigned long end) +{ + + acpi_intr_overrrides++; + return 0; +} + +static int __init acpi_set_intr_ov(struct acpi_subtable_header *header, + const unsigned long end) +{ + struct acpi_madt_interrupt_override *intr = + container_of(header, struct acpi_madt_interrupt_override, header); + + ACPI_MEMCPY(intsrcovr, intr, sizeof(*intr)); + intsrcovr++; + + return 0; +} + +static void __init acpi_zap_table_signature(char *name) +{ + struct acpi_table_header *table; + acpi_status status; + union { + char str[ACPI_NAME_SIZE]; + uint32_t bits; + } signature; + char tmp; + int i; + + status = acpi_get_table(name, 0, &table); + if ( ACPI_SUCCESS(status) ) + { + memcpy(&signature.str[0], &table->signature[0], ACPI_NAME_SIZE); + for ( i = 0; i < ACPI_NAME_SIZE / 2; i++ ) + { + tmp = signature.str[ACPI_NAME_SIZE - i - 1]; + signature.str[ACPI_NAME_SIZE - i - 1] = signature.str[i]; + signature.str[i] = tmp; + } + write_atomic((uint32_t*)&table->signature[0], signature.bits); + } +} + +static int __init acpi_map(struct domain *d, unsigned long pfn, + unsigned long nr_pages) +{ + int rc; + + while ( nr_pages > 0 ) + { + rc = map_mmio_regions(d, _gfn(pfn), nr_pages, _mfn(pfn)); + if ( rc == 0 ) + break; + if ( rc < 0 ) + { + printk("Failed to map %#lx - %#lx into Dom0 memory map: %d\n", + pfn, pfn + nr_pages, rc); + return rc; + } + nr_pages -= rc; + pfn += rc; + process_pending_softirqs(); + } + + return rc; +} + +static int __init hvm_setup_acpi(struct domain *d) +{ + struct vcpu *saved_current, *v = d->vcpu[0]; + unsigned long pfn, nr_pages; + uint64_t size, start_addr, end_addr; + uint64_t madt_addr[2] = { 0, 0 }; + struct acpi_table_header *table; + struct acpi_table_madt *madt; + struct acpi_madt_io_apic *io_apic; + struct acpi_madt_local_apic *local_apic; + acpi_status status; + int rc, i; + + printk("** Setup ACPI tables **\n"); + + /* ZAP the HPET, SLIT, SRAT, MPST and PMTT tables. */ + acpi_zap_table_signature(ACPI_SIG_HPET); + acpi_zap_table_signature(ACPI_SIG_SLIT); + acpi_zap_table_signature(ACPI_SIG_SRAT); + acpi_zap_table_signature(ACPI_SIG_MPST); + acpi_zap_table_signature(ACPI_SIG_PMTT); + + /* Map ACPI tables 1:1 */ + for ( i = 0; i < d->arch.nr_e820; i++ ) + { + if ( d->arch.e820[i].type != E820_ACPI && + d->arch.e820[i].type != E820_NVS ) + continue; + + pfn = PFN_DOWN(d->arch.e820[i].addr); + nr_pages = DIV_ROUND_UP(d->arch.e820[i].size, PAGE_SIZE); + + rc = acpi_map(d, pfn, nr_pages); + if ( rc ) + { + printk( + "Failed to map ACPI region %#lx - %#lx into Dom0 memory map\n", + pfn, pfn + nr_pages); + return rc; + } + } + + /* Map the first 1MB 1:1 also */ + pfn = 0; + nr_pages = 0x100; + rc = acpi_map(d, pfn, nr_pages); + if ( rc ) + { + printk( + "Failed to map low 1MB region %#lx - %#lx into Dom0 memory map\n", + pfn, pfn + nr_pages); + return rc; + } + + acpi_get_table_phys(ACPI_SIG_MADT, 0, &madt_addr[0], &size); + if ( !madt_addr[0] ) + { + printk("Unable to find ACPI MADT table\n"); + return -EINVAL; + } + if ( size > PAGE_SIZE ) + { + printk("MADT table is bigger than PAGE_SIZE, aborting\n"); + return -EINVAL; + } + + acpi_get_table_phys(ACPI_SIG_MADT, 2, &madt_addr[1], &size); + if ( madt_addr[1] != 0 && madt_addr[1] != madt_addr[0] ) + { + printk("Multiple MADT tables found, aborting\n"); + return -EINVAL; + } + + /* + * Populate the guest physical memory were MADT resides with empty RAM + * pages. This will remove the 1:1 mapping in this area, so that Xen + * can modify it without any side-effects. + */ + start_addr = madt_addr[0] & PAGE_MASK; + end_addr = PAGE_ALIGN(madt_addr[0] + size); + hvm_populate_memory_range(d, start_addr, end_addr - start_addr); + + /* Get the address where the MADT is currently mapped. */ + status = acpi_get_table(ACPI_SIG_MADT, 0, &table); + if ( !ACPI_SUCCESS(status) ) + { + printk("Failed to get MADT ACPI table, aborting.\n"); + return -EINVAL; + } + + /* + * Copy the original MADT table (and whatever is around it) to the + * guest physmap. + */ + saved_current = current; + set_current(v); + rc = hvm_copy_to_guest_phys(start_addr, + (void *)((uintptr_t)table & PAGE_MASK), + end_addr - start_addr); + set_current(saved_current); + if ( rc != HVMCOPY_okay ) + { + printk("Unable to copy original MADT page(s)\n"); + return -EFAULT; + } + + /* Craft a new MADT for the guest */ + + /* Count number of interrupt overrides. */ + acpi_table_parse_madt(ACPI_MADT_TYPE_INTERRUPT_OVERRIDE, acpi_count_intr_ov, + MAX_IRQ_SOURCES); + size = sizeof(struct acpi_table_madt); + size += sizeof(struct acpi_madt_interrupt_override) * acpi_intr_overrrides; + size += sizeof(struct acpi_madt_io_apic); + size += sizeof(struct acpi_madt_local_apic) * dom0_max_vcpus(); + + madt = xzalloc_bytes(size); + ACPI_MEMCPY(madt, table, sizeof(*madt)); + madt->address = APIC_DEFAULT_PHYS_BASE; + io_apic = (struct acpi_madt_io_apic *)(madt + 1); + io_apic->header.type = ACPI_MADT_TYPE_IO_APIC; + io_apic->header.length = sizeof(*io_apic); + io_apic->id = 1; + io_apic->address = VIOAPIC_DEFAULT_BASE_ADDRESS; + + if ( dom0_max_vcpus() > num_online_cpus() ) + { + printk("CPU overcommit is not supported for Dom0\n"); + xfree(madt); + return -EINVAL; + } + + local_apic = (struct acpi_madt_local_apic *)(io_apic + 1); + for ( i = 0; i < dom0_max_vcpus(); i++ ) + { + local_apic->header.type = ACPI_MADT_TYPE_LOCAL_APIC; + local_apic->header.length = sizeof(*local_apic); + local_apic->processor_id = i; + local_apic->id = i * 2; + local_apic->lapic_flags = ACPI_MADT_ENABLED; + local_apic++; + } + + intsrcovr = (struct acpi_madt_interrupt_override *)local_apic; + acpi_table_parse_madt(ACPI_MADT_TYPE_INTERRUPT_OVERRIDE, acpi_set_intr_ov, + MAX_IRQ_SOURCES); + ASSERT(((unsigned char *)intsrcovr - (unsigned char *)madt) == size); + madt->header.length = size; + madt->header.checksum -= acpi_tb_checksum(ACPI_CAST_PTR(u8, madt), + madt->header.length); + + /* Copy the new MADT table to the guest physmap. */ + saved_current = current; + set_current(v); + rc = hvm_copy_to_guest_phys(madt_addr[0], madt, size); + set_current(saved_current); + if ( rc != HVMCOPY_okay ) + { + printk("Unable to copy modified MADT page(s)\n"); + xfree(madt); + return -EFAULT; + } + + xfree(madt); + + return 0; +} + static int __init construct_dom0_hvm(struct domain *d, const module_t *image, unsigned long image_headroom, module_t *initrd, @@ -2085,6 +2328,13 @@ static int __init construct_dom0_hvm(struct domain *d, const module_t *image, return rc; } + rc = hvm_setup_acpi(d); + if ( rc ) + { + printk("Failed to setup Dom0 ACPI tables: %d\n", rc); + return rc; + } + return 0; }