From patchwork Fri Aug 5 23:04:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Daniel Kiper X-Patchwork-Id: 9265797 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 622D560839 for ; Fri, 5 Aug 2016 23:08:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4D09D2841F for ; Fri, 5 Aug 2016 23:08:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 417A028462; Fri, 5 Aug 2016 23:08:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.2 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_MED, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 02A51284A5 for ; Fri, 5 Aug 2016 23:08:31 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bVoC4-0003pT-1K; Fri, 05 Aug 2016 23:06:20 +0000 Received: from mail6.bemta5.messagelabs.com ([195.245.231.135]) by lists.xenproject.org with esmtp (Exim 4.84_2) (envelope-from ) id 1bVoC2-0003hl-SQ for xen-devel@lists.xenproject.org; Fri, 05 Aug 2016 23:06:19 +0000 Received: from [85.158.139.211] by server-2.bemta-5.messagelabs.com id 11/7B-03032-AEB15A75; Fri, 05 Aug 2016 23:06:18 +0000 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrEIsWRWlGSWpSXmKPExsXSO6nOVfel9NJ wg68nVC2+b5nM5MDocfjDFZYAxijWzLyk/IoE1owba6YzF0zuY6o4fv8ZWwPj5IOMXYxcHEIC 7UwSm25uY4FwvjJKnNj9jw3C2cAo0fxtAiuEM5FR4s+q1cxdjJwcbAI6Ehe/PGQHsUUElCTur ZrMBFLELHCUSeL/5MesIAlhgXSJWV2L2EBsFgFVibaL75hAbF4Bd4lJxy6ANUsIKEp0P5sAVs MJFJ+38TmYLSTgJvGxez4rRL2gxMmZT4Du4wBaoC6xfp4QSJhZQF6ieetsZogxxhLtby+yTWA UnIWkYxZCxywkHQsYmVcxahSnFpWlFukaGeglFWWmZ5TkJmbm6BoamOrlphYXJ6an5iQmFesl 5+duYgSGbz0DA+MOxsbZfocYJTmYlER5z19eEi7El5SfUpmRWJwRX1Sak1p8iFGGg0NJgpcBG A9CgkWp6akVaZk5wEiCSUtw8CiJ8F6RAkrzFhck5hZnpkOkTjHqcjxaeGMtkxBLXn5eqpQ47z mQIgGQoozSPLgRsKi+xCgrJczLyMDAIMRTkFqUm1mCKv+KUZyDUUmY1wvkEp7MvBK4Ta+AjmA COuKj1RKQI0oSEVJSDYwNO6RXLEiWDl28ipXN1zDc2bDqzwKxbVWGqk+K2x58Tp2Ue4Rzyie2 47uOrkpqlrHpXvfBfd8hO47CFk/Zb5dfvamWv5s/cUp1hVDFXaWrj9Naav4EXc0r5PolYPTd7 rle5A7BK68XHmDlzPi3j/fw+tdH+uYeO3B2kXz95y3t6l6/Whw2yxkosRRnJBpqMRcVJwIAE+ vFmOUCAAA= X-Env-Sender: daniel.kiper@oracle.com X-Msg-Ref: server-7.tower-206.messagelabs.com!1470438375!53031118!1 X-Originating-IP: [141.146.126.69] X-SpamReason: No, hits=0.0 required=7.0 tests=sa_preprocessor: VHJ1c3RlZCBJUDogMTQxLjE0Ni4xMjYuNjkgPT4gMjc3MjE4\n X-StarScan-Received: X-StarScan-Version: 8.77; banners=-,-,- X-VirusChecked: Checked Received: (qmail 1369 invoked from network); 5 Aug 2016 23:06:16 -0000 Received: from aserp1040.oracle.com (HELO aserp1040.oracle.com) (141.146.126.69) by server-7.tower-206.messagelabs.com with DHE-RSA-AES256-GCM-SHA384 encrypted SMTP; 5 Aug 2016 23:06:16 -0000 Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u75N66ds005828 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 5 Aug 2016 23:06:07 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserv0022.oracle.com (8.13.8/8.13.8) with ESMTP id u75N66qs020784 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 5 Aug 2016 23:06:06 GMT Received: from abhmp0006.oracle.com (abhmp0006.oracle.com [141.146.116.12]) by aserv0121.oracle.com (8.13.8/8.13.8) with ESMTP id u75N65Kc003601; Fri, 5 Aug 2016 23:06:06 GMT Received: from olila.local.net-space.pl (/10.175.255.156) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 05 Aug 2016 16:06:05 -0700 From: Daniel Kiper To: xen-devel@lists.xenproject.org Date: Sat, 6 Aug 2016 01:04:41 +0200 Message-Id: <1470438282-4226-19-git-send-email-daniel.kiper@oracle.com> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1470438282-4226-1-git-send-email-daniel.kiper@oracle.com> References: <1470438282-4226-1-git-send-email-daniel.kiper@oracle.com> MIME-Version: 1.0 X-Source-IP: aserv0022.oracle.com [141.146.126.234] Cc: jgross@suse.com, sstabellini@kernel.org, andrew.cooper3@citrix.com, cardoe@cardoe.com, pgnet.dev@gmail.com, ning.sun@intel.com, david.vrabel@citrix.com, jbeulich@suse.com, qiaowei.ren@intel.com, richard.l.maliszewski@intel.com, gang.wei@intel.com, fu.wei@linaro.org Subject: [Xen-devel] =?utf-8?q?=5BPATCH_v4_18/19=5D_x86=3A_make_Xen_early_?= =?utf-8?q?boot_code_relocatable?= X-BeenThere: xen-devel@lists.xen.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Errors-To: xen-devel-bounces@lists.xen.org Sender: "Xen-devel" X-Virus-Scanned: ClamAV using ClamSMTP Every multiboot protocol (regardless of version) compatible image must specify its load address (in ELF or multiboot header). Multiboot protocol compatible loader have to load image at specified address. However, there is no guarantee that the requested memory region (in case of Xen it starts at 1 MiB and ends at 17 MiB) where image should be loaded initially is a RAM and it is free (legacy BIOS platforms are merciful for Xen but I found at least one EFI platform on which Xen load address conflicts with EFI boot services; it is Dell PowerEdge R820 with latest firmware). To cope with that problem we must make Xen early boot code relocatable and help boot loader to relocate image in proper way by suggesting, not requesting specific load addresses as it is right now, allowed address ranges. This patch does former. It does not add multiboot2 protocol interface which is done in "x86: add multiboot2 protocol support for relocatable images" patch. This patch changes following things: - default load address is changed from 1 MiB to 2 MiB; I did that because initial page tables are using 2 MiB huge pages and this way required updates for them are quite easy; it means that e.g. we avoid spacial cases for start and end of required memory region if it live at address not aligned to 2 MiB, - %esi and %r15d registers are used as a storage for Xen image load base address (%r15d shortly because %rsi is used for EFI SystemTable address in 64-bit code); both registers are (%esi is mostly) unused in early boot code and preserved during C functions calls, - %fs is used as base for Xen data relative addressing in 32-bit code if it is possible; %esi is used for that thing during error printing because it is not always possible to properly and efficiently initialize %fs. PS I am still not convinced that move to %fs relative addressing is good idea. As you can see code grows larger due to GDT initialization stuff, etc. However, I cannot see potential gains for now and future (probably it would be if whole Xen code, not early boot one, played segment registers games). Well, maybe in one or two places where base register is not used in full SIB addressing mode. So, question is: does it pay? Does gains overweight all efforts related to %fs games? Maybe we should stay with %esi relative addressing? Of course I am aware that it is not perfect. However, IMO, it is much simpler and clearer. This is my suggestion. If you agree with me I can change code once again and back to %esi. This is not big problem. If not I am not going to argue longer. I will do what you request. Well, it will be nice if you convince me that your idea is good and I am wrong then... ;-))) Signed-off-by: Daniel Kiper --- v4 - suggestions/fixes: - do not relocate Xen image if boot loader did work for us (suggested by Andrew Cooper and Jan Beulich), - initialize xen_img_load_base_addr in EFI boot code too, - properly initialize trampoline_xen_phys_start, - calculate Xen image load base address in x86_64 code ourselves, (suggested by Jan Beulich), - change how and when Xen image base address is printed, - use %fs instead of %esi for relative addressing (suggested by Andrew Cooper and Jan Beulich), - create esi_offset and fs_offset() macros in assembly, - calculate mkelf32 argument automatically, - optimize and cleanup code, - improve comments, - improve commit message. v3 - suggestions/fixes: - improve segment registers initialization (suggested by Jan Beulich), - simplify Xen image load base address calculation (suggested by Jan Beulich), - use %esi and %r15d instead of %ebp to store Xen image load base address, - use %esi instead of %fs for relative addressing; this way we get shorter and simpler code, - rename some variables and constants (suggested by Jan Beulich), - improve comments (suggested by Konrad Rzeszutek Wilk), - improve commit message (suggested by Jan Beulich). --- xen/arch/x86/Makefile | 4 +- xen/arch/x86/Rules.mk | 4 + xen/arch/x86/boot/head.S | 204 +++++++++++++++++++++++++++++++--------- xen/arch/x86/boot/trampoline.S | 10 +- xen/arch/x86/boot/wakeup.S | 4 +- xen/arch/x86/boot/x86_64.S | 51 ++++------ xen/arch/x86/efi/efi-boot.h | 3 +- xen/arch/x86/setup.c | 31 +++--- xen/arch/x86/xen.lds.S | 8 +- xen/include/asm-x86/config.h | 1 + xen/include/asm-x86/page.h | 2 +- 11 files changed, 217 insertions(+), 105 deletions(-) diff --git a/xen/arch/x86/Makefile b/xen/arch/x86/Makefile index 9464b7b..df899c1 100644 --- a/xen/arch/x86/Makefile +++ b/xen/arch/x86/Makefile @@ -89,8 +89,8 @@ all_symbols = endif $(TARGET): $(TARGET)-syms $(efi-y) boot/mkelf32 - ./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) 0x100000 \ - `$(NM) -nr $(TARGET)-syms | head -n 1 | sed -e 's/^\([^ ]*\).*/0x\1/'` + ./boot/mkelf32 $(notes_phdrs) $(TARGET)-syms $(TARGET) $(XEN_IMG_OFFSET) \ + `$(NM) -nr $(TARGET)-syms | awk '$$3 == "__end_of_image__" {print "0x"$$1}'` .PHONY: tests tests: diff --git a/xen/arch/x86/Rules.mk b/xen/arch/x86/Rules.mk index 42be4bc..dd10afe 100644 --- a/xen/arch/x86/Rules.mk +++ b/xen/arch/x86/Rules.mk @@ -1,6 +1,10 @@ ######################################## # x86-specific definitions +XEN_IMG_OFFSET = 0x200000 + +CFLAGS += -DXEN_IMG_OFFSET=$(XEN_IMG_OFFSET) + CFLAGS += -I$(BASEDIR)/include CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-generic CFLAGS += -I$(BASEDIR)/include/asm-x86/mach-default diff --git a/xen/arch/x86/boot/head.S b/xen/arch/x86/boot/head.S index b832b21..a1b0c05 100644 --- a/xen/arch/x86/boot/head.S +++ b/xen/arch/x86/boot/head.S @@ -12,13 +12,16 @@ .text .code32 -#define sym_phys(sym) ((sym) - __XEN_VIRT_START) +#define sym_offset(sym) ((sym) - __XEN_VIRT_START) +#define esi_offset(sym) sym_offset(sym)(%esi) +#define fs_offset(sym) %fs:sym_offset(sym) #define BOOT_CS32 0x0008 #define BOOT_CS64 0x0010 #define BOOT_DS 0x0018 #define BOOT_PSEUDORM_CS 0x0020 #define BOOT_PSEUDORM_DS 0x0028 +#define BOOT_FS 0x0030 #define MB2_HT(name) (MULTIBOOT2_HEADER_TAG_##name) #define MB2_TT(name) (MULTIBOOT2_TAG_TYPE_##name) @@ -94,7 +97,7 @@ multiboot2_header_start: /* EFI64 entry point. */ mb2ht_init MB2_HT(ENTRY_ADDRESS_EFI64), MB2_HT(OPTIONAL), \ - sym_phys(__efi64_start) + sym_offset(__efi64_start) /* Multiboot2 header end tag. */ mb2ht_init MB2_HT(END), MB2_HT(REQUIRED) @@ -105,12 +108,13 @@ multiboot2_header_start: .word 0 gdt_boot_descr: - .word 6*8-1 - .long sym_phys(trampoline_gdt) + .word 7*8-1 +gdt_boot_base: + .long sym_offset(trampoline_gdt) .long 0 /* Needed for 64-bit lgdt */ cs32_switch_addr: - .long sym_phys(cs32_switch) + .long sym_offset(cs32_switch) .word BOOT_CS32 vga_text_buffer: @@ -126,26 +130,26 @@ vga_text_buffer: .section .init.text, "ax", @progbits bad_cpu: - mov $(sym_phys(.Lbad_cpu_msg)),%esi # Error message + lea esi_offset(.Lbad_cpu_msg),%esi # Error message jmp 0f not_multiboot: - mov $(sym_phys(.Lbad_ldr_msg)),%esi # Error message + lea esi_offset(.Lbad_ldr_msg),%esi # Error message jmp 0f mb2_no_st: - mov $(sym_phys(.Lbad_ldr_nst)),%esi # Error message + lea esi_offset(.Lbad_ldr_nst),%esi # Error message jmp 0f mb2_no_ih: - mov $(sym_phys(.Lbad_ldr_nih)),%esi # Error message + lea esi_offset(.Lbad_ldr_nih),%esi # Error message jmp 0f mb2_no_bs: - mov $(sym_phys(.Lbad_ldr_nbs)),%esi # Error message + lea esi_offset(.Lbad_ldr_nbs),%esi # Error message xor %edi,%edi # No VGA text buffer jmp 1f mb2_efi_ia_32: - mov $(sym_phys(.Lbad_efi_msg)),%esi # Error message + lea esi_offset(.Lbad_efi_msg),%esi # Error message xor %edi,%edi # No VGA text buffer jmp 1f -0: mov sym_phys(vga_text_buffer),%edi +0: mov esi_offset(vga_text_buffer),%edi 1: mov (%esi),%bl test %bl,%bl # Terminate on '\0' sentinel je .Lhalt @@ -173,6 +177,9 @@ __efi64_start: /* VGA is not available on EFI platforms. */ movl $0,vga_text_buffer(%rip) + /* Load Xen image load base address. */ + lea __image_base__(%rip),%r15d + /* Check for Multiboot2 bootloader. */ cmp $MULTIBOOT2_BOOTLOADER_MAGIC,%eax je .Lefi_multiboot2_proto @@ -288,6 +295,9 @@ run_bs: pop %rax + /* Store Xen image load base address in place accessible for 32-bit code. */ + mov %r15d,%esi + /* Jump to trampoline_setup after switching CPU to x86_32 mode. */ lea trampoline_setup(%rip),%edi @@ -295,9 +305,11 @@ x86_32_switch: cli /* Initialise GDT. */ + add %esi,gdt_boot_base(%rip) lgdt gdt_boot_descr(%rip) /* Reload code selector. */ + add %esi,cs32_switch_addr(%rip) ljmpl *cs32_switch_addr(%rip) .code32 @@ -327,12 +339,8 @@ __start: cld cli - /* Initialise GDT and basic data segments. */ - lgdt %cs:sym_phys(gdt_boot_descr) - mov $BOOT_DS,%ecx - mov %ecx,%ds - mov %ecx,%es - mov %ecx,%ss + /* Load default Xen image load base address. */ + mov $sym_offset(__image_base__),%esi /* Bootloaders may set multiboot{1,2}.mem_lower to a nonzero value. */ xor %edx,%edx @@ -388,6 +396,25 @@ __start: jmp 0b trampoline_bios_setup: + /* + * Called on legacy BIOS platforms only. + * + * Initialise GDT and basic data segments. + */ + add %esi,esi_offset(gdt_boot_base) + lgdt esi_offset(gdt_boot_descr) + + mov $BOOT_DS,%ecx + mov %ecx,%ds + mov %ecx,%es + mov %ecx,%ss + /* %esp is initialised later. */ + + /* Load null descriptor to unused segment registers. */ + xor %ecx,%ecx + mov %ecx,%fs + mov %ecx,%gs + /* Set up trampoline segment 64k below EBDA */ movzwl 0x40e,%ecx /* EBDA segment */ cmp $0xa000,%ecx /* sanity check (high) */ @@ -409,36 +436,93 @@ trampoline_bios_setup: cmovb %edx,%ecx /* and use the smaller */ trampoline_setup: + /* + * Called on legacy BIOS and EFI platforms. + * + * Compute 0-15 bits of BOOT_FS segment descriptor base address. + */ + mov %esi,%edx + shl $16,%edx + or %edx,BOOT_FS+esi_offset(trampoline_gdt) + + /* Compute 16-23 bits of BOOT_FS segment descriptor base address. */ + mov %esi,%edx + shr $16,%edx + and $0x000000ff,%edx + or %edx,BOOT_FS+4+esi_offset(trampoline_gdt) + + /* Compute 24-31 bits of BOOT_FS segment descriptor base address. */ + mov %esi,%edx + and $0xff000000,%edx + or %edx,BOOT_FS+4+esi_offset(trampoline_gdt) + + /* + * Initialise %fs and later use it to access Xen data if possible. + * According to Intel 64 and IA-32 Architectures Software Developer’s + * Manual it is safe to do that without reloading GDTR before. + * + * Please check Intel 64 and IA-32 Architectures Software Developer’s + * Manual, Volume 2 (2A, 2B & 2C): Instruction Set Reference, + * LGDT and MOV instructions description and + * Intel 64 and IA-32 Architectures Software Developer’s + * Manual Volume 3 (3A, 3B & 3C): System Programming Guide, + * section 3.4.3, Segment Registers for more details. + * + * AIUI, only GDT address and limit are loaded into GDTR when + * lgdt is executed. Segment descriptor is loaded directly from + * memory into segment register (hiden part) only when relevant + * load instruction is used (e.g. mov %edx,%fs). Though GDT content + * probably could be stored in CPU cache but nothing suggest that + * CPU caching interfere in one way or another with segment descriptor + * load. So, it looks that every change in active GDT is immediately + * available for relevant segment descriptor load instruction. + * + * I was not able to find anything which invalidates above. + * So, everything suggest that we do not need an extra lgdt here. + */ + mov $BOOT_FS,%edx + mov %edx,%fs + /* Reserve 64kb for the trampoline. */ sub $0x1000,%ecx /* From arch/x86/smpboot.c: start_eip had better be page-aligned! */ xor %cl, %cl shl $4, %ecx - mov %ecx,sym_phys(trampoline_phys) + mov %ecx,fs_offset(trampoline_phys) + + /* Save Xen image load base address for later use. */ + mov %esi,fs_offset(xen_img_load_base_addr) + mov %esi,fs_offset(trampoline_xen_phys_start) + + /* Setup stack. %ss was initialized earlier. */ + lea 1024+esi_offset(cpu0_stack),%esp /* Save the Multiboot info struct (after relocation) for later use. */ - mov $sym_phys(cpu0_stack)+1024,%esp push %ecx /* Boot trampoline address. */ push %ebx /* Multiboot information address. */ push %eax /* Multiboot magic. */ call reloc - mov %eax,sym_phys(multiboot_ptr) + mov %eax,fs_offset(multiboot_ptr) /* * Do not zero BSS on EFI platform here. * It was initialized earlier. */ - cmpb $0,sym_phys(skip_realmode) + cmpb $0,fs_offset(skip_realmode) jnz 1f /* Initialize BSS (no nasty surprises!). */ - mov $sym_phys(__bss_start),%edi - mov $sym_phys(__bss_end),%ecx + mov $sym_offset(__bss_start),%edi + mov $sym_offset(__bss_end),%ecx + push %fs + pop %es sub %edi,%ecx shr $2,%ecx xor %eax,%eax rep stosl + push %ds + pop %es 1: /* Interrogate CPU extended features via CPUID. */ @@ -452,8 +536,8 @@ trampoline_setup: jbe 1f mov $0x80000001,%eax cpuid -1: mov %edx,sym_phys(cpuid_ext_features) - mov %edx,sym_phys(boot_cpu_data)+CPUINFO_FEATURE_OFFSET(X86_FEATURE_LM) +1: mov %edx,fs_offset(cpuid_ext_features) + mov %edx,fs_offset(boot_cpu_data)+CPUINFO_FEATURE_OFFSET(X86_FEATURE_LM) /* Check for availability of long mode. */ bt $cpufeat_bit(X86_FEATURE_LM),%edx @@ -461,62 +545,88 @@ trampoline_setup: /* Stash TSC to calculate a good approximation of time-since-boot */ rdtsc - mov %eax,sym_phys(boot_tsc_stamp) - mov %edx,sym_phys(boot_tsc_stamp+4) + mov %eax,fs_offset(boot_tsc_stamp) + mov %edx,fs_offset(boot_tsc_stamp)+4 + + /* Update frame addresses in page tables. */ + mov $((__page_tables_end-__page_tables_start)/8),%ecx +1: testl $_PAGE_PRESENT,fs_offset(__page_tables_start)-8(,%ecx,8) + jz 2f + add %esi,fs_offset(__page_tables_start)-8(,%ecx,8) +2: loop 1b + + /* Initialise L2 boot-map/direct map page table entries (14MB). */ + lea esi_offset(start),%ebx + lea (1<= end ) return NULL; - if ( end <= BOOTSTRAP_MAP_BASE ) - return (void *)(unsigned long)start; - ret = (void *)(map_cur + (unsigned long)(start & mask)); start &= ~mask; end = (end + mask) & ~mask; @@ -673,6 +670,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) printk("Command line: %s\n", cmdline); + printk("Xen image load base address: 0x%08x\n", xen_img_load_base_addr); + printk("Video information:\n"); /* Print VGA display mode information. */ @@ -860,15 +859,17 @@ void __init noreturn __start_xen(unsigned long mbi_p) highmem_start &= ~((1UL << L3_PAGETABLE_SHIFT) - 1); #endif + /* Do not relocate Xen image if boot loader did work for us. */ + if ( xen_img_load_base_addr ) + xen_phys_start = xen_img_load_base_addr; + for ( i = boot_e820.nr_map-1; i >= 0; i-- ) { uint64_t s, e, mask = (1UL << L2_PAGETABLE_SHIFT) - 1; uint64_t end, limit = ARRAY_SIZE(l2_identmap) << L2_PAGETABLE_SHIFT; - /* Superpage-aligned chunks from BOOTSTRAP_MAP_BASE. */ s = (boot_e820.map[i].addr + mask) & ~mask; e = (boot_e820.map[i].addr + boot_e820.map[i].size) & ~mask; - s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE); if ( (boot_e820.map[i].type != E820_RAM) || (s >= e) ) continue; @@ -900,7 +901,6 @@ void __init noreturn __start_xen(unsigned long mbi_p) l4_pgentry_t *pl4e; l3_pgentry_t *pl3e; l2_pgentry_t *pl2e; - uint64_t load_start; int i, j, k; /* Select relocation address. */ @@ -914,9 +914,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) * with a barrier(). After this we must *not* modify static/global * data until after we have switched to the relocated pagetables! */ - load_start = (unsigned long)_start - XEN_VIRT_START; barrier(); - move_memory(e + load_start, load_start, _end - _start, 1); + move_memory(e + XEN_IMG_OFFSET, XEN_IMG_OFFSET, _end - _start, 1); /* Walk initial pagetables, relocating page directory entries. */ pl4e = __va(__pa(idle_pg_table)); @@ -932,7 +931,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) /* Not present, 1GB mapping, or already relocated? */ if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) || (l3e_get_flags(*pl3e) & _PAGE_PSE) || - (l3e_get_pfn(*pl3e) > 0x1000) ) + (l3e_get_pfn(*pl3e) > PFN_DOWN(xen_phys_start)) ) continue; *pl3e = l3e_from_intpte(l3e_get_intpte(*pl3e) + xen_phys_start); @@ -942,7 +941,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) /* Not present, PSE, or already relocated? */ if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) || (l2e_get_flags(*pl2e) & _PAGE_PSE) || - (l2e_get_pfn(*pl2e) > 0x1000) ) + (l2e_get_pfn(*pl2e) > PFN_DOWN(xen_phys_start)) ) continue; *pl2e = l2e_from_intpte(l2e_get_intpte(*pl2e) + xen_phys_start); @@ -956,15 +955,14 @@ void __init noreturn __start_xen(unsigned long mbi_p) * Undo the temporary-hooking of the l1_identmap. __2M_text_start * is contained in this PTE. */ - BUG_ON(l2_table_offset((unsigned long)_erodata) == - l2_table_offset((unsigned long)_stext)); *pl2e++ = l2e_from_pfn(xen_phys_start >> PAGE_SHIFT, PAGE_HYPERVISOR_RX | _PAGE_PSE); for ( i = 1; i < L2_PAGETABLE_ENTRIES; i++, pl2e++ ) { unsigned int flags; - if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) ) + if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) || + (l2e_get_pfn(*pl2e) > PFN_DOWN(xen_phys_start)) ) continue; if ( !using_2M_mapping() ) @@ -1018,6 +1016,8 @@ void __init noreturn __start_xen(unsigned long mbi_p) : "memory" ); bootstrap_map(NULL); + + printk("New Xen image base address: 0x%08lx\n", xen_phys_start); } /* Is the region suitable for relocating the multiboot modules? */ @@ -1081,6 +1081,7 @@ void __init noreturn __start_xen(unsigned long mbi_p) if ( !xen_phys_start ) panic("Not enough memory to relocate Xen."); + reserve_e820_ram(&boot_e820, __pa(&_start), __pa(&_end)); /* Late kexec reservation (dynamic start address). */ @@ -1153,14 +1154,12 @@ void __init noreturn __start_xen(unsigned long mbi_p) set_pdx_range(s >> PAGE_SHIFT, e >> PAGE_SHIFT); - /* Need to create mappings above BOOTSTRAP_MAP_BASE. */ - map_s = max_t(uint64_t, s, BOOTSTRAP_MAP_BASE); + map_s = s; map_e = min_t(uint64_t, e, ARRAY_SIZE(l2_identmap) << L2_PAGETABLE_SHIFT); /* Pass mapped memory to allocator /before/ creating new mappings. */ init_boot_pages(s, min(map_s, e)); - s = map_s; if ( s < map_e ) { uint64_t mask = (1UL << L2_PAGETABLE_SHIFT) - 1; diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S index 595137f..046fd25 100644 --- a/xen/arch/x86/xen.lds.S +++ b/xen/arch/x86/xen.lds.S @@ -55,7 +55,7 @@ SECTIONS __2M_text_start = .; /* Start of 2M superpages, mapped RX. */ #endif - . = __XEN_VIRT_START + MB(1); + . = __XEN_VIRT_START + XEN_IMG_OFFSET; _start = .; .text : { _stext = .; /* Text and read-only data */ @@ -260,12 +260,14 @@ SECTIONS .reloc : { *(.reloc) } :text - /* Trick the linker into setting the image size to exactly 16Mb. */ . = ALIGN(__section_alignment__); +#endif + + /* Trick the linker into setting the image size to exactly 16Mb. */ .pad : { . = ALIGN(MB(16)); + __end_of_image__ = .; } :text -#endif /* Sections to be discarded */ /DISCARD/ : { diff --git a/xen/include/asm-x86/config.h b/xen/include/asm-x86/config.h index 6fd84e7..f5a2d2f 100644 --- a/xen/include/asm-x86/config.h +++ b/xen/include/asm-x86/config.h @@ -96,6 +96,7 @@ extern unsigned long trampoline_phys; trampoline_phys-__pa(trampoline_start))) extern char trampoline_start[], trampoline_end[]; extern char trampoline_realmode_entry[]; +extern unsigned int xen_img_load_base_addr; extern unsigned int trampoline_xen_phys_start; extern unsigned char trampoline_cpu_started; extern char wakeup_start[]; diff --git a/xen/include/asm-x86/page.h b/xen/include/asm-x86/page.h index 4ae387f..7324afe 100644 --- a/xen/include/asm-x86/page.h +++ b/xen/include/asm-x86/page.h @@ -288,7 +288,7 @@ extern root_pgentry_t idle_pg_table[ROOT_PAGETABLE_ENTRIES]; extern l2_pgentry_t *compat_idle_pg_table_l2; extern unsigned int m2p_compat_vstart; extern l2_pgentry_t l2_xenmap[L2_PAGETABLE_ENTRIES], - l2_bootmap[L2_PAGETABLE_ENTRIES]; + l2_bootmap[4*L2_PAGETABLE_ENTRIES]; extern l3_pgentry_t l3_bootmap[L3_PAGETABLE_ENTRIES]; extern l2_pgentry_t l2_identmap[4*L2_PAGETABLE_ENTRIES]; extern l1_pgentry_t l1_identmap[L1_PAGETABLE_ENTRIES],