Message ID | 20220412173407.13637-8-varad.gautam@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | SMP Support for x86 UEFI Tests | expand |
On Tue, Apr 12, 2022, Varad Gautam wrote: > Sending INIT/SIPI to APs from ap_init() resets them into 16-bit mode > to loop into sipi_entry(). > > To drive the APs into 32-bit mode, the SIPI vector needs: > 1. A GDT descriptor reachable from 16-bit code (gdt32_descr). > 2. A 32-bit entrypoint reachable from 16-bit code (ap_start32). > 3. The locations of GDT and the 32-bit entrypoint. > > Setting these up at compile time (like on non-EFI builds) is not > possible since EFI builds with -shared -fPIC and efistart64.S cannot > reference any absolute addresses. > > Relative addressing is unavailable on 16-bit mode. > > Moreover, EFI may not load the 32-bit entrypoint to be reachable from > 16-bit mode. > > To overcome these problems, > 1. Fill the GDT descriptor at runtime after relocating > [sipi_entry-sipi_end] to lowmem. Since sipi_entry does not know the > address of this descriptor, use the last two bytes of SIPI page to > communicate it. > 2. Place a call gate in the GDT to point to ap_start32. > 3. Popluate sipi_entry() to lcall to ap_start32. > > With this, the APs can transition to 32-bit mode and loop at a known > location. > > Signed-off-by: Varad Gautam <varad.gautam@suse.com> > --- > lib/x86/smp.c | 56 ++++++++++++++++++++++++++++++++++++++++++++ > x86/efi/efistart64.S | 29 ++++++++++++++++++++++- > 2 files changed, 84 insertions(+), 1 deletion(-) > > diff --git a/lib/x86/smp.c b/lib/x86/smp.c > index d7f5aba..5cc1648 100644 > --- a/lib/x86/smp.c > +++ b/lib/x86/smp.c > @@ -6,6 +6,7 @@ > #include "apic.h" > #include "fwcfg.h" > #include "desc.h" > +#include "asm/page.h" > > #define IPI_VECTOR 0x20 > > @@ -144,16 +145,71 @@ void smp_reset_apic(void) > atomic_inc(&active_cpus); > } > > +#ifdef CONFIG_EFI > +extern u8 gdt32_descr, gdt32, gdt32_end; > +extern u8 ap_start32; > +#endif > + > void ap_init(void) > { > u8 *dst_addr = 0; > size_t sipi_sz = (&sipi_end - &sipi_entry) + 1; > > + assert(sipi_sz < PAGE_SIZE); > + > asm volatile("cld"); > > /* Relocate SIPI vector to dst_addr so it can run in 16-bit mode. */ > + memset(dst_addr, 0, PAGE_SIZE); > memcpy(dst_addr, &sipi_entry, sipi_sz); > > +#ifdef CONFIG_EFI > + volatile struct descriptor_table_ptr *gdt32_descr_rel; > + idt_entry_t *gate_descr; > + u16 *gdt32_descr_reladdr = (u16 *) (PAGE_SIZE - sizeof(u16)); Ah, the gdt32 name confused me for a bit. Can we use something more unique? Maybe efi_rm_trampoline_gdt? Or just efi_trampoline_gdt since the "rm" part is technically wrong (though consistent). Oooh, and the "PAGE_SIZE - sizeof(u16)" is another instance of open coding "PAGE_SIZE - 2". Definitely need a #define for that. > + > + /* > + * gdt32_descr for CONFIG_EFI needs to be filled here dynamically > + * since compile time calculation of offsets is not allowed when > + * building with -shared, and rip-relative addressing is not supported > + * in 16-bit mode. > + * > + * Use the last two bytes of SIPI page to store relocated gdt32_descr > + * addr. Ooh, I see, it's a double indirection. Load the address to the descriptor, then load the descriptor which holds the address to the GDT. I kept thinking that 2 bytes were the limit chunk of the descriptor. Maybe instead of "relocated gdt32_descr addr", "a pointer to the relocated descriptor used by the trampoline to load the GDT"? > + */ > + *gdt32_descr_reladdr = (&gdt32_descr - &sipi_entry); > + > + gdt32_descr_rel = (struct descriptor_table_ptr *) ((u64) *gdt32_descr_reladdr); > + gdt32_descr_rel->limit = (u16) (&gdt32_end - &gdt32 - 1); > + gdt32_descr_rel->base = (ulong) ((u32) (&gdt32 - &sipi_entry)); > + > + /* > + * EFI may not load the 32-bit AP entrypoint (ap_start32) low enough > + * to be reachable from the SIPI vector. Since we build with -shared, this Nit, please avoid "we" and other pronous, e.g. "Since KVM-unit-tests is built with..." avoids any ambiguity. > + * location needs to be fetched at runtime, and rip-relative addressing is > + * not supported in 16-bit mode. > + * To perform 16-bit -> 32-bit far jump, our options are: "our options" can be tweaked to something like "Alternatives to a CALL GATE are". > + * - ljmpl $cs, $label : unusable since $label is not known at build time. > + * - push $cs; push $label; lret : requires an intermediate trampoline since > + * $label must still be within 0 - 0xFFFF for 16-bit far return to work. > + * - lcall into a call-gate : best suited. I very much appreciate the comment, but it's backwards in the sense that I had no idea what I was reading until the very end. I.e. lead with the below "Set up a call gate ..." blurb, then dive into the alternatives. First and forement, help the reader understand what is actually implemented, then call out the alternatives and why they're inferior. > + * > + * Set up call gate to ap_start32 within GDT. > + * > + * gdt32 layout: > + * > + * Entry | Segment > + * 0 | NULL descr > + * 1 | Code segment descr > + * 2 | Data segment descr > + * 3 | Call gate descr > + */ > + gate_descr = (idt_entry_t *) ((u8 *)(&gdt32 - &sipi_entry) > + + 3 * sizeof(gdt_entry_t)); Ah, it's just a coincidence that this GDT has the same layout as the "real" GDT, e.g. has the same KERNEL_DS and KERNEL_CS. It took me a while to suss out that the APs do indeed reload a different GDT during ap_start64(). In the comment above, can you add a blurb to call out that it's purely coincidental and not strictly required that this matches the initial part of the final GDT, and that APs will load the final GDT in ap_start64()? I kept trying to figure out how this didn't break tests that use other selectors :-) > + set_idt_entry_t(gate_descr, sizeof(gdt_entry_t), (void *) &ap_start32, > + 0x8 /* sel */, 0xc /* type */, 0 /* dpl */); > +#endif > + > /* INIT */ > apic_icr_write(APIC_DEST_ALLBUT | APIC_DEST_PHYSICAL | APIC_DM_INIT | APIC_INT_ASSERT, 0); >
diff --git a/lib/x86/smp.c b/lib/x86/smp.c index d7f5aba..5cc1648 100644 --- a/lib/x86/smp.c +++ b/lib/x86/smp.c @@ -6,6 +6,7 @@ #include "apic.h" #include "fwcfg.h" #include "desc.h" +#include "asm/page.h" #define IPI_VECTOR 0x20 @@ -144,16 +145,71 @@ void smp_reset_apic(void) atomic_inc(&active_cpus); } +#ifdef CONFIG_EFI +extern u8 gdt32_descr, gdt32, gdt32_end; +extern u8 ap_start32; +#endif + void ap_init(void) { u8 *dst_addr = 0; size_t sipi_sz = (&sipi_end - &sipi_entry) + 1; + assert(sipi_sz < PAGE_SIZE); + asm volatile("cld"); /* Relocate SIPI vector to dst_addr so it can run in 16-bit mode. */ + memset(dst_addr, 0, PAGE_SIZE); memcpy(dst_addr, &sipi_entry, sipi_sz); +#ifdef CONFIG_EFI + volatile struct descriptor_table_ptr *gdt32_descr_rel; + idt_entry_t *gate_descr; + u16 *gdt32_descr_reladdr = (u16 *) (PAGE_SIZE - sizeof(u16)); + + /* + * gdt32_descr for CONFIG_EFI needs to be filled here dynamically + * since compile time calculation of offsets is not allowed when + * building with -shared, and rip-relative addressing is not supported + * in 16-bit mode. + * + * Use the last two bytes of SIPI page to store relocated gdt32_descr + * addr. + */ + *gdt32_descr_reladdr = (&gdt32_descr - &sipi_entry); + + gdt32_descr_rel = (struct descriptor_table_ptr *) ((u64) *gdt32_descr_reladdr); + gdt32_descr_rel->limit = (u16) (&gdt32_end - &gdt32 - 1); + gdt32_descr_rel->base = (ulong) ((u32) (&gdt32 - &sipi_entry)); + + /* + * EFI may not load the 32-bit AP entrypoint (ap_start32) low enough + * to be reachable from the SIPI vector. Since we build with -shared, this + * location needs to be fetched at runtime, and rip-relative addressing is + * not supported in 16-bit mode. + * To perform 16-bit -> 32-bit far jump, our options are: + * - ljmpl $cs, $label : unusable since $label is not known at build time. + * - push $cs; push $label; lret : requires an intermediate trampoline since + * $label must still be within 0 - 0xFFFF for 16-bit far return to work. + * - lcall into a call-gate : best suited. + * + * Set up call gate to ap_start32 within GDT. + * + * gdt32 layout: + * + * Entry | Segment + * 0 | NULL descr + * 1 | Code segment descr + * 2 | Data segment descr + * 3 | Call gate descr + */ + gate_descr = (idt_entry_t *) ((u8 *)(&gdt32 - &sipi_entry) + + 3 * sizeof(gdt_entry_t)); + set_idt_entry_t(gate_descr, sizeof(gdt_entry_t), (void *) &ap_start32, + 0x8 /* sel */, 0xc /* type */, 0 /* dpl */); +#endif + /* INIT */ apic_icr_write(APIC_DEST_ALLBUT | APIC_DEST_PHYSICAL | APIC_DM_INIT | APIC_INT_ASSERT, 0); diff --git a/x86/efi/efistart64.S b/x86/efi/efistart64.S index 1c38355..00279b8 100644 --- a/x86/efi/efistart64.S +++ b/x86/efi/efistart64.S @@ -65,7 +65,34 @@ setup_segments64: .globl sipi_entry sipi_entry: - jmp sipi_entry + mov %cr0, %eax + or $1, %eax + mov %eax, %cr0 + + /* Retrieve relocated gdt32_descr address at (PAGE_SIZE - 2). */ + mov (PAGE_SIZE - 2), %ebx + lgdtl (%ebx) + + lcall $0x18, $0x0 + +.globl gdt32 +gdt32: + .quad 0 + .quad 0x00cf9b000000ffff // flat 32-bit code segment + .quad 0x00cf93000000ffff // flat 32-bit data segment + .quad 0 // call gate to 32-bit AP entrypoint +.globl gdt32_end +gdt32_end: + +.globl gdt32_descr +gdt32_descr: + .word 0 + .long 0 .globl sipi_end sipi_end: + +.code32 +.globl ap_start32 +ap_start32: + jmp ap_start32
Sending INIT/SIPI to APs from ap_init() resets them into 16-bit mode to loop into sipi_entry(). To drive the APs into 32-bit mode, the SIPI vector needs: 1. A GDT descriptor reachable from 16-bit code (gdt32_descr). 2. A 32-bit entrypoint reachable from 16-bit code (ap_start32). 3. The locations of GDT and the 32-bit entrypoint. Setting these up at compile time (like on non-EFI builds) is not possible since EFI builds with -shared -fPIC and efistart64.S cannot reference any absolute addresses. Relative addressing is unavailable on 16-bit mode. Moreover, EFI may not load the 32-bit entrypoint to be reachable from 16-bit mode. To overcome these problems, 1. Fill the GDT descriptor at runtime after relocating [sipi_entry-sipi_end] to lowmem. Since sipi_entry does not know the address of this descriptor, use the last two bytes of SIPI page to communicate it. 2. Place a call gate in the GDT to point to ap_start32. 3. Popluate sipi_entry() to lcall to ap_start32. With this, the APs can transition to 32-bit mode and loop at a known location. Signed-off-by: Varad Gautam <varad.gautam@suse.com> --- lib/x86/smp.c | 56 ++++++++++++++++++++++++++++++++++++++++++++ x86/efi/efistart64.S | 29 ++++++++++++++++++++++- 2 files changed, 84 insertions(+), 1 deletion(-)