Message ID | 20200501225838.9866-8-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86: Support for CET Supervisor Shadow Stacks | expand |
On 02.05.2020 00:58, Andrew Cooper wrote: > --- a/xen/arch/x86/cpu/common.c > +++ b/xen/arch/x86/cpu/common.c > @@ -732,14 +732,14 @@ void load_system_tables(void) > .rsp2 = 0x8600111111111111ul, > > /* > - * MCE, NMI and Double Fault handlers get their own stacks. > + * #DB, NMI, DF and #MCE handlers get their own stacks. Then also #DF and #MC? > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -6002,25 +6002,18 @@ void memguard_unguard_range(void *p, unsigned long l) > > void memguard_guard_stack(void *p) > { > - /* IST_MAX IST pages + at least 1 guard page + primary stack. */ > - BUILD_BUG_ON((IST_MAX + 1) * PAGE_SIZE + PRIMARY_STACK_SIZE > STACK_SIZE); > + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, _PAGE_NONE); > > - memguard_guard_range(p + IST_MAX * PAGE_SIZE, > - STACK_SIZE - PRIMARY_STACK_SIZE - IST_MAX * PAGE_SIZE); > + p += 5 * PAGE_SIZE; The literal 5 here and ... > + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, _PAGE_NONE); > } > > void memguard_unguard_stack(void *p) > { > - memguard_unguard_range(p + IST_MAX * PAGE_SIZE, > - STACK_SIZE - PRIMARY_STACK_SIZE - IST_MAX * PAGE_SIZE); > -} > - > -bool memguard_is_stack_guard_page(unsigned long addr) > -{ > - addr &= STACK_SIZE - 1; > + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, PAGE_HYPERVISOR_RW); > > - return addr >= IST_MAX * PAGE_SIZE && > - addr < STACK_SIZE - PRIMARY_STACK_SIZE; > + p += 5 * PAGE_SIZE; ... here could do with macro-izing: IST_MAX + 1 would already be a little better, I guess. Preferably with adjustments along these lines Reviewed-by: Jan Beulich <jbeulich@suse.com> Jan
On 04/05/2020 15:24, Jan Beulich wrote: > On 02.05.2020 00:58, Andrew Cooper wrote: >> --- a/xen/arch/x86/cpu/common.c >> +++ b/xen/arch/x86/cpu/common.c >> @@ -732,14 +732,14 @@ void load_system_tables(void) >> .rsp2 = 0x8600111111111111ul, >> >> /* >> - * MCE, NMI and Double Fault handlers get their own stacks. >> + * #DB, NMI, DF and #MCE handlers get their own stacks. > Then also #DF and #MC? Ok. > >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -6002,25 +6002,18 @@ void memguard_unguard_range(void *p, unsigned long l) >> >> void memguard_guard_stack(void *p) >> { >> - /* IST_MAX IST pages + at least 1 guard page + primary stack. */ >> - BUILD_BUG_ON((IST_MAX + 1) * PAGE_SIZE + PRIMARY_STACK_SIZE > STACK_SIZE); >> + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, _PAGE_NONE); >> >> - memguard_guard_range(p + IST_MAX * PAGE_SIZE, >> - STACK_SIZE - PRIMARY_STACK_SIZE - IST_MAX * PAGE_SIZE); >> + p += 5 * PAGE_SIZE; > The literal 5 here and ... > >> + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, _PAGE_NONE); >> } >> >> void memguard_unguard_stack(void *p) >> { >> - memguard_unguard_range(p + IST_MAX * PAGE_SIZE, >> - STACK_SIZE - PRIMARY_STACK_SIZE - IST_MAX * PAGE_SIZE); >> -} >> - >> -bool memguard_is_stack_guard_page(unsigned long addr) >> -{ >> - addr &= STACK_SIZE - 1; >> + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, PAGE_HYPERVISOR_RW); >> >> - return addr >= IST_MAX * PAGE_SIZE && >> - addr < STACK_SIZE - PRIMARY_STACK_SIZE; >> + p += 5 * PAGE_SIZE; > ... here could do with macro-izing: IST_MAX + 1 would already be > a little better, I guess. The problem is that "IST_MAX + 1" is now less meaningful than a literal 5, because at least 5 obviously matches up with the comment describing which page does what. ~Andrew > > Preferably with adjustments along these lines > Reviewed-by: Jan Beulich <jbeulich@suse.com> > > Jan
On 11.05.2020 19:48, Andrew Cooper wrote: > On 04/05/2020 15:24, Jan Beulich wrote: >> On 02.05.2020 00:58, Andrew Cooper wrote: >>> --- a/xen/arch/x86/cpu/common.c >>> +++ b/xen/arch/x86/cpu/common.c >>> @@ -732,14 +732,14 @@ void load_system_tables(void) >>> .rsp2 = 0x8600111111111111ul, >>> >>> /* >>> - * MCE, NMI and Double Fault handlers get their own stacks. >>> + * #DB, NMI, DF and #MCE handlers get their own stacks. >> Then also #DF and #MC? > > Ok. > >> >>> --- a/xen/arch/x86/mm.c >>> +++ b/xen/arch/x86/mm.c >>> @@ -6002,25 +6002,18 @@ void memguard_unguard_range(void *p, unsigned long l) >>> >>> void memguard_guard_stack(void *p) >>> { >>> - /* IST_MAX IST pages + at least 1 guard page + primary stack. */ >>> - BUILD_BUG_ON((IST_MAX + 1) * PAGE_SIZE + PRIMARY_STACK_SIZE > STACK_SIZE); >>> + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, _PAGE_NONE); >>> >>> - memguard_guard_range(p + IST_MAX * PAGE_SIZE, >>> - STACK_SIZE - PRIMARY_STACK_SIZE - IST_MAX * PAGE_SIZE); >>> + p += 5 * PAGE_SIZE; >> The literal 5 here and ... >> >>> + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, _PAGE_NONE); >>> } >>> >>> void memguard_unguard_stack(void *p) >>> { >>> - memguard_unguard_range(p + IST_MAX * PAGE_SIZE, >>> - STACK_SIZE - PRIMARY_STACK_SIZE - IST_MAX * PAGE_SIZE); >>> -} >>> - >>> -bool memguard_is_stack_guard_page(unsigned long addr) >>> -{ >>> - addr &= STACK_SIZE - 1; >>> + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, PAGE_HYPERVISOR_RW); >>> >>> - return addr >= IST_MAX * PAGE_SIZE && >>> - addr < STACK_SIZE - PRIMARY_STACK_SIZE; >>> + p += 5 * PAGE_SIZE; >> ... here could do with macro-izing: IST_MAX + 1 would already be >> a little better, I guess. > > The problem is that "IST_MAX + 1" is now less meaningful than a literal > 5, because at least 5 obviously matches up with the comment describing > which page does what. If you don't like IST_MAX+1, can I at least talk you into introducing a separate constant? This is the only way to connect together the various places where it is used. We can't be sure that we're not going to touch this code anymore, ever. And if we do, it'll help if related places are easy to spot. Jan
diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c index 131ff03fcf..290f9f1c30 100644 --- a/xen/arch/x86/cpu/common.c +++ b/xen/arch/x86/cpu/common.c @@ -732,14 +732,14 @@ void load_system_tables(void) .rsp2 = 0x8600111111111111ul, /* - * MCE, NMI and Double Fault handlers get their own stacks. + * #DB, NMI, DF and #MCE handlers get their own stacks. * All others poisoned. */ .ist = { - [IST_MCE - 1] = stack_top + IST_MCE * PAGE_SIZE, - [IST_DF - 1] = stack_top + IST_DF * PAGE_SIZE, - [IST_NMI - 1] = stack_top + IST_NMI * PAGE_SIZE, - [IST_DB - 1] = stack_top + IST_DB * PAGE_SIZE, + [IST_MCE - 1] = stack_top + (1 + IST_MCE) * PAGE_SIZE, + [IST_NMI - 1] = stack_top + (1 + IST_NMI) * PAGE_SIZE, + [IST_DB - 1] = stack_top + (1 + IST_DB) * PAGE_SIZE, + [IST_DF - 1] = stack_top + (1 + IST_DF) * PAGE_SIZE, [IST_MAX ... ARRAY_SIZE(tss->ist) - 1] = 0x8600111111111111ul, diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 355c50ff91..bc44d865ef 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -6002,25 +6002,18 @@ void memguard_unguard_range(void *p, unsigned long l) void memguard_guard_stack(void *p) { - /* IST_MAX IST pages + at least 1 guard page + primary stack. */ - BUILD_BUG_ON((IST_MAX + 1) * PAGE_SIZE + PRIMARY_STACK_SIZE > STACK_SIZE); + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, _PAGE_NONE); - memguard_guard_range(p + IST_MAX * PAGE_SIZE, - STACK_SIZE - PRIMARY_STACK_SIZE - IST_MAX * PAGE_SIZE); + p += 5 * PAGE_SIZE; + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, _PAGE_NONE); } void memguard_unguard_stack(void *p) { - memguard_unguard_range(p + IST_MAX * PAGE_SIZE, - STACK_SIZE - PRIMARY_STACK_SIZE - IST_MAX * PAGE_SIZE); -} - -bool memguard_is_stack_guard_page(unsigned long addr) -{ - addr &= STACK_SIZE - 1; + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, PAGE_HYPERVISOR_RW); - return addr >= IST_MAX * PAGE_SIZE && - addr < STACK_SIZE - PRIMARY_STACK_SIZE; + p += 5 * PAGE_SIZE; + map_pages_to_xen((unsigned long)p, virt_to_mfn(p), 1, PAGE_HYPERVISOR_RW); } void arch_dump_shared_mem_info(void) diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c index f999323bc4..e0f421ca3d 100644 --- a/xen/arch/x86/smpboot.c +++ b/xen/arch/x86/smpboot.c @@ -823,8 +823,7 @@ static int setup_cpu_root_pgt(unsigned int cpu) /* Install direct map page table entries for stack, IDT, and TSS. */ for ( off = rc = 0; !rc && off < STACK_SIZE; off += PAGE_SIZE ) - if ( !memguard_is_stack_guard_page(off) ) - rc = clone_mapping(__va(__pa(stack_base[cpu])) + off, rpt); + rc = clone_mapping(__va(__pa(stack_base[cpu])) + off, rpt); if ( !rc ) rc = clone_mapping(idt_tables[cpu], rpt); diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c index ddbe312f89..1cf00c1f4a 100644 --- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -369,20 +369,15 @@ static void show_guest_stack(struct vcpu *v, const struct cpu_user_regs *regs) /* * Notes for get_stack_trace_bottom() and get_stack_dump_bottom() * - * Stack pages 0 - 3: + * Stack pages 1 - 4: * These are all 1-page IST stacks. Each of these stacks have an exception * frame and saved register state at the top. The interesting bound for a * trace is the word adjacent to this, while the bound for a dump is the * very top, including the exception frame. * - * Stack pages 4 and 5: - * None of these are particularly interesting. With MEMORY_GUARD, page 5 is - * explicitly not present, so attempting to dump or trace it is - * counterproductive. Without MEMORY_GUARD, it is possible for a call chain - * to use the entire primary stack and wander into page 5. In this case, - * consider these pages an extension of the primary stack to aid debugging - * hopefully rare situations where the primary stack has effective been - * overflown. + * Stack pages 0 and 5: + * Shadow stacks. These are mapped read-only, and used by CET-SS capable + * processors. They will never contain regular stack data. * * Stack pages 6 and 7: * These form the primary stack, and have a cpu_info at the top. For a @@ -396,13 +391,10 @@ unsigned long get_stack_trace_bottom(unsigned long sp) { switch ( get_stack_page(sp) ) { - case 0 ... 3: + case 1 ... 4: return ROUNDUP(sp, PAGE_SIZE) - offsetof(struct cpu_user_regs, es) - sizeof(unsigned long); -#ifndef MEMORY_GUARD - case 4 ... 5: -#endif case 6 ... 7: return ROUNDUP(sp, STACK_SIZE) - sizeof(struct cpu_info) - sizeof(unsigned long); @@ -416,12 +408,9 @@ unsigned long get_stack_dump_bottom(unsigned long sp) { switch ( get_stack_page(sp) ) { - case 0 ... 3: + case 1 ... 4: return ROUNDUP(sp, PAGE_SIZE) - sizeof(unsigned long); -#ifndef MEMORY_GUARD - case 4 ... 5: -#endif case 6 ... 7: return ROUNDUP(sp, STACK_SIZE) - sizeof(unsigned long); diff --git a/xen/include/asm-x86/current.h b/xen/include/asm-x86/current.h index 5b8f4dbc79..99b66a0087 100644 --- a/xen/include/asm-x86/current.h +++ b/xen/include/asm-x86/current.h @@ -16,12 +16,12 @@ * * 7 - Primary stack (with a struct cpu_info at the top) * 6 - Primary stack - * 5 - Optionally not present (MEMORY_GUARD) - * 4 - Unused; optionally not present (MEMORY_GUARD) - * 3 - Unused; optionally not present (MEMORY_GUARD) - * 2 - MCE IST stack - * 1 - NMI IST stack - * 0 - Double Fault IST stack + * 5 - Primay Shadow Stack (read-only) + * 4 - #DF IST stack + * 3 - #DB IST stack + * 2 - NMI IST stack + * 1 - #MC IST stack + * 0 - IST Shadow Stacks (4x 1k, read-only) */ /* diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h index 3d3f9d49ac..7e74996053 100644 --- a/xen/include/asm-x86/mm.h +++ b/xen/include/asm-x86/mm.h @@ -536,7 +536,6 @@ void memguard_unguard_range(void *p, unsigned long l); void memguard_guard_stack(void *p); void memguard_unguard_stack(void *p); -bool __attribute_const__ memguard_is_stack_guard_page(unsigned long addr); struct mmio_ro_emulate_ctxt { unsigned long cr2; diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h index 5e8a0fb649..f7e80d12e4 100644 --- a/xen/include/asm-x86/processor.h +++ b/xen/include/asm-x86/processor.h @@ -439,10 +439,10 @@ struct tss_page { DECLARE_PER_CPU(struct tss_page, tss_page); #define IST_NONE 0UL -#define IST_DF 1UL +#define IST_MCE 1UL #define IST_NMI 2UL -#define IST_MCE 3UL -#define IST_DB 4UL +#define IST_DB 3UL +#define IST_DF 4UL #define IST_MAX 4UL /* Set the Interrupt Stack Table used by a particular IDT entry. */
We have two free pages in the current stack. A useful property of shadow stacks and regular stacks is that they act as each others guard pages as far as OoB writes go. Move the regular IST stacks up by one page, to allow their shadow stack page to be in slot 0. The primary shadow stack uses slot 5. As the shadow IST stacks are only 1k large, shuffle the order of IST vectors to have #DF numerically highest (so there is no chance of a shadow stack overflow clobbering the supervisor token). The XPTI code already breaks the MEMORY_GUARD abstraction for stacks by forcing it to be present. To avoid having too many configurations, do away with the concept entirely, and unconditionally unmap the pages in all cases. A later change will turn these properly into shadow stacks. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Wei Liu <wl@xen.org> CC: Roger Pau Monné <roger.pau@citrix.com> --- xen/arch/x86/cpu/common.c | 10 +++++----- xen/arch/x86/mm.c | 19 ++++++------------- xen/arch/x86/smpboot.c | 3 +-- xen/arch/x86/traps.c | 23 ++++++----------------- xen/include/asm-x86/current.h | 12 ++++++------ xen/include/asm-x86/mm.h | 1 - xen/include/asm-x86/processor.h | 6 +++--- 7 files changed, 27 insertions(+), 47 deletions(-)