Message ID | 20210325102959.GD31322@zn.tnic (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/tlb: Flush global mappings when KAISER is disabled | expand |
On 25/03/21 11:29, Borislav Petkov wrote: > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h > index f5ca15622dc9..2bfa4deb8cae 100644 > --- a/arch/x86/include/asm/tlbflush.h > +++ b/arch/x86/include/asm/tlbflush.h > @@ -245,12 +245,15 @@ static inline void __native_flush_tlb_single(unsigned long addr) > * ASID. But, userspace flushes are probably much more > * important performance-wise. > * > - * Make sure to do only a single invpcid when KAISER is > - * disabled and we have only a single ASID. > + * In the KAISER disabled case, do an INVLPG to make sure > + * the mapping is flushed in case it is a global one. > */ > - if (kaiser_enabled) > + if (kaiser_enabled) { > invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr); > - invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr); > + invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr); > + } else { > + asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); > + } > } > > static inline void __flush_tlb_all(void) > Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
On 3/25/21 5:29 AM, Borislav Petkov wrote: > Ok, > > I tried to be as specific as possible in the commit message so that we > don't forget. Please lemme know if I've missed something. > > Babu, Jim, I'd appreciate it if you ran this to confirm. > > Thx. > > --- > From: Borislav Petkov <bp@suse.de> > Date: Thu, 25 Mar 2021 11:02:31 +0100 > > Jim Mattson reported that Debian 9 guests using a 4.9-stable kernel > are exploding during alternatives patching: > > kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709! > invalid opcode: 0000 [#1] SMP > Modules linked in: > CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > Call Trace: > swap_entry_free > swap_entry_free > text_poke_bp > swap_entry_free > arch_jump_label_transform > set_debug_rodata > __jump_label_update > static_key_slow_inc > frontswap_register_ops > init_zswap > init_frontswap > do_one_initcall > set_debug_rodata > kernel_init_freeable > rest_init > kernel_init > ret_from_fork > > triggering the BUG_ON in text_poke() which verifies whether patched > instruction bytes have actually landed at the destination. > > Further debugging showed that the TLB flush before that check is > insufficient because there could be global mappings left in the TLB, > leading to a stale mapping getting used. > > I say "global mappings" because the hardware configuration is a new one: > machine is an AMD, which means, KAISER/PTI doesn't need to be enabled > there, which also means there's no user/kernel pagetables split and > therefore the TLB can have global mappings. > > And the configuration is new one for a second reason: because that AMD > machine supports PCID and INVPCID, which leads the CPU detection code to > set the synthetic X86_FEATURE_INVPCID_SINGLE flag. > > Now, __native_flush_tlb_single() does invalidate global mappings when > X86_FEATURE_INVPCID_SINGLE is *not* set and returns. > > When X86_FEATURE_INVPCID_SINGLE is set, however, it invalidates the > requested address from both PCIDs in the KAISER-enabled case. But if > KAISER is not enabled and the machine has global mappings in the TLB, > then those global mappings do not get invalidated, which would lead to > the above mismatch from using a stale TLB entry. > > So make sure to flush those global mappings in the KAISER disabled case. > > Co-debugged by Babu Moger <babu.moger@amd.com>. > > Reported-by: Jim Mattson <jmattson@google.com> > Signed-off-by: Borislav Petkov <bp@suse.de> > Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flkml.kernel.org%2Fr%2FCALMp9eRDSW66%252BXvbHVF4ohL7XhThoPoT0BrB0TcS0cgk%3DdkcBg%40mail.gmail.com&data=04%7C01%7Cbabu.moger%40amd.com%7Cf4e0aacf81744dc8be4408d8ef78f2cf%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637522650066097649%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=1c4MQ9I9KrLxWLqghGCI%2BC%2Bvs0c9vYaNC5d%2FiYL0oMA%3D&reserved=0 > --- > arch/x86/include/asm/tlbflush.h | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h > index f5ca15622dc9..2bfa4deb8cae 100644 > --- a/arch/x86/include/asm/tlbflush.h > +++ b/arch/x86/include/asm/tlbflush.h > @@ -245,12 +245,15 @@ static inline void __native_flush_tlb_single(unsigned long addr) > * ASID. But, userspace flushes are probably much more > * important performance-wise. > * > - * Make sure to do only a single invpcid when KAISER is > - * disabled and we have only a single ASID. > + * In the KAISER disabled case, do an INVLPG to make sure > + * the mapping is flushed in case it is a global one. > */ > - if (kaiser_enabled) > + if (kaiser_enabled) { > invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr); > - invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr); > + invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr); > + } else { > + asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); > + } > } > > static inline void __flush_tlb_all(void) > Thanks Boris. As you updated the patch little bit since yesterday, I retested them again both on host and guest kernel. They are looking good. Tested-by: Babu Moger <babu.moger@amd.com>
On Thu, 25 Mar 2021, Borislav Petkov wrote: > Ok, > > I tried to be as specific as possible in the commit message so that we > don't forget. Please lemme know if I've missed something. > > Babu, Jim, I'd appreciate it if you ran this to confirm. > > Thx. > > --- > From: Borislav Petkov <bp@suse.de> > Date: Thu, 25 Mar 2021 11:02:31 +0100 > > Jim Mattson reported that Debian 9 guests using a 4.9-stable kernel > are exploding during alternatives patching: > > kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709! > invalid opcode: 0000 [#1] SMP > Modules linked in: > CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 > Call Trace: > swap_entry_free > swap_entry_free > text_poke_bp > swap_entry_free > arch_jump_label_transform > set_debug_rodata > __jump_label_update > static_key_slow_inc > frontswap_register_ops > init_zswap > init_frontswap > do_one_initcall > set_debug_rodata > kernel_init_freeable > rest_init > kernel_init > ret_from_fork > > triggering the BUG_ON in text_poke() which verifies whether patched > instruction bytes have actually landed at the destination. > > Further debugging showed that the TLB flush before that check is > insufficient because there could be global mappings left in the TLB, > leading to a stale mapping getting used. > > I say "global mappings" because the hardware configuration is a new one: > machine is an AMD, which means, KAISER/PTI doesn't need to be enabled > there, which also means there's no user/kernel pagetables split and > therefore the TLB can have global mappings. > > And the configuration is new one for a second reason: because that AMD > machine supports PCID and INVPCID, which leads the CPU detection code to > set the synthetic X86_FEATURE_INVPCID_SINGLE flag. > > Now, __native_flush_tlb_single() does invalidate global mappings when > X86_FEATURE_INVPCID_SINGLE is *not* set and returns. > > When X86_FEATURE_INVPCID_SINGLE is set, however, it invalidates the > requested address from both PCIDs in the KAISER-enabled case. But if > KAISER is not enabled and the machine has global mappings in the TLB, > then those global mappings do not get invalidated, which would lead to > the above mismatch from using a stale TLB entry. > > So make sure to flush those global mappings in the KAISER disabled case. > > Co-debugged by Babu Moger <babu.moger@amd.com>. > > Reported-by: Jim Mattson <jmattson@google.com> > Signed-off-by: Borislav Petkov <bp@suse.de> > Link: https://lkml.kernel.org/r/CALMp9eRDSW66%2BXvbHVF4ohL7XhThoPoT0BrB0TcS0cgk=dkcBg@mail.gmail.com Acked-by: Hugh Dickins <hughd@google.com> Great write-up too: many thanks. > --- > arch/x86/include/asm/tlbflush.h | 11 +++++++---- > 1 file changed, 7 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h > index f5ca15622dc9..2bfa4deb8cae 100644 > --- a/arch/x86/include/asm/tlbflush.h > +++ b/arch/x86/include/asm/tlbflush.h > @@ -245,12 +245,15 @@ static inline void __native_flush_tlb_single(unsigned long addr) > * ASID. But, userspace flushes are probably much more > * important performance-wise. > * > - * Make sure to do only a single invpcid when KAISER is > - * disabled and we have only a single ASID. > + * In the KAISER disabled case, do an INVLPG to make sure > + * the mapping is flushed in case it is a global one. > */ > - if (kaiser_enabled) > + if (kaiser_enabled) { > invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr); > - invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr); > + invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr); > + } else { > + asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); > + } > } > > static inline void __flush_tlb_all(void) > -- > 2.29.2 > > -- > Regards/Gruss, > Boris. > > https://people.kernel.org/tglx/notes-about-netiquette
On Thu, Mar 25, 2021 at 9:33 AM Hugh Dickins <hughd@google.com> wrote: > > On Thu, 25 Mar 2021, Borislav Petkov wrote: > > > Ok, > > > > I tried to be as specific as possible in the commit message so that we > > don't forget. Please lemme know if I've missed something. > > > > Babu, Jim, I'd appreciate it if you ran this to confirm. Tested-by: Jim Mattson <jmattson@google.com>
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index f5ca15622dc9..2bfa4deb8cae 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -245,12 +245,15 @@ static inline void __native_flush_tlb_single(unsigned long addr) * ASID. But, userspace flushes are probably much more * important performance-wise. * - * Make sure to do only a single invpcid when KAISER is - * disabled and we have only a single ASID. + * In the KAISER disabled case, do an INVLPG to make sure + * the mapping is flushed in case it is a global one. */ - if (kaiser_enabled) + if (kaiser_enabled) { invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr); - invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr); + invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr); + } else { + asm volatile("invlpg (%0)" ::"r" (addr) : "memory"); + } } static inline void __flush_tlb_all(void)