diff mbox series

[v2,1/5] x86: Detection of Knights Landing A/D leak

Message ID 20211021122112.592634-2-namit@vmware.com (mailing list archive)
State New
Headers show
Series mm/mprotect: avoid unnecessary TLB flushes | expand

Commit Message

Nadav Amit Oct. 21, 2021, 12:21 p.m. UTC
From: Nadav Amit <namit@vmware.com>

Knights Landing has a issue that a thread setting A or D bits may not do
so atomically against checking the present bit.  A thread which is going
to page fault may still set those bits, even though the present bit was
already atomically cleared.

This implies that when the kernel clears present atomically, some time
later the supposed to be zero entry could be corrupted with stray A or D
bits.

Since the PTE could be already used for storing a swap index, or a NUMA
migration index, this cannot be tolerated. Most of the time the kernel
detects the problem, but in some rare cases it may not.

This patch adds an interface to detect the bug, which will be used in
the following patch.

[ Based on a patch by Andi Kleen ]

Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: x86@kernel.org
Link: https://lore.kernel.org/lkml/1465919919-2093-1-git-send-email-lukasz.anaczkowski@intel.com/
Signed-off-by: Nadav Amit <namit@vmware.com>
---
 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/kernel/cpu/intel.c        | 5 +++++
 2 files changed, 6 insertions(+)

Comments

Dave Hansen Oct. 26, 2021, 3:54 p.m. UTC | #1
On 10/21/21 5:21 AM, Nadav Amit wrote:
> --- a/arch/x86/kernel/cpu/intel.c
> +++ b/arch/x86/kernel/cpu/intel.c
> @@ -296,6 +296,11 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>  		}
>  	}
>  
> +	if (c->x86_model == 87) {
> +		pr_info_once("Enabling PTE leaking workaround\n");
> +		set_cpu_bug(c, X86_BUG_PTE_LEAK);
> +	}

Please take a look at:

	arch/x86/include/asm/intel-family.h

specifically:

#define INTEL_FAM6_XEON_PHI_KNL         0x57 /* Knights Landing */
Nadav Amit Oct. 26, 2021, 3:57 p.m. UTC | #2
> On Oct 26, 2021, at 8:54 AM, Dave Hansen <dave.hansen@intel.com> wrote:
> 
> On 10/21/21 5:21 AM, Nadav Amit wrote:
>> --- a/arch/x86/kernel/cpu/intel.c
>> +++ b/arch/x86/kernel/cpu/intel.c
>> @@ -296,6 +296,11 @@ static void early_init_intel(struct cpuinfo_x86 *c)
>> 		}
>> 	}
>> 
>> +	if (c->x86_model == 87) {
>> +		pr_info_once("Enabling PTE leaking workaround\n");
>> +		set_cpu_bug(c, X86_BUG_PTE_LEAK);
>> +	}
> 
> Please take a look at:
> 
> 	arch/x86/include/asm/intel-family.h
> 
> specifically:
> 
> #define INTEL_FAM6_XEON_PHI_KNL         0x57 /* Knights Landing */

Thanks, I will fix it. I really just copy pasted from Andi’s patch
(for better and worse).
diff mbox series

Patch

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index d0ce5cfd3ac1..32d0aabd788d 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -436,5 +436,6 @@ 
 #define X86_BUG_TAA			X86_BUG(22) /* CPU is affected by TSX Async Abort(TAA) */
 #define X86_BUG_ITLB_MULTIHIT		X86_BUG(23) /* CPU may incur MCE during certain page attribute changes */
 #define X86_BUG_SRBDS			X86_BUG(24) /* CPU may leak RNG bits if not mitigated */
+#define X86_BUG_PTE_LEAK		X86_BUG(25) /* PTE may leak A/D bits after clear */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 8321c43554a1..40bcba6e3641 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -296,6 +296,11 @@  static void early_init_intel(struct cpuinfo_x86 *c)
 		}
 	}
 
+	if (c->x86_model == 87) {
+		pr_info_once("Enabling PTE leaking workaround\n");
+		set_cpu_bug(c, X86_BUG_PTE_LEAK);
+	}
+
 	/*
 	 * Intel Quark Core DevMan_001.pdf section 6.4.11
 	 * "The operating system also is required to invalidate (i.e., flush)