diff mbox

[1/4] x86/speculation: Use IBRS if available before calling into firmware

Message ID 1518615874-13806-2-git-send-email-dwmw@amazon.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Woodhouse, David Feb. 14, 2018, 1:44 p.m. UTC
Retpoline means the kernel is safe because it has no indirect branches.
But firmware isn't, so use IBRS for firmware calls if it's available.

Block preemption while IBRS is set, although in practice the call sites
already had to be doing that.

Ignore hpwdt.c for now. It's taking spinlocks and calling into firmware
code, from an NMI handler. I don't want to touch that with a bargepole.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 arch/x86/include/asm/apm.h           |  6 ++++++
 arch/x86/include/asm/cpufeatures.h   |  1 +
 arch/x86/include/asm/efi.h           | 17 ++++++++++++++--
 arch/x86/include/asm/nospec-branch.h | 39 +++++++++++++++++++++++++++---------
 arch/x86/kernel/cpu/bugs.c           | 12 ++++++++++-
 5 files changed, 63 insertions(+), 12 deletions(-)

Comments

Tom Lendacky Feb. 14, 2018, 4:07 p.m. UTC | #1
On 2/14/2018 7:44 AM, David Woodhouse wrote:
> Retpoline means the kernel is safe because it has no indirect branches.
> But firmware isn't, so use IBRS for firmware calls if it's available.
> 
> Block preemption while IBRS is set, although in practice the call sites
> already had to be doing that.
> 
> Ignore hpwdt.c for now. It's taking spinlocks and calling into firmware
> code, from an NMI handler. I don't want to touch that with a bargepole.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
>  arch/x86/include/asm/apm.h           |  6 ++++++
>  arch/x86/include/asm/cpufeatures.h   |  1 +
>  arch/x86/include/asm/efi.h           | 17 ++++++++++++++--
>  arch/x86/include/asm/nospec-branch.h | 39 +++++++++++++++++++++++++++---------
>  arch/x86/kernel/cpu/bugs.c           | 12 ++++++++++-
>  5 files changed, 63 insertions(+), 12 deletions(-)
> 

... <snip> ...

> +/*
> + * With retpoline, we must use IBRS to restrict branch prediction
> + * before calling into firmware.
> + */
> +static inline void firmware_restrict_branch_speculation_start(void)
> +{
> +	preempt_disable();
> +	alternative_msr_write(MSR_IA32_SPEC_CTRL, SPEC_CTRL_IBRS,
> +			      X86_FEATURE_USE_IBRS_FW);
> +}
> +
> +static inline void firmware_restrict_branch_speculation_end(void)
> +{
> +	alternative_msr_write(MSR_IA32_SPEC_CTRL, 0,
> +			      X86_FEATURE_USE_IBRS_FW);
> +	preempt_enable();
>  }

Shouldn't these writes to the MSR be just for the IBRS bit?  The spec
also defines the STIBP bit for this MSR, and if that bit had been set by
BIOS for example, these writes will clear it.  And who knows what future
bits may be defined and how they'll be used.

Thanks,
Tom

>  
>  #endif /* __ASSEMBLY__ */
> diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
> index d71c8b5..bfca937 100644
> --- a/arch/x86/kernel/cpu/bugs.c
> +++ b/arch/x86/kernel/cpu/bugs.c
> @@ -300,6 +300,15 @@ static void __init spectre_v2_select_mitigation(void)
>  		setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
>  		pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n");
>  	}
> +
> +	/*
> +	 * Retpoline means the kernel is safe because it has no indirect
> +	 * branches. But firmware isn't, so use IBRS to protect that.
> +	 */
> +	if (boot_cpu_has(X86_FEATURE_IBRS)) {
> +		setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
> +		pr_info("Enabling Restricted Speculation for firmware calls\n");
> +	}
>  }
>  
>  #undef pr_fmt
> @@ -326,8 +335,9 @@ ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c
>  	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
>  		return sprintf(buf, "Not affected\n");
>  
> -	return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
> +	return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
>  		       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
> +		       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
>  		       spectre_v2_module_string());
>  }
>  #endif
>
David Woodhouse Feb. 14, 2018, 4:11 p.m. UTC | #2
On Wed, 2018-02-14 at 10:07 -0600, Tom Lendacky wrote:
> Shouldn't these writes to the MSR be just for the IBRS bit?  The spec
> also defines the STIBP bit for this MSR, and if that bit had been set by
> BIOS for example, these writes will clear it.  And who knows what future
> bits may be defined and how they'll be used.

We don't use STIBP. If one day we do decide to set it in userspace for
"sensitive" processes, if we're done having the debate about what those
are, then that seems unlikely to conflict what what this code is doing
anyway, as we would presumably *clear* it again on the way back into
the kernel.

I certainly don't want to add a read/modify/write cycle here just to
cope with some hypothetical future use case for STIBP, when there would
be better ways to cope.
Tom Lendacky Feb. 14, 2018, 4:36 p.m. UTC | #3
On 2/14/2018 10:11 AM, David Woodhouse wrote:
> 
> 
> On Wed, 2018-02-14 at 10:07 -0600, Tom Lendacky wrote:
>> Shouldn't these writes to the MSR be just for the IBRS bit?  The spec
>> also defines the STIBP bit for this MSR, and if that bit had been set by
>> BIOS for example, these writes will clear it.  And who knows what future
>> bits may be defined and how they'll be used.
> 
> We don't use STIBP. If one day we do decide to set it in userspace for

Right, I understand the kernel doesn't use STIBP, that's why I mentioned
BIOS as an example.

> "sensitive" processes, if we're done having the debate about what those
> are, then that seems unlikely to conflict what what this code is doing
> anyway, as we would presumably *clear* it again on the way back into
> the kernel.
> 
> I certainly don't want to add a read/modify/write cycle here just to

Right, definitely to be avoided.  Maybe the value could be tracked in a
per-cpu variable so you never have to read it before the write.  Just
change the bit in question and write.  Not sure that's really feasible
though.

> cope with some hypothetical future use case for STIBP, when there would
> be better ways to cope.

Just putting it out there, no worries.

Thanks,
Tom

>
David Woodhouse Feb. 14, 2018, 4:53 p.m. UTC | #4
On Wed, 2018-02-14 at 10:36 -0600, Tom Lendacky wrote:
> On 2/14/2018 10:11 AM, David Woodhouse wrote:
> > 
> > 
> > 
> > On Wed, 2018-02-14 at 10:07 -0600, Tom Lendacky wrote:
> > > 
> > > Shouldn't these writes to the MSR be just for the IBRS bit?  The spec
> > > also defines the STIBP bit for this MSR, and if that bit had been set by
> > > BIOS for example, these writes will clear it.  And who knows what future
> > > bits may be defined and how they'll be used.
> >
> > We don't use STIBP. If one day we do decide to set it in userspace for
>
> Right, I understand the kernel doesn't use STIBP, that's why I mentioned
> BIOS as an example.

BIOS has no business setting this for us either. Either we support it
and turn it on in the kernel ourselves, or it's off.
diff mbox

Patch

diff --git a/arch/x86/include/asm/apm.h b/arch/x86/include/asm/apm.h
index 4d4015d..c356098 100644
--- a/arch/x86/include/asm/apm.h
+++ b/arch/x86/include/asm/apm.h
@@ -7,6 +7,8 @@ 
 #ifndef _ASM_X86_MACH_DEFAULT_APM_H
 #define _ASM_X86_MACH_DEFAULT_APM_H
 
+#include <asm/nospec-branch.h>
+
 #ifdef APM_ZERO_SEGS
 #	define APM_DO_ZERO_SEGS \
 		"pushl %%ds\n\t" \
@@ -32,6 +34,7 @@  static inline void apm_bios_call_asm(u32 func, u32 ebx_in, u32 ecx_in,
 	 * N.B. We do NOT need a cld after the BIOS call
 	 * because we always save and restore the flags.
 	 */
+	firmware_restrict_branch_speculation_start();
 	__asm__ __volatile__(APM_DO_ZERO_SEGS
 		"pushl %%edi\n\t"
 		"pushl %%ebp\n\t"
@@ -44,6 +47,7 @@  static inline void apm_bios_call_asm(u32 func, u32 ebx_in, u32 ecx_in,
 		  "=S" (*esi)
 		: "a" (func), "b" (ebx_in), "c" (ecx_in)
 		: "memory", "cc");
+	firmware_restrict_branch_speculation_end();
 }
 
 static inline bool apm_bios_call_simple_asm(u32 func, u32 ebx_in,
@@ -56,6 +60,7 @@  static inline bool apm_bios_call_simple_asm(u32 func, u32 ebx_in,
 	 * N.B. We do NOT need a cld after the BIOS call
 	 * because we always save and restore the flags.
 	 */
+	firmware_restrict_branch_speculation_start();
 	__asm__ __volatile__(APM_DO_ZERO_SEGS
 		"pushl %%edi\n\t"
 		"pushl %%ebp\n\t"
@@ -68,6 +73,7 @@  static inline bool apm_bios_call_simple_asm(u32 func, u32 ebx_in,
 		  "=S" (si)
 		: "a" (func), "b" (ebx_in), "c" (ecx_in)
 		: "memory", "cc");
+	firmware_restrict_branch_speculation_end();
 	return error;
 }
 
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 73b5fff..66c1434 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -211,6 +211,7 @@ 
 #define X86_FEATURE_RSB_CTXSW		( 7*32+19) /* "" Fill RSB on context switches */
 
 #define X86_FEATURE_USE_IBPB		( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */
+#define X86_FEATURE_USE_IBRS_FW		( 7*32+22) /* "" Use IBRS during runtime firmware calls */
 
 /* Virtualization flags: Linux defined, word 8 */
 #define X86_FEATURE_TPR_SHADOW		( 8*32+ 0) /* Intel TPR Shadow */
diff --git a/arch/x86/include/asm/efi.h b/arch/x86/include/asm/efi.h
index 85f6ccb..a399c1e 100644
--- a/arch/x86/include/asm/efi.h
+++ b/arch/x86/include/asm/efi.h
@@ -6,6 +6,7 @@ 
 #include <asm/pgtable.h>
 #include <asm/processor-flags.h>
 #include <asm/tlb.h>
+#include <asm/nospec-branch.h>
 
 /*
  * We map the EFI regions needed for runtime services non-contiguously,
@@ -36,8 +37,18 @@ 
 
 extern asmlinkage unsigned long efi_call_phys(void *, ...);
 
-#define arch_efi_call_virt_setup()	kernel_fpu_begin()
-#define arch_efi_call_virt_teardown()	kernel_fpu_end()
+#define arch_efi_call_virt_setup()					\
+({									\
+	kernel_fpu_begin();						\
+	firmware_restrict_branch_speculation_start();			\
+})
+
+#define arch_efi_call_virt_teardown()					\
+({									\
+	firmware_restrict_branch_speculation_end();			\
+	kernel_fpu_end();						\
+})
+
 
 /*
  * Wrap all the virtual calls in a way that forces the parameters on the stack.
@@ -73,6 +84,7 @@  struct efi_scratch {
 	efi_sync_low_kernel_mappings();					\
 	preempt_disable();						\
 	__kernel_fpu_begin();						\
+	firmware_restrict_branch_speculation_start();			\
 									\
 	if (efi_scratch.use_pgd) {					\
 		efi_scratch.prev_cr3 = __read_cr3();			\
@@ -91,6 +103,7 @@  struct efi_scratch {
 		__flush_tlb_all();					\
 	}								\
 									\
+	firmware_restrict_branch_speculation_end();			\
 	__kernel_fpu_end();						\
 	preempt_enable();						\
 })
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
index 76b0585..0995c6a 100644
--- a/arch/x86/include/asm/nospec-branch.h
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -163,17 +163,38 @@  static inline void vmexit_fill_RSB(void)
 #endif
 }
 
+#define alternative_msr_write(_msr, _val, _feature)		\
+	asm volatile(ALTERNATIVE("",				\
+				 "movl %[msr], %%ecx\n\t"	\
+				 "movl %[val], %%eax\n\t"	\
+				 "movl $0, %%edx\n\t"		\
+				 "wrmsr",			\
+				 _feature)			\
+		     : : [msr] "i" (_msr), [val] "i" (_val)	\
+		     : "eax", "ecx", "edx", "memory")
+
 static inline void indirect_branch_prediction_barrier(void)
 {
-	asm volatile(ALTERNATIVE("",
-				 "movl %[msr], %%ecx\n\t"
-				 "movl %[val], %%eax\n\t"
-				 "movl $0, %%edx\n\t"
-				 "wrmsr",
-				 X86_FEATURE_USE_IBPB)
-		     : : [msr] "i" (MSR_IA32_PRED_CMD),
-			 [val] "i" (PRED_CMD_IBPB)
-		     : "eax", "ecx", "edx", "memory");
+	alternative_msr_write(MSR_IA32_PRED_CMD, PRED_CMD_IBPB,
+			      X86_FEATURE_USE_IBPB);
+}
+
+/*
+ * With retpoline, we must use IBRS to restrict branch prediction
+ * before calling into firmware.
+ */
+static inline void firmware_restrict_branch_speculation_start(void)
+{
+	preempt_disable();
+	alternative_msr_write(MSR_IA32_SPEC_CTRL, SPEC_CTRL_IBRS,
+			      X86_FEATURE_USE_IBRS_FW);
+}
+
+static inline void firmware_restrict_branch_speculation_end(void)
+{
+	alternative_msr_write(MSR_IA32_SPEC_CTRL, 0,
+			      X86_FEATURE_USE_IBRS_FW);
+	preempt_enable();
 }
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index d71c8b5..bfca937 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -300,6 +300,15 @@  static void __init spectre_v2_select_mitigation(void)
 		setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
 		pr_info("Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier\n");
 	}
+
+	/*
+	 * Retpoline means the kernel is safe because it has no indirect
+	 * branches. But firmware isn't, so use IBRS to protect that.
+	 */
+	if (boot_cpu_has(X86_FEATURE_IBRS)) {
+		setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
+		pr_info("Enabling Restricted Speculation for firmware calls\n");
+	}
 }
 
 #undef pr_fmt
@@ -326,8 +335,9 @@  ssize_t cpu_show_spectre_v2(struct device *dev, struct device_attribute *attr, c
 	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
 		return sprintf(buf, "Not affected\n");
 
-	return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+	return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
 		       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
+		       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
 		       spectre_v2_module_string());
 }
 #endif