Message ID | 1516741116.13558.11.camel@infradead.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Jan 23, 2018 at 08:58:36PM +0000, David Woodhouse wrote: > +static const struct sku_microcode spectre_bad_microcodes[] = { > + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x0B, 0x80 }, > + { INTEL_FAM6_KABYLAKE_MOBILE, 0x0A, 0x80 }, > + { INTEL_FAM6_KABYLAKE_MOBILE, 0x0A, 0x80 }, > + { INTEL_FAM6_KABYLAKE_MOBILE, 0x09, 0x80 }, > + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x09, 0x80 }, > + { INTEL_FAM6_SKYLAKE_X, 0x04, 0x0200003C }, > + { INTEL_FAM6_SKYLAKE_MOBILE, 0x03, 0x000000C2 }, > + { INTEL_FAM6_SKYLAKE_DESKTOP, 0x03, 0x000000C2 }, > + { INTEL_FAM6_BROADWELL_CORE, 0x04, 0x28 }, > + { INTEL_FAM6_BROADWELL_GT3E, 0x01, 0x0000001B }, > + { INTEL_FAM6_HASWELL_ULT, 0x01, 0x21 }, > + { INTEL_FAM6_HASWELL_GT3E, 0x01, 0x18 }, > + { INTEL_FAM6_HASWELL_CORE, 0x03, 0x23 }, > + { INTEL_FAM6_IVYBRIDGE_X, 0x04, 0x42a }, > + { INTEL_FAM6_HASWELL_X, 0x02, 0x3b }, > + { INTEL_FAM6_HASWELL_X, 0x04, 0x10 }, > + { INTEL_FAM6_HASWELL_CORE, 0x03, 0x23 }, > + { INTEL_FAM6_BROADWELL_XEON_D, 0x02, 0x14 }, > + { INTEL_FAM6_BROADWELL_XEON_D, 0x03, 0x7000011 }, > + { INTEL_FAM6_BROADWELL_GT3E, 0x01, 0x0000001B }, > + /* For 406F1 Intel says "0x25, 0x23" while VMware says 0x0B000025 > + * and a real CPU has a firmware in the 0x0B0000xx range. So: */ > + { INTEL_FAM6_BROADWELL_X, 0x01, 0x0b000025 }, > + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x09, 0x80 }, > + { INTEL_FAM6_SKYLAKE_X, 0x03, 0x100013e }, > + { INTEL_FAM6_SKYLAKE_X, 0x04, 0x200003c }, > +}; Typically tglx likes to use x86_match_cpu() for these things; see also commit: bd9240a18edfb ("x86/apic: Add TSC_DEADLINE quirk due to errata"). > + > +static int bad_spectre_microcode(struct cpuinfo_x86 *c) > +{ > + int i; > + > + for (i = 0; i < ARRAY_SIZE(spectre_bad_microcodes); i++) { > + if (c->x86_model == spectre_bad_microcodes[i].model && > + c->x86_mask == spectre_bad_microcodes[i].stepping) > + return (c->microcode <= spectre_bad_microcodes[i].microcode); > + } > + return 0; > +} The above is Intel only, you should check vendor too I think. > static void early_init_intel(struct cpuinfo_x86 *c) > { > u64 misc_enable; > @@ -122,6 +173,18 @@ static void early_init_intel(struct cpuinfo_x86 *c) > if (c->x86 >= 6 && !cpu_has(c, X86_FEATURE_IA64)) > c->microcode = intel_get_microcode_revision(); > > + if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) || > + cpu_has(c, X86_FEATURE_AMD_SPEC_CTRL) || > + cpu_has(c, X86_FEATURE_AMD_PRED_CMD) || > + cpu_has(c, X86_FEATURE_AMD_STIBP)) && bad_spectre_microcode(c)) { > + pr_warn("Intel Spectre v2 broken microcode detected; disabling SPEC_CTRL\n"); > + clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL); > + clear_cpu_cap(c, X86_FEATURE_STIBP); > + clear_cpu_cap(c, X86_FEATURE_AMD_SPEC_CTRL); > + clear_cpu_cap(c, X86_FEATURE_AMD_PRED_CMD); > + clear_cpu_cap(c, X86_FEATURE_AMD_STIBP); > + } And since its Intel only, what are those AMD features doing there?
On Wed, 2018-01-24 at 09:47 +0100, Peter Zijlstra wrote: > Typically tglx likes to use x86_match_cpu() for these things; see also > commit: bd9240a18edfb ("x86/apic: Add TSC_DEADLINE quirk due to > errata"). Thanks, will fix. I think we might also end up in whitelist mode, adding "known good" microcodes to the list as they get released or retroactively blessed. I would really have liked a new bit in IA32_ARCH_CAPABILITIES to say that it's safe, but that's not possible for *existing* microcode which actually turns out to be OK in the end. That means the whitelist ends up basically empty right now. Should I add a command line parameter to override it? Otherwise we end up having to rebuild the kernel every time there's a microcode release which covers a new CPU SKU (which is why I kind of hate the whitelist, but Arjan is very insistent...) I'm kind of tempted to turn it into a whitelist just by adding 1 to the microcode revision in each table entry. Sure, that N+1 might be another microcode build that also has issues but never saw the light of day... but that's OK as long it never *does*. And yes we'd have to tweak it if revisions that are blacklisted in the Intel doc are subsequently cleared. But at least it'd require *less* tweaking. > > > > + > > +static int bad_spectre_microcode(struct cpuinfo_x86 *c) > > +{ > > + int i; > > + > > + for (i = 0; i < ARRAY_SIZE(spectre_bad_microcodes); i++) { > > + if (c->x86_model == spectre_bad_microcodes[i].model && > > + c->x86_mask == spectre_bad_microcodes[i].stepping) > > + return (c->microcode <= spectre_bad_microcodes[i].microcode); > > + } > > + return 0; > > +} > The above is Intel only, you should check vendor too I think. It's in intel.c, called from early_init_intel(). Isn't that sufficient? > > > > static void early_init_intel(struct cpuinfo_x86 *c) > > { > > u64 misc_enable; > > @@ -122,6 +173,18 @@ static void early_init_intel(struct cpuinfo_x86 *c) > > if (c->x86 >= 6 && !cpu_has(c, X86_FEATURE_IA64)) > > c->microcode = intel_get_microcode_revision(); > > > > + if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) || > > + cpu_has(c, X86_FEATURE_AMD_SPEC_CTRL) || > > + cpu_has(c, X86_FEATURE_AMD_PRED_CMD) || > > + cpu_has(c, X86_FEATURE_AMD_STIBP)) && bad_spectre_microcode(c)) { > > + pr_warn("Intel Spectre v2 broken microcode detected; disabling SPEC_CTRL\n"); > > + clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL); > > + clear_cpu_cap(c, X86_FEATURE_STIBP); > > + clear_cpu_cap(c, X86_FEATURE_AMD_SPEC_CTRL); > > + clear_cpu_cap(c, X86_FEATURE_AMD_PRED_CMD); > > + clear_cpu_cap(c, X86_FEATURE_AMD_STIBP); > > + } > And since its Intel only, what are those AMD features doing there? Hypervisors which only want to expose PRED_CMD may do so using the AMD feature bit. SPEC_CTRL requires save/restore and live migration support, and isn't needed with retpoline anyway (since guests won't be calling directly into firmware).
On Wed, Jan 24, 2018 at 09:02:21AM +0000, David Woodhouse wrote: > On Wed, 2018-01-24 at 09:47 +0100, Peter Zijlstra wrote: > > Typically tglx likes to use x86_match_cpu() for these things; see also > > commit: bd9240a18edfb ("x86/apic: Add TSC_DEADLINE quirk due to > > errata"). > > Thanks, will fix. I think we might also end up in whitelist mode, > adding "known good" microcodes to the list as they get released or > retroactively blessed. > > I would really have liked a new bit in IA32_ARCH_CAPABILITIES to say > that it's safe, but that's not possible for *existing* microcode which > actually turns out to be OK in the end. > > That means the whitelist ends up basically empty right now. Should I > add a command line parameter to override it? Otherwise we end up having > to rebuild the kernel every time there's a microcode release which > covers a new CPU SKU (which is why I kind of hate the whitelist, but > Arjan is very insistent...) Ick, no, whitelists are a pain for everyone involved. Don't do that unless it is absolutely the only way it will ever work. Arjan, why do you think this can only be done as a whitelist? It's much easier to just mark the "bad" microcode versions as those _should_ be a much smaller list that Intel knows about today. And of course, any future microcode updates will not be "bad" because they know how to properly test for this now before they are released :) thanks, greg k-h
> > > + for (i = 0; i < ARRAY_SIZE(spectre_bad_microcodes); i++) { > > > + if (c->x86_model == spectre_bad_microcodes[i].model && > > > + c->x86_mask == spectre_bad_microcodes[i].stepping) > > > + return (c->microcode <= spectre_bad_microcodes[i].microcode); > > > + } > > > + return 0; > > > +} > > The above is Intel only, you should check vendor too I think. > > It's in intel.c, called from early_init_intel(). Isn't that sufficient? Duh, so much for reading skillz on my end ;-) > > > + pr_warn("Intel Spectre v2 broken microcode detected; disabling SPEC_CTRL\n"); > > > + clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL); > > > + clear_cpu_cap(c, X86_FEATURE_STIBP); > > > + clear_cpu_cap(c, X86_FEATURE_AMD_SPEC_CTRL); > > > + clear_cpu_cap(c, X86_FEATURE_AMD_PRED_CMD); > > > + clear_cpu_cap(c, X86_FEATURE_AMD_STIBP); > > > + } > > And since its Intel only, what are those AMD features doing there? > > Hypervisors which only want to expose PRED_CMD may do so using the AMD > feature bit. SPEC_CTRL requires save/restore and live migration > support, and isn't needed with retpoline anyway (since guests won't be > calling directly into firmware). Egads, I suppose that makes some sense, but it does make a horrible muddle of things.
On Wed, 24 Jan 2018, David Woodhouse wrote: > I'm kind of tempted to turn it into a whitelist just by adding 1 to the > microcode revision in each table entry. Sure, that N+1 might be another > microcode build that also has issues but never saw the light of day... Watch out for the (AFAIK) still not properly documented where it should be (i.e. the microcode chapter of the Intel SDM) weirdness in Skylake+ microcode revision. Actually, this is related to SGX, so anything that has SGX. When it has SGX inside, Intel will release microcode only with even revision numbers, but the processor may report it as odd (and will do so by subtracting 1, so microcode 0xb0 is the same as microcode 0xaf) when the update is loaded by the processor itself from FIT (as opposed as being loaded by WRMSR from BIOS/UEFI/OS). So, you could see N-1 from within Linux if we did not update the microcode, and fail to trigger a whitelist (or mistrigger a blacklist).
On Wed, 2018-01-24 at 09:47 +0100, Peter Zijlstra wrote: > > Typically tglx likes to use x86_match_cpu() for these things; see also > commit: bd9240a18edfb ("x86/apic: Add TSC_DEADLINE quirk due to > errata"). Ewww. static u32 hsx_deadline_rev(void) { switch (boot_cpu_data.x86_mask) { case 0x02: return 0x3a; /* EP */ case 0x04: return 0x0f; /* EX */ } return ~0U; } ... static const struct x86_cpu_id deadline_match[] = { DEADLINE_MODEL_MATCH_FUNC( INTEL_FAM6_HASWELL_X, hsx_deadline_rev), DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_BROADWELL_X, 0x0b000020), DEADLINE_MODEL_MATCH_FUNC( INTEL_FAM6_BROADWELL_XEON_D, bdx_deadline_rev), DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_X, 0x02000014), ... /* * Function pointers will have the MSB set due to address layout, * immediate revisions will not. */ if ((long)m->driver_data < 0) rev = ((u32 (*)(void))(m->driver_data))(); else rev = (u32)m->driver_data; EWWWW! Shan't.
On Wed, Jan 24, 2018 at 12:14:51PM +0000, David Woodhouse wrote: > On Wed, 2018-01-24 at 09:47 +0100, Peter Zijlstra wrote: > > > > Typically tglx likes to use x86_match_cpu() for these things; see also > > commit: bd9240a18edfb ("x86/apic: Add TSC_DEADLINE quirk due to > > errata"). > > Ewww. > > static u32 hsx_deadline_rev(void) > { > switch (boot_cpu_data.x86_mask) { > case 0x02: return 0x3a; /* EP */ > case 0x04: return 0x0f; /* EX */ > } > > return ~0U; > } > ... > static const struct x86_cpu_id deadline_match[] = { > DEADLINE_MODEL_MATCH_FUNC( INTEL_FAM6_HASWELL_X, hsx_deadline_rev), > DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_BROADWELL_X, 0x0b000020), > DEADLINE_MODEL_MATCH_FUNC( INTEL_FAM6_BROADWELL_XEON_D, bdx_deadline_rev), > DEADLINE_MODEL_MATCH_REV ( INTEL_FAM6_SKYLAKE_X, 0x02000014), > ... > > /* > * Function pointers will have the MSB set due to address layout, > * immediate revisions will not. > */ > if ((long)m->driver_data < 0) > rev = ((u32 (*)(void))(m->driver_data))(); > else > rev = (u32)m->driver_data; > > EWWWW! > Yes :/ We could look at extending x86_cpu_id and x86_match_cpu with a stepping option I suppose, but that might be lots of churn. Thomas?
On Wed, 2018-01-24 at 08:49 -0200, Henrique de Moraes Holschuh wrote: > On Wed, 24 Jan 2018, David Woodhouse wrote: > > > > I'm kind of tempted to turn it into a whitelist just by adding 1 to the > > microcode revision in each table entry. Sure, that N+1 might be another > > microcode build that also has issues but never saw the light of day... > Watch out for the (AFAIK) still not properly documented where it should > be (i.e. the microcode chapter of the Intel SDM) weirdness in Skylake+ > microcode revision. Actually, this is related to SGX, so anything that > has SGX. > > When it has SGX inside, Intel will release microcode only with even > revision numbers, but the processor may report it as odd (and will do so > by subtracting 1, so microcode 0xb0 is the same as microcode 0xaf) when > the update is loaded by the processor itself from FIT (as opposed as > being loaded by WRMSR from BIOS/UEFI/OS). > > So, you could see N-1 from within Linux if we did not update the > microcode, and fail to trigger a whitelist (or mistrigger a blacklist). That's OK. If they ship a fixed 0x0200003E firmware for SKX, for example, which appears as 0x0200003D when it's loaded from FIT, that's still >= 0x0200003C *and* !(<0x0200003D) if we were to do that. In fact, the code for the "whitelist X+1" vs. "blacklist X" approach is *entirely* equivalent; it's purely a cosmetic change. Because !(< X) ≡ ≥ (X+1) The *real* change here is that for ∀ SKU, we are being asked to blacklist all microcode revisions <= 0xFFFFFFFF¹ for now, and change that only once new microcode is actually released. Every time, and then get people to rebuild their kernels because they can *use* the features from the new microcode. ¹(OK, *there's* a functional difference between whitelist and blacklist approach. But we'll never actually see 0xffffffff so that's not important right now :)
On Wed, 2018-01-24 at 13:29 +0100, Peter Zijlstra wrote: > > Yes :/ > > We could look at extending x86_cpu_id and x86_match_cpu with a stepping > option I suppose, but that might be lots of churn. That goes all the way to mod_deviceinfo, and would be horrid. We could add an x86_match_cpu_stepping() function, I suppose? But I'm mostly trying to avoid depending on other stuff like that, for patches which are going to need to be backported to all the stable kernels. I'd much rather do it this way and then if we see another use case for it (that commit you mentioned could be nicer, I suppose), consolidate into a single stepping-capable lookup function in a later "cleanup".
On 1/24/2018 1:10 AM, Greg Kroah-Hartman wrote: > >> That means the whitelist ends up basically empty right now. Should I >> add a command line parameter to override it? Otherwise we end up having >> to rebuild the kernel every time there's a microcode release which >> covers a new CPU SKU (which is why I kind of hate the whitelist, but >> Arjan is very insistent...) > > Ick, no, whitelists are a pain for everyone involved. Don't do that > unless it is absolutely the only way it will ever work. > > Arjan, why do you think this can only be done as a whitelist? I suggested a minimum version list for those cpus that need it. microcode versions are tricky (and we've released betas etc etc with their own numbers) and as a result there might be several numbers that have those issues with their IBRS for the same F/M/S
On Wed, 2018-01-24 at 07:09 -0800, Arjan van de Ven wrote: > On 1/24/2018 1:10 AM, Greg Kroah-Hartman wrote: > > Arjan, why do you think this can only be done as a whitelist? > > I suggested a minimum version list for those cpus that need it. > > microcode versions are tricky (and we've released betas etc etc with their own numbers) > and as a result there might be several numbers that have those issues with their IBRS for the same F/M/S I really think that's fine. Anyone who uses beta microcodes, should be perfectly prepared to deal with the results. And probably *wanted* to be able to actually test them, instead of having the kernel refuse to do so. So if there are beta microcodes floating around with numbers higher than in Intel's currently-published list, which are not yet known to be safe (or even if they're known not to be), that's absolutely OK. If you're telling me that there will be *publicly* released microcodes with version numbers higher than those in the list, which still have the same issues... well, then I think Mr Shouty is going to come for another visit.
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c index b720dacac051..52855d1a4f9a 100644 --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -102,6 +102,57 @@ static void probe_xeon_phi_r3mwait(struct cpuinfo_x86 *c) ELF_HWCAP2 |= HWCAP2_RING3MWAIT; } +/* + * Early microcode releases for the Spectre v2 mitigation were broken: + * https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/microcode-update-guidance.pdf + * VMware also has a list at https://kb.vmware.com/s/article/52345 + */ +struct sku_microcode { + u8 model; + u8 stepping; + u32 microcode; +}; +static const struct sku_microcode spectre_bad_microcodes[] = { + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x0B, 0x80 }, + { INTEL_FAM6_KABYLAKE_MOBILE, 0x0A, 0x80 }, + { INTEL_FAM6_KABYLAKE_MOBILE, 0x0A, 0x80 }, + { INTEL_FAM6_KABYLAKE_MOBILE, 0x09, 0x80 }, + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x09, 0x80 }, + { INTEL_FAM6_SKYLAKE_X, 0x04, 0x0200003C }, + { INTEL_FAM6_SKYLAKE_MOBILE, 0x03, 0x000000C2 }, + { INTEL_FAM6_SKYLAKE_DESKTOP, 0x03, 0x000000C2 }, + { INTEL_FAM6_BROADWELL_CORE, 0x04, 0x28 }, + { INTEL_FAM6_BROADWELL_GT3E, 0x01, 0x0000001B }, + { INTEL_FAM6_HASWELL_ULT, 0x01, 0x21 }, + { INTEL_FAM6_HASWELL_GT3E, 0x01, 0x18 }, + { INTEL_FAM6_HASWELL_CORE, 0x03, 0x23 }, + { INTEL_FAM6_IVYBRIDGE_X, 0x04, 0x42a }, + { INTEL_FAM6_HASWELL_X, 0x02, 0x3b }, + { INTEL_FAM6_HASWELL_X, 0x04, 0x10 }, + { INTEL_FAM6_HASWELL_CORE, 0x03, 0x23 }, + { INTEL_FAM6_BROADWELL_XEON_D, 0x02, 0x14 }, + { INTEL_FAM6_BROADWELL_XEON_D, 0x03, 0x7000011 }, + { INTEL_FAM6_BROADWELL_GT3E, 0x01, 0x0000001B }, + /* For 406F1 Intel says "0x25, 0x23" while VMware says 0x0B000025 + * and a real CPU has a firmware in the 0x0B0000xx range. So: */ + { INTEL_FAM6_BROADWELL_X, 0x01, 0x0b000025 }, + { INTEL_FAM6_KABYLAKE_DESKTOP, 0x09, 0x80 }, + { INTEL_FAM6_SKYLAKE_X, 0x03, 0x100013e }, + { INTEL_FAM6_SKYLAKE_X, 0x04, 0x200003c }, +}; + +static int bad_spectre_microcode(struct cpuinfo_x86 *c) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(spectre_bad_microcodes); i++) { + if (c->x86_model == spectre_bad_microcodes[i].model && + c->x86_mask == spectre_bad_microcodes[i].stepping) + return (c->microcode <= spectre_bad_microcodes[i].microcode); + } + return 0; +} + static void early_init_intel(struct cpuinfo_x86 *c) { u64 misc_enable; @@ -122,6 +173,18 @@ static void early_init_intel(struct cpuinfo_x86 *c) if (c->x86 >= 6 && !cpu_has(c, X86_FEATURE_IA64)) c->microcode = intel_get_microcode_revision(); + if ((cpu_has(c, X86_FEATURE_SPEC_CTRL) || + cpu_has(c, X86_FEATURE_AMD_SPEC_CTRL) || + cpu_has(c, X86_FEATURE_AMD_PRED_CMD) || + cpu_has(c, X86_FEATURE_AMD_STIBP)) && bad_spectre_microcode(c)) { + pr_warn("Intel Spectre v2 broken microcode detected; disabling SPEC_CTRL\n"); + clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL); + clear_cpu_cap(c, X86_FEATURE_STIBP); + clear_cpu_cap(c, X86_FEATURE_AMD_SPEC_CTRL); + clear_cpu_cap(c, X86_FEATURE_AMD_PRED_CMD); + clear_cpu_cap(c, X86_FEATURE_AMD_STIBP); + } + /* * Atom erratum AAE44/AAF40/AAG38/AAH41: *