Message ID | 20240123205948.1782556-1-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | x86/ucode: Fix stability of the Raw CPU Policy rescan | expand |
On 23.01.2024 21:59, Andrew Cooper wrote: > Always run microcode_update_helper() on the BSP, so the the updated Raw CPU > policy doesn't get non-BSP topology details included. Wouldn't it be better (and consistent with ... > Have calculate_raw_cpu_policy() clear the instantanious XSTATE sizes. The > value XCR0 | MSR_XSS had when we scanned the policy isn't terribly interesting > to report. ... this) to purge these details from the raw policy as well then? > When CPUID Masking is active, it affects CPUID instructions issued by Xen > too. Transiently disable masking to get a clean scan. > > Fixes: 694d79ed5aac ("x86/ucode: Refresh raw CPU policy after microcode load") > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Irrespective of the question above, I'm also okay with the change as is: Reviewed-by: Jan Beulich <jbeulich@suse.com> Jan
On 24/01/2024 3:37 pm, Jan Beulich wrote: > On 23.01.2024 21:59, Andrew Cooper wrote: >> Always run microcode_update_helper() on the BSP, so the the updated Raw CPU >> policy doesn't get non-BSP topology details included. > Wouldn't it be better (and consistent with ... > >> Have calculate_raw_cpu_policy() clear the instantanious XSTATE sizes. The >> value XCR0 | MSR_XSS had when we scanned the policy isn't terribly interesting >> to report. > ... this) to purge these details from the raw policy as well then? I did spend some time considering this. Rerunning on the same CPU is more resilient to new topology leaves, so we'd want to be doing that irrespective. The XCR0/XSS state really is transient, and the useful information is everywhere else in leaf 0xd. > >> When CPUID Masking is active, it affects CPUID instructions issued by Xen >> too. Transiently disable masking to get a clean scan. >> >> Fixes: 694d79ed5aac ("x86/ucode: Refresh raw CPU policy after microcode load") >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Irrespective of the question above, I'm also okay with the change as is: > Reviewed-by: Jan Beulich <jbeulich@suse.com> Thanks. ~Andrew
On 24.01.2024 18:23, Andrew Cooper wrote: > On 24/01/2024 3:37 pm, Jan Beulich wrote: >> On 23.01.2024 21:59, Andrew Cooper wrote: >>> Always run microcode_update_helper() on the BSP, so the the updated Raw CPU >>> policy doesn't get non-BSP topology details included. >> Wouldn't it be better (and consistent with ... >> >>> Have calculate_raw_cpu_policy() clear the instantanious XSTATE sizes. The >>> value XCR0 | MSR_XSS had when we scanned the policy isn't terribly interesting >>> to report. >> ... this) to purge these details from the raw policy as well then? > > I did spend some time considering this. > > Rerunning on the same CPU is more resilient to new topology leaves, so > we'd want to be doing that irrespective. I'm afraid I don't understand this: If a ucode update surfaced new leaves, they surely would appear on all CPUs? IOW my question still stands: Wouldn't we better zap topology data from the raw policy (thus also not propagating it into other policies)? At which point retrieval becomes independent of what CPU it is run on (if there were any other CPU-specific pieces of data, similar zapping should happen for them). Surely using CPU0 here isn't much of a problem, as this is a pretty infrequent event. But generally I'd like to avoid "preferring" CPU0 as much as possible. Hence I'd prefer if even in cases like this one we could avoid it. > The XCR0/XSS state really is transient, and the useful information is > everywhere else in leaf 0xd. Sure, but this is still independent on what CPU the retrieval is run on. Jan
diff --git a/xen/arch/x86/cpu-policy.c b/xen/arch/x86/cpu-policy.c index 76efb050edf7..82b10de03efd 100644 --- a/xen/arch/x86/cpu-policy.c +++ b/xen/arch/x86/cpu-policy.c @@ -353,6 +353,13 @@ void calculate_raw_cpu_policy(void) /* Nothing good will come from Xen and libx86 disagreeing on vendor. */ ASSERT(p->x86_vendor == boot_cpu_data.x86_vendor); + /* + * Clear the truly dynamic fields. These vary with the in-context XCR0 + * and MSR_XSS, and aren't interesting fields in the raw policy. + */ + p->xstate.raw[0].ebx = 0; + p->xstate.raw[1].ebx = 0; + /* 0x000000ce MSR_INTEL_PLATFORM_INFO */ /* Was already added by probe_cpuid_faulting() */ } diff --git a/xen/arch/x86/cpu/microcode/core.c b/xen/arch/x86/cpu/microcode/core.c index 120a11d5036d..6f95f7bbe223 100644 --- a/xen/arch/x86/cpu/microcode/core.c +++ b/xen/arch/x86/cpu/microcode/core.c @@ -680,8 +680,18 @@ static long cf_check microcode_update_helper(void *data) microcode_update_cache(patch); spin_unlock(µcode_mutex); - /* Refresh the raw CPU policy, in case the features have changed. */ + /* + * Refresh the raw CPU policy, in case the features have changed. + * Disable CPUID masking if in use, to avoid having current's + * cpu_policy affect the rescan. + */ + if ( ctxt_switch_masking ) + alternative_vcall(ctxt_switch_masking, NULL); + calculate_raw_cpu_policy(); + + if ( ctxt_switch_masking ) + alternative_vcall(ctxt_switch_masking, current); } else microcode_free_patch(patch); @@ -721,8 +731,12 @@ int microcode_update(XEN_GUEST_HANDLE(const_void) buf, unsigned long len) } buffer->len = len; - return continue_hypercall_on_cpu(smp_processor_id(), - microcode_update_helper, buffer); + /* + * Always queue microcode_update_helper() on CPU0. Most of the logic + * won't care, but the update of the Raw CPU policy wants to (re)run on + * the BSP. + */ + return continue_hypercall_on_cpu(0, microcode_update_helper, buffer); } static int __init cf_check microcode_init(void)
Always run microcode_update_helper() on the BSP, so the the updated Raw CPU policy doesn't get non-BSP topology details included. Have calculate_raw_cpu_policy() clear the instantanious XSTATE sizes. The value XCR0 | MSR_XSS had when we scanned the policy isn't terribly interesting to report. When CPUID Masking is active, it affects CPUID instructions issued by Xen too. Transiently disable masking to get a clean scan. Fixes: 694d79ed5aac ("x86/ucode: Refresh raw CPU policy after microcode load") Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> I debased adding named fields for the instantious xstate sizes, but decided not to. There's no (other) case where I can see them reasonably being used. --- xen/arch/x86/cpu-policy.c | 7 +++++++ xen/arch/x86/cpu/microcode/core.c | 20 +++++++++++++++++--- 2 files changed, 24 insertions(+), 3 deletions(-)