Message ID | 1568272949-1086-16-git-send-email-chao.gao@intel.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | improve late microcode loading | expand |
On 12.09.2019 09:22, Chao Gao wrote: > @@ -283,6 +284,27 @@ static enum microcode_match_result compare_patch( > : OLD_UCODE; > } > > +static bool is_blacklisted(void) > +{ > + struct cpuinfo_x86 *c = ¤t_cpu_data; > + uint64_t llc_size = c->x86_cache_size * 1024ULL; > + struct cpu_signature *sig = &this_cpu(cpu_sig); > + > + do_div(llc_size, c->x86_max_cores); > + > + /* > + * Late loading on model 79 with microcode revision less than 0x0b000021 > + * and LLC size per core bigger than 2.5MB may result in a system hang. > + * This behavior is documented in item BDF90, #334165 (Intel Xeon > + * Processor E7-8800/4800 v4 Product Family). > + */ > + if ( c->x86 == 6 && c->x86_model == 0x4F && c->x86_mask == 0x1 && > + llc_size > 2621440 && sig->rev < 0x0b000021 ) > + return true; > + > + return false; > +} Isn't this misbehavior worked around by the wbinvd() you add in the next patch? > --- a/xen/include/asm-x86/microcode.h > +++ b/xen/include/asm-x86/microcode.h > @@ -30,6 +30,7 @@ struct microcode_ops { > bool (*match_cpu)(const struct microcode_patch *patch); > enum microcode_match_result (*compare_patch)( > const struct microcode_patch *new, const struct microcode_patch *old); > + bool (*is_blacklisted)(void); Why a hook rather than a boolean flag, which could be set by microcode_update_one() (as invoked during AP bringup)? Jan
On 12/09/2019 08:22, Chao Gao wrote: > It ports the implementation of is_blacklisted() in linux kernel > to Xen. > > Late loading may cause system hang if CPUs are affected by BDF90. > Check against BDF90 before performing a late loading. > > Signed-off-by: Chao Gao <chao.gao@intel.com> There is an Intel-blessed workaround for this issue, which is to perform parallel loading and issue a WBINVD inside the rendezvous, which is being used in production by a number of vendors. We should prohibit late loading cases which we know to be broken, but parallel loading should be permitted. ~Andrew
On Fri, Sep 13, 2019 at 11:22:59AM +0200, Jan Beulich wrote: >On 12.09.2019 09:22, Chao Gao wrote: >> @@ -283,6 +284,27 @@ static enum microcode_match_result compare_patch( >> : OLD_UCODE; >> } >> >> +static bool is_blacklisted(void) >> +{ >> + struct cpuinfo_x86 *c = ¤t_cpu_data; >> + uint64_t llc_size = c->x86_cache_size * 1024ULL; >> + struct cpu_signature *sig = &this_cpu(cpu_sig); >> + >> + do_div(llc_size, c->x86_max_cores); >> + >> + /* >> + * Late loading on model 79 with microcode revision less than 0x0b000021 >> + * and LLC size per core bigger than 2.5MB may result in a system hang. >> + * This behavior is documented in item BDF90, #334165 (Intel Xeon >> + * Processor E7-8800/4800 v4 Product Family). >> + */ >> + if ( c->x86 == 6 && c->x86_model == 0x4F && c->x86_mask == 0x1 && >> + llc_size > 2621440 && sig->rev < 0x0b000021 ) >> + return true; >> + >> + return false; >> +} > >Isn't this misbehavior worked around by the wbinvd() you add in the next >patch? Hi Jan and Andrew, Perhaps I misunderstood what I was told. I am confirming with Ashok whether this patch is necessary. > >> --- a/xen/include/asm-x86/microcode.h >> +++ b/xen/include/asm-x86/microcode.h >> @@ -30,6 +30,7 @@ struct microcode_ops { >> bool (*match_cpu)(const struct microcode_patch *patch); >> enum microcode_match_result (*compare_patch)( >> const struct microcode_patch *new, const struct microcode_patch *old); >> + bool (*is_blacklisted)(void); > >Why a hook rather than a boolean flag, which could be set by >microcode_update_one() (as invoked during AP bringup)? How about set the boolean flag in Intel_errata_workarounds? One limitation of setting the flag in microcode_update_one() is: BSP also calls microcode_update_one(). But calculating LLC size per core on BSP would meet the same issue as the following patch (i.e. patch 16/16): BSP's current_cpu_data isn't initialized properly. We might need to revert commit f97838bbd980a01 in some way and reenumerate features after ucode loading is done. Thanks Chao
On 17.09.2019 11:01, Chao Gao wrote: > On Fri, Sep 13, 2019 at 11:22:59AM +0200, Jan Beulich wrote: >> On 12.09.2019 09:22, Chao Gao wrote: >>> --- a/xen/include/asm-x86/microcode.h >>> +++ b/xen/include/asm-x86/microcode.h >>> @@ -30,6 +30,7 @@ struct microcode_ops { >>> bool (*match_cpu)(const struct microcode_patch *patch); >>> enum microcode_match_result (*compare_patch)( >>> const struct microcode_patch *new, const struct microcode_patch *old); >>> + bool (*is_blacklisted)(void); >> >> Why a hook rather than a boolean flag, which could be set by >> microcode_update_one() (as invoked during AP bringup)? > > How about set the boolean flag in Intel_errata_workarounds? Wherever it ends up working best. My suggestion was just a wild guess. Jan
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c index 64a4321..dbd2730 100644 --- a/xen/arch/x86/microcode.c +++ b/xen/arch/x86/microcode.c @@ -561,6 +561,12 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len) if ( microcode_ops == NULL ) return -EINVAL; + if ( microcode_ops->is_blacklisted && microcode_ops->is_blacklisted() ) + { + printk(XENLOG_WARNING "Late ucode loading is disabled!\n"); + return -EPERM; + } + buffer = xmalloc_bytes(len); if ( !buffer ) return -ENOMEM; diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c index 19f1ba0..bcef668 100644 --- a/xen/arch/x86/microcode_intel.c +++ b/xen/arch/x86/microcode_intel.c @@ -28,6 +28,7 @@ #include <xen/smp.h> #include <xen/spinlock.h> +#include <asm/div64.h> #include <asm/msr.h> #include <asm/processor.h> #include <asm/microcode.h> @@ -283,6 +284,27 @@ static enum microcode_match_result compare_patch( : OLD_UCODE; } +static bool is_blacklisted(void) +{ + struct cpuinfo_x86 *c = ¤t_cpu_data; + uint64_t llc_size = c->x86_cache_size * 1024ULL; + struct cpu_signature *sig = &this_cpu(cpu_sig); + + do_div(llc_size, c->x86_max_cores); + + /* + * Late loading on model 79 with microcode revision less than 0x0b000021 + * and LLC size per core bigger than 2.5MB may result in a system hang. + * This behavior is documented in item BDF90, #334165 (Intel Xeon + * Processor E7-8800/4800 v4 Product Family). + */ + if ( c->x86 == 6 && c->x86_model == 0x4F && c->x86_mask == 0x1 && + llc_size > 2621440 && sig->rev < 0x0b000021 ) + return true; + + return false; +} + static int apply_microcode(const struct microcode_patch *patch) { uint64_t msr_content; @@ -415,6 +437,7 @@ static const struct microcode_ops microcode_intel_ops = { .free_patch = free_patch, .compare_patch = compare_patch, .match_cpu = match_cpu, + .is_blacklisted = is_blacklisted, }; int __init microcode_init_intel(void) diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h index 7d5a1f8..9ffd9d2 100644 --- a/xen/include/asm-x86/microcode.h +++ b/xen/include/asm-x86/microcode.h @@ -30,6 +30,7 @@ struct microcode_ops { bool (*match_cpu)(const struct microcode_patch *patch); enum microcode_match_result (*compare_patch)( const struct microcode_patch *new, const struct microcode_patch *old); + bool (*is_blacklisted)(void); }; struct cpu_signature {
It ports the implementation of is_blacklisted() in linux kernel to Xen. Late loading may cause system hang if CPUs are affected by BDF90. Check against BDF90 before performing a late loading. Signed-off-by: Chao Gao <chao.gao@intel.com> --- xen/arch/x86/microcode.c | 6 ++++++ xen/arch/x86/microcode_intel.c | 23 +++++++++++++++++++++++ xen/include/asm-x86/microcode.h | 1 + 3 files changed, 30 insertions(+)