Message ID | 0e60520c-d660-1a83-3f57-3466a0ad617b@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86: reduce cache flushing overhead | expand |
On 19/04/2023 11:45 am, Jan Beulich wrote: > --- a/xen/arch/x86/mm.c > +++ b/xen/arch/x86/mm.c > @@ -3772,7 +3772,7 @@ long do_mmuext_op( > else if ( unlikely(!cache_flush_permitted(currd)) ) > rc = -EACCES; > else > - wbinvd(); > + wbnoinvd(); > break; It occurs to me that this is fundamentally broken. The guest is not in any position to know (or control) whether it gets rescheduled elsewhere between now and it logically continuing execution. So if there is actually any cache maintenance to do, it can't be a WB{...} of any form on just this CPU alone. > --- a/xen/include/public/arch-x86/cpufeatureset.h > +++ b/xen/include/public/arch-x86/cpufeatureset.h > @@ -238,7 +238,7 @@ XEN_CPUFEATURE(EFRO, 7*32+10) / > /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ > XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ > XEN_CPUFEATURE(RSTR_FP_ERR_PTRS, 8*32+ 2) /*A (F)X{SAVE,RSTOR} always saves/restores FPU Error pointers */ > -XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*S WBNOINVD instruction */ > +XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*A WBNOINVD instruction */ Given that PV guests do have several real hypercalls for doing this, I'm not particularly inclined to let them do it via an emulated path, however easy it might be at a technical level. I doubt anything good can come from it. ~Andrew
On 19.04.2023 22:10, Andrew Cooper wrote: > On 19/04/2023 11:45 am, Jan Beulich wrote: >> --- a/xen/arch/x86/mm.c >> +++ b/xen/arch/x86/mm.c >> @@ -3772,7 +3772,7 @@ long do_mmuext_op( >> else if ( unlikely(!cache_flush_permitted(currd)) ) >> rc = -EACCES; >> else >> - wbinvd(); >> + wbnoinvd(); >> break; > > It occurs to me that this is fundamentally broken. > > The guest is not in any position to know (or control) whether it gets > rescheduled elsewhere between now and it logically continuing execution. > > So if there is actually any cache maintenance to do, it can't be a > WB{...} of any form on just this CPU alone. Aren't you merely confirming the respective paragraph in the cover letter here? (The implication would be that at least the data added in the last patch wants to be guest-type-independent, so that it can be re-used for PV, even if maybe not in the same patch.) >> --- a/xen/include/public/arch-x86/cpufeatureset.h >> +++ b/xen/include/public/arch-x86/cpufeatureset.h >> @@ -238,7 +238,7 @@ XEN_CPUFEATURE(EFRO, 7*32+10) / >> /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ >> XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ >> XEN_CPUFEATURE(RSTR_FP_ERR_PTRS, 8*32+ 2) /*A (F)X{SAVE,RSTOR} always saves/restores FPU Error pointers */ >> -XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*S WBNOINVD instruction */ >> +XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*A WBNOINVD instruction */ > > Given that PV guests do have several real hypercalls for doing this, I'm > not particularly inclined to let them do it via an emulated path, > however easy it might be at a technical level. > > I doubt anything good can come from it. We permit use of WBINVD; it looks inconsistent to me to not also permit WBNOINVD then. Furthermore exposing the feature then can serve as a (better) hint that behind the scenes we actually carry out the mmuext ops as write-back. not evict. Jan
--- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -3772,7 +3772,7 @@ long do_mmuext_op( else if ( unlikely(!cache_flush_permitted(currd)) ) rc = -EACCES; else - wbinvd(); + wbnoinvd(); break; case MMUEXT_FLUSH_CACHE_GLOBAL: @@ -3788,7 +3788,7 @@ long do_mmuext_op( if ( !cpumask_intersects(mask, per_cpu(cpu_sibling_mask, cpu)) ) __cpumask_set_cpu(cpu, mask); - flush_mask(mask, FLUSH_CACHE); + flush_mask(mask, FLUSH_WRITEBACK); } else rc = -EINVAL; --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -1196,10 +1196,8 @@ static int cf_check cache_op( * newer linux uses this in some start-of-day timing loops. */ ; - else if ( op == x86emul_wbnoinvd /* && cpu_has_wbnoinvd */ ) - wbnoinvd(); else - wbinvd(); + wbnoinvd(); return X86EMUL_OKAY; } --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -238,7 +238,7 @@ XEN_CPUFEATURE(EFRO, 7*32+10) / /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */ XEN_CPUFEATURE(CLZERO, 8*32+ 0) /*A CLZERO instruction */ XEN_CPUFEATURE(RSTR_FP_ERR_PTRS, 8*32+ 2) /*A (F)X{SAVE,RSTOR} always saves/restores FPU Error pointers */ -XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*S WBNOINVD instruction */ +XEN_CPUFEATURE(WBNOINVD, 8*32+ 9) /*A WBNOINVD instruction */ XEN_CPUFEATURE(IBPB, 8*32+12) /*A IBPB support only (no IBRS, used by AMD) */ XEN_CPUFEATURE(IBRS, 8*32+14) /*S MSR_SPEC_CTRL.IBRS */ XEN_CPUFEATURE(AMD_STIBP, 8*32+15) /*S MSR_SPEC_CTRL.STIBP */
We allow its use for writeback purposes only anyway, so let's also carry these out that way on capable hardware. We can then also expose the WBNOINVD feature, as there's no difference to WBINVD anymore. Note that respective emulation logic has already been in place since ad3abc47dd23 ("x86emul: support WBNOINVD"). Signed-off-by: Jan Beulich <jbeulich@suse.com>