diff mbox series

[3/5] x86/PV: restrict guest-induced WBINVD (or alike) to cache writeback

Message ID 0e60520c-d660-1a83-3f57-3466a0ad617b@suse.com (mailing list archive)
State Superseded
Headers show
Series x86: reduce cache flushing overhead | expand

Commit Message

Jan Beulich April 19, 2023, 10:45 a.m. UTC
We allow its use for writeback purposes only anyway, so let's also carry
these out that way on capable hardware.

We can then also expose the WBNOINVD feature, as there's no difference
to WBINVD anymore. Note that respective emulation logic has already been
in place since ad3abc47dd23 ("x86emul: support WBNOINVD").

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

Andrew Cooper April 19, 2023, 8:10 p.m. UTC | #1
On 19/04/2023 11:45 am, Jan Beulich wrote:
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -3772,7 +3772,7 @@ long do_mmuext_op(
>              else if ( unlikely(!cache_flush_permitted(currd)) )
>                  rc = -EACCES;
>              else
> -                wbinvd();
> +                wbnoinvd();
>              break;

It occurs to me that this is fundamentally broken.

The guest is not in any position to know (or control) whether it gets
rescheduled elsewhere between now and it logically continuing execution.

So if there is actually any cache maintenance to do, it can't be a
WB{...} of any form on just this CPU alone.

> --- a/xen/include/public/arch-x86/cpufeatureset.h
> +++ b/xen/include/public/arch-x86/cpufeatureset.h
> @@ -238,7 +238,7 @@ XEN_CPUFEATURE(EFRO,          7*32+10) /
>  /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */
>  XEN_CPUFEATURE(CLZERO,        8*32+ 0) /*A  CLZERO instruction */
>  XEN_CPUFEATURE(RSTR_FP_ERR_PTRS, 8*32+ 2) /*A  (F)X{SAVE,RSTOR} always saves/restores FPU Error pointers */
> -XEN_CPUFEATURE(WBNOINVD,      8*32+ 9) /*S  WBNOINVD instruction */
> +XEN_CPUFEATURE(WBNOINVD,      8*32+ 9) /*A  WBNOINVD instruction */

Given that PV guests do have several real hypercalls for doing this, I'm
not particularly inclined to let them do it via an emulated path,
however easy it might be at a technical level.

I doubt anything good can come from it.

~Andrew
Jan Beulich April 20, 2023, 9:08 a.m. UTC | #2
On 19.04.2023 22:10, Andrew Cooper wrote:
> On 19/04/2023 11:45 am, Jan Beulich wrote:
>> --- a/xen/arch/x86/mm.c
>> +++ b/xen/arch/x86/mm.c
>> @@ -3772,7 +3772,7 @@ long do_mmuext_op(
>>              else if ( unlikely(!cache_flush_permitted(currd)) )
>>                  rc = -EACCES;
>>              else
>> -                wbinvd();
>> +                wbnoinvd();
>>              break;
> 
> It occurs to me that this is fundamentally broken.
> 
> The guest is not in any position to know (or control) whether it gets
> rescheduled elsewhere between now and it logically continuing execution.
> 
> So if there is actually any cache maintenance to do, it can't be a
> WB{...} of any form on just this CPU alone.

Aren't you merely confirming the respective paragraph in the cover letter
here? (The implication would be that at least the data added in the last
patch wants to be guest-type-independent, so that it can be re-used for
PV, even if maybe not in the same patch.)

>> --- a/xen/include/public/arch-x86/cpufeatureset.h
>> +++ b/xen/include/public/arch-x86/cpufeatureset.h
>> @@ -238,7 +238,7 @@ XEN_CPUFEATURE(EFRO,          7*32+10) /
>>  /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */
>>  XEN_CPUFEATURE(CLZERO,        8*32+ 0) /*A  CLZERO instruction */
>>  XEN_CPUFEATURE(RSTR_FP_ERR_PTRS, 8*32+ 2) /*A  (F)X{SAVE,RSTOR} always saves/restores FPU Error pointers */
>> -XEN_CPUFEATURE(WBNOINVD,      8*32+ 9) /*S  WBNOINVD instruction */
>> +XEN_CPUFEATURE(WBNOINVD,      8*32+ 9) /*A  WBNOINVD instruction */
> 
> Given that PV guests do have several real hypercalls for doing this, I'm
> not particularly inclined to let them do it via an emulated path,
> however easy it might be at a technical level.
> 
> I doubt anything good can come from it.

We permit use of WBINVD; it looks inconsistent to me to not also permit
WBNOINVD then. Furthermore exposing the feature then can serve as a
(better) hint that behind the scenes we actually carry out the mmuext
ops as write-back. not evict.

Jan
diff mbox series

Patch

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3772,7 +3772,7 @@  long do_mmuext_op(
             else if ( unlikely(!cache_flush_permitted(currd)) )
                 rc = -EACCES;
             else
-                wbinvd();
+                wbnoinvd();
             break;
 
         case MMUEXT_FLUSH_CACHE_GLOBAL:
@@ -3788,7 +3788,7 @@  long do_mmuext_op(
                     if ( !cpumask_intersects(mask,
                                              per_cpu(cpu_sibling_mask, cpu)) )
                         __cpumask_set_cpu(cpu, mask);
-                flush_mask(mask, FLUSH_CACHE);
+                flush_mask(mask, FLUSH_WRITEBACK);
             }
             else
                 rc = -EINVAL;
--- a/xen/arch/x86/pv/emul-priv-op.c
+++ b/xen/arch/x86/pv/emul-priv-op.c
@@ -1196,10 +1196,8 @@  static int cf_check cache_op(
          * newer linux uses this in some start-of-day timing loops.
          */
         ;
-    else if ( op == x86emul_wbnoinvd /* && cpu_has_wbnoinvd */ )
-        wbnoinvd();
     else
-        wbinvd();
+        wbnoinvd();
 
     return X86EMUL_OKAY;
 }
--- a/xen/include/public/arch-x86/cpufeatureset.h
+++ b/xen/include/public/arch-x86/cpufeatureset.h
@@ -238,7 +238,7 @@  XEN_CPUFEATURE(EFRO,          7*32+10) /
 /* AMD-defined CPU features, CPUID level 0x80000008.ebx, word 8 */
 XEN_CPUFEATURE(CLZERO,        8*32+ 0) /*A  CLZERO instruction */
 XEN_CPUFEATURE(RSTR_FP_ERR_PTRS, 8*32+ 2) /*A  (F)X{SAVE,RSTOR} always saves/restores FPU Error pointers */
-XEN_CPUFEATURE(WBNOINVD,      8*32+ 9) /*S  WBNOINVD instruction */
+XEN_CPUFEATURE(WBNOINVD,      8*32+ 9) /*A  WBNOINVD instruction */
 XEN_CPUFEATURE(IBPB,          8*32+12) /*A  IBPB support only (no IBRS, used by AMD) */
 XEN_CPUFEATURE(IBRS,          8*32+14) /*S  MSR_SPEC_CTRL.IBRS */
 XEN_CPUFEATURE(AMD_STIBP,     8*32+15) /*S  MSR_SPEC_CTRL.STIBP */