Message ID | 20190221115020.12385-13-bigeasy@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [01/22] x86/fpu: Remove fpu->initialized usage in __fpu__restore_sig() | expand |
On 2/21/19 3:50 AM, Sebastian Andrzej Siewior wrote: > @@ -111,6 +111,12 @@ static inline void __write_pkru(u32 pkru) > { > u32 ecx = 0, edx = 0; > > + /* > + * WRPKRU is relatively expensive compared to RDPKRU. > + * Avoid WRPKRU when it would not change the value. > + */ > + if (pkru == __read_pkru()) > + return; > /* > * "wrpkru" instruction. Loads contents in EAX to PKRU, > * requires that ecx = edx = 0. Could we do this in write_pkru() instead? I tried to keep the __ versions of read and write as simple and symmetric as possible.
On 2019-02-25 10:08:10 [-0800], Dave Hansen wrote: > On 2/21/19 3:50 AM, Sebastian Andrzej Siewior wrote: > > @@ -111,6 +111,12 @@ static inline void __write_pkru(u32 pkru) > > { > > u32 ecx = 0, edx = 0; > > > > + /* > > + * WRPKRU is relatively expensive compared to RDPKRU. > > + * Avoid WRPKRU when it would not change the value. > > + */ > > + if (pkru == __read_pkru()) > > + return; > > /* > > * "wrpkru" instruction. Loads contents in EAX to PKRU, > > * requires that ecx = edx = 0. > > Could we do this in write_pkru() instead? I tried to keep the __ > versions of read and write as simple and symmetric as possible. write_pkru() has: | static inline void write_pkru(u32 pkru) | { | struct pkru_state *pk; | | if (!boot_cpu_has(X86_FEATURE_OSPKE)) | return; | | pk = get_xsave_addr(¤t->thread.fpu.state.xsave, XFEATURE_PKRU); | /* | * The PKRU value in xstate needs to be in sync with the value that is | * written to the CPU. The FPU restore on return to userland would | * otherwise load the previous value again. | */ | fpregs_lock(); | if (pk) | pk->pkru = pkru; | __write_pkru(pkru); | fpregs_unlock(); | } and is other callers besides write_pkru(): |arch/x86/include/asm/fpu/internal.h: __write_pkru(pkru_val); |arch/x86/kvm/vmx/vmx.c: __write_pkru(vcpu->arch.pkru); |arch/x86/kvm/vmx/vmx.c: __write_pkru(vmx->host_pkru) Do want __write_pkru_isn() and __read_pkru_isn() so those are symmetric and then we would have __write_pkru() which does the read check and invokes __write_pkru_isn()? Sebastian
diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 43c029cdc3fe8..2d3adeb268e38 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -111,6 +111,12 @@ static inline void __write_pkru(u32 pkru) { u32 ecx = 0, edx = 0; + /* + * WRPKRU is relatively expensive compared to RDPKRU. + * Avoid WRPKRU when it would not change the value. + */ + if (pkru == __read_pkru()) + return; /* * "wrpkru" instruction. Loads contents in EAX to PKRU, * requires that ecx = edx = 0.
Dave Hansen says that the `wrpkru' is more expensive than `rdpkru'. It has a higher cycle cost and it's also practically a (light) speculation barrier. As an optimisation read the current PKRU value and only write the new one if it is different. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> --- arch/x86/include/asm/special_insns.h | 6 ++++++ 1 file changed, 6 insertions(+)