Message ID | 0c2ddae9-3222-9755-b6e1-35e51410093b@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86emul: misc additions | expand |
On 04/04/2023 3:50 pm, Jan Beulich wrote: > This insn differs from WRMSR solely in the lack of serialization. Hence > the code used there can simply be used here as well, plus a feature > check of course. I have a feeling this is a bit stale now that 0f01.c has moved into a separate file ? > As there's no other infrastructure needed beyond > permitting the insn for PV privileged-op emulation (in particular no > separate new VMEXIT) we can expose the insn to guests right away. There are good reasons not to expose this to PV guests. The #GP fault to trigger emulation is serialising, so from the guest's point of view there is literally no difference between WRMSR and WRMSRNS. All code is going to need to default to WRMSR for compatibility reasons, so not advertising WRMSRNS will save the guest going through and self-modifying itself to use a "more efficient" instruction which is not actually more efficient. But, in general and as said before, I feel it is irresponsible to be continuing to add slow guest-fastpaths without introducing a real fastpath. We desperately need HYPERCALL_pv_priv_op which can take an array of inputs, and has an sub-op for each of CPUID, RDMSR, WRMSR and whatever other ops are trapping because of no fastpath. Perf testing routinely shows that #GP/#UD faults are only 2 orders of magnitude less rare in PV guests than #PF, which is an insane wastage when guests are supposed to be enlightened and use fastpaths in the first place. The emulation implementation is fine, so Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com> but dependent on the PV guest changes being dropped. ~Andrew
On 05.04.2023 23:29, Andrew Cooper wrote: > On 04/04/2023 3:50 pm, Jan Beulich wrote: >> This insn differs from WRMSR solely in the lack of serialization. Hence >> the code used there can simply be used here as well, plus a feature >> check of course. > > I have a feeling this is a bit stale now that 0f01.c has moved into a > separate file ? Hmm, no, not really. This code was written long after the split anyway. "Used here as well" is meant as "copy that code", not "goto ...". >> As there's no other infrastructure needed beyond >> permitting the insn for PV privileged-op emulation (in particular no >> separate new VMEXIT) we can expose the insn to guests right away. > > There are good reasons not to expose this to PV guests. > > The #GP fault to trigger emulation is serialising, so from the guest's > point of view there is literally no difference between WRMSR and WRMSRNS. > > All code is going to need to default to WRMSR for compatibility reasons, > so not advertising WRMSRNS will save the guest going through and > self-modifying itself to use a "more efficient" instruction which is not > actually more efficient. Well, sure, I can drop that again. I don't view "self-modifying itself" as meaningful wastage. Plus I'd consider it reasonable to expose the feature by default only to HVM, while permitting (non-default) exposure to PV (in which case the emul-priv-op.c change would want retaining), except that we can't express this afaict. Even without exposing at all I'd consider it kind of reasonable to allow use of the insn, in case a guest infers its availability (or simply knows from executing raw CPUID). > The emulation implementation is fine, so Reviewed-by: Andrew Cooper > <andrew.cooper3@citrix.com> but dependent on the PV guest changes being > dropped. Thanks. Jan
--- a/tools/tests/x86_emulator/predicates.c +++ b/tools/tests/x86_emulator/predicates.c @@ -341,6 +341,7 @@ static const struct { /*{ 0x01, 0xc3 }, { 2, 2 }, F, R }, vmresume */ { { 0x01, 0xc4 }, { 2, 2 }, F, N }, /* vmxoff */ { { 0x01, 0xc5 }, { 2, 2 }, F, N }, /* pconfig */ + { { 0x01, 0xc6 }, { 2, 2 }, F, N }, /* wrmsrns */ { { 0x01, 0xc8 }, { 2, 2 }, F, N }, /* monitor */ { { 0x01, 0xc9 }, { 2, 2 }, F, N }, /* mwait */ { { 0x01, 0xca }, { 2, 2 }, F, N }, /* clac */ --- a/tools/tests/x86_emulator/x86-emulate.c +++ b/tools/tests/x86_emulator/x86-emulate.c @@ -87,6 +87,7 @@ bool emul_test_init(void) cp.feat.avx512pf = cp.feat.avx512f; cp.feat.rdpid = true; cp.feat.lkgs = true; + cp.feat.wrmsrns = true; cp.extd.clzero = true; if ( cpu_has_xsave ) --- a/xen/arch/x86/pv/emul-priv-op.c +++ b/xen/arch/x86/pv/emul-priv-op.c @@ -1252,8 +1252,11 @@ static int cf_check validate( { unsigned int modrm_rm, modrm_reg; - if ( x86_insn_modrm(state, &modrm_rm, &modrm_reg) != 3 || - (modrm_rm & 7) != 1 ) + if ( x86_insn_modrm(state, &modrm_rm, &modrm_reg) != 3 ) + break; + if ( (modrm_rm & 7) == 6 && !(modrm_reg & 7) ) /* wrmsrns, {rd,wr}msrlist */ + return X86EMUL_OKAY; + if ( (modrm_rm & 7) != 1 ) break; switch ( modrm_reg & 7 ) { --- a/xen/arch/x86/x86_emulate/0f01.c +++ b/xen/arch/x86/x86_emulate/0f01.c @@ -43,6 +43,20 @@ int x86emul_0f01(struct x86_emulate_stat struct segment_register sreg; uint64_t msr_val; + case 0xc6: + switch ( s->vex.pfx ) + { + case vex_none: /* wrmsrns */ + vcpu_must_have(wrmsrns); + generate_exception_if(!mode_ring0(), X86_EXC_GP, 0); + fail_if(!ops->write_msr); + rc = ops->write_msr(regs->ecx, + ((uint64_t)regs->r(dx) << 32) | regs->eax, + ctxt); + goto done; + } + generate_exception(X86_EXC_UD); + case 0xca: /* clac */ case 0xcb: /* stac */ vcpu_must_have(smap); --- a/xen/arch/x86/x86_emulate/private.h +++ b/xen/arch/x86/x86_emulate/private.h @@ -595,6 +595,7 @@ amd_like(const struct x86_emulate_ctxt * #define vcpu_has_avx_vnni() (ctxt->cpuid->feat.avx_vnni) #define vcpu_has_avx512_bf16() (ctxt->cpuid->feat.avx512_bf16) #define vcpu_has_lkgs() (ctxt->cpuid->feat.lkgs) +#define vcpu_has_wrmsrns() (ctxt->cpuid->feat.wrmsrns) #define vcpu_must_have(feat) \ generate_exception_if(!vcpu_has_##feat(), X86_EXC_UD) --- a/xen/include/public/arch-x86/cpufeatureset.h +++ b/xen/include/public/arch-x86/cpufeatureset.h @@ -283,7 +283,7 @@ XEN_CPUFEATURE(FSRS, 10*32+11) / XEN_CPUFEATURE(FSRCS, 10*32+12) /*A Fast Short REP CMPSB/SCASB */ XEN_CPUFEATURE(FRED, 10*32+17) /* Flexible Return and Event Delivery */ XEN_CPUFEATURE(LKGS, 10*32+18) /*S Load Kernel GS Base */ -XEN_CPUFEATURE(WRMSRNS, 10*32+19) /* WRMSR Non-Serialising */ +XEN_CPUFEATURE(WRMSRNS, 10*32+19) /*A WRMSR Non-Serialising */ /* AMD-defined CPU features, CPUID level 0x80000021.eax, word 11 */ XEN_CPUFEATURE(LFENCE_DISPATCH, 11*32+ 2) /*A LFENCE always serializing */
This insn differs from WRMSR solely in the lack of serialization. Hence the code used there can simply be used here as well, plus a feature check of course. As there's no other infrastructure needed beyond permitting the insn for PV privileged-op emulation (in particular no separate new VMEXIT) we can expose the insn to guests right away. Signed-off-by: Jan Beulich <jbeulich@suse.com>