Message ID | 20220126084452.28975-4-andrew.cooper3@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | x86: MSR_SPEC_CTRL support for SVM guests | expand |
On 26.01.2022 09:44, Andrew Cooper wrote: > 1) It would be slightly more efficient to pass curr and cpu_info into > vm{entry,exit}_spec_ctrl(), but setup of such state can't be in the > ALTERNATIVE block because then the call displacement won't get fixed up. > All the additional accesses are hot off the stack, so almost certainly > negligible compared to the WRMSR. What's wrong with using two instances of ALTERNATIVE, one to setup the call arguments and the 2nd for just the CALL? > --- a/xen/arch/x86/hvm/svm/entry.S > +++ b/xen/arch/x86/hvm/svm/entry.S > @@ -55,11 +55,11 @@ __UNLIKELY_END(nsvm_hap) > mov %rsp, %rdi > call svm_vmenter_helper > > - mov VCPU_arch_msrs(%rbx), %rax > - mov VCPUMSR_spec_ctrl_raw(%rax), %eax > + clgi > > /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */ > - /* SPEC_CTRL_EXIT_TO_SVM (nothing currently) */ > + /* SPEC_CTRL_EXIT_TO_SVM Req: Clob: C */ > + ALTERNATIVE "", __stringify(call vmentry_spec_ctrl), X86_FEATURE_SC_MSR_HVM I guess the new upper case C after Clob: stands for "all call-clobbered registers"? In which case ... > @@ -86,8 +85,9 @@ __UNLIKELY_END(nsvm_hap) > > GET_CURRENT(bx) > > - /* SPEC_CTRL_ENTRY_FROM_SVM Req: b=curr %rsp=regs/cpuinfo, Clob: ac */ > + /* SPEC_CTRL_ENTRY_FROM_SVM Req: Clob: ac,C */ > ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM > + ALTERNATIVE "", __stringify(call vmexit_spec_ctrl), X86_FEATURE_SC_MSR_HVM ... why the explicit further "ac" here? Is the intention to annotate every individual ALTERNATIVE this way? > --- a/xen/arch/x86/hvm/svm/svm.c > +++ b/xen/arch/x86/hvm/svm/svm.c > @@ -3086,6 +3086,36 @@ void svm_vmexit_handler(struct cpu_user_regs *regs) > vmcb_set_vintr(vmcb, intr); > } > > +/* Called with GIF=0. */ > +void vmexit_spec_ctrl(void) > +{ > + struct cpu_info *info = get_cpu_info(); > + unsigned int val = info->xen_spec_ctrl; > + > + /* > + * Write to MSR_SPEC_CTRL unconditionally, for the RAS[:32] flushing side > + * effect. > + */ > + wrmsr(MSR_SPEC_CTRL, val, 0); > + info->last_spec_ctrl = val; > +} > + > +/* Called with GIF=0. */ > +void vmentry_spec_ctrl(void) > +{ > + struct cpu_info *info = get_cpu_info(); > + const struct vcpu *curr = current; > + unsigned int val = curr->arch.msrs->spec_ctrl.raw; > + > + if ( val != info->last_spec_ctrl ) > + { > + wrmsr(MSR_SPEC_CTRL, val, 0); > + info->last_spec_ctrl = val; > + } Is this correct for the very first use on a CPU? last_spec_ctrl starts out as zero afaict, and hence this very first write would be skipped if the guest value is also zero (which it will be for a vCPU first launched), even if we have a non-zero value in the MSR at that point. Jan
On 26/01/2022 16:50, Jan Beulich wrote: > On 26.01.2022 09:44, Andrew Cooper wrote: >> 1) It would be slightly more efficient to pass curr and cpu_info into >> vm{entry,exit}_spec_ctrl(), but setup of such state can't be in the >> ALTERNATIVE block because then the call displacement won't get fixed up. >> All the additional accesses are hot off the stack, so almost certainly >> negligible compared to the WRMSR. > What's wrong with using two instances of ALTERNATIVE, one to setup the > call arguments and the 2nd for just the CALL? Hmm diff --git a/xen/arch/x86/hvm/svm/entry.S b/xen/arch/x86/hvm/svm/entry.S index c718328ac4cf..1d4be7e97ae2 100644 --- a/xen/arch/x86/hvm/svm/entry.S +++ b/xen/arch/x86/hvm/svm/entry.S @@ -59,6 +59,7 @@ __UNLIKELY_END(nsvm_hap) /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */ /* SPEC_CTRL_EXIT_TO_SVM Req: Clob: C */ + ALTERNATIVE "", __stringify(mov %rbx, %rdi; mov %rsp, %rsi), X86_FEATURE_SC_MSR_HVM ALTERNATIVE "", __stringify(call vmentry_spec_ctrl), X86_FEATURE_SC_MSR_HVM pop %r15 is somewhat of a long line, but isn't too terrible. I'm tempted to switch back to using STR() seeing as we have both and it is much more concise. >> --- a/xen/arch/x86/hvm/svm/entry.S >> +++ b/xen/arch/x86/hvm/svm/entry.S >> @@ -55,11 +55,11 @@ __UNLIKELY_END(nsvm_hap) >> mov %rsp, %rdi >> call svm_vmenter_helper >> >> - mov VCPU_arch_msrs(%rbx), %rax >> - mov VCPUMSR_spec_ctrl_raw(%rax), %eax >> + clgi >> >> /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */ >> - /* SPEC_CTRL_EXIT_TO_SVM (nothing currently) */ >> + /* SPEC_CTRL_EXIT_TO_SVM Req: Clob: C */ >> + ALTERNATIVE "", __stringify(call vmentry_spec_ctrl), X86_FEATURE_SC_MSR_HVM > I guess the new upper case C after Clob: stands for "all call-clobbered > registers"? That was the intention, yes. > In which case ... > >> @@ -86,8 +85,9 @@ __UNLIKELY_END(nsvm_hap) >> >> GET_CURRENT(bx) >> >> - /* SPEC_CTRL_ENTRY_FROM_SVM Req: b=curr %rsp=regs/cpuinfo, Clob: ac */ >> + /* SPEC_CTRL_ENTRY_FROM_SVM Req: Clob: ac,C */ >> ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM >> + ALTERNATIVE "", __stringify(call vmexit_spec_ctrl), X86_FEATURE_SC_MSR_HVM > ... why the explicit further "ac" here? Is the intention to annotate > every individual ALTERNATIVE this way? Fair point. I'll switch to just C. The clobbers are rather more important for the PV side where the logic has multiple live variables and it's not totally obvious that all GPRs are available. > >> --- a/xen/arch/x86/hvm/svm/svm.c >> +++ b/xen/arch/x86/hvm/svm/svm.c >> @@ -3086,6 +3086,36 @@ void svm_vmexit_handler(struct cpu_user_regs *regs) >> vmcb_set_vintr(vmcb, intr); >> } >> >> +/* Called with GIF=0. */ >> +void vmexit_spec_ctrl(void) >> +{ >> + struct cpu_info *info = get_cpu_info(); >> + unsigned int val = info->xen_spec_ctrl; >> + >> + /* >> + * Write to MSR_SPEC_CTRL unconditionally, for the RAS[:32] flushing side >> + * effect. >> + */ >> + wrmsr(MSR_SPEC_CTRL, val, 0); >> + info->last_spec_ctrl = val; >> +} >> + >> +/* Called with GIF=0. */ >> +void vmentry_spec_ctrl(void) >> +{ >> + struct cpu_info *info = get_cpu_info(); >> + const struct vcpu *curr = current; >> + unsigned int val = curr->arch.msrs->spec_ctrl.raw; >> + >> + if ( val != info->last_spec_ctrl ) >> + { >> + wrmsr(MSR_SPEC_CTRL, val, 0); >> + info->last_spec_ctrl = val; >> + } > Is this correct for the very first use on a CPU? last_spec_ctrl > starts out as zero afaict, and hence this very first write would be > skipped if the guest value is also zero (which it will be for a > vCPU first launched), even if we have a non-zero value in the MSR > at that point. Ish. We intentionally write MSR_SPEC_CTRL once on each CPU to clear out any previous-environment settings, but those boot paths need to latch last_spec_ctrl too for this to work correctly. Making this safe is slightly nasty. I think the best option would be to reorder this patch to be after the patch 6, and tweak the wording in patch 6's commit message. That way, we're not adding latching to later-dropped codepaths. ~Andrew
On 26.01.2022 21:16, Andrew Cooper wrote: > On 26/01/2022 16:50, Jan Beulich wrote: >> On 26.01.2022 09:44, Andrew Cooper wrote: >>> 1) It would be slightly more efficient to pass curr and cpu_info into >>> vm{entry,exit}_spec_ctrl(), but setup of such state can't be in the >>> ALTERNATIVE block because then the call displacement won't get fixed up. >>> All the additional accesses are hot off the stack, so almost certainly >>> negligible compared to the WRMSR. >> What's wrong with using two instances of ALTERNATIVE, one to setup the >> call arguments and the 2nd for just the CALL? > > Hmm > > diff --git a/xen/arch/x86/hvm/svm/entry.S b/xen/arch/x86/hvm/svm/entry.S > index c718328ac4cf..1d4be7e97ae2 100644 > --- a/xen/arch/x86/hvm/svm/entry.S > +++ b/xen/arch/x86/hvm/svm/entry.S > @@ -59,6 +59,7 @@ __UNLIKELY_END(nsvm_hap) > > /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */ > /* SPEC_CTRL_EXIT_TO_SVM Req: > Clob: C */ > + ALTERNATIVE "", __stringify(mov %rbx, %rdi; mov %rsp, %rsi), > X86_FEATURE_SC_MSR_HVM > ALTERNATIVE "", __stringify(call vmentry_spec_ctrl), > X86_FEATURE_SC_MSR_HVM > > pop %r15 > > is somewhat of a long line, but isn't too terrible. > > I'm tempted to switch back to using STR() seeing as we have both and it > is much more concise. No objections. In fact I think when I introduced stringify.h over 10 years back, I didn't realize we already had STR(). Quite certainly first and foremost because of its bogus placement in xen/config.h (same would go for at least IS_ALIGNED() as well as KB() and friends). I wouldn't even mind phasing out stringify.h again. Or maybe we want to move STR() there ... Jan
On 27/01/2022 07:25, Jan Beulich wrote: > On 26.01.2022 21:16, Andrew Cooper wrote: >> On 26/01/2022 16:50, Jan Beulich wrote: >>> On 26.01.2022 09:44, Andrew Cooper wrote: >>>> 1) It would be slightly more efficient to pass curr and cpu_info into >>>> vm{entry,exit}_spec_ctrl(), but setup of such state can't be in the >>>> ALTERNATIVE block because then the call displacement won't get fixed up. >>>> All the additional accesses are hot off the stack, so almost certainly >>>> negligible compared to the WRMSR. >>> What's wrong with using two instances of ALTERNATIVE, one to setup the >>> call arguments and the 2nd for just the CALL? >> Hmm >> >> diff --git a/xen/arch/x86/hvm/svm/entry.S b/xen/arch/x86/hvm/svm/entry.S >> index c718328ac4cf..1d4be7e97ae2 100644 >> --- a/xen/arch/x86/hvm/svm/entry.S >> +++ b/xen/arch/x86/hvm/svm/entry.S >> @@ -59,6 +59,7 @@ __UNLIKELY_END(nsvm_hap) >> >> /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */ >> /* SPEC_CTRL_EXIT_TO_SVM Req: >> Clob: C */ >> + ALTERNATIVE "", __stringify(mov %rbx, %rdi; mov %rsp, %rsi), >> X86_FEATURE_SC_MSR_HVM >> ALTERNATIVE "", __stringify(call vmentry_spec_ctrl), >> X86_FEATURE_SC_MSR_HVM >> >> pop %r15 >> >> is somewhat of a long line, but isn't too terrible. >> >> I'm tempted to switch back to using STR() seeing as we have both and it >> is much more concise. > No objections. In fact I think when I introduced stringify.h over 10 years > back, I didn't realize we already had STR(). Quite certainly first and > foremost because of its bogus placement in xen/config.h (same would go for > at least IS_ALIGNED() as well as KB() and friends). > > I wouldn't even mind phasing out stringify.h again. Or maybe we want to > move STR() there ... Now that we've given up using freestanding headers anywhere, there's a xen/stddef.h shaped hole. Things like NULL should move across, but it would also be the appropriate place for ARRAY_SIZE() and pretty much everything else we expect to be ubiquitous. Linux includes things like offsetof(). ~Andrew
diff --git a/xen/arch/x86/hvm/svm/entry.S b/xen/arch/x86/hvm/svm/entry.S index 276215d36aff..c718328ac4cf 100644 --- a/xen/arch/x86/hvm/svm/entry.S +++ b/xen/arch/x86/hvm/svm/entry.S @@ -55,11 +55,11 @@ __UNLIKELY_END(nsvm_hap) mov %rsp, %rdi call svm_vmenter_helper - mov VCPU_arch_msrs(%rbx), %rax - mov VCPUMSR_spec_ctrl_raw(%rax), %eax + clgi /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */ - /* SPEC_CTRL_EXIT_TO_SVM (nothing currently) */ + /* SPEC_CTRL_EXIT_TO_SVM Req: Clob: C */ + ALTERNATIVE "", __stringify(call vmentry_spec_ctrl), X86_FEATURE_SC_MSR_HVM pop %r15 pop %r14 @@ -78,7 +78,6 @@ __UNLIKELY_END(nsvm_hap) pop %rsi pop %rdi - clgi sti vmrun @@ -86,8 +85,9 @@ __UNLIKELY_END(nsvm_hap) GET_CURRENT(bx) - /* SPEC_CTRL_ENTRY_FROM_SVM Req: b=curr %rsp=regs/cpuinfo, Clob: ac */ + /* SPEC_CTRL_ENTRY_FROM_SVM Req: Clob: ac,C */ ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM + ALTERNATIVE "", __stringify(call vmexit_spec_ctrl), X86_FEATURE_SC_MSR_HVM /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */ stgi diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c index bb6b8e560a9f..8fdb530b4004 100644 --- a/xen/arch/x86/hvm/svm/svm.c +++ b/xen/arch/x86/hvm/svm/svm.c @@ -3086,6 +3086,36 @@ void svm_vmexit_handler(struct cpu_user_regs *regs) vmcb_set_vintr(vmcb, intr); } +/* Called with GIF=0. */ +void vmexit_spec_ctrl(void) +{ + struct cpu_info *info = get_cpu_info(); + unsigned int val = info->xen_spec_ctrl; + + /* + * Write to MSR_SPEC_CTRL unconditionally, for the RAS[:32] flushing side + * effect. + */ + wrmsr(MSR_SPEC_CTRL, val, 0); + info->last_spec_ctrl = val; +} + +/* Called with GIF=0. */ +void vmentry_spec_ctrl(void) +{ + struct cpu_info *info = get_cpu_info(); + const struct vcpu *curr = current; + unsigned int val = curr->arch.msrs->spec_ctrl.raw; + + if ( val != info->last_spec_ctrl ) + { + wrmsr(MSR_SPEC_CTRL, val, 0); + info->last_spec_ctrl = val; + } + + /* No Spectre v1 concerns. Execution is going to hit VMRUN imminently. */ +} + /* * Local variables: * mode: C diff --git a/xen/arch/x86/include/asm/current.h b/xen/arch/x86/include/asm/current.h index cfbedc31983f..dc0edd9ed07d 100644 --- a/xen/arch/x86/include/asm/current.h +++ b/xen/arch/x86/include/asm/current.h @@ -56,6 +56,7 @@ struct cpu_info { /* See asm/spec_ctrl_asm.h for usage. */ unsigned int shadow_spec_ctrl; uint8_t xen_spec_ctrl; + uint8_t last_spec_ctrl; uint8_t spec_ctrl_flags; /* @@ -73,7 +74,6 @@ struct cpu_info { */ bool use_pv_cr3; - unsigned long __pad; /* get_stack_bottom() must be 16-byte aligned */ }; diff --git a/xen/arch/x86/include/asm/msr.h b/xen/arch/x86/include/asm/msr.h index 657a3295613d..ce4fe51afe54 100644 --- a/xen/arch/x86/include/asm/msr.h +++ b/xen/arch/x86/include/asm/msr.h @@ -297,6 +297,15 @@ struct vcpu_msrs * * For VT-x guests, the guest value is held in the MSR guest load/save * list. + * + * For SVM, the guest value lives in the VMCB, and hardware saves/restores + * the host value automatically. However, guests run with the OR of the + * host and guest value, which allows Xen to set protections behind the + * guest's back. + * + * We must clear/restore Xen's value before/after VMRUN to avoid unduly + * influencing the guest. In order to support "behind the guest's back" + * protections, we load this value (commonly 0) before VMRUN. */ struct { uint32_t raw; diff --git a/xen/arch/x86/include/asm/spec_ctrl_asm.h b/xen/arch/x86/include/asm/spec_ctrl_asm.h index bf82528a12ae..02b3b18ce69f 100644 --- a/xen/arch/x86/include/asm/spec_ctrl_asm.h +++ b/xen/arch/x86/include/asm/spec_ctrl_asm.h @@ -46,6 +46,9 @@ * - On VMX by using MSR load/save lists to have vmentry/exit atomically * load/save the guest value. Xen's value is loaded in regular code, and * there is no need to use the shadow logic (below). + * - On SVM by altering MSR_SPEC_CTRL inside the CLGI/STGI region. This + * makes the changes atomic with respect to NMIs/etc, so no need for + * shadowing logic. * * Factor 2 is harder. We maintain a shadow_spec_ctrl value, and a use_shadow * boolean in the per cpu spec_ctrl_flags. The synchronous use is: @@ -67,6 +70,10 @@ * steps 2 and 6 will restore the shadow value rather than leaving Xen's value * loaded and corrupting the value used in guest context. * + * Additionally, in some cases it is safe to skip writes to MSR_SPEC_CTRL when + * we don't require any of the side effects of an identical write. Maintain a + * per-cpu last_spec_ctrl value for this purpose. + * * The following ASM fragments implement this algorithm. See their local * comments for further details. * - SPEC_CTRL_ENTRY_FROM_PV
Hardware maintains both host and guest versions of MSR_SPEC_CTRL, but guests run with the logical OR of both values. Therefore, in principle we want to clear Xen's value before entering the guest. However, for migration compatibiltiy, and for performance reasons with SEV-SNP guests, we want the ability to use a nonzero value behind the guest's back. Use vcpu_msrs to hold this value, with the guest value in the VMCB. On the VMEntry path, adjusting MSR_SPEC_CTRL must be done after CLGI so as to be atomic with respect to NMIs/etc. The loading of spec_ctrl_raw into %eax was also stale from the unused old code, so can be dropped too. Implement both pieces of logic as small pieces of C, and alternative the call to get there based on X86_FEATURE_SC_MSR_HVM. While adjusting the clobber lists, drop the stale requirements on the VMExit side. The common case is that host and "guest-protection" values are both 0, so maintain a per-cpu last_spec_ctrl value to allow us to skip redundant WRMSRs. The value needs to live in the cpu_info block for subsequent use with PV guests, and compatibility with XPTI. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> --- CC: Jan Beulich <JBeulich@suse.com> CC: Roger Pau Monné <roger.pau@citrix.com> CC: Wei Liu <wl@xen.org> Several points: 1) It would be slightly more efficient to pass curr and cpu_info into vm{entry,exit}_spec_ctrl(), but setup of such state can't be in the ALTERNATIVE block because then the call displacement won't get fixed up. All the additional accesses are hot off the stack, so almost certainly negligible compared to the WRMSR. 2) The RAS[:32] flushing side effect is under reconsideration. It is actually a very awkward side effect in practice, and not applicable to any implementations (that I'm aware of), but for now, it's the documented safe action to take. Furthermore, it avoids complicating the logic with an lfence in the else case for Spectre v1 safety. --- xen/arch/x86/hvm/svm/entry.S | 10 +++++----- xen/arch/x86/hvm/svm/svm.c | 30 ++++++++++++++++++++++++++++++ xen/arch/x86/include/asm/current.h | 2 +- xen/arch/x86/include/asm/msr.h | 9 +++++++++ xen/arch/x86/include/asm/spec_ctrl_asm.h | 7 +++++++ 5 files changed, 52 insertions(+), 6 deletions(-)