Message ID | 1456225539-9162-3-git-send-email-david.vrabel@citrix.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 23/02/16 11:05, David Vrabel wrote: > @@ -653,6 +657,12 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, > /* PV/PVH guests get an emulated PIT too for video BIOSes to use. */ > pit_init(d, cpu_khz); > > + /* > + * If the FPU not to save FCS/FDS then we can always save/restore "If the FPU does not" ? > @@ -284,7 +284,11 @@ void xsave(struct vcpu *v, uint64_t mask) > return; > } > > - if ( word_size > 0 && > + /* > + * If the FIP/FDP[63:32] are both zero, it is safe to use the > + * 32-bit restore to also restore the selectors. > + */ This comment is presumably applicable to the fxsave path? The functional content of the patch, Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
On 23/02/16 11:10, Andrew Cooper wrote: > On 23/02/16 11:05, David Vrabel wrote: >> @@ -653,6 +657,12 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, >> /* PV/PVH guests get an emulated PIT too for video BIOSes to use. */ >> pit_init(d, cpu_khz); >> >> + /* >> + * If the FPU not to save FCS/FDS then we can always save/restore > > "If the FPU does not" ? > >> @@ -284,7 +284,11 @@ void xsave(struct vcpu *v, uint64_t mask) >> return; >> } >> >> - if ( word_size > 0 && >> + /* >> + * If the FIP/FDP[63:32] are both zero, it is safe to use the >> + * 32-bit restore to also restore the selectors. >> + */ > > This comment is presumably applicable to the fxsave path? Yes. Do you want this comment added there? David
>>> On 23.02.16 at 12:05, <david.vrabel@citrix.com> wrote: > @@ -292,17 +296,17 @@ void xsave(struct vcpu *v, uint64_t mask) > asm volatile ( "fnstenv %0" : "=m" (fpu_env) ); > ptr->fpu_sse.fip.sel = fpu_env.fcs; > ptr->fpu_sse.fdp.sel = fpu_env.fds; > - word_size = 4; > + fip_width = 4; > } > + else > + fip_width = 8; > } > else > { > XSAVE(""); > - word_size = 4; > } > #undef XSAVE > - if ( word_size >= 0 ) > - ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = word_size; > + ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = fip_width; > } There's actually a pre-existing bug here that I think should get fixed at once: The 64-bit save path avoided to update ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] when FPU state didn't get saved (by setting word_size to -1), which continues to be the case due to patch 1's changes. The 32-bit code path violated this even before your change. I.e. the last else visible above should get extended to return when !(mask & XSTATE_FP). Jan
On 23/02/16 15:24, Jan Beulich wrote: >>>> On 23.02.16 at 12:05, <david.vrabel@citrix.com> wrote: >> @@ -292,17 +296,17 @@ void xsave(struct vcpu *v, uint64_t mask) >> asm volatile ( "fnstenv %0" : "=m" (fpu_env) ); >> ptr->fpu_sse.fip.sel = fpu_env.fcs; >> ptr->fpu_sse.fdp.sel = fpu_env.fds; >> - word_size = 4; >> + fip_width = 4; >> } >> + else >> + fip_width = 8; >> } >> else >> { >> XSAVE(""); >> - word_size = 4; >> } >> #undef XSAVE >> - if ( word_size >= 0 ) >> - ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = word_size; >> + ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = fip_width; >> } > > There's actually a pre-existing bug here that I think should get > fixed at once: The 64-bit save path avoided to update > ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] when FPU state > didn't get saved (by setting word_size to -1), which continues to be > the case due to patch 1's changes. The 32-bit code path violated > this even before your change. I.e. the last else visible above > should get extended to return when !(mask & XSTATE_FP). XSTATE_FP is always set in the mask. (see fpu_xsave()). David
>>> On 23.02.16 at 17:27, <david.vrabel@citrix.com> wrote: > On 23/02/16 15:24, Jan Beulich wrote: >>>>> On 23.02.16 at 12:05, <david.vrabel@citrix.com> wrote: >>> @@ -292,17 +296,17 @@ void xsave(struct vcpu *v, uint64_t mask) >>> asm volatile ( "fnstenv %0" : "=m" (fpu_env) ); >>> ptr->fpu_sse.fip.sel = fpu_env.fcs; >>> ptr->fpu_sse.fdp.sel = fpu_env.fds; >>> - word_size = 4; >>> + fip_width = 4; >>> } >>> + else >>> + fip_width = 8; >>> } >>> else >>> { >>> XSAVE(""); >>> - word_size = 4; >>> } >>> #undef XSAVE >>> - if ( word_size >= 0 ) >>> - ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = word_size; >>> + ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = fip_width; >>> } >> >> There's actually a pre-existing bug here that I think should get >> fixed at once: The 64-bit save path avoided to update >> ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] when FPU state >> didn't get saved (by setting word_size to -1), which continues to be >> the case due to patch 1's changes. The 32-bit code path violated >> this even before your change. I.e. the last else visible above >> should get extended to return when !(mask & XSTATE_FP). > > XSTATE_FP is always set in the mask. (see fpu_xsave()). fpu_xsave() sets the flag in XCR0, but not all values returned from vcpu_xsave_mask() have that flag set (and hence the "mask" parameter doesn't either). Jan
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c index 9d43f7b..e863bbd 100644 --- a/xen/arch/x86/domain.c +++ b/xen/arch/x86/domain.c @@ -343,6 +343,8 @@ int switch_native(struct domain *d) hvm_set_mode(v, 8); } + d->arch.x87_fip_width = cpu_has_fpu_sel ? 0 : 8; + return 0; } @@ -377,6 +379,8 @@ int switch_compat(struct domain *d) domain_set_alloc_bitsize(d); + d->arch.x87_fip_width = 4; + return 0; undo_and_fail: @@ -653,6 +657,12 @@ int arch_domain_create(struct domain *d, unsigned int domcr_flags, /* PV/PVH guests get an emulated PIT too for video BIOSes to use. */ pit_init(d, cpu_khz); + /* + * If the FPU not to save FCS/FDS then we can always save/restore + * the 64-bit FIP/FDP and ignore the selectors. + */ + d->arch.x87_fip_width = cpu_has_fpu_sel ? 0 : 8; + return 0; fail: diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c index 67016c9..ef08a52 100644 --- a/xen/arch/x86/i387.c +++ b/xen/arch/x86/i387.c @@ -144,9 +144,9 @@ static inline void fpu_xsave(struct vcpu *v) static inline void fpu_fxsave(struct vcpu *v) { typeof(v->arch.xsave_area->fpu_sse) *fpu_ctxt = v->arch.fpu_ctxt; - int word_size = cpu_has_fpu_sel ? 8 : 0; + unsigned int fip_width = v->domain->arch.x87_fip_width; - if ( !is_pv_32bit_vcpu(v) ) + if ( fip_width != 4 ) { /* * The only way to force fxsaveq on a wide range of gas versions. @@ -164,7 +164,7 @@ static inline void fpu_fxsave(struct vcpu *v) boot_cpu_data.x86_vendor == X86_VENDOR_AMD ) return; - if ( word_size > 0 && + if ( !fip_width && !((fpu_ctxt->fip.addr | fpu_ctxt->fdp.addr) >> 32) ) { struct ix87_env fpu_env; @@ -172,17 +172,18 @@ static inline void fpu_fxsave(struct vcpu *v) asm volatile ( "fnstenv %0" : "=m" (fpu_env) ); fpu_ctxt->fip.sel = fpu_env.fcs; fpu_ctxt->fdp.sel = fpu_env.fds; - word_size = 4; + fip_width = 4; } + else + fip_width = 8; } else { asm volatile ( "fxsave %0" : "=m" (*fpu_ctxt) ); - word_size = 4; + fip_width = 4; } - if ( word_size >= 0 ) - fpu_ctxt->x[FPU_WORD_SIZE_OFFSET] = word_size; + fpu_ctxt->x[FPU_WORD_SIZE_OFFSET] = fip_width; } /*******************************/ diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index 0869789..0d2a3a4 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -249,7 +249,7 @@ void xsave(struct vcpu *v, uint64_t mask) struct xsave_struct *ptr = v->arch.xsave_area; uint32_t hmask = mask >> 32; uint32_t lmask = mask; - int word_size = mask & XSTATE_FP ? (cpu_has_fpu_sel ? 8 : 0) : -1; + unsigned int fip_width = v->domain->arch.x87_fip_width; #define XSAVE(pfx) \ alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", /* xsave */ \ ".byte " pfx "0x0f,0xae,0x37\n", /* xsaveopt */ \ @@ -261,7 +261,7 @@ void xsave(struct vcpu *v, uint64_t mask) "=m" (*ptr), \ "a" (lmask), "d" (hmask), "D" (ptr)) - if ( word_size <= 0 || !is_pv_32bit_vcpu(v) ) + if ( fip_width != 4 ) { uint64_t bad_fip; @@ -284,7 +284,11 @@ void xsave(struct vcpu *v, uint64_t mask) return; } - if ( word_size > 0 && + /* + * If the FIP/FDP[63:32] are both zero, it is safe to use the + * 32-bit restore to also restore the selectors. + */ + if ( !fip_width && !((ptr->fpu_sse.fip.addr | ptr->fpu_sse.fdp.addr) >> 32) ) { struct ix87_env fpu_env; @@ -292,17 +296,17 @@ void xsave(struct vcpu *v, uint64_t mask) asm volatile ( "fnstenv %0" : "=m" (fpu_env) ); ptr->fpu_sse.fip.sel = fpu_env.fcs; ptr->fpu_sse.fdp.sel = fpu_env.fds; - word_size = 4; + fip_width = 4; } + else + fip_width = 8; } else { XSAVE(""); - word_size = 4; } #undef XSAVE - if ( word_size >= 0 ) - ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = word_size; + ptr->fpu_sse.x[FPU_WORD_SIZE_OFFSET] = fip_width; } void xrstor(struct vcpu *v, uint64_t mask) diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h index 4fad638..7135709 100644 --- a/xen/include/asm-x86/domain.h +++ b/xen/include/asm-x86/domain.h @@ -339,6 +339,21 @@ struct arch_domain u8 x86_vendor; /* CPU vendor */ u8 x86_model; /* CPU model */ + /* + * The width of the FIP/FDP register in the FPU that needs to be + * saved/restored during a context switch. This is needed because + * the FPU can either: a) restore the 64-bit FIP/FDP and clear FCS + * and FDS; or b) restore the 32-bit FIP/FDP (clearing the upper + * 32-bits of FIP/FDP) and restore FCS/FDS. + * + * Which one is needed depends on the guest. + * + * This can be either: 8, 4 or 0. 0 means auto-detect the size + * based on the width of FIP/FDP values that are written by the + * guest. + */ + uint8_t x87_fip_width; + cpuid_input_t *cpuids; struct PITState vpit;
The x86 architecture allows either: a) the 64-bit FIP/FDP registers to be restored (clearing FCS and FDS); or b) the 32-bit FIP/FDP and FCS/FDS registers to be restored (clearing the upper 32-bits). Add a per-domain field to indicate which of these options a guest needs. The options are: 8, 4 or 0. Where 0 indicates that the hypervisor should automatically guess the FIP width by checking the value of FIP/FDP when saving the state (this is the existing behaviour). The FIP width is initially automatic but is set explicitly in the following cases: - 32-bit PV guest: 4 - Newer CPUs that do not save FCS/FDS: 8 The x87_fip_width field is placed into an existing 1 byte hole in struct arch_domain. Signed-off-by: David Vrabel <david.vrabel@citrix.com> --- v2: - Retain logic to detect if XSAVE* writes FIP/FDP. - Keep 64-bit PV guests in auto-mode. --- xen/arch/x86/domain.c | 10 ++++++++++ xen/arch/x86/i387.c | 15 ++++++++------- xen/arch/x86/xstate.c | 18 +++++++++++------- xen/include/asm-x86/domain.h | 15 +++++++++++++++ 4 files changed, 44 insertions(+), 14 deletions(-)