Message ID | 1457089255-20424-1-git-send-email-shuai.ruan@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
>>> On 04.03.16 at 12:00, <shuai.ruan@linux.intel.com> wrote: > --- a/xen/arch/x86/domctl.c > +++ b/xen/arch/x86/domctl.c > @@ -934,8 +934,14 @@ long arch_do_domctl( > goto vcpuextstate_out; > } > > - expand_xsave_states(v, xsave_area, > - size - 2 * sizeof(uint64_t)); > + ret = expand_xsave_states(v, xsave_area, > + size - 2 * sizeof(uint64_t)); > + if ( ret ) > + { > + xfree(xsave_area); > + vcpu_unpause(v); > + goto vcpuextstate_out; > + } Well, while this is one way to deal with the problem, it's certainly not the most desirable one: We should try to avoid runtime allocations, failures of which then cause other things to fail (in perhaps not very graceful ways). And doing so is pretty simple here, and you even have two options: Either allocate a per-CPU array, or - considering that XCNTXT_MASK has only a limited number of bits set - even use an on-stack array of suitable (compile time determined from XCNTXT_MASK) size. If you want to save space, this could presumably even be uint16_t[], in which case I think it would further be okay if all 63 possible features were known (totaling to a variable size of 126 bytes). This is even more so considering that you forgot to adjust hvm.c in a similar manner. > @@ -111,21 +111,27 @@ static int setup_xstate_features(bool_t bsp) > for ( leaf = 2; leaf < xstate_features; leaf++ ) > { > if ( bsp ) > + { > cpuid_count(XSTATE_CPUID, leaf, &xstate_sizes[leaf], > - &xstate_offsets[leaf], &tmp, &tmp); > + &xstate_offsets[leaf], &ecx, &edx); > + if ( ecx & XSTATE_ALIGN64 ) > + set_bit(leaf, &xstate_align); __set_bit() > @@ -134,22 +140,26 @@ static void __init setup_xstate_comp(void) > * in the fixed offsets in the xsave area in either compacted form > * or standard form. > */ > - xstate_comp_offsets[0] = 0; > xstate_comp_offsets[1] = XSAVE_SSE_OFFSET; > > xstate_comp_offsets[2] = FXSAVE_SIZE + XSAVE_HDR_SIZE; > > for ( i = 3; i < xstate_features; i++ ) > { > - xstate_comp_offsets[i] = xstate_comp_offsets[i - 1] + > - (((1ul << i) & xfeature_mask) > - ? xstate_sizes[i - 1] : 0); > - ASSERT(xstate_comp_offsets[i] + xstate_sizes[i] <= xsave_cntxt_size); > + if ( (1ul << i) & xcomp_bv ) > + { > + xstate_comp_offsets[i] = test_bit(i, &xstate_align) ? > + ROUNDUP(xstate_comp_offsets[i - 1], 64) : > + xstate_comp_offsets[i -1]; > + xstate_comp_offsets[i] += xstate_sizes[i - 1]; > + ASSERT(xstate_comp_offsets[i] + xstate_sizes[i] <= xsave_cntxt_size); > + } This now seems wrong for the case when there are discontiguous bits in xcomp_bv, as you leave zeros in the array (which by itself is fine, but confuses the subsequent iteration). I think you'd benefit from having a local variable holding the offset you're currently at: offset = xstate_comp_offsets[2]; for ( i = 2; i < xstate_features; i++ ) { if ( (1ul << i) & xcomp_bv ) { if ( test_bit(i, &xstate_align) ) offset = ROUNDUP(offset, 64); xstate_comp_offsets[i] = offset; offset += xstate_sizes[i]; ASSERT(offset <= xsave_cntxt_size); } } or something along those lines. > static void *get_xsave_addr(struct xsave_struct *xsave, > - unsigned int xfeature_idx) > + unsigned int *xstate_comp_offsets, const > @@ -94,8 +96,8 @@ bool_t xsave_enabled(const struct vcpu *v); > int __must_check validate_xstate(u64 xcr0, u64 xcr0_accum, > const struct xsave_hdr *); > int __must_check handle_xsetbv(u32 index, u64 new_bv); > -void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size); > -void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size); > +int expand_xsave_states(struct vcpu *v, void *dest, unsigned int size); > +int compress_xsave_states(struct vcpu *v, const void *src, unsigned int size); While with the above I assume these are going to remain what they have been, a general advice: If you convert a function from having void return type to something else, and that something else can indicate an error, use __must_check so the compiler can tell you whether you've forgotten to update some caller. Jan
On Fri, Mar 04, 2016 at 06:56:35AM -0700, Jan Beulich wrote: > >>> On 04.03.16 at 12:00, <shuai.ruan@linux.intel.com> wrote: > > --- a/xen/arch/x86/domctl.c > > +++ b/xen/arch/x86/domctl.c > > @@ -934,8 +934,14 @@ long arch_do_domctl( > > goto vcpuextstate_out; > > } > > > > - expand_xsave_states(v, xsave_area, > > - size - 2 * sizeof(uint64_t)); > > + ret = expand_xsave_states(v, xsave_area, > > + size - 2 * sizeof(uint64_t)); > > + if ( ret ) > > + { > > + xfree(xsave_area); > > + vcpu_unpause(v); > > + goto vcpuextstate_out; > > + } > > Well, while this is one way to deal with the problem, it's certainly > not the most desirable one: We should try to avoid runtime > allocations, failures of which then cause other things to fail (in > perhaps not very graceful ways). And doing so is pretty simple > here, and you even have two options: Either allocate a per-CPU > array, or - considering that XCNTXT_MASK has only a limited > number of bits set - even use an on-stack array of suitable > (compile time determined from XCNTXT_MASK) size. If you Thanks. I will change it to on-stack array. For "size compile time determined from XCNTXT_MASK", hweight64(XCNTXT_MASK) can return the num of bits set. But we need to caculte the highest bit set in XCNTXT_MASK at compile time, is there any macro can be used here ? > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
>>> On 07.03.16 at 11:19, <shuai.ruan@linux.intel.com> wrote: > On Fri, Mar 04, 2016 at 06:56:35AM -0700, Jan Beulich wrote: >> >>> On 04.03.16 at 12:00, <shuai.ruan@linux.intel.com> wrote: >> > --- a/xen/arch/x86/domctl.c >> > +++ b/xen/arch/x86/domctl.c >> > @@ -934,8 +934,14 @@ long arch_do_domctl( >> > goto vcpuextstate_out; >> > } >> > >> > - expand_xsave_states(v, xsave_area, >> > - size - 2 * sizeof(uint64_t)); >> > + ret = expand_xsave_states(v, xsave_area, >> > + size - 2 * sizeof(uint64_t)); >> > + if ( ret ) >> > + { >> > + xfree(xsave_area); >> > + vcpu_unpause(v); >> > + goto vcpuextstate_out; >> > + } >> >> Well, while this is one way to deal with the problem, it's certainly >> not the most desirable one: We should try to avoid runtime >> allocations, failures of which then cause other things to fail (in >> perhaps not very graceful ways). And doing so is pretty simple >> here, and you even have two options: Either allocate a per-CPU >> array, or - considering that XCNTXT_MASK has only a limited >> number of bits set - even use an on-stack array of suitable >> (compile time determined from XCNTXT_MASK) size. If you > Thanks. > I will change it to on-stack array. > For "size compile time determined from XCNTXT_MASK", hweight64(XCNTXT_MASK) > can return the num of bits set. But we need to caculte the highest bit set > in XCNTXT_MASK at compile time, is there any macro can be used here ? You may want to pull in Linux'es ilog2(). Jan
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c index 55aecdc..3f891b4 100644 --- a/xen/arch/x86/domctl.c +++ b/xen/arch/x86/domctl.c @@ -934,8 +934,14 @@ long arch_do_domctl( goto vcpuextstate_out; } - expand_xsave_states(v, xsave_area, - size - 2 * sizeof(uint64_t)); + ret = expand_xsave_states(v, xsave_area, + size - 2 * sizeof(uint64_t)); + if ( ret ) + { + xfree(xsave_area); + vcpu_unpause(v); + goto vcpuextstate_out; + } if ( copy_to_guest_offset(evc->buffer, offset, xsave_area, size - 2 * sizeof(uint64_t)) ) @@ -1000,8 +1006,15 @@ long arch_do_domctl( v->arch.xcr0_accum = _xcr0_accum; if ( _xcr0_accum & XSTATE_NONLAZY ) v->arch.nonlazy_xstate_used = 1; - compress_xsave_states(v, _xsave_area, - evc->size - 2 * sizeof(uint64_t)); + ret = compress_xsave_states(v, _xsave_area, + evc->size - 2 * sizeof(uint64_t)); + if ( ret ) + { + xfree(receive_buf); + vcpu_unpause(v); + goto vcpuextstate_out; + } + vcpu_unpause(v); } else diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c index 8316bd9..d25f045 100644 --- a/xen/arch/x86/xstate.c +++ b/xen/arch/x86/xstate.c @@ -26,8 +26,8 @@ u64 __read_mostly xfeature_mask; static unsigned int *__read_mostly xstate_offsets; unsigned int *__read_mostly xstate_sizes; +static u64 __read_mostly xstate_align; static unsigned int __read_mostly xstate_features; -static unsigned int __read_mostly xstate_comp_offsets[sizeof(xfeature_mask)*8]; static uint32_t __read_mostly mxcsr_mask = 0x0000ffbf; @@ -94,7 +94,7 @@ static bool_t xsave_area_compressed(const struct xsave_struct *xsave_area) static int setup_xstate_features(bool_t bsp) { - unsigned int leaf, tmp, eax, ebx; + unsigned int leaf, eax, ebx, ecx, edx; if ( bsp ) { @@ -111,21 +111,27 @@ static int setup_xstate_features(bool_t bsp) for ( leaf = 2; leaf < xstate_features; leaf++ ) { if ( bsp ) + { cpuid_count(XSTATE_CPUID, leaf, &xstate_sizes[leaf], - &xstate_offsets[leaf], &tmp, &tmp); + &xstate_offsets[leaf], &ecx, &edx); + if ( ecx & XSTATE_ALIGN64 ) + set_bit(leaf, &xstate_align); + } else { cpuid_count(XSTATE_CPUID, leaf, &eax, - &ebx, &tmp, &tmp); + &ebx, &ecx, &edx); BUG_ON(eax != xstate_sizes[leaf]); BUG_ON(ebx != xstate_offsets[leaf]); + BUG_ON((ecx & XSTATE_ALIGN64) != test_bit(leaf, &xstate_align)); } } return 0; } -static void __init setup_xstate_comp(void) +static void setup_xstate_comp(unsigned int *xstate_comp_offsets, + const u64 xcomp_bv) { unsigned int i; @@ -134,22 +140,26 @@ static void __init setup_xstate_comp(void) * in the fixed offsets in the xsave area in either compacted form * or standard form. */ - xstate_comp_offsets[0] = 0; xstate_comp_offsets[1] = XSAVE_SSE_OFFSET; xstate_comp_offsets[2] = FXSAVE_SIZE + XSAVE_HDR_SIZE; for ( i = 3; i < xstate_features; i++ ) { - xstate_comp_offsets[i] = xstate_comp_offsets[i - 1] + - (((1ul << i) & xfeature_mask) - ? xstate_sizes[i - 1] : 0); - ASSERT(xstate_comp_offsets[i] + xstate_sizes[i] <= xsave_cntxt_size); + if ( (1ul << i) & xcomp_bv ) + { + xstate_comp_offsets[i] = test_bit(i, &xstate_align) ? + ROUNDUP(xstate_comp_offsets[i - 1], 64) : + xstate_comp_offsets[i -1]; + xstate_comp_offsets[i] += xstate_sizes[i - 1]; + ASSERT(xstate_comp_offsets[i] + xstate_sizes[i] <= xsave_cntxt_size); + } } } static void *get_xsave_addr(struct xsave_struct *xsave, - unsigned int xfeature_idx) + unsigned int *xstate_comp_offsets, + unsigned int xfeature_idx) { if ( !((1ul << xfeature_idx) & xsave->xsave_hdr.xstate_bv) ) return NULL; @@ -159,19 +169,26 @@ static void *get_xsave_addr(struct xsave_struct *xsave, : xstate_offsets)[xfeature_idx]; } -void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size) +int expand_xsave_states(struct vcpu *v, void *dest, unsigned int size) { struct xsave_struct *xsave = v->arch.xsave_area; + unsigned int *xstate_comp_offsets; u64 xstate_bv = xsave->xsave_hdr.xstate_bv; u64 valid; if ( !cpu_has_xsaves && !cpu_has_xsavec ) { memcpy(dest, xsave, size); - return; + return 0; } + xstate_comp_offsets = xzalloc_array(unsigned int, xstate_features); + if ( !xstate_comp_offsets ) + return -ENOMEM; + ASSERT(xsave_area_compressed(xsave)); + setup_xstate_comp(xstate_comp_offsets, xsave->xsave_hdr.xcomp_bv); + /* * Copy legacy XSAVE area and XSAVE hdr area. */ @@ -188,7 +205,7 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size) { u64 feature = valid & -valid; unsigned int index = fls(feature) - 1; - const void *src = get_xsave_addr(xsave, index); + const void *src = get_xsave_addr(xsave, xstate_comp_offsets, index); if ( src ) { @@ -198,20 +215,28 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size) valid &= ~feature; } + + xfree(xstate_comp_offsets); + return 0; } -void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) +int compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) { struct xsave_struct *xsave = v->arch.xsave_area; + unsigned int *xstate_comp_offsets; u64 xstate_bv = ((const struct xsave_struct *)src)->xsave_hdr.xstate_bv; u64 valid; if ( !cpu_has_xsaves && !cpu_has_xsavec ) { memcpy(xsave, src, size); - return; + return 0; } + xstate_comp_offsets = xzalloc_array(unsigned int, xstate_features); + if ( !xstate_comp_offsets ) + return -ENOMEM; + ASSERT(!xsave_area_compressed(src)); /* * Copy legacy XSAVE area, to avoid complications with CPUID @@ -223,6 +248,8 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) xsave->xsave_hdr.xstate_bv = xstate_bv; xsave->xsave_hdr.xcomp_bv = v->arch.xcr0_accum | XSTATE_COMPACTION_ENABLED; + setup_xstate_comp(xstate_comp_offsets, xsave->xsave_hdr.xcomp_bv); + /* * Copy each region from the non-compacted offset to the * possibly compacted offset. @@ -232,7 +259,7 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) { u64 feature = valid & -valid; unsigned int index = fls(feature) - 1; - void *dest = get_xsave_addr(xsave, index); + void *dest = get_xsave_addr(xsave, xstate_comp_offsets, index); if ( dest ) { @@ -242,6 +269,9 @@ void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size) valid &= ~feature; } + + xfree(xstate_comp_offsets); + return 0; } void xsave(struct vcpu *v, uint64_t mask) @@ -575,8 +605,6 @@ void xstate_init(struct cpuinfo_x86 *c) if ( setup_xstate_features(bsp) && bsp ) BUG(); - if ( bsp && (cpu_has_xsaves || cpu_has_xsavec) ) - setup_xstate_comp(); } static bool_t valid_xcr0(u64 xcr0) diff --git a/xen/include/asm-x86/xstate.h b/xen/include/asm-x86/xstate.h index c28cea5..782f9bf 100644 --- a/xen/include/asm-x86/xstate.h +++ b/xen/include/asm-x86/xstate.h @@ -46,6 +46,8 @@ #define XSTATE_LAZY (XSTATE_ALL & ~XSTATE_NONLAZY) #define XSTATE_COMPACTION_ENABLED (1ULL << 63) +#define XSTATE_ALIGN64 (1U << 1) + extern u64 xfeature_mask; extern unsigned int *xstate_sizes; @@ -94,8 +96,8 @@ bool_t xsave_enabled(const struct vcpu *v); int __must_check validate_xstate(u64 xcr0, u64 xcr0_accum, const struct xsave_hdr *); int __must_check handle_xsetbv(u32 index, u64 new_bv); -void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size); -void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size); +int expand_xsave_states(struct vcpu *v, void *dest, unsigned int size); +int compress_xsave_states(struct vcpu *v, const void *src, unsigned int size); /* extended state init and cleanup functions */ void xstate_free_save_area(struct vcpu *v);
Previous patch using all available features calculate xstate_comp_offsets. This is wrong.This patch fix this bug by calculating the xstate_comp_offset based on xcomp_bv of current guest. Also, the xstate_comp_offset should take alignment into consideration. Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com> Reported-by: Jan Beulich <jbeulich@suse.com> --- V3: Address comments from Jan: 1. fix xstate_comp_offsets used as static array problem. 2. change xstate_align from array to u64 and used as bitmap. 3. change calculating xstate_comp_offsets into three step. 1) whether component is set in xsavearea 2) whether component need align 3) add xstate_size[i-1] V2: Address comments from Jan: 1. code style fix. 2. setup_xstate_comp take xcomp_bv as param. xen/arch/x86/domctl.c | 21 +++++++++++--- xen/arch/x86/xstate.c | 66 +++++++++++++++++++++++++++++++------------- xen/include/asm-x86/xstate.h | 6 ++-- 3 files changed, 68 insertions(+), 25 deletions(-)