diff mbox

[V4] x86/xsaves: fix overwriting between non-lazy/lazy xsaves

Message ID 1457598165-10393-1-git-send-email-shuai.ruan@linux.intel.com (mailing list archive)
State New, archived
Headers show

Commit Message

Shuai Ruan March 10, 2016, 8:22 a.m. UTC
The offset at which components xsaved by xsave[sc] are not fixed.
So when when a save with v->fpu_dirtied set is followed by one
with v->fpu_dirtied clear, non-lazy xsave[sc] may overwriting data
written by the lazy one.

The solution is when using_xsave_compact is enabled and taking xcr0_accum into
consideration, if guest has ever used XSTATE_LAZY & ~XSTATE_FP_SSE
(XSTATE_FP_SSE will be excluded beacause xsave will write XSTATE_FP_SSE
part in legacy region of xsave area which is fixed, saving XSTATE_FS_SSE
will not cause overwriting problem), vcpu_xsave_mask will return XSTATE_ALL.
Otherwise vcpu_xsave_mask will return XSTATE_NONLAZY.

This may cause overhead save on lazy states which will cause performance
impact. After doing some performance tests on xsavec and xsaveopt
(suggested by jan), the results show xsaveopt performs better than xsavec.
So hypervisor will not use xsavec anymore.

As xsaves will be used until suprevised state is instroduced in hypervisor.
So in this patch add a new synthetic CPU feature X86_FEATURE_XSAVE_COMPACT
which indicates whether hypervisor uses compact xsave area or not.
Code related with hypervisor xsave will depend on this feature.

Signed-off-by: Shuai Ruan <shuai.ruan@linux.intel.com>
Reported-by: Jan Beulich <jbeulich@suse.com>
---
v4: Address comments from Jan
1. Add synthetic CPU feature X86_FEATURE_XSAVE_COMPACT and use this feature 
   indicate whether hypervisor use compact xsave area or not.
2. Fix type/grammer errors of the comment in vcpu_xsave_mask.

v3: Address comments from Jan
1. Add xsavc clean up code and disable xsaves.
2. Add comment on why certain mask should be return in vcpu_xsave_mask.

v2: Address comments from Jan
add performance impact and next step to do in the description.

 xen/arch/x86/domain.c            |  2 +-
 xen/arch/x86/domctl.c            |  2 +-
 xen/arch/x86/hvm/hvm.c           |  4 ++--
 xen/arch/x86/i387.c              | 18 ++++++++++++++++--
 xen/arch/x86/xstate.c            | 22 ++++++++++------------
 xen/include/asm-x86/cpufeature.h |  2 ++
 6 files changed, 32 insertions(+), 18 deletions(-)

Comments

Jan Beulich March 10, 2016, 9:30 a.m. UTC | #1
>>> On 10.03.16 at 09:22, <shuai.ruan@linux.intel.com> wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -948,7 +948,7 @@ int arch_set_info_guest(
>          fpu_sse->fcw = FCW_DEFAULT;
>          fpu_sse->mxcsr = MXCSR_DEFAULT;
>      }
> -    if ( cpu_has_xsaves )
> +    if ( using_xsave_compact )
>      {
>          ASSERT(v->arch.xsave_area);
>          v->arch.xsave_area->xsave_hdr.xcomp_bv = XSTATE_COMPACTION_ENABLED |

With the common pattern being that cpu_has_xsaves ||
cpu_has_xsavec gets replaced by using_xsave_compact this
looks wrong - imo it needs to be using_xsave_compact &&
cpu_has_xsaves, since when using plain XRSTOR we don't
need to set the compaction bit. The same would apply
elsewhere in the patch.

> --- a/xen/arch/x86/i387.c
> +++ b/xen/arch/x86/i387.c
> @@ -118,7 +118,21 @@ static inline uint64_t vcpu_xsave_mask(const struct vcpu *v)
>      if ( v->fpu_dirtied )
>          return v->arch.nonlazy_xstate_used ? XSTATE_ALL : XSTATE_LAZY;
>  
> -    return v->arch.nonlazy_xstate_used ? XSTATE_NONLAZY : 0;

Considering this being deleted, wouldn't you better add
ASSERT(v->arch.nonlazy_xstate_used) in its stead, making
more obvious that (taking into account the other return path
above) in the using_xsave_compact case both paths will
produce XSTATE_ALL?

> +    /*
> +     * The offsets of components which live in the extended region of
> +     * compact xsave area are not fixed. Xsave area may be overwritten
> +     * when v->fpu_dirtied set is followed by one with v->fpu_dirtied
> +     * clear.

I think this sentence still lack a few words, e.g. "... when a save
with v->fpu_dirtied set is followed ..."

> +     * In such case, if hypervisor uses compact xsave area and guest
> +     * has ever used lazy states (checking xcr0_accum also excluding

I'm not sure about the "also" here. Perhaps just drop it? Or replace
it by "yet"? A native speaker's input would be appreciated.

> +     * XSTATE_FP_SSE), vcpu_xsave_mask will return XSTATE_ALL. Otherwise
> +     * return XSTATE_NONLAZY.
> +     * XSTATE_FP_SSE may be excluded, because the offsets of XSTATE_FP_SSE
> +     * (in the legacy region of xsave area) are fixed, so saving
> +     * XSTATE_FS_SSE will not cause overwriting problem.

s/FS/FP/

> +     */
> +    return using_xsave_compact && (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE) ?
> +           XSTATE_ALL : XSTATE_NONLAZY;

Long line and indentation.

> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -165,7 +165,7 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>      u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
>      u64 valid;
>  
> -    if ( !cpu_has_xsaves && !cpu_has_xsavec )
> +    if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) )
>      {
>          memcpy(dest, xsave, size);
>          return;

This one looks correct, but ...

> @@ -206,7 +206,7 @@ void compress_xsave_states(struct vcpu *v, const void 
> *src, unsigned int size)
>      u64 xstate_bv = ((const struct xsave_struct *)src)->xsave_hdr.xstate_bv;
>      u64 valid;
>  
> -    if ( !cpu_has_xsaves && !cpu_has_xsavec )
> +    if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) )
>      {
>          memcpy(xsave, src, size);
>          return;

... how can this one be? You are in the process of compressing
known uncompressed input.

> @@ -370,7 +368,7 @@ void xrstor(struct vcpu *v, uint64_t mask)
>                         "   .previous\n" \
>                         _ASM_EXTABLE(1b, 2b), \
>                         ".byte " pfx "0x0f,0xc7,0x1f\n", \
> -                       X86_FEATURE_XSAVES, \
> +                       X86_FEATURE_XSAVE_COMPACT, \
>                         ASM_OUTPUT2([mem] "+m" (*ptr), [faults] "+g" (faults)), \
>                         [lmask] "a" (lmask), [hmask] "d" (hmask), \
>                         [ptr] "D" (ptr))

I don't think you can stick to alternative patching here - whether
to use XRSTORS now depends on what state is to be restored.

Or maybe (to amend the first comment above)
"using_xsave_compact" is actually the wrong term now, and this
really needs to become "using_xsaves" (in which case the change
suggested in that first comment wouldn't be needed anymore). In
the end, at least the code outside of xstate.c should be in a state
where xstate.c's choice of whether to use XSAVEC doesn't matter
(and ideally this would also extend to all code in that file except
for the relevant parts of xsave()).

Jan
Shuai Ruan March 11, 2016, 6:45 a.m. UTC | #2
On Thu, Mar 10, 2016 at 02:30:34AM -0700, Jan Beulich wrote:
> I'm not sure about the "also" here. Perhaps just drop it? Or replace
> it by "yet"? A native speaker's input would be appreciated.
> 
Thanks. I will drop it . 
> > --- a/xen/arch/x86/xstate.c
> > +++ b/xen/arch/x86/xstate.c
> > @@ -165,7 +165,7 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
> >      u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
> >      u64 valid;
> >  
> > -    if ( !cpu_has_xsaves && !cpu_has_xsavec )
> > +    if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) )
> >      {
> >          memcpy(dest, xsave, size);
> >          return;
> 
> This one looks correct, but ...
> 
> > @@ -206,7 +206,7 @@ void compress_xsave_states(struct vcpu *v, const void 
> > *src, unsigned int size)
> >      u64 xstate_bv = ((const struct xsave_struct *)src)->xsave_hdr.xstate_bv;
> >      u64 valid;
> >  
> > -    if ( !cpu_has_xsaves && !cpu_has_xsavec )
> > +    if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) )
> >      {
> >          memcpy(xsave, src, size);
> >          return;
> 
> ... how can this one be? You are in the process of compressing
> known uncompressed input.
I think this one is corret, here this check means whether we use
xsaves in xen or not (actually when we use xsaves in xen 
xsave->xsave_hdr.xcomp_bv will set XSTATE_COMPACTION_ENABLED).
For more clearly, I can add 
if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) &&
     !xsave_area_compressed(src) )
But I do think !xsave_area_compressed(src) is useless. 
There is a "ASSERT(!xsave_area_compressed(src))" follow "if ()".
> 
> > @@ -370,7 +368,7 @@ void xrstor(struct vcpu *v, uint64_t mask)
> >                         "   .previous\n" \
> >                         _ASM_EXTABLE(1b, 2b), \
> >                         ".byte " pfx "0x0f,0xc7,0x1f\n", \
> > -                       X86_FEATURE_XSAVES, \
> > +                       X86_FEATURE_XSAVE_COMPACT, \
> >                         ASM_OUTPUT2([mem] "+m" (*ptr), [faults] "+g" (faults)), \
> >                         [lmask] "a" (lmask), [hmask] "d" (hmask), \
> >                         [ptr] "D" (ptr))
> 
> I don't think you can stick to alternative patching here - whether
> to use XRSTORS now depends on what state is to be restored.
> 
X86_FEATURE_XSAVE_COMPACT is confusing. I will change
X86_FEATURE_XSAVE_COMPACT -> X86_FEATURE_USE_XSAVES
Then, XRSTORS in the alternative patch can depend on 
X86_FEATURE_USE_XSAVES. 
> Or maybe (to amend the first comment above)
> "using_xsave_compact" is actually the wrong term now, and this
> really needs to become "using_xsaves" (in which case the change
> suggested in that first comment wouldn't be needed anymore). In
The term using_xsave_compact is confusing(actually here using_xsave_compact
means using_xsaves). Will change using_xsave_compact -> using_xsaves.
> the end, at least the code outside of xstate.c should be in a state
> where xstate.c's choice of whether to use XSAVEC doesn't matter
XSAVEC? 
Oh, I now realise that I simply drop xsavec support code is
too much of a step backwards(what you want here is using a synthetic CPU 
feature X86_FEATRUE_USE_XSAVEC and using_xsavec to disable xsavec like
xsaves, code depend on cpu_has_xsavec will depend on useing_xsavec).
The code should be ok even if we use xsavc in xen.  
Is that what you mean ?
> (and ideally this would also extend to all code in that file except
> for the relevant parts of xsave()).
If I understand you clearly (my comments above is right), I think we can
also add xsavec aternative patching depend on X86_FEATRUE_USE_XSAVEC 
in xsave(), and just keep X86_FEATRUE_USE_XSAVEC clear in x86_cap.

> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
Jan Beulich March 11, 2016, 10:16 a.m. UTC | #3
>>> On 11.03.16 at 07:45, <shuai.ruan@linux.intel.com> wrote:
> On Thu, Mar 10, 2016 at 02:30:34AM -0700, Jan Beulich wrote:
>> > --- a/xen/arch/x86/xstate.c
>> > +++ b/xen/arch/x86/xstate.c
>> > @@ -165,7 +165,7 @@ void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
>> >      u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
>> >      u64 valid;
>> >  
>> > -    if ( !cpu_has_xsaves && !cpu_has_xsavec )
>> > +    if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) )
>> >      {
>> >          memcpy(dest, xsave, size);
>> >          return;
>> 
>> This one looks correct, but ...
>> 
>> > @@ -206,7 +206,7 @@ void compress_xsave_states(struct vcpu *v, const void 
>> > *src, unsigned int size)
>> >      u64 xstate_bv = ((const struct xsave_struct *)src)->xsave_hdr.xstate_bv;
>> >      u64 valid;
>> >  
>> > -    if ( !cpu_has_xsaves && !cpu_has_xsavec )
>> > +    if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) )
>> >      {
>> >          memcpy(xsave, src, size);
>> >          return;
>> 
>> ... how can this one be? You are in the process of compressing
>> known uncompressed input.
> I think this one is corret, here this check means whether we use
> xsaves in xen or not (actually when we use xsaves in xen 
> xsave->xsave_hdr.xcomp_bv will set XSTATE_COMPACTION_ENABLED).
> For more clearly, I can add 
> if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) &&
>      !xsave_area_compressed(src) )

Hmm, my reply indeed was slightly wrong, as you're not looking at
the incoming guest data here. But looking at the format the data
is currently in doesn't seem very reasonable either (even if maybe
it is correct). Hence, together with this ...

> But I do think !xsave_area_compressed(src) is useless. 
> There is a "ASSERT(!xsave_area_compressed(src))" follow "if ()".

... I think you should use !xsave_area_compressed(src) and delete
the ASSERT().

>> > @@ -370,7 +368,7 @@ void xrstor(struct vcpu *v, uint64_t mask)
>> >                         "   .previous\n" \
>> >                         _ASM_EXTABLE(1b, 2b), \
>> >                         ".byte " pfx "0x0f,0xc7,0x1f\n", \
>> > -                       X86_FEATURE_XSAVES, \
>> > +                       X86_FEATURE_XSAVE_COMPACT, \
>> >                         ASM_OUTPUT2([mem] "+m" (*ptr), [faults] "+g" (faults)), \
>> >                         [lmask] "a" (lmask), [hmask] "d" (hmask), \
>> >                         [ptr] "D" (ptr))
>> 
>> I don't think you can stick to alternative patching here - whether
>> to use XRSTORS now depends on what state is to be restored.
>> 
> X86_FEATURE_XSAVE_COMPACT is confusing. I will change
> X86_FEATURE_XSAVE_COMPACT -> X86_FEATURE_USE_XSAVES
> Then, XRSTORS in the alternative patch can depend on 
> X86_FEATURE_USE_XSAVES. 

I don't think this is what we want. In no case is this what I have
been asking for (which also applies to the remainder of your reply).
Just to re-iterate: Code outside of the code xsave() / xrstor()
functions should not be concerned at all what specific save and
restore instructions are being used. All it needs to care about is
to know what layout the data is in, and whether compaction or
expansion is needed while transferring state from / to a guest.

The fact that we introduce a synthetic feature here is solely to
satisfy the alternative instruction patching mechanism (and it
could be dropped if both the save and restore paths came to
use further conditionals, which may well be desirable - I think I
had suggested this for one of the two paths already). And
perhaps it was a mistake to scatter around the setting of
XSTATE_COMPACTION_ENABLED.

May I ask that you take a little step back and think about what
our needs here really are? For this please consider that we want
to save/restore state with as little overhead as possible (i.e. it
may be warranted to make the choice of instruction depend on
the set of components that need saving/restoring, rather than
just the availability of certain instructions). And that choice of
instruction(s) should be as transparent to the rest of the
hypervisor as possible. Which for example means ...

>> Or maybe (to amend the first comment above)
>> "using_xsave_compact" is actually the wrong term now, and this
>> really needs to become "using_xsaves" (in which case the change
>> suggested in that first comment wouldn't be needed anymore). In
> The term using_xsave_compact is confusing(actually here using_xsave_compact
> means using_xsaves). Will change using_xsave_compact -> using_xsaves.

... that "using_xsaves" is not what the rest of the hypervisor is
in need of knowing/checking. All that other code a most needs to
know/check whether the state is / needs to be in compacted form.

Jan

>> the end, at least the code outside of xstate.c should be in a state
>> where xstate.c's choice of whether to use XSAVEC doesn't matter
> XSAVEC? 
> Oh, I now realise that I simply drop xsavec support code is
> too much of a step backwards(what you want here is using a synthetic CPU 
> feature X86_FEATRUE_USE_XSAVEC and using_xsavec to disable xsavec like
> xsaves, code depend on cpu_has_xsavec will depend on useing_xsavec).
> The code should be ok even if we use xsavc in xen.  
> Is that what you mean ?
>> (and ideally this would also extend to all code in that file except
>> for the relevant parts of xsave()).
> If I understand you clearly (my comments above is right), I think we can
> also add xsavec aternative patching depend on X86_FEATRUE_USE_XSAVEC 
> in xsave(), and just keep X86_FEATRUE_USE_XSAVEC clear in x86_cap.
> 
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org 
>> http://lists.xen.org/xen-devel
Shuai Ruan March 15, 2016, 9:40 a.m. UTC | #4
On Fri, Mar 11, 2016 at 03:16:05AM -0700, Jan Beulich wrote:
> I don't think this is what we want. In no case is this what I have
> been asking for (which also applies to the remainder of your reply).
> Just to re-iterate: Code outside of the code xsave() / xrstor()
> functions should not be concerned at all what specific save and
> restore instructions are being used. All it needs to care about is
> to know what layout the data is in, and whether compaction or
> expansion is needed while transferring state from / to a guest.
> 
> The fact that we introduce a synthetic feature here is solely to
> satisfy the alternative instruction patching mechanism (and it
> could be dropped if both the save and restore paths came to
> use further conditionals, which may well be desirable - I think I
> had suggested this for one of the two paths already). And
> perhaps it was a mistake to scatter around the setting of
> XSTATE_COMPACTION_ENABLED.
> 
> May I ask that you take a little step back and think about what
> our needs here really are? For this please consider that we want
> to save/restore state with as little overhead as possible (i.e. it
> may be warranted to make the choice of instruction depend on
> the set of components that need saving/restoring, rather than
> just the availability of certain instructions). And that choice of
> instruction(s) should be as transparent to the rest of the
> hypervisor as possible. Which for example means ...
> 
> >> Or maybe (to amend the first comment above)
> >> "using_xsave_compact" is actually the wrong term now, and this
> >> really needs to become "using_xsaves" (in which case the change
> >> suggested in that first comment wouldn't be needed anymore). In
> > The term using_xsave_compact is confusing(actually here using_xsave_compact
> > means using_xsaves). Will change using_xsave_compact -> using_xsaves.
> 
> ... that "using_xsaves" is not what the rest of the hypervisor is
> in need of knowing/checking. All that other code a most needs to
> know/check whether the state is / needs to be in compacted form.
> 
> Jan
> 
Sure. I write a few key points here.

1. For when to use "XSTATE_COMPACTION_ENABLED"
1). it will only be set in xrstor().
2). all code outside xsave()/xrstor() (exclude compress_xsave_states()) 
    only check whether XSTATE_COMPACTION_ENABLED is set or not.

2. For when to use "using_xsaves"
1). only used in xrstor()/xsave().
2). xrstor will not stick to alternative patching. Will use
    if(use_xsaves) instead.

3. For save/restore(migration)
1). for save, it is ok to check XSTATE_COMPACTION_ENABLED of
    xsave->xsave_hdr.xcomp_bv to decide whether expanded is 
    needed or not.
2). for restore, in compress_xsave_states(), we can not check
    XSTATE_COMPACTION_ENABLED of xsave->xsave_hdr.xcomp_bv 
    to decide whether compress is needed or not (for                          
    XSTATE_COMPACTION_ENABLED will only be set when perform                     
    first xrstor()).
    we should use "using_xsaves" is tell whether compact is needed
    or not.(this is the only place outside xsave()/xrstor()
    depend on "using_xsaves")
    Code in compress_xsave_states looks as follow.
    ....
    if ( !using_xsaves && !xsave_area_compress(src) )
    {
  	memcpy
	return 
    }
    .....
    compress src
    

For more detail:
	xrstor() will look as follow:
	if ( using_xsaves )
	{
		if ( unlikely(!(prt->xsave_hdr->xcom_bv &
		              XSTATE_COMPACTION_ENABLED)) )
		       ptr->xsave_hdr->xcomp_bv =
		                    ptr->xsave_hdr->xstate_bv |
		                    XSTATE_COMPACTION_ENABLED;
		XRSTORS;
	}
	else
		XRSTOR;

Any comments on this? 

I know you are busy :) and really thanks for 
your time spent on making this clear to me.

> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
Jan Beulich March 15, 2016, 1:33 p.m. UTC | #5
>>> On 15.03.16 at 10:40, <shuai.ruan@linux.intel.com> wrote:
> 	xrstor() will look as follow:
> 	if ( using_xsaves )
> 	{
> 		if ( unlikely(!(prt->xsave_hdr->xcom_bv &
> 		              XSTATE_COMPACTION_ENABLED)) )
> 		       ptr->xsave_hdr->xcomp_bv =
> 		                    ptr->xsave_hdr->xstate_bv |
> 		                    XSTATE_COMPACTION_ENABLED;
> 		XRSTORS;
> 	}
> 	else
> 		XRSTOR;

This makes me imply that "using_xsaves" is still a global variable,
set depending on CPU features. That's exactly what I've said
would presumably not be sufficient in code like xrstor(). What
point is there in using XSTORS if the guest never touched XSS?
I would much rather have expected for you to introduce a
flag paralleling v->arch.nonlazy_xstate_used indicating
whether for a particular vCPU XSAVES/XRSTORS really need to be
used (or maybe just looking at xcr0_accum would be sufficient,
and no new flag is needed; in fact I think that flag would also
better go away in favor of just inspecting xcr0_accum).

Jan
Shuai Ruan March 16, 2016, 9:35 a.m. UTC | #6
On Tue, Mar 15, 2016 at 07:33:40AM -0600, Jan Beulich wrote:
> >>> On 15.03.16 at 10:40, <shuai.ruan@linux.intel.com> wrote:
> > 	xrstor() will look as follow:
> > 	if ( using_xsaves )
> > 	{
> > 		if ( unlikely(!(prt->xsave_hdr->xcom_bv &
> > 		              XSTATE_COMPACTION_ENABLED)) )
> > 		       ptr->xsave_hdr->xcomp_bv =
> > 		                    ptr->xsave_hdr->xstate_bv |
> > 		                    XSTATE_COMPACTION_ENABLED;
> > 		XRSTORS;
> > 	}
> > 	else
> > 		XRSTOR;
> 
> This makes me imply that "using_xsaves" is still a global variable,
> set depending on CPU features. That's exactly what I've said
> would presumably not be sufficient in code like xrstor(). What
> point is there in using XSTORS if the guest never touched XSS?
> I would much rather have expected for you to introduce a
> flag paralleling v->arch.nonlazy_xstate_used indicating
> whether for a particular vCPU XSAVES/XRSTORS really need to be
> used (or maybe just looking at xcr0_accum would be sufficient,
> and no new flag is needed; in fact I think that flag would also
> better go away in favor of just inspecting xcr0_accum).

if xrstor() side depend on checking xcr0_accum and using_xsaves,
then xsave() can not only depend on using_xsaves (or
X86_FEATURE_USE_XSAVES). So I will drop alternativer patching 
in xsave() side.
And both xsave() xrestor() will depend on using_xsaves and checking 
xcr0_accum. And compress_xsave_states() will check this too.

Detail is :

#define XSTATE_SUPER 0
#define using_xsaves 0

if ( using_xsaves && (v->arch.xcr0_accum & XSTATE_SUPER) )
{

.....
	XSAVES/XRSTORS;
}

....

Thanks
> 
> Jan
>
Jan Beulich March 16, 2016, 10:08 a.m. UTC | #7
>>> On 16.03.16 at 10:35, <shuai.ruan@linux.intel.com> wrote:
> On Tue, Mar 15, 2016 at 07:33:40AM -0600, Jan Beulich wrote:
>> >>> On 15.03.16 at 10:40, <shuai.ruan@linux.intel.com> wrote:
>> > 	xrstor() will look as follow:
>> > 	if ( using_xsaves )
>> > 	{
>> > 		if ( unlikely(!(prt->xsave_hdr->xcom_bv &
>> > 		              XSTATE_COMPACTION_ENABLED)) )
>> > 		       ptr->xsave_hdr->xcomp_bv =
>> > 		                    ptr->xsave_hdr->xstate_bv |
>> > 		                    XSTATE_COMPACTION_ENABLED;
>> > 		XRSTORS;
>> > 	}
>> > 	else
>> > 		XRSTOR;
>> 
>> This makes me imply that "using_xsaves" is still a global variable,
>> set depending on CPU features. That's exactly what I've said
>> would presumably not be sufficient in code like xrstor(). What
>> point is there in using XSTORS if the guest never touched XSS?
>> I would much rather have expected for you to introduce a
>> flag paralleling v->arch.nonlazy_xstate_used indicating
>> whether for a particular vCPU XSAVES/XRSTORS really need to be
>> used (or maybe just looking at xcr0_accum would be sufficient,
>> and no new flag is needed; in fact I think that flag would also
>> better go away in favor of just inspecting xcr0_accum).
> 
> if xrstor() side depend on checking xcr0_accum and using_xsaves,
> then xsave() can not only depend on using_xsaves (or
> X86_FEATURE_USE_XSAVES). So I will drop alternativer patching 
> in xsave() side.
> And both xsave() xrestor() will depend on using_xsaves and checking 
> xcr0_accum. And compress_xsave_states() will check this too.
> 
> Detail is :
> 
> #define XSTATE_SUPER 0

Not an ideal name I would say. XSTATE_XSAVES_ONLY maybe?

> #define using_xsaves 0
> 
> if ( using_xsaves && (v->arch.xcr0_accum & XSTATE_SUPER) )
> {
> 
> .....
> 	XSAVES/XRSTORS;
> }

So what does the left side of the && then do that the right side
doesn't already cover? When there's no XSAVES support, then
code elsewhere should (and already does afaict) guarantee that
the respective bits in xcr0_accum can't ever get turned on.

Jan
Shuai Ruan March 16, 2016, 10:51 a.m. UTC | #8
On Wed, Mar 16, 2016 at 04:08:07AM -0600, Jan Beulich wrote:
> >>> On 16.03.16 at 10:35, <shuai.ruan@linux.intel.com> wrote:
> > #define XSTATE_SUPER 0
> 
> Not an ideal name I would say. XSTATE_XSAVES_ONLY maybe?
> 
Ok.
> > #define using_xsaves 0
> > 
> > if ( using_xsaves && (v->arch.xcr0_accum & XSTATE_SUPER) )
> > {
> > 
> > .....
> > 	XSAVES/XRSTORS;
> > }
> 
> So what does the left side of the && then do that the right side
> doesn't already cover? When there's no XSAVES support, then
> code elsewhere should (and already does afaict) guarantee that
> the respective bits in xcr0_accum can't ever get turned on.
Yes. "v->arch.xcr0_accum & XSTATE_XSAVES_ONLY" has already cover the 
left one. For only the states set in xfeature_mask(will never turn on
XSTATE_XSAVES_ONLY nowdays) will appear in xcr0_accum. Then I will
drop all "using_xsaves" and only use v->arch.xcr0_accum & XSTATE_XSAVES_ONLY.

Really thanks for reminding me this.
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
diff mbox

Patch

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index a6d721b..83db3a5 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -948,7 +948,7 @@  int arch_set_info_guest(
         fpu_sse->fcw = FCW_DEFAULT;
         fpu_sse->mxcsr = MXCSR_DEFAULT;
     }
-    if ( cpu_has_xsaves )
+    if ( using_xsave_compact )
     {
         ASSERT(v->arch.xsave_area);
         v->arch.xsave_area->xsave_hdr.xcomp_bv = XSTATE_COMPACTION_ENABLED |
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 55aecdc..1b56732 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -922,7 +922,7 @@  long arch_do_domctl(
                 ret = -EFAULT;
 
             offset += sizeof(v->arch.xcr0_accum);
-            if ( !ret && (cpu_has_xsaves || cpu_has_xsavec) )
+            if ( !ret && using_xsave_compact )
             {
                 void *xsave_area;
 
diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 5bc2812..215df9a 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -2202,7 +2202,7 @@  static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
         xsave_area->xsave_hdr.xstate_bv = 0;
         xsave_area->fpu_sse.mxcsr = MXCSR_DEFAULT;
     }
-    if ( cpu_has_xsaves && xsave_area )
+    if ( using_xsave_compact && xsave_area )
         xsave_area->xsave_hdr.xcomp_bv = XSTATE_COMPACTION_ENABLED |
             xsave_area->xsave_hdr.xstate_bv;
 
@@ -5591,7 +5591,7 @@  void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
     if ( v->arch.xsave_area )
     {
         v->arch.xsave_area->xsave_hdr.xstate_bv = XSTATE_FP;
-        v->arch.xsave_area->xsave_hdr.xcomp_bv = cpu_has_xsaves
+        v->arch.xsave_area->xsave_hdr.xcomp_bv = using_xsave_compact
             ? XSTATE_COMPACTION_ENABLED | XSTATE_FP : 0;
     }
 
diff --git a/xen/arch/x86/i387.c b/xen/arch/x86/i387.c
index c29d0fa..51a8614 100644
--- a/xen/arch/x86/i387.c
+++ b/xen/arch/x86/i387.c
@@ -118,7 +118,21 @@  static inline uint64_t vcpu_xsave_mask(const struct vcpu *v)
     if ( v->fpu_dirtied )
         return v->arch.nonlazy_xstate_used ? XSTATE_ALL : XSTATE_LAZY;
 
-    return v->arch.nonlazy_xstate_used ? XSTATE_NONLAZY : 0;
+    /*
+     * The offsets of components which live in the extended region of
+     * compact xsave area are not fixed. Xsave area may be overwritten
+     * when v->fpu_dirtied set is followed by one with v->fpu_dirtied
+     * clear.
+     * In such case, if hypervisor uses compact xsave area and guest
+     * has ever used lazy states (checking xcr0_accum also excluding
+     * XSTATE_FP_SSE), vcpu_xsave_mask will return XSTATE_ALL. Otherwise
+     * return XSTATE_NONLAZY.
+     * XSTATE_FP_SSE may be excluded, because the offsets of XSTATE_FP_SSE
+     * (in the legacy region of xsave area) are fixed, so saving
+     * XSTATE_FS_SSE will not cause overwriting problem.
+     */
+    return using_xsave_compact && (v->arch.xcr0_accum & XSTATE_LAZY & ~XSTATE_FP_SSE) ?
+           XSTATE_ALL : XSTATE_NONLAZY;
 }
 
 /* Save x87 extended state */
@@ -277,7 +291,7 @@  int vcpu_init_fpu(struct vcpu *v)
     if ( v->arch.xsave_area )
     {
         v->arch.fpu_ctxt = &v->arch.xsave_area->fpu_sse;
-        if ( cpu_has_xsaves )
+        if ( using_xsave_compact )
             v->arch.xsave_area->xsave_hdr.xcomp_bv = XSTATE_COMPACTION_ENABLED;
     }
     else
diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index 8316bd9..c08e171 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -165,7 +165,7 @@  void expand_xsave_states(struct vcpu *v, void *dest, unsigned int size)
     u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
     u64 valid;
 
-    if ( !cpu_has_xsaves && !cpu_has_xsavec )
+    if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) )
     {
         memcpy(dest, xsave, size);
         return;
@@ -206,7 +206,7 @@  void compress_xsave_states(struct vcpu *v, const void *src, unsigned int size)
     u64 xstate_bv = ((const struct xsave_struct *)src)->xsave_hdr.xstate_bv;
     u64 valid;
 
-    if ( !cpu_has_xsaves && !cpu_has_xsavec )
+    if ( !(xsave->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED) )
     {
         memcpy(xsave, src, size);
         return;
@@ -251,13 +251,11 @@  void xsave(struct vcpu *v, uint64_t mask)
     uint32_t lmask = mask;
     unsigned int fip_width = v->domain->arch.x87_fip_width;
 #define XSAVE(pfx) \
-        alternative_io_3(".byte " pfx "0x0f,0xae,0x27\n", /* xsave */ \
+        alternative_io_2(".byte " pfx "0x0f,0xae,0x27\n", /* xsave */ \
                          ".byte " pfx "0x0f,0xae,0x37\n", /* xsaveopt */ \
                          X86_FEATURE_XSAVEOPT, \
-                         ".byte " pfx "0x0f,0xc7,0x27\n", /* xsavec */ \
-                         X86_FEATURE_XSAVEC, \
                          ".byte " pfx "0x0f,0xc7,0x2f\n", /* xsaves */ \
-                         X86_FEATURE_XSAVES, \
+                         X86_FEATURE_XSAVE_COMPACT, \
                          "=m" (*ptr), \
                          "a" (lmask), "d" (hmask), "D" (ptr))
 
@@ -274,7 +272,7 @@  void xsave(struct vcpu *v, uint64_t mask)
         typeof(ptr->fpu_sse.fip.sel) fcs = ptr->fpu_sse.fip.sel;
         typeof(ptr->fpu_sse.fdp.sel) fds = ptr->fpu_sse.fdp.sel;
 
-        if ( cpu_has_xsaveopt || cpu_has_xsaves )
+        if ( cpu_has_xsaveopt || using_xsave_compact )
         {
             /*
              * XSAVEOPT/XSAVES may not write the FPU portion even when the
@@ -300,7 +298,7 @@  void xsave(struct vcpu *v, uint64_t mask)
              (!(ptr->fpu_sse.fsw & 0x0080) &&
               boot_cpu_data.x86_vendor == X86_VENDOR_AMD) )
         {
-            if ( cpu_has_xsaveopt || cpu_has_xsaves )
+            if ( cpu_has_xsaveopt || using_xsave_compact )
             {
                 ptr->fpu_sse.fip.sel = fcs;
                 ptr->fpu_sse.fdp.sel = fds;
@@ -370,7 +368,7 @@  void xrstor(struct vcpu *v, uint64_t mask)
                        "   .previous\n" \
                        _ASM_EXTABLE(1b, 2b), \
                        ".byte " pfx "0x0f,0xc7,0x1f\n", \
-                       X86_FEATURE_XSAVES, \
+                       X86_FEATURE_XSAVE_COMPACT, \
                        ASM_OUTPUT2([mem] "+m" (*ptr), [faults] "+g" (faults)), \
                        [lmask] "a" (lmask), [hmask] "d" (hmask), \
                        [ptr] "D" (ptr))
@@ -409,7 +407,7 @@  void xrstor(struct vcpu *v, uint64_t mask)
                   ((mask & XSTATE_YMM) &&
                    !(ptr->xsave_hdr.xcomp_bv & XSTATE_COMPACTION_ENABLED))) )
                 ptr->fpu_sse.mxcsr &= mxcsr_mask;
-            if ( cpu_has_xsaves || cpu_has_xsavec )
+            if ( using_xsave_compact )
             {
                 ptr->xsave_hdr.xcomp_bv &= this_cpu(xcr0) | this_cpu(xss);
                 ptr->xsave_hdr.xstate_bv &= ptr->xsave_hdr.xcomp_bv;
@@ -426,7 +424,7 @@  void xrstor(struct vcpu *v, uint64_t mask)
         case 2: /* Stage 2: Reset all state. */
             ptr->fpu_sse.mxcsr = MXCSR_DEFAULT;
             ptr->xsave_hdr.xstate_bv = 0;
-            ptr->xsave_hdr.xcomp_bv = cpu_has_xsaves
+            ptr->xsave_hdr.xcomp_bv = using_xsave_compact
                                       ? XSTATE_COMPACTION_ENABLED : 0;
             continue;
         }
@@ -575,7 +573,7 @@  void xstate_init(struct cpuinfo_x86 *c)
 
     if ( setup_xstate_features(bsp) && bsp )
         BUG();
-    if ( bsp && (cpu_has_xsaves || cpu_has_xsavec) )
+    if ( bsp && using_xsave_compact )
         setup_xstate_comp();
 }
 
diff --git a/xen/include/asm-x86/cpufeature.h b/xen/include/asm-x86/cpufeature.h
index 07ba368..cf247f0 100644
--- a/xen/include/asm-x86/cpufeature.h
+++ b/xen/include/asm-x86/cpufeature.h
@@ -76,6 +76,7 @@ 
 #define X86_FEATURE_CPUID_FAULTING (3*32+14) /* cpuid faulting */
 #define X86_FEATURE_CLFLUSH_MONITOR (3*32+15) /* clflush reqd with monitor */
 #define X86_FEATURE_APERFMPERF   (3*32+16) /* APERFMPERF */
+#define X86_FEATURE_XSAVE_COMPACT (3*32 +17) /* use compact xsave area */
 
 /* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
 #define X86_FEATURE_XMM3	(4*32+ 0) /* Streaming SIMD Extensions-3 */
@@ -224,6 +225,7 @@ 
 #define cpu_has_xsavec		boot_cpu_has(X86_FEATURE_XSAVEC)
 #define cpu_has_xgetbv1		boot_cpu_has(X86_FEATURE_XGETBV1)
 #define cpu_has_xsaves		boot_cpu_has(X86_FEATURE_XSAVES)
+#define using_xsave_compact	boot_cpu_has(X86_FEATURE_XSAVE_COMPACT)
 
 enum _cache_type {
     CACHE_TYPE_NULL = 0,