diff mbox series

[05/14] x86/xstate: Map/unmap xsave area in xstate_set_init() and handle_setbv()

Message ID 20241028154932.6797-6-alejandro.vallejo@cloud.com (mailing list archive)
State Superseded
Headers show
Series x86: Address Space Isolation FPU preparations | expand

Commit Message

Alejandro Vallejo Oct. 28, 2024, 3:49 p.m. UTC
No functional change.

Signed-off-by: Alejandro Vallejo <alejandro.vallejo@cloud.com>
---
 xen/arch/x86/xstate.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Comments

Jan Beulich Oct. 29, 2024, 8:26 a.m. UTC | #1
On 28.10.2024 16:49, Alejandro Vallejo wrote:
> --- a/xen/arch/x86/xstate.c
> +++ b/xen/arch/x86/xstate.c
> @@ -993,7 +993,12 @@ int handle_xsetbv(u32 index, u64 new_bv)
>  
>          clts();
>          if ( curr->fpu_dirtied )
> -            asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) );
> +        {
> +            struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
> +
> +            asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
> +            vcpu_unmap_xsave_area(curr, xsave_area);
> +        }

Since it's curr that we're dealing with, is this largely a cosmetic change? I.e.
there's no going to be any actual map/unmap operation in that case? Otherwise
I'd be inclined to say that an actual map/unmap is pretty high overhead for a
mere store of a 32-bit value.

Jan
Alejandro Vallejo Oct. 29, 2024, 1 p.m. UTC | #2
On Tue Oct 29, 2024 at 8:26 AM GMT, Jan Beulich wrote:
> On 28.10.2024 16:49, Alejandro Vallejo wrote:
> > --- a/xen/arch/x86/xstate.c
> > +++ b/xen/arch/x86/xstate.c
> > @@ -993,7 +993,12 @@ int handle_xsetbv(u32 index, u64 new_bv)
> >  
> >          clts();
> >          if ( curr->fpu_dirtied )
> > -            asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) );
> > +        {
> > +            struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
> > +
> > +            asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
> > +            vcpu_unmap_xsave_area(curr, xsave_area);
> > +        }
>
> Since it's curr that we're dealing with, is this largely a cosmetic change? I.e.
> there's no going to be any actual map/unmap operation in that case? Otherwise
> I'd be inclined to say that an actual map/unmap is pretty high overhead for a
> mere store of a 32-bit value.
>
> Jan

Somewhat.

See the follow-up reply to patch2 with something resembling what I expect the
wrappers to have. In short, yes, I expect "current" to not require
mapping/unmapping; but I still would rather see those sites using the same
wrappers for auditability. After we settle on a particular interface, we can
let the implementation details creep out if that happens to be clearer, but
it's IMO easier to work this way for the time being until those details
crystalise.

Cheers,
Alejandro
Jan Beulich Oct. 29, 2024, 1:31 p.m. UTC | #3
On 29.10.2024 14:00, Alejandro Vallejo wrote:
> On Tue Oct 29, 2024 at 8:26 AM GMT, Jan Beulich wrote:
>> On 28.10.2024 16:49, Alejandro Vallejo wrote:
>>> --- a/xen/arch/x86/xstate.c
>>> +++ b/xen/arch/x86/xstate.c
>>> @@ -993,7 +993,12 @@ int handle_xsetbv(u32 index, u64 new_bv)
>>>  
>>>          clts();
>>>          if ( curr->fpu_dirtied )
>>> -            asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) );
>>> +        {
>>> +            struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
>>> +
>>> +            asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
>>> +            vcpu_unmap_xsave_area(curr, xsave_area);
>>> +        }
>>
>> Since it's curr that we're dealing with, is this largely a cosmetic change? I.e.
>> there's no going to be any actual map/unmap operation in that case? Otherwise
>> I'd be inclined to say that an actual map/unmap is pretty high overhead for a
>> mere store of a 32-bit value.
> 
> Somewhat.
> 
> See the follow-up reply to patch2 with something resembling what I expect the
> wrappers to have. In short, yes, I expect "current" to not require
> mapping/unmapping; but I still would rather see those sites using the same
> wrappers for auditability. After we settle on a particular interface, we can
> let the implementation details creep out if that happens to be clearer, but
> it's IMO easier to work this way for the time being until those details
> crystalise.

Sure. As expressed in a later reply on the same topic, what I'm after are brief
comments indicating that despite the function names involved, no actual mapping
operations will be carried out in these cases, thus addressing concerns towards
the overhead involved.

Jan
Alejandro Vallejo Oct. 29, 2024, 2:14 p.m. UTC | #4
On Tue Oct 29, 2024 at 1:31 PM GMT, Jan Beulich wrote:
> On 29.10.2024 14:00, Alejandro Vallejo wrote:
> > On Tue Oct 29, 2024 at 8:26 AM GMT, Jan Beulich wrote:
> >> On 28.10.2024 16:49, Alejandro Vallejo wrote:
> >>> --- a/xen/arch/x86/xstate.c
> >>> +++ b/xen/arch/x86/xstate.c
> >>> @@ -993,7 +993,12 @@ int handle_xsetbv(u32 index, u64 new_bv)
> >>>  
> >>>          clts();
> >>>          if ( curr->fpu_dirtied )
> >>> -            asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) );
> >>> +        {
> >>> +            struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
> >>> +
> >>> +            asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
> >>> +            vcpu_unmap_xsave_area(curr, xsave_area);
> >>> +        }
> >>
> >> Since it's curr that we're dealing with, is this largely a cosmetic change? I.e.
> >> there's no going to be any actual map/unmap operation in that case? Otherwise
> >> I'd be inclined to say that an actual map/unmap is pretty high overhead for a
> >> mere store of a 32-bit value.
> > 
> > Somewhat.
> > 
> > See the follow-up reply to patch2 with something resembling what I expect the
> > wrappers to have. In short, yes, I expect "current" to not require
> > mapping/unmapping; but I still would rather see those sites using the same
> > wrappers for auditability. After we settle on a particular interface, we can
> > let the implementation details creep out if that happens to be clearer, but
> > it's IMO easier to work this way for the time being until those details
> > crystalise.
>
> Sure. As expressed in a later reply on the same topic, what I'm after are brief
> comments indicating that despite the function names involved, no actual mapping
> operations will be carried out in these cases, thus addressing concerns towards
> the overhead involved.
>
> Jan

Right, I can add those to the sites using exclusively "current". That's no
problem.

Cheers,
Alejandro
diff mbox series

Patch

diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
index af9e345a7ace..60e752a245ca 100644
--- a/xen/arch/x86/xstate.c
+++ b/xen/arch/x86/xstate.c
@@ -993,7 +993,12 @@  int handle_xsetbv(u32 index, u64 new_bv)
 
         clts();
         if ( curr->fpu_dirtied )
-            asm ( "stmxcsr %0" : "=m" (curr->arch.xsave_area->fpu_sse.mxcsr) );
+        {
+            struct xsave_struct *xsave_area = vcpu_map_xsave_area(curr);
+
+            asm ( "stmxcsr %0" : "=m" (xsave_area->fpu_sse.mxcsr) );
+            vcpu_unmap_xsave_area(curr, xsave_area);
+        }
         else if ( xstate_all(curr) )
         {
             /* See the comment in i387.c:vcpu_restore_fpu_eager(). */
@@ -1048,7 +1053,7 @@  void xstate_set_init(uint64_t mask)
     unsigned long cr0 = read_cr0();
     unsigned long xcr0 = this_cpu(xcr0);
     struct vcpu *v = idle_vcpu[smp_processor_id()];
-    struct xsave_struct *xstate = v->arch.xsave_area;
+    struct xsave_struct *xstate;
 
     if ( ~xfeature_mask & mask )
     {
@@ -1061,8 +1066,10 @@  void xstate_set_init(uint64_t mask)
 
     clts();
 
+    xstate = vcpu_map_xsave_area(v);
     memset(&xstate->xsave_hdr, 0, sizeof(xstate->xsave_hdr));
     xrstor(v, mask);
+    vcpu_unmap_xsave_area(v, xstate);
 
     if ( cr0 & X86_CR0_TS )
         write_cr0(cr0);