diff mbox series

[1/2] x86/msi: passthrough all MSI-X vector ctrl writes to device model

Message ID 20221114192100.1539267-1-marmarek@invisiblethingslab.com (mailing list archive)
State New, archived
Headers show
Series [1/2] x86/msi: passthrough all MSI-X vector ctrl writes to device model | expand

Commit Message

Marek Marczykowski-Górecki Nov. 14, 2022, 7:20 p.m. UTC
QEMU needs to know whether clearing maskbit of a vector is really
clearing, or was already cleared before. Currently Xen sends only
clearing that bit to the device model, but not setting it, so QEMU
cannot detect it. Because of that, QEMU is working this around by
checking via /dev/mem, but that isn't the proper approach.

Give all necessary information to QEMU by passing all ctrl writes,
including masking a vector.

Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
---
 xen/arch/x86/hvm/vmsi.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

Jan Beulich Nov. 15, 2022, 9:36 a.m. UTC | #1
On 14.11.2022 20:20, Marek Marczykowski-Górecki wrote:
> QEMU needs to know whether clearing maskbit of a vector is really
> clearing, or was already cleared before. Currently Xen sends only
> clearing that bit to the device model, but not setting it, so QEMU
> cannot detect it.

Except for qword writes as it looks. Furthermore even clearing
requests aren't sent if address/data are unchanged. If you agree,
please add this here in some form for having a complete picture.

> Because of that, QEMU is working this around by
> checking via /dev/mem, but that isn't the proper approach.
> 
> Give all necessary information to QEMU by passing all ctrl writes,
> including masking a vector.

Can we perhaps still avoid sending dword writes which don't change
the mask bit?

> --- a/xen/arch/x86/hvm/vmsi.c
> +++ b/xen/arch/x86/hvm/vmsi.c
> @@ -271,7 +271,8 @@ out:
>  }
>  
>  static int msixtbl_write(struct vcpu *v, unsigned long address,
> -                         unsigned int len, unsigned long val)
> +                         unsigned int len, unsigned long val,
> +                         bool completion)
>  {

I'd like to propose an alternative approach without an extra parameter:
Have msix_write_completion() pass 0 for "len" and move the initial
check

    if ( (len != 4 && len != 8) || (address & (len - 1)) )
        return r;

into _msixtbl_write(). Then ...

> @@ -343,7 +344,7 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
>  
>  unlock:
>      spin_unlock_irqrestore(&desc->lock, flags);
> -    if ( len == 4 )
> +    if ( len == 4 && completion )
>          r = X86EMUL_OKAY;

... this could simply be "if ( !len )", seeing that even with your
approach it could simply be "if ( completion )".

Jan
Marek Marczykowski-Górecki Nov. 15, 2022, 11:37 a.m. UTC | #2
On Tue, Nov 15, 2022 at 10:36:32AM +0100, Jan Beulich wrote:
> On 14.11.2022 20:20, Marek Marczykowski-Górecki wrote:
> > QEMU needs to know whether clearing maskbit of a vector is really
> > clearing, or was already cleared before. Currently Xen sends only
> > clearing that bit to the device model, but not setting it, so QEMU
> > cannot detect it.
> 
> Except for qword writes as it looks. Furthermore even clearing
> requests aren't sent if address/data are unchanged. If you agree,
> please add this here in some form for having a complete picture.

Ok.

> > Because of that, QEMU is working this around by
> > checking via /dev/mem, but that isn't the proper approach.
> > 
> > Give all necessary information to QEMU by passing all ctrl writes,
> > including masking a vector.
> 
> Can we perhaps still avoid sending dword writes which don't change
> the mask bit?

Is it worth it? I don't think such writes are common (which I confirm
observing debug log - every single write to maskbit Linux did was
changing the value). The old value isn't readily available here.

> > --- a/xen/arch/x86/hvm/vmsi.c
> > +++ b/xen/arch/x86/hvm/vmsi.c
> > @@ -271,7 +271,8 @@ out:
> >  }
> >  
> >  static int msixtbl_write(struct vcpu *v, unsigned long address,
> > -                         unsigned int len, unsigned long val)
> > +                         unsigned int len, unsigned long val,
> > +                         bool completion)
> >  {
> 
> I'd like to propose an alternative approach without an extra parameter:
> Have msix_write_completion() pass 0 for "len" and move the initial
> check
> 
>     if ( (len != 4 && len != 8) || (address & (len - 1)) )
>         return r;
> 
> into _msixtbl_write(). Then ...
> 
> > @@ -343,7 +344,7 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
> >  
> >  unlock:
> >      spin_unlock_irqrestore(&desc->lock, flags);
> > -    if ( len == 4 )
> > +    if ( len == 4 && completion )
> >          r = X86EMUL_OKAY;
> 
> ... this could simply be "if ( !len )", seeing that even with your
> approach it could simply be "if ( completion )".

I find such usage of magic len=0 confusing. It would change the meaning
of "len" from "write length" to "write length, unless it's 0 - then
write length is 4 and it's called from msix_write_completion.
Is there any real value from avoiding extra parameter?
Jan Beulich Nov. 15, 2022, 1:54 p.m. UTC | #3
On 15.11.2022 12:37, Marek Marczykowski-Górecki wrote:
> On Tue, Nov 15, 2022 at 10:36:32AM +0100, Jan Beulich wrote:
>> On 14.11.2022 20:20, Marek Marczykowski-Górecki wrote:
>>> Give all necessary information to QEMU by passing all ctrl writes,
>>> including masking a vector.
>>
>> Can we perhaps still avoid sending dword writes which don't change
>> the mask bit?
> 
> Is it worth it? I don't think such writes are common (which I confirm
> observing debug log - every single write to maskbit Linux did was
> changing the value). The old value isn't readily available here.

For one a 2nd aspect would be Windows behavior. As you've seen in
the hypervisor code somebody back at the time even thought
accelerating reads was useful. I'm going from that rather than
knowing for sure that such an optimization would help anywhere.

>>> --- a/xen/arch/x86/hvm/vmsi.c
>>> +++ b/xen/arch/x86/hvm/vmsi.c
>>> @@ -271,7 +271,8 @@ out:
>>>  }
>>>  
>>>  static int msixtbl_write(struct vcpu *v, unsigned long address,
>>> -                         unsigned int len, unsigned long val)
>>> +                         unsigned int len, unsigned long val,
>>> +                         bool completion)
>>>  {
>>
>> I'd like to propose an alternative approach without an extra parameter:
>> Have msix_write_completion() pass 0 for "len" and move the initial
>> check
>>
>>     if ( (len != 4 && len != 8) || (address & (len - 1)) )
>>         return r;
>>
>> into _msixtbl_write(). Then ...
>>
>>> @@ -343,7 +344,7 @@ static int msixtbl_write(struct vcpu *v, unsigned long address,
>>>  
>>>  unlock:
>>>      spin_unlock_irqrestore(&desc->lock, flags);
>>> -    if ( len == 4 )
>>> +    if ( len == 4 && completion )
>>>          r = X86EMUL_OKAY;
>>
>> ... this could simply be "if ( !len )", seeing that even with your
>> approach it could simply be "if ( completion )".
> 
> I find such usage of magic len=0 confusing. It would change the meaning
> of "len" from "write length" to "write length, unless it's 0 - then
> write length is 4 and it's called from msix_write_completion.
> Is there any real value from avoiding extra parameter?

Perhaps a matter of taste, but to me redundant parameters are odd
at times as well - often I end up wondering in such cases why an
extra parameter was introduced when things could easily be done
with what was there. In the specific case here there's also the
further aspect of you moving the function across the boundary of
all arguments fitting in registers available for parameter passing
(which of course only matters if the compiler decides to not
inline the function at all call sites).

Jan
diff mbox series

Patch

diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c
index 75f92885dc5e..ba4cf46dfe91 100644
--- a/xen/arch/x86/hvm/vmsi.c
+++ b/xen/arch/x86/hvm/vmsi.c
@@ -271,7 +271,8 @@  out:
 }
 
 static int msixtbl_write(struct vcpu *v, unsigned long address,
-                         unsigned int len, unsigned long val)
+                         unsigned int len, unsigned long val,
+                         bool completion)
 {
     unsigned long offset;
     struct msixtbl_entry *entry;
@@ -343,7 +344,7 @@  static int msixtbl_write(struct vcpu *v, unsigned long address,
 
 unlock:
     spin_unlock_irqrestore(&desc->lock, flags);
-    if ( len == 4 )
+    if ( len == 4 && completion )
         r = X86EMUL_OKAY;
 
 out:
@@ -355,7 +356,7 @@  static int cf_check _msixtbl_write(
     const struct hvm_io_handler *handler, uint64_t address, uint32_t len,
     uint64_t val)
 {
-    return msixtbl_write(current, address, len, val);
+    return msixtbl_write(current, address, len, val, false);
 }
 
 static bool cf_check msixtbl_range(
@@ -633,7 +634,7 @@  void msix_write_completion(struct vcpu *v)
         return;
 
     v->arch.hvm.hvm_io.msix_unmask_address = 0;
-    if ( msixtbl_write(v, ctrl_address, 4, 0) != X86EMUL_OKAY )
+    if ( msixtbl_write(v, ctrl_address, 4, 0, true) != X86EMUL_OKAY )
         gdprintk(XENLOG_WARNING, "MSI-X write completion failure\n");
 }