diff mbox series

xen/vpci: allow BAR write if value is the same

Message ID 20231023163615.693462-1-stewart.hildebrand@amd.com (mailing list archive)
State New, archived
Headers show
Series xen/vpci: allow BAR write if value is the same | expand

Commit Message

Stewart Hildebrand Oct. 23, 2023, 4:36 p.m. UTC
During xl pci-assignable-add, pciback may reset the device while memory decoding
(PCI_COMMAND_MEMORY) is enabled. After device reset, memory decoding may be
disabled in hardware, and the BARs may be zeroed/reset in hardware. However, Xen
vPCI still thinks memory decoding is enabled, and BARs will remain mapped in
p2m. In other words, memory decoding may become disabled and BARs reset in
hardware, bypassing the respective vPCI command and BAR register handlers.
Subsequently, when pciback attempts to restore state to the device, including
BARs, it happens to write the BARs before writing the command register.
Restoring/writing the BARs silently fails because Xen vPCI mistakenly thinks
memory decoding is enabled.

Fix the BAR write by allowing it to succeed if the value written is the same as
the Xen vPCI stored value. pciback will subsequently restore the command
register and the state of the BARs and memory decoding bit will then be in sync
between hardware and vPCI again.

While here, remove a nearby stray newline.

Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
---
Do we need similar handling in rom_write()?

We may consider additionally checking during bar_write() if the memory decoding
state has become out of sync between hardware and vPCI. During bar_write(), we
would check the device's memory decoding bit, compare it to our vPCI state, and
invoke modify_bars() if needed. Please let me know your thoughts.

I considered additionally checking if the hardware and vPCI memory decoding
state are out of sync in new cmd_read()/bar_read() handlers, and calling
modify_bars() if needed. However, I decided not to do this because it would
impose an unnecessary implication on the in-progress vPCI series with the
rwlock: calling modify_bars() would require holding the lock in write/exclusive
mode, whereas in vPCI read handlers we would only hold the lock in read mode.

I have only observed the inconsistency after device reset when pciback (i.e.
dom0/hardware domain) is restoring the state to the device. Since pciback will
also restore the command register, the state will be back in sync after pciback
is finished restoring the state.
---
 xen/drivers/vpci/header.c | 14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)


base-commit: bad1ac345b1910b820b8a703ad1b9f66412ea844

Comments

Jan Beulich Oct. 24, 2023, 7:09 a.m. UTC | #1
On 23.10.2023 18:36, Stewart Hildebrand wrote:
> During xl pci-assignable-add, pciback may reset the device while memory decoding
> (PCI_COMMAND_MEMORY) is enabled. After device reset, memory decoding may be
> disabled in hardware, and the BARs may be zeroed/reset in hardware. However, Xen
> vPCI still thinks memory decoding is enabled, and BARs will remain mapped in
> p2m. In other words, memory decoding may become disabled and BARs reset in
> hardware, bypassing the respective vPCI command and BAR register handlers.
> Subsequently, when pciback attempts to restore state to the device, including
> BARs, it happens to write the BARs before writing the command register.
> Restoring/writing the BARs silently fails because Xen vPCI mistakenly thinks
> memory decoding is enabled.
> 
> Fix the BAR write by allowing it to succeed if the value written is the same as
> the Xen vPCI stored value. pciback will subsequently restore the command
> register and the state of the BARs and memory decoding bit will then be in sync
> between hardware and vPCI again.
> 
> While here, remove a nearby stray newline.
> 
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> ---
> Do we need similar handling in rom_write()?

I think so, if we are to go this route. However, ...

> We may consider additionally checking during bar_write() if the memory decoding
> state has become out of sync between hardware and vPCI. During bar_write(), we
> would check the device's memory decoding bit, compare it to our vPCI state, and
> invoke modify_bars() if needed. Please let me know your thoughts.

... iirc we discussed earlier on that doing resets in Dom0 wants
communicating to Xen. Any way of resetting by a DomU would likely need
intercepting. This way the described situation can be avoided altogether.
We may go further and uniformly intercept resets, carrying them out on
behalf of Dom0 as well. The main issue is, as with any config-space-
based interaction with a device, that there may be device specific ways
of resetting.

Jan
Roger Pau Monné Oct. 24, 2023, 7:39 a.m. UTC | #2
On Tue, Oct 24, 2023 at 09:09:45AM +0200, Jan Beulich wrote:
> On 23.10.2023 18:36, Stewart Hildebrand wrote:
> > During xl pci-assignable-add, pciback may reset the device while memory decoding
> > (PCI_COMMAND_MEMORY) is enabled. After device reset, memory decoding may be
> > disabled in hardware, and the BARs may be zeroed/reset in hardware. However, Xen
> > vPCI still thinks memory decoding is enabled, and BARs will remain mapped in
> > p2m. In other words, memory decoding may become disabled and BARs reset in
> > hardware, bypassing the respective vPCI command and BAR register handlers.
> > Subsequently, when pciback attempts to restore state to the device, including
> > BARs, it happens to write the BARs before writing the command register.
> > Restoring/writing the BARs silently fails because Xen vPCI mistakenly thinks
> > memory decoding is enabled.
> > 
> > Fix the BAR write by allowing it to succeed if the value written is the same as
> > the Xen vPCI stored value. pciback will subsequently restore the command
> > register and the state of the BARs and memory decoding bit will then be in sync
> > between hardware and vPCI again.
> > 
> > While here, remove a nearby stray newline.
> > 
> > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> > ---
> > Do we need similar handling in rom_write()?
> 
> I think so, if we are to go this route. However, ...
> 
> > We may consider additionally checking during bar_write() if the memory decoding
> > state has become out of sync between hardware and vPCI. During bar_write(), we
> > would check the device's memory decoding bit, compare it to our vPCI state, and
> > invoke modify_bars() if needed. Please let me know your thoughts.
> 
> ... iirc we discussed earlier on that doing resets in Dom0 wants
> communicating to Xen. Any way of resetting by a DomU would likely need
> intercepting. This way the described situation can be avoided altogether.
> We may go further and uniformly intercept resets, carrying them out on
> behalf of Dom0 as well. The main issue is, as with any config-space-
> based interaction with a device, that there may be device specific ways
> of resetting.

I agree with Jan, the plan as I recall was to introduce a new
hypercall to signal Xen of when a device has been reset, I can't
however find that conversation right now.  It would be nice to trap
device reset attempts, but there are some device specific reset
methods that would be too much special casing to accommodate sadly.

The fix here is just papering over the issue, as if the device has
been reset and Xen is not aware of it all the vPCI cached state is out
of date, so it's not only BARs addresses that are stale, but also
possibly MSI and MSI-X.

Thanks, Roger.
Stewart Hildebrand Oct. 24, 2023, 2:31 p.m. UTC | #3
On 10/24/23 03:39, Roger Pau Monné wrote:
> On Tue, Oct 24, 2023 at 09:09:45AM +0200, Jan Beulich wrote:
>> On 23.10.2023 18:36, Stewart Hildebrand wrote:
>>> During xl pci-assignable-add, pciback may reset the device while memory decoding
>>> (PCI_COMMAND_MEMORY) is enabled. After device reset, memory decoding may be
>>> disabled in hardware, and the BARs may be zeroed/reset in hardware. However, Xen
>>> vPCI still thinks memory decoding is enabled, and BARs will remain mapped in
>>> p2m. In other words, memory decoding may become disabled and BARs reset in
>>> hardware, bypassing the respective vPCI command and BAR register handlers.
>>> Subsequently, when pciback attempts to restore state to the device, including
>>> BARs, it happens to write the BARs before writing the command register.
>>> Restoring/writing the BARs silently fails because Xen vPCI mistakenly thinks
>>> memory decoding is enabled.
>>>
>>> Fix the BAR write by allowing it to succeed if the value written is the same as
>>> the Xen vPCI stored value. pciback will subsequently restore the command
>>> register and the state of the BARs and memory decoding bit will then be in sync
>>> between hardware and vPCI again.
>>>
>>> While here, remove a nearby stray newline.
>>>
>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
>>> ---
>>> Do we need similar handling in rom_write()?
>>
>> I think so, if we are to go this route. However, ...
>>
>>> We may consider additionally checking during bar_write() if the memory decoding
>>> state has become out of sync between hardware and vPCI. During bar_write(), we
>>> would check the device's memory decoding bit, compare it to our vPCI state, and
>>> invoke modify_bars() if needed. Please let me know your thoughts.
>>
>> ... iirc we discussed earlier on that doing resets in Dom0 wants
>> communicating to Xen. Any way of resetting by a DomU would likely need
>> intercepting. This way the described situation can be avoided altogether.
>> We may go further and uniformly intercept resets, carrying them out on
>> behalf of Dom0 as well. The main issue is, as with any config-space-
>> based interaction with a device, that there may be device specific ways
>> of resetting.
> 
> I agree with Jan, the plan as I recall was to introduce a new
> hypercall to signal Xen of when a device has been reset, I can't
> however find that conversation right now.  It would be nice to trap
> device reset attempts, but there are some device specific reset
> methods that would be too much special casing to accommodate sadly.
> 
> The fix here is just papering over the issue, as if the device has
> been reset and Xen is not aware of it all the vPCI cached state is out
> of date, so it's not only BARs addresses that are stale, but also
> possibly MSI and MSI-X.
> 
> Thanks, Roger.

Ah, right, this makes sense. Sorry I missed that. I agree vPCI needs to know when the device has been reset and handle the state accordingly.

Found the thread: https://lists.xenproject.org/archives/html/xen-devel/2023-03/msg01687.html
diff mbox series

Patch

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 767c1ba718d7..446ecf539e89 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -430,19 +430,17 @@  static void cf_check bar_write(
 
     /*
      * Xen only cares whether the BAR is mapped into the p2m, so allow BAR
-     * writes as long as the BAR is not mapped into the p2m.
+     * writes as long as the BAR is not mapped into the p2m. If the value
+     * written is the current one allow the write regardless to ensure
+     * consistent state between hardware and vPCI.
      */
-    if ( bar->enabled )
+    if ( bar->enabled && val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) )
     {
-        /* If the value written is the current one avoid printing a warning. */
-        if ( val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) )
-            gprintk(XENLOG_WARNING,
-                    "%pp: ignored BAR %zu write while mapped\n",
-                    &pdev->sbdf, bar - pdev->vpci->header.bars + hi);
+        gprintk(XENLOG_WARNING, "%pp: ignored BAR %zu write while mapped\n",
+                &pdev->sbdf, bar - pdev->vpci->header.bars + hi);
         return;
     }
 
-
     /*
      * Update the cached address, so that when memory decoding is enabled
      * Xen can map the BAR into the guest p2m.