Message ID | 20231023163615.693462-1-stewart.hildebrand@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen/vpci: allow BAR write if value is the same | expand |
On 23.10.2023 18:36, Stewart Hildebrand wrote: > During xl pci-assignable-add, pciback may reset the device while memory decoding > (PCI_COMMAND_MEMORY) is enabled. After device reset, memory decoding may be > disabled in hardware, and the BARs may be zeroed/reset in hardware. However, Xen > vPCI still thinks memory decoding is enabled, and BARs will remain mapped in > p2m. In other words, memory decoding may become disabled and BARs reset in > hardware, bypassing the respective vPCI command and BAR register handlers. > Subsequently, when pciback attempts to restore state to the device, including > BARs, it happens to write the BARs before writing the command register. > Restoring/writing the BARs silently fails because Xen vPCI mistakenly thinks > memory decoding is enabled. > > Fix the BAR write by allowing it to succeed if the value written is the same as > the Xen vPCI stored value. pciback will subsequently restore the command > register and the state of the BARs and memory decoding bit will then be in sync > between hardware and vPCI again. > > While here, remove a nearby stray newline. > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > --- > Do we need similar handling in rom_write()? I think so, if we are to go this route. However, ... > We may consider additionally checking during bar_write() if the memory decoding > state has become out of sync between hardware and vPCI. During bar_write(), we > would check the device's memory decoding bit, compare it to our vPCI state, and > invoke modify_bars() if needed. Please let me know your thoughts. ... iirc we discussed earlier on that doing resets in Dom0 wants communicating to Xen. Any way of resetting by a DomU would likely need intercepting. This way the described situation can be avoided altogether. We may go further and uniformly intercept resets, carrying them out on behalf of Dom0 as well. The main issue is, as with any config-space- based interaction with a device, that there may be device specific ways of resetting. Jan
On Tue, Oct 24, 2023 at 09:09:45AM +0200, Jan Beulich wrote: > On 23.10.2023 18:36, Stewart Hildebrand wrote: > > During xl pci-assignable-add, pciback may reset the device while memory decoding > > (PCI_COMMAND_MEMORY) is enabled. After device reset, memory decoding may be > > disabled in hardware, and the BARs may be zeroed/reset in hardware. However, Xen > > vPCI still thinks memory decoding is enabled, and BARs will remain mapped in > > p2m. In other words, memory decoding may become disabled and BARs reset in > > hardware, bypassing the respective vPCI command and BAR register handlers. > > Subsequently, when pciback attempts to restore state to the device, including > > BARs, it happens to write the BARs before writing the command register. > > Restoring/writing the BARs silently fails because Xen vPCI mistakenly thinks > > memory decoding is enabled. > > > > Fix the BAR write by allowing it to succeed if the value written is the same as > > the Xen vPCI stored value. pciback will subsequently restore the command > > register and the state of the BARs and memory decoding bit will then be in sync > > between hardware and vPCI again. > > > > While here, remove a nearby stray newline. > > > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > > --- > > Do we need similar handling in rom_write()? > > I think so, if we are to go this route. However, ... > > > We may consider additionally checking during bar_write() if the memory decoding > > state has become out of sync between hardware and vPCI. During bar_write(), we > > would check the device's memory decoding bit, compare it to our vPCI state, and > > invoke modify_bars() if needed. Please let me know your thoughts. > > ... iirc we discussed earlier on that doing resets in Dom0 wants > communicating to Xen. Any way of resetting by a DomU would likely need > intercepting. This way the described situation can be avoided altogether. > We may go further and uniformly intercept resets, carrying them out on > behalf of Dom0 as well. The main issue is, as with any config-space- > based interaction with a device, that there may be device specific ways > of resetting. I agree with Jan, the plan as I recall was to introduce a new hypercall to signal Xen of when a device has been reset, I can't however find that conversation right now. It would be nice to trap device reset attempts, but there are some device specific reset methods that would be too much special casing to accommodate sadly. The fix here is just papering over the issue, as if the device has been reset and Xen is not aware of it all the vPCI cached state is out of date, so it's not only BARs addresses that are stale, but also possibly MSI and MSI-X. Thanks, Roger.
On 10/24/23 03:39, Roger Pau Monné wrote: > On Tue, Oct 24, 2023 at 09:09:45AM +0200, Jan Beulich wrote: >> On 23.10.2023 18:36, Stewart Hildebrand wrote: >>> During xl pci-assignable-add, pciback may reset the device while memory decoding >>> (PCI_COMMAND_MEMORY) is enabled. After device reset, memory decoding may be >>> disabled in hardware, and the BARs may be zeroed/reset in hardware. However, Xen >>> vPCI still thinks memory decoding is enabled, and BARs will remain mapped in >>> p2m. In other words, memory decoding may become disabled and BARs reset in >>> hardware, bypassing the respective vPCI command and BAR register handlers. >>> Subsequently, when pciback attempts to restore state to the device, including >>> BARs, it happens to write the BARs before writing the command register. >>> Restoring/writing the BARs silently fails because Xen vPCI mistakenly thinks >>> memory decoding is enabled. >>> >>> Fix the BAR write by allowing it to succeed if the value written is the same as >>> the Xen vPCI stored value. pciback will subsequently restore the command >>> register and the state of the BARs and memory decoding bit will then be in sync >>> between hardware and vPCI again. >>> >>> While here, remove a nearby stray newline. >>> >>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> >>> --- >>> Do we need similar handling in rom_write()? >> >> I think so, if we are to go this route. However, ... >> >>> We may consider additionally checking during bar_write() if the memory decoding >>> state has become out of sync between hardware and vPCI. During bar_write(), we >>> would check the device's memory decoding bit, compare it to our vPCI state, and >>> invoke modify_bars() if needed. Please let me know your thoughts. >> >> ... iirc we discussed earlier on that doing resets in Dom0 wants >> communicating to Xen. Any way of resetting by a DomU would likely need >> intercepting. This way the described situation can be avoided altogether. >> We may go further and uniformly intercept resets, carrying them out on >> behalf of Dom0 as well. The main issue is, as with any config-space- >> based interaction with a device, that there may be device specific ways >> of resetting. > > I agree with Jan, the plan as I recall was to introduce a new > hypercall to signal Xen of when a device has been reset, I can't > however find that conversation right now. It would be nice to trap > device reset attempts, but there are some device specific reset > methods that would be too much special casing to accommodate sadly. > > The fix here is just papering over the issue, as if the device has > been reset and Xen is not aware of it all the vPCI cached state is out > of date, so it's not only BARs addresses that are stale, but also > possibly MSI and MSI-X. > > Thanks, Roger. Ah, right, this makes sense. Sorry I missed that. I agree vPCI needs to know when the device has been reset and handle the state accordingly. Found the thread: https://lists.xenproject.org/archives/html/xen-devel/2023-03/msg01687.html
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c index 767c1ba718d7..446ecf539e89 100644 --- a/xen/drivers/vpci/header.c +++ b/xen/drivers/vpci/header.c @@ -430,19 +430,17 @@ static void cf_check bar_write( /* * Xen only cares whether the BAR is mapped into the p2m, so allow BAR - * writes as long as the BAR is not mapped into the p2m. + * writes as long as the BAR is not mapped into the p2m. If the value + * written is the current one allow the write regardless to ensure + * consistent state between hardware and vPCI. */ - if ( bar->enabled ) + if ( bar->enabled && val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) ) { - /* If the value written is the current one avoid printing a warning. */ - if ( val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) ) - gprintk(XENLOG_WARNING, - "%pp: ignored BAR %zu write while mapped\n", - &pdev->sbdf, bar - pdev->vpci->header.bars + hi); + gprintk(XENLOG_WARNING, "%pp: ignored BAR %zu write while mapped\n", + &pdev->sbdf, bar - pdev->vpci->header.bars + hi); return; } - /* * Update the cached address, so that when memory decoding is enabled * Xen can map the BAR into the guest p2m.
During xl pci-assignable-add, pciback may reset the device while memory decoding (PCI_COMMAND_MEMORY) is enabled. After device reset, memory decoding may be disabled in hardware, and the BARs may be zeroed/reset in hardware. However, Xen vPCI still thinks memory decoding is enabled, and BARs will remain mapped in p2m. In other words, memory decoding may become disabled and BARs reset in hardware, bypassing the respective vPCI command and BAR register handlers. Subsequently, when pciback attempts to restore state to the device, including BARs, it happens to write the BARs before writing the command register. Restoring/writing the BARs silently fails because Xen vPCI mistakenly thinks memory decoding is enabled. Fix the BAR write by allowing it to succeed if the value written is the same as the Xen vPCI stored value. pciback will subsequently restore the command register and the state of the BARs and memory decoding bit will then be in sync between hardware and vPCI again. While here, remove a nearby stray newline. Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> --- Do we need similar handling in rom_write()? We may consider additionally checking during bar_write() if the memory decoding state has become out of sync between hardware and vPCI. During bar_write(), we would check the device's memory decoding bit, compare it to our vPCI state, and invoke modify_bars() if needed. Please let me know your thoughts. I considered additionally checking if the hardware and vPCI memory decoding state are out of sync in new cmd_read()/bar_read() handlers, and calling modify_bars() if needed. However, I decided not to do this because it would impose an unnecessary implication on the in-progress vPCI series with the rwlock: calling modify_bars() would require holding the lock in write/exclusive mode, whereas in vPCI read handlers we would only hold the lock in read mode. I have only observed the inconsistency after device reset when pciback (i.e. dom0/hardware domain) is restoring the state to the device. Since pciback will also restore the command register, the state will be back in sync after pciback is finished restoring the state. --- xen/drivers/vpci/header.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-) base-commit: bad1ac345b1910b820b8a703ad1b9f66412ea844