Message ID | 20221020094649.28667-6-roger.pau@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | (v)pci: fixes related to memory decoding handling | expand |
On 20.10.2022 11:46, Roger Pau Monne wrote: > Commit 75cc460a1b added checks to ensure the position of the BARs from > PCI devices don't overlap with regions defined on the memory map. > When there's a collision memory decoding is left disabled for the > device, assuming that dom0 will reposition the BAR if necessary and > enable memory decoding. > > While this would be the case for devices being used by dom0, devices > being used by the firmware itself that have no driver would usually be > left with memory decoding disabled by dom0 if that's the state dom0 > found them in, and thus firmware trying to make use of them will not > function correctly. > > The initial intent of 75cc460a1b was to prevent vPCI from creating > MMIO mappings on the dom0 p2m over regions that would otherwise > already have mappings established. It's my view now that we likely > went too far with 75cc460a1b, and Xen disabling memory decoding of > devices (as buggy as they might be) is harmful, and reduces the set of > hardware on which Xen works. > > This commits reverts most of 75cc460a1b, and instead adds checks to > vPCI in order to prevent misplaced BARs from being added to the > hardware domain p2m. Which makes me wonder: How do things work then? Dom0 then still can't access the BAR address range, can it? Plus with this adjustment, is ... > Signaling on whether BARs are mapped is tracked > in the vpci structure, so that misplaced BARs are not mapped, and thus > Xen won't attempt to unmap them when memory decoding is disabled. > > This restores the behavior of Xen for PV dom0 to the state it was > previous to 75cc460a1b, while also introducing a more contained fix > for the vPCI BAR mapping issues. ... this (in particular "restores the behavior") a valid description of this change? > Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably positioned') > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > --- > AT Citrix we have a system with a device with the following BARs: > > BAR [0xfe010, 0xfe010] -> in a EfiMemoryMappedIO region > BAR [0, 0x1fff] -> not positioned, outside host bridge window > > And memory decoding enabled by the firmware. With the current code > (or any of the previous fix proposals), Xen would still disable memory > decoding for the device, and the system will freeze when attempting to > set EFI vars. Isn't the latter (BAR at address 0) yet another problem? I have to admit that I'm uncertain in how far it is a good idea to try to make Xen look to work on such a system ... Jan
On Mon, Oct 24, 2022 at 01:19:22PM +0200, Jan Beulich wrote: > On 20.10.2022 11:46, Roger Pau Monne wrote: > > Commit 75cc460a1b added checks to ensure the position of the BARs from > > PCI devices don't overlap with regions defined on the memory map. > > When there's a collision memory decoding is left disabled for the > > device, assuming that dom0 will reposition the BAR if necessary and > > enable memory decoding. > > > > While this would be the case for devices being used by dom0, devices > > being used by the firmware itself that have no driver would usually be > > left with memory decoding disabled by dom0 if that's the state dom0 > > found them in, and thus firmware trying to make use of them will not > > function correctly. > > > > The initial intent of 75cc460a1b was to prevent vPCI from creating > > MMIO mappings on the dom0 p2m over regions that would otherwise > > already have mappings established. It's my view now that we likely > > went too far with 75cc460a1b, and Xen disabling memory decoding of > > devices (as buggy as they might be) is harmful, and reduces the set of > > hardware on which Xen works. > > > > This commits reverts most of 75cc460a1b, and instead adds checks to > > vPCI in order to prevent misplaced BARs from being added to the > > hardware domain p2m. > > Which makes me wonder: How do things work then? Dom0 then still can't > access the BAR address range, can it? It does allow access on some situations where the previous arrangement didn't work because it wholesale disabled memory decoding for the device. So if it's only one BAR that's misplaced the rest will still get added to the dom0 p2m and be accessible, because memory decoding won't be turned off for the device. > Plus with this adjustment, is > ... > > > Signaling on whether BARs are mapped is tracked > > in the vpci structure, so that misplaced BARs are not mapped, and thus > > Xen won't attempt to unmap them when memory decoding is disabled. > > > > This restores the behavior of Xen for PV dom0 to the state it was > > previous to 75cc460a1b, while also introducing a more contained fix > > for the vPCI BAR mapping issues. > > ... this (in particular "restores the behavior") a valid description > of this change? Yes, it restores the previous behavior for PV dom0, as memory decoding is no longer turned off for any devices regardless of where the BARs are positioned. > > Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably positioned') > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > > --- > > AT Citrix we have a system with a device with the following BARs: > > > > BAR [0xfe010, 0xfe010] -> in a EfiMemoryMappedIO region > > BAR [0, 0x1fff] -> not positioned, outside host bridge window > > > > And memory decoding enabled by the firmware. With the current code > > (or any of the previous fix proposals), Xen would still disable memory > > decoding for the device, and the system will freeze when attempting to > > set EFI vars. > > Isn't the latter (BAR at address 0) yet another problem? It's a BAR that hasn't been positioned by the firmware AFAICT. Which is a bug in the firmware but shouldn't prevent Xen from booting. In the above system address 0 is outside of the PCI host bridge window, so even if we mapped the BAR and memory decoding for the device was enabled accessing such BAR wouldn't work. > I have to admit > that I'm uncertain in how far it is a good idea to try to make Xen look > to work on such a system ... PV dom0 works on a system like the above prior to c/s 75cc460a1b, so I would consider 75cc460a1b to be a regression for PV dom0 setups. Thanks, Roger.
On 24.10.2022 14:45, Roger Pau Monné wrote: > On Mon, Oct 24, 2022 at 01:19:22PM +0200, Jan Beulich wrote: >> On 20.10.2022 11:46, Roger Pau Monne wrote: >>> Commit 75cc460a1b added checks to ensure the position of the BARs from >>> PCI devices don't overlap with regions defined on the memory map. >>> When there's a collision memory decoding is left disabled for the >>> device, assuming that dom0 will reposition the BAR if necessary and >>> enable memory decoding. >>> >>> While this would be the case for devices being used by dom0, devices >>> being used by the firmware itself that have no driver would usually be >>> left with memory decoding disabled by dom0 if that's the state dom0 >>> found them in, and thus firmware trying to make use of them will not >>> function correctly. >>> >>> The initial intent of 75cc460a1b was to prevent vPCI from creating >>> MMIO mappings on the dom0 p2m over regions that would otherwise >>> already have mappings established. It's my view now that we likely >>> went too far with 75cc460a1b, and Xen disabling memory decoding of >>> devices (as buggy as they might be) is harmful, and reduces the set of >>> hardware on which Xen works. >>> >>> This commits reverts most of 75cc460a1b, and instead adds checks to >>> vPCI in order to prevent misplaced BARs from being added to the >>> hardware domain p2m. >> >> Which makes me wonder: How do things work then? Dom0 then still can't >> access the BAR address range, can it? > > It does allow access on some situations where the previous arrangement > didn't work because it wholesale disabled memory decoding for the > device. > > So if it's only one BAR that's misplaced the rest will still get added > to the dom0 p2m and be accessible, because memory decoding won't be > turned off for the device. Right - without a per-BAR disable there can only be all or nothing. In the end if things work with this adjustment, the problem BAR cannot really be in use aiui. I wonder what you would propose we do if on another system such a BAR is actually in use. >> Plus with this adjustment, is >> ... >> >>> Signaling on whether BARs are mapped is tracked >>> in the vpci structure, so that misplaced BARs are not mapped, and thus >>> Xen won't attempt to unmap them when memory decoding is disabled. >>> >>> This restores the behavior of Xen for PV dom0 to the state it was >>> previous to 75cc460a1b, while also introducing a more contained fix >>> for the vPCI BAR mapping issues. >> >> ... this (in particular "restores the behavior") a valid description >> of this change? > > Yes, it restores the previous behavior for PV dom0, as memory decoding > is no longer turned off for any devices regardless of where the BARs > are positioned. It restores one aspect of behavior but then puts in place another restriction. >>> Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably positioned') >>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >>> --- >>> AT Citrix we have a system with a device with the following BARs: >>> >>> BAR [0xfe010, 0xfe010] -> in a EfiMemoryMappedIO region >>> BAR [0, 0x1fff] -> not positioned, outside host bridge window >>> >>> And memory decoding enabled by the firmware. With the current code >>> (or any of the previous fix proposals), Xen would still disable memory >>> decoding for the device, and the system will freeze when attempting to >>> set EFI vars. >> >> Isn't the latter (BAR at address 0) yet another problem? > > It's a BAR that hasn't been positioned by the firmware AFAICT. Which > is a bug in the firmware but shouldn't prevent Xen from booting. > > In the above system address 0 is outside of the PCI host bridge > window, so even if we mapped the BAR and memory decoding for the > device was enabled accessing such BAR wouldn't work. It's mere luck I would say that in this case the BAR is outside the bridge's window. What if this was a device integrated in the root complex? >> I have to admit >> that I'm uncertain in how far it is a good idea to try to make Xen look >> to work on such a system ... > > PV dom0 works on a system like the above prior to c/s 75cc460a1b, so I > would consider 75cc460a1b to be a regression for PV dom0 setups. Agreed, in a way it is a regression. In another way it is deliberate behavior to not accept bogus configurations. The difficulty is to find a reasonable balance between allowing Xen to work in such cases and guarding Xen from suffering follow-on issues resulting from such misconfiguration. After all if this system later was impacted by the bad BAR(s), connecting the misbehavior to the root cause might end up quite a bit more difficult. Jan
On Mon, Oct 24, 2022 at 03:59:18PM +0200, Jan Beulich wrote: > On 24.10.2022 14:45, Roger Pau Monné wrote: > > On Mon, Oct 24, 2022 at 01:19:22PM +0200, Jan Beulich wrote: > >> On 20.10.2022 11:46, Roger Pau Monne wrote: > >>> Commit 75cc460a1b added checks to ensure the position of the BARs from > >>> PCI devices don't overlap with regions defined on the memory map. > >>> When there's a collision memory decoding is left disabled for the > >>> device, assuming that dom0 will reposition the BAR if necessary and > >>> enable memory decoding. > >>> > >>> While this would be the case for devices being used by dom0, devices > >>> being used by the firmware itself that have no driver would usually be > >>> left with memory decoding disabled by dom0 if that's the state dom0 > >>> found them in, and thus firmware trying to make use of them will not > >>> function correctly. > >>> > >>> The initial intent of 75cc460a1b was to prevent vPCI from creating > >>> MMIO mappings on the dom0 p2m over regions that would otherwise > >>> already have mappings established. It's my view now that we likely > >>> went too far with 75cc460a1b, and Xen disabling memory decoding of > >>> devices (as buggy as they might be) is harmful, and reduces the set of > >>> hardware on which Xen works. > >>> > >>> This commits reverts most of 75cc460a1b, and instead adds checks to > >>> vPCI in order to prevent misplaced BARs from being added to the > >>> hardware domain p2m. > >> > >> Which makes me wonder: How do things work then? Dom0 then still can't > >> access the BAR address range, can it? > > > > It does allow access on some situations where the previous arrangement > > didn't work because it wholesale disabled memory decoding for the > > device. > > > > So if it's only one BAR that's misplaced the rest will still get added > > to the dom0 p2m and be accessible, because memory decoding won't be > > turned off for the device. > > Right - without a per-BAR disable there can only be all or nothing. In > the end if things work with this adjustment, the problem BAR cannot > really be in use aiui. I wonder what you would propose we do if on > another system such a BAR is actually in use. dom0 would have to change the position of the BAR to a suitable place and then use it. Linux dom0 does already reposition bogus BARs of devices. > >> Plus with this adjustment, is > >> ... > >> > >>> Signaling on whether BARs are mapped is tracked > >>> in the vpci structure, so that misplaced BARs are not mapped, and thus > >>> Xen won't attempt to unmap them when memory decoding is disabled. > >>> > >>> This restores the behavior of Xen for PV dom0 to the state it was > >>> previous to 75cc460a1b, while also introducing a more contained fix > >>> for the vPCI BAR mapping issues. > >> > >> ... this (in particular "restores the behavior") a valid description > >> of this change? > > > > Yes, it restores the previous behavior for PV dom0, as memory decoding > > is no longer turned off for any devices regardless of where the BARs > > are positioned. > > It restores one aspect of behavior but then puts in place another > restriction. I assume the other restriction is about moving the check to vPCI code rather than disabling memory decoding? It restores the behavior for PV dom0, and keeps a more 'contained' fix for PVH dom0. > > >>> Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably positioned') > >>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > >>> --- > >>> AT Citrix we have a system with a device with the following BARs: > >>> > >>> BAR [0xfe010, 0xfe010] -> in a EfiMemoryMappedIO region > >>> BAR [0, 0x1fff] -> not positioned, outside host bridge window > >>> > >>> And memory decoding enabled by the firmware. With the current code > >>> (or any of the previous fix proposals), Xen would still disable memory > >>> decoding for the device, and the system will freeze when attempting to > >>> set EFI vars. > >> > >> Isn't the latter (BAR at address 0) yet another problem? > > > > It's a BAR that hasn't been positioned by the firmware AFAICT. Which > > is a bug in the firmware but shouldn't prevent Xen from booting. > > > > In the above system address 0 is outside of the PCI host bridge > > window, so even if we mapped the BAR and memory decoding for the > > device was enabled accessing such BAR wouldn't work. > > It's mere luck I would say that in this case the BAR is outside the > bridge's window. What if this was a device integrated in the root > complex? I would expect dom0 to reposition the BAR, but doesn't a root complex also have a set of windows in decodes accesses from (as listed in ACPI _CRS method for the device), and hence still need BARs to be positioned at certain ranges in order to be accessible? > >> I have to admit > >> that I'm uncertain in how far it is a good idea to try to make Xen look > >> to work on such a system ... > > > > PV dom0 works on a system like the above prior to c/s 75cc460a1b, so I > > would consider 75cc460a1b to be a regression for PV dom0 setups. > > Agreed, in a way it is a regression. In another way it is deliberate > behavior to not accept bogus configurations. The difficulty is to > find a reasonable balance between allowing Xen to work in such cases > and guarding Xen from suffering follow-on issues resulting from such > misconfiguration. After all if this system later was impacted by the > bad BAR(s), connecting the misbehavior to the root cause might end > up quite a bit more difficult. IMO we should strive to boot (almost?) everywhere Linux (or your chosen dom0 OS) also boots, since that's what users expect. I would assume if the system was impacted by the bad BARs, it would also affect the dom0 OS when booting natively on such platform. What we do right now with memory decoding already leads to a very difficult to diagnose issue, as on the above example calling an UEFI runtime method completely freezes the box (no debug keys, no watchdog worked). So I think leaving the system PCI devices as-is and letting dom0 deal with the conflicts is likely a better option than playing with the memory decoding bits. Thanks, Roger.
On 24.10.2022 17:45, Roger Pau Monné wrote: > On Mon, Oct 24, 2022 at 03:59:18PM +0200, Jan Beulich wrote: >> On 24.10.2022 14:45, Roger Pau Monné wrote: >>> On Mon, Oct 24, 2022 at 01:19:22PM +0200, Jan Beulich wrote: >>>> On 20.10.2022 11:46, Roger Pau Monne wrote: >>>>> Commit 75cc460a1b added checks to ensure the position of the BARs from >>>>> PCI devices don't overlap with regions defined on the memory map. >>>>> When there's a collision memory decoding is left disabled for the >>>>> device, assuming that dom0 will reposition the BAR if necessary and >>>>> enable memory decoding. >>>>> >>>>> While this would be the case for devices being used by dom0, devices >>>>> being used by the firmware itself that have no driver would usually be >>>>> left with memory decoding disabled by dom0 if that's the state dom0 >>>>> found them in, and thus firmware trying to make use of them will not >>>>> function correctly. >>>>> >>>>> The initial intent of 75cc460a1b was to prevent vPCI from creating >>>>> MMIO mappings on the dom0 p2m over regions that would otherwise >>>>> already have mappings established. It's my view now that we likely >>>>> went too far with 75cc460a1b, and Xen disabling memory decoding of >>>>> devices (as buggy as they might be) is harmful, and reduces the set of >>>>> hardware on which Xen works. >>>>> >>>>> This commits reverts most of 75cc460a1b, and instead adds checks to >>>>> vPCI in order to prevent misplaced BARs from being added to the >>>>> hardware domain p2m. >>>> >>>> Which makes me wonder: How do things work then? Dom0 then still can't >>>> access the BAR address range, can it? >>> >>> It does allow access on some situations where the previous arrangement >>> didn't work because it wholesale disabled memory decoding for the >>> device. >>> >>> So if it's only one BAR that's misplaced the rest will still get added >>> to the dom0 p2m and be accessible, because memory decoding won't be >>> turned off for the device. >> >> Right - without a per-BAR disable there can only be all or nothing. In >> the end if things work with this adjustment, the problem BAR cannot >> really be in use aiui. I wonder what you would propose we do if on >> another system such a BAR is actually in use. > > dom0 would have to change the position of the BAR to a suitable place > and then use it. Linux dom0 does already reposition bogus BARs of > devices. Yet that still can't realistically work if the firmware expects the BAR at the address recorded in the EFI memory map entry. >>>> Plus with this adjustment, is >>>> ... >>>> >>>>> Signaling on whether BARs are mapped is tracked >>>>> in the vpci structure, so that misplaced BARs are not mapped, and thus >>>>> Xen won't attempt to unmap them when memory decoding is disabled. >>>>> >>>>> This restores the behavior of Xen for PV dom0 to the state it was >>>>> previous to 75cc460a1b, while also introducing a more contained fix >>>>> for the vPCI BAR mapping issues. >>>> >>>> ... this (in particular "restores the behavior") a valid description >>>> of this change? >>> >>> Yes, it restores the previous behavior for PV dom0, as memory decoding >>> is no longer turned off for any devices regardless of where the BARs >>> are positioned. >> >> It restores one aspect of behavior but then puts in place another >> restriction. > > I assume the other restriction is about moving the check to vPCI code > rather than disabling memory decoding? > > It restores the behavior for PV dom0, and keeps a more 'contained' fix > for PVH dom0. Ouch, I'm sorry: I didn't pay attention to the "restore" applying to PV (the similarity of the acronyms made me read "PVH" despite it being "PV"). >>>>> Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably positioned') >>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >>>>> --- >>>>> AT Citrix we have a system with a device with the following BARs: >>>>> >>>>> BAR [0xfe010, 0xfe010] -> in a EfiMemoryMappedIO region >>>>> BAR [0, 0x1fff] -> not positioned, outside host bridge window >>>>> >>>>> And memory decoding enabled by the firmware. With the current code >>>>> (or any of the previous fix proposals), Xen would still disable memory >>>>> decoding for the device, and the system will freeze when attempting to >>>>> set EFI vars. >>>> >>>> Isn't the latter (BAR at address 0) yet another problem? >>> >>> It's a BAR that hasn't been positioned by the firmware AFAICT. Which >>> is a bug in the firmware but shouldn't prevent Xen from booting. >>> >>> In the above system address 0 is outside of the PCI host bridge >>> window, so even if we mapped the BAR and memory decoding for the >>> device was enabled accessing such BAR wouldn't work. >> >> It's mere luck I would say that in this case the BAR is outside the >> bridge's window. What if this was a device integrated in the root >> complex? > > I would expect dom0 to reposition the BAR, but doesn't a root complex > also have a set of windows in decodes accesses from (as listed in ACPI > _CRS method for the device), and hence still need BARs to be > positioned at certain ranges in order to be accessible? Possibly; I guess I haven't learned enough of how this works at the root complex. Yet still an unassigned BAR might end up overlapping a valid window. >>>> I have to admit >>>> that I'm uncertain in how far it is a good idea to try to make Xen look >>>> to work on such a system ... >>> >>> PV dom0 works on a system like the above prior to c/s 75cc460a1b, so I >>> would consider 75cc460a1b to be a regression for PV dom0 setups. >> >> Agreed, in a way it is a regression. In another way it is deliberate >> behavior to not accept bogus configurations. The difficulty is to >> find a reasonable balance between allowing Xen to work in such cases >> and guarding Xen from suffering follow-on issues resulting from such >> misconfiguration. After all if this system later was impacted by the >> bad BAR(s), connecting the misbehavior to the root cause might end >> up quite a bit more difficult. > > IMO we should strive to boot (almost?) everywhere Linux (or your > chosen dom0 OS) also boots, since that's what users expect. > > I would assume if the system was impacted by the bad BARs, it would > also affect the dom0 OS when booting natively on such platform. > > What we do right now with memory decoding already leads to a very > difficult to diagnose issue, as on the above example calling an UEFI > runtime method completely freezes the box (no debug keys, no watchdog > worked). > > So I think leaving the system PCI devices as-is and letting dom0 deal > with the conflicts is likely a better option than playing with the > memory decoding bits. Maybe. None of the workarounds really feel very good. Jan
On Mon, Oct 24, 2022 at 05:56:25PM +0200, Jan Beulich wrote: > On 24.10.2022 17:45, Roger Pau Monné wrote: > > On Mon, Oct 24, 2022 at 03:59:18PM +0200, Jan Beulich wrote: > >> On 24.10.2022 14:45, Roger Pau Monné wrote: > >>> On Mon, Oct 24, 2022 at 01:19:22PM +0200, Jan Beulich wrote: > >>>> On 20.10.2022 11:46, Roger Pau Monne wrote: > >>>>> Commit 75cc460a1b added checks to ensure the position of the BARs from > >>>>> PCI devices don't overlap with regions defined on the memory map. > >>>>> When there's a collision memory decoding is left disabled for the > >>>>> device, assuming that dom0 will reposition the BAR if necessary and > >>>>> enable memory decoding. > >>>>> > >>>>> While this would be the case for devices being used by dom0, devices > >>>>> being used by the firmware itself that have no driver would usually be > >>>>> left with memory decoding disabled by dom0 if that's the state dom0 > >>>>> found them in, and thus firmware trying to make use of them will not > >>>>> function correctly. > >>>>> > >>>>> The initial intent of 75cc460a1b was to prevent vPCI from creating > >>>>> MMIO mappings on the dom0 p2m over regions that would otherwise > >>>>> already have mappings established. It's my view now that we likely > >>>>> went too far with 75cc460a1b, and Xen disabling memory decoding of > >>>>> devices (as buggy as they might be) is harmful, and reduces the set of > >>>>> hardware on which Xen works. > >>>>> > >>>>> This commits reverts most of 75cc460a1b, and instead adds checks to > >>>>> vPCI in order to prevent misplaced BARs from being added to the > >>>>> hardware domain p2m. > >>>> > >>>> Which makes me wonder: How do things work then? Dom0 then still can't > >>>> access the BAR address range, can it? > >>> > >>> It does allow access on some situations where the previous arrangement > >>> didn't work because it wholesale disabled memory decoding for the > >>> device. > >>> > >>> So if it's only one BAR that's misplaced the rest will still get added > >>> to the dom0 p2m and be accessible, because memory decoding won't be > >>> turned off for the device. > >> > >> Right - without a per-BAR disable there can only be all or nothing. In > >> the end if things work with this adjustment, the problem BAR cannot > >> really be in use aiui. I wonder what you would propose we do if on > >> another system such a BAR is actually in use. > > > > dom0 would have to change the position of the BAR to a suitable place > > and then use it. Linux dom0 does already reposition bogus BARs of > > devices. > > Yet that still can't realistically work if the firmware expects the > BAR at the address recorded in the EFI memory map entry. I was thinking about the BAR at address 0, rather than the BAR in the EfiMemoryMappedIO region. dom0 OS would need to avoid moving it, but that would also apply when running natively on the platform. The behavior when running as a dom0 won't change vs the native behavior, which is what we should aim for. > >>>>> Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably positioned') > >>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > >>>>> --- > >>>>> AT Citrix we have a system with a device with the following BARs: > >>>>> > >>>>> BAR [0xfe010, 0xfe010] -> in a EfiMemoryMappedIO region > >>>>> BAR [0, 0x1fff] -> not positioned, outside host bridge window > >>>>> > >>>>> And memory decoding enabled by the firmware. With the current code > >>>>> (or any of the previous fix proposals), Xen would still disable memory > >>>>> decoding for the device, and the system will freeze when attempting to > >>>>> set EFI vars. > >>>> > >>>> Isn't the latter (BAR at address 0) yet another problem? > >>> > >>> It's a BAR that hasn't been positioned by the firmware AFAICT. Which > >>> is a bug in the firmware but shouldn't prevent Xen from booting. > >>> > >>> In the above system address 0 is outside of the PCI host bridge > >>> window, so even if we mapped the BAR and memory decoding for the > >>> device was enabled accessing such BAR wouldn't work. > >> > >> It's mere luck I would say that in this case the BAR is outside the > >> bridge's window. What if this was a device integrated in the root > >> complex? > > > > I would expect dom0 to reposition the BAR, but doesn't a root complex > > also have a set of windows in decodes accesses from (as listed in ACPI > > _CRS method for the device), and hence still need BARs to be > > positioned at certain ranges in order to be accessible? > > Possibly; I guess I haven't learned enough of how this works at the > root complex. Yet still an unassigned BAR might end up overlapping a > valid window. Right, but if the BAR overlaps a valid window it could be seen as correctly positioned, and in any case that would be for dom0 to deal with. What we care about is BARs no overlapping regions on the memory map, so that we can setup a valid p2m for dom0. > >>>> I have to admit > >>>> that I'm uncertain in how far it is a good idea to try to make Xen look > >>>> to work on such a system ... > >>> > >>> PV dom0 works on a system like the above prior to c/s 75cc460a1b, so I > >>> would consider 75cc460a1b to be a regression for PV dom0 setups. > >> > >> Agreed, in a way it is a regression. In another way it is deliberate > >> behavior to not accept bogus configurations. The difficulty is to > >> find a reasonable balance between allowing Xen to work in such cases > >> and guarding Xen from suffering follow-on issues resulting from such > >> misconfiguration. After all if this system later was impacted by the > >> bad BAR(s), connecting the misbehavior to the root cause might end > >> up quite a bit more difficult. > > > > IMO we should strive to boot (almost?) everywhere Linux (or your > > chosen dom0 OS) also boots, since that's what users expect. > > > > I would assume if the system was impacted by the bad BARs, it would > > also affect the dom0 OS when booting natively on such platform. > > > > What we do right now with memory decoding already leads to a very > > difficult to diagnose issue, as on the above example calling an UEFI > > runtime method completely freezes the box (no debug keys, no watchdog > > worked). > > > > So I think leaving the system PCI devices as-is and letting dom0 deal > > with the conflicts is likely a better option than playing with the > > memory decoding bits. > > Maybe. None of the workarounds really feel very good. Hence this last suggestion which limits the workarounds to PVH dom0 only, thus limiting the interaction of Xen with PCI devices as much as possible. I think it's an appropriate compromise between being able to boot as PVH dom0 and not playing with the PCI device memory decoding bits. Thanks, Roger.
diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c index 149f68bb6e..b42acb8d7c 100644 --- a/xen/drivers/passthrough/pci.c +++ b/xen/drivers/passthrough/pci.c @@ -233,9 +233,6 @@ static void check_pdev(const struct pci_dev *pdev) PCI_STATUS_REC_TARGET_ABORT | PCI_STATUS_REC_MASTER_ABORT | \ PCI_STATUS_SIG_SYSTEM_ERROR | PCI_STATUS_DETECTED_PARITY) u16 val; - unsigned int nbars = 0, rom_pos = 0, i; - static const char warn[] = XENLOG_WARNING - "%pp disabled: %sBAR [%#lx, %#lx] overlaps with memory map\n"; if ( command_mask ) { @@ -254,8 +251,6 @@ static void check_pdev(const struct pci_dev *pdev) switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f ) { case PCI_HEADER_TYPE_BRIDGE: - nbars = PCI_HEADER_BRIDGE_NR_BARS; - rom_pos = PCI_ROM_ADDRESS1; if ( !bridge_ctl_mask ) break; val = pci_conf_read16(pdev->sbdf, PCI_BRIDGE_CONTROL); @@ -272,75 +267,11 @@ static void check_pdev(const struct pci_dev *pdev) } break; - case PCI_HEADER_TYPE_NORMAL: - nbars = PCI_HEADER_NORMAL_NR_BARS; - rom_pos = PCI_ROM_ADDRESS; - break; - case PCI_HEADER_TYPE_CARDBUS: /* TODO */ break; } #undef PCI_STATUS_CHECK - - /* Check if BARs overlap with other memory regions. */ - val = pci_conf_read16(pdev->sbdf, PCI_COMMAND); - if ( !(val & PCI_COMMAND_MEMORY) || pdev->ignore_bars ) - return; - - pci_conf_write16(pdev->sbdf, PCI_COMMAND, val & ~PCI_COMMAND_MEMORY); - for ( i = 0; i < nbars; ) - { - uint64_t addr, size; - unsigned int reg = PCI_BASE_ADDRESS_0 + i * 4; - int rc = 1; - - if ( (pci_conf_read32(pdev->sbdf, reg) & PCI_BASE_ADDRESS_SPACE) != - PCI_BASE_ADDRESS_SPACE_MEMORY ) - goto next; - - rc = pci_size_mem_bar(pdev->sbdf, reg, &addr, &size, - (i == nbars - 1) ? PCI_BAR_LAST : 0); - if ( rc < 0 ) - /* Unable to size, better leave memory decoding disabled. */ - return; - if ( size && !pci_check_bar(pdev, maddr_to_mfn(addr), - maddr_to_mfn(addr + size - 1)) ) - { - /* - * Return without enabling memory decoding if BAR position is not - * in IO suitable memory. Let the hardware domain re-position the - * BAR. - */ - printk(warn, - &pdev->sbdf, "", PFN_DOWN(addr), PFN_DOWN(addr + size - 1)); - return; - } - - next: - ASSERT(rc > 0); - i += rc; - } - - if ( rom_pos && - (pci_conf_read32(pdev->sbdf, rom_pos) & PCI_ROM_ADDRESS_ENABLE) ) - { - uint64_t addr, size; - int rc = pci_size_mem_bar(pdev->sbdf, rom_pos, &addr, &size, - PCI_BAR_ROM); - - if ( rc < 0 ) - return; - if ( size && !pci_check_bar(pdev, maddr_to_mfn(addr), - maddr_to_mfn(addr + size - 1)) ) - { - printk(warn, &pdev->sbdf, "ROM ", PFN_DOWN(addr), - PFN_DOWN(addr + size - 1)); - return; - } - } - - pci_conf_write16(pdev->sbdf, PCI_COMMAND, val); } static void apply_quirks(struct pci_dev *pdev) diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c index eb9219a49a..4d7f8f4a30 100644 --- a/xen/drivers/vpci/header.c +++ b/xen/drivers/vpci/header.c @@ -121,7 +121,9 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd, } if ( !rom_only && - (bar->type != VPCI_BAR_ROM || header->rom_enabled) ) + (bar->type != VPCI_BAR_ROM || header->rom_enabled) && + pci_check_bar(pdev, _mfn(PFN_DOWN(bar->addr)), + _mfn(PFN_DOWN(bar->addr + bar->size - 1))) ) bar->enabled = map; } @@ -234,9 +236,25 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only) if ( !MAPPABLE_BAR(bar) || (rom_only ? bar->type != VPCI_BAR_ROM - : (bar->type == VPCI_BAR_ROM && !header->rom_enabled)) ) + : (bar->type == VPCI_BAR_ROM && !header->rom_enabled)) || + /* Skip BARs already in the requested state. */ + bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) ) continue; + /* + * Only do BAR position checks for the hardware domain, for guests it's + * assumed that the hardware domain has changed the position of any + * problematic BARs. + */ + if ( is_hardware_domain(pdev->domain) && + !pci_check_bar(pdev, _mfn(start), _mfn(end)) ) + { + printk(XENLOG_G_WARNING + "%pp: not mapping BAR [%lx, %lx] invalid position\n", + &pdev->sbdf, start, end); + continue; + } + rc = rangeset_add_range(mem, start, end); if ( rc ) {
Commit 75cc460a1b added checks to ensure the position of the BARs from PCI devices don't overlap with regions defined on the memory map. When there's a collision memory decoding is left disabled for the device, assuming that dom0 will reposition the BAR if necessary and enable memory decoding. While this would be the case for devices being used by dom0, devices being used by the firmware itself that have no driver would usually be left with memory decoding disabled by dom0 if that's the state dom0 found them in, and thus firmware trying to make use of them will not function correctly. The initial intent of 75cc460a1b was to prevent vPCI from creating MMIO mappings on the dom0 p2m over regions that would otherwise already have mappings established. It's my view now that we likely went too far with 75cc460a1b, and Xen disabling memory decoding of devices (as buggy as they might be) is harmful, and reduces the set of hardware on which Xen works. This commits reverts most of 75cc460a1b, and instead adds checks to vPCI in order to prevent misplaced BARs from being added to the hardware domain p2m. Signaling on whether BARs are mapped is tracked in the vpci structure, so that misplaced BARs are not mapped, and thus Xen won't attempt to unmap them when memory decoding is disabled. This restores the behavior of Xen for PV dom0 to the state it was previous to 75cc460a1b, while also introducing a more contained fix for the vPCI BAR mapping issues. Fixes: 75cc460a1b ('xen/pci: detect when BARs are not suitably positioned') Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- AT Citrix we have a system with a device with the following BARs: BAR [0xfe010, 0xfe010] -> in a EfiMemoryMappedIO region BAR [0, 0x1fff] -> not positioned, outside host bridge window And memory decoding enabled by the firmware. With the current code (or any of the previous fix proposals), Xen would still disable memory decoding for the device, and the system will freeze when attempting to set EFI vars. I'm afraid the best solution to avoid regressions caused by 75cc460a1b is the proposal here. --- xen/drivers/passthrough/pci.c | 69 ----------------------------------- xen/drivers/vpci/header.c | 22 ++++++++++- 2 files changed, 20 insertions(+), 71 deletions(-)