Message ID | 20180312164426.68b3c422@w520.home (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 13/3/18 9:44 am, Alex Williamson wrote: > On Fri, 9 Mar 2018 12:17:36 +1100 > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >> Commit 2170dd04316e ("vfio-pci: Mask INTx if a device is not capabable >> of enabling it") is causing 'Masking broken INTx support' messages >> every time when a PCI without INTx support is enabled. This message is >> intended to appear when a device known for broken INTx is being enabled. >> However the message also appears on IOV virtual functions where INTx >> cannot possibly be enabled so saying that the "broken" support is masked >> is not correct. >> >> This changes the message to only appear when the device advertises INTx >> (via PCI_INTERRUPT_PIN != 0) but does not implement PCI 2.3 INTx masking. >> >> Fixes: 2170dd04316e ("vfio-pci: Mask INTx if a device is not capabable of enabling it") >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >> --- >> drivers/vfio/pci/vfio_pci.c | 8 +++++++- >> 1 file changed, 7 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c >> index b0f7594..7f2ec47 100644 >> --- a/drivers/vfio/pci/vfio_pci.c >> +++ b/drivers/vfio/pci/vfio_pci.c >> @@ -245,7 +245,13 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev) >> >> if (likely(!nointxmask)) { >> if (vfio_pci_nointx(pdev)) { >> - dev_info(&pdev->dev, "Masking broken INTx support\n"); >> + u8 pin = 0; >> + >> + pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, >> + &pin); >> + if (pin) >> + dev_info(&pdev->dev, >> + "Masking broken INTx support\n"); >> vdev->nointx = true; >> pci_intx(pdev, 0); >> } else > > > Why don't we fix it at the location of the bug rather than after the > fact? I don't see any reason to invoke the nointx handling for devices > that don't actually support INTx. Thanks, We can do that too, this just means that we will keep doing v2.3 masking tests and possibly other things for SRIOV VFs when we do not need to as dev-irq==0 anyway. Not a big deal though (was not a problem before), are you going to post this as a patch or you want me to do this? Thanks, > > Alex > > > diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > index ad18ed266dc0..34d463a0b4a5 100644 > --- a/drivers/vfio/pci/vfio_pci.c > +++ b/drivers/vfio/pci/vfio_pci.c > @@ -192,6 +192,8 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev); > */ > static bool vfio_pci_nointx(struct pci_dev *pdev) > { > + u8 pin; > + > switch (pdev->vendor) { > case PCI_VENDOR_ID_INTEL: > switch (pdev->device) { > @@ -207,7 +209,9 @@ static bool vfio_pci_nointx(struct pci_dev *pdev) > } > } > > - if (!pdev->irq) > + pci_read_config_byte(pdev, PCI_INTERRUPT_PIN, &pin); > + > + if (pin && !pdev->irq) > return true; > > return false; >
On Tue, 13 Mar 2018 13:38:46 +1100 Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > On 13/3/18 9:44 am, Alex Williamson wrote: > > On Fri, 9 Mar 2018 12:17:36 +1100 > > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > > > >> Commit 2170dd04316e ("vfio-pci: Mask INTx if a device is not capabable > >> of enabling it") is causing 'Masking broken INTx support' messages > >> every time when a PCI without INTx support is enabled. This message is > >> intended to appear when a device known for broken INTx is being enabled. > >> However the message also appears on IOV virtual functions where INTx > >> cannot possibly be enabled so saying that the "broken" support is masked > >> is not correct. > >> > >> This changes the message to only appear when the device advertises INTx > >> (via PCI_INTERRUPT_PIN != 0) but does not implement PCI 2.3 INTx masking. > >> > >> Fixes: 2170dd04316e ("vfio-pci: Mask INTx if a device is not capabable of enabling it") > >> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > >> --- > >> drivers/vfio/pci/vfio_pci.c | 8 +++++++- > >> 1 file changed, 7 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > >> index b0f7594..7f2ec47 100644 > >> --- a/drivers/vfio/pci/vfio_pci.c > >> +++ b/drivers/vfio/pci/vfio_pci.c > >> @@ -245,7 +245,13 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev) > >> > >> if (likely(!nointxmask)) { > >> if (vfio_pci_nointx(pdev)) { > >> - dev_info(&pdev->dev, "Masking broken INTx support\n"); > >> + u8 pin = 0; > >> + > >> + pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, > >> + &pin); > >> + if (pin) > >> + dev_info(&pdev->dev, > >> + "Masking broken INTx support\n"); > >> vdev->nointx = true; > >> pci_intx(pdev, 0); > >> } else > > > > > > Why don't we fix it at the location of the bug rather than after the > > fact? I don't see any reason to invoke the nointx handling for devices > > that don't actually support INTx. Thanks, > > > We can do that too, this just means that we will keep doing v2.3 masking > tests and possibly other things for SRIOV VFs when we do not need to as > dev-irq==0 anyway. Not a big deal though (was not a problem before), are > you going to post this as a patch or you want me to do this? Thanks, I tend to prefer keeping the nointx as a special case, where I don't consider that a device not supporting intx, such as a vf, a special case. The user of the device can detect that the device doesn't support intx and we don't need to worry about the device triggering it. The nointx case is trying to emulate that in software. Despite the commit subject of 2170dd04316e, I think the intent of that commit is to invoke the nointx behavior if the device supports intx, but something in the chipset/platform/whatever has made it unconfigurable. Now that you mention v2.3 stuff, I further question the validity of the original patch. The nointx handling depends on DisINTx support being only half broken, that we can mask it, but not detect a pending interrupt. OTOH, pdev->irq being NULL precludes our ability to configure an irq handler, but what does it imply about the device's ability to cause chaos by triggering the interrupt line regardless? The nointx case does quite a bit of pci_intx(pdev, 0) calls to stop the device from pulling intx, which is why I don't want to use it for devices no supporting intx. Should we only be invoking the nointx path if a) the device supports intx (pin reg), b) pdev->irq is null, and c) the device passes the v2.3 masking test (which is cached, not retested now, btw)? Thanks, Alex
On 15/3/18 2:02 pm, Alex Williamson wrote: > On Tue, 13 Mar 2018 13:38:46 +1100 > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >> On 13/3/18 9:44 am, Alex Williamson wrote: >>> On Fri, 9 Mar 2018 12:17:36 +1100 >>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: >>> >>>> Commit 2170dd04316e ("vfio-pci: Mask INTx if a device is not capabable >>>> of enabling it") is causing 'Masking broken INTx support' messages >>>> every time when a PCI without INTx support is enabled. This message is >>>> intended to appear when a device known for broken INTx is being enabled. >>>> However the message also appears on IOV virtual functions where INTx >>>> cannot possibly be enabled so saying that the "broken" support is masked >>>> is not correct. >>>> >>>> This changes the message to only appear when the device advertises INTx >>>> (via PCI_INTERRUPT_PIN != 0) but does not implement PCI 2.3 INTx masking. >>>> >>>> Fixes: 2170dd04316e ("vfio-pci: Mask INTx if a device is not capabable of enabling it") >>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>>> --- >>>> drivers/vfio/pci/vfio_pci.c | 8 +++++++- >>>> 1 file changed, 7 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c >>>> index b0f7594..7f2ec47 100644 >>>> --- a/drivers/vfio/pci/vfio_pci.c >>>> +++ b/drivers/vfio/pci/vfio_pci.c >>>> @@ -245,7 +245,13 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev) >>>> >>>> if (likely(!nointxmask)) { >>>> if (vfio_pci_nointx(pdev)) { >>>> - dev_info(&pdev->dev, "Masking broken INTx support\n"); >>>> + u8 pin = 0; >>>> + >>>> + pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, >>>> + &pin); >>>> + if (pin) >>>> + dev_info(&pdev->dev, >>>> + "Masking broken INTx support\n"); >>>> vdev->nointx = true; >>>> pci_intx(pdev, 0); >>>> } else >>> > Should we only be invoking the nointx path > if a) the device supports intx (pin reg), b) pdev->irq is null, and c) > the device passes the v2.3 masking test (which is cached, not retested > now, btw)? Yes we should, and yes I missed that last bit about caching the result of the test which is performed anyway in the common pci code. Thanks,
On Mon, 19 Mar 2018 01:05:27 +1100 Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > On 15/3/18 2:02 pm, Alex Williamson wrote: > > On Tue, 13 Mar 2018 13:38:46 +1100 > > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > > > >> On 13/3/18 9:44 am, Alex Williamson wrote: > >>> On Fri, 9 Mar 2018 12:17:36 +1100 > >>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >>> > >>>> Commit 2170dd04316e ("vfio-pci: Mask INTx if a device is not capabable > >>>> of enabling it") is causing 'Masking broken INTx support' messages > >>>> every time when a PCI without INTx support is enabled. This message is > >>>> intended to appear when a device known for broken INTx is being enabled. > >>>> However the message also appears on IOV virtual functions where INTx > >>>> cannot possibly be enabled so saying that the "broken" support is masked > >>>> is not correct. > >>>> > >>>> This changes the message to only appear when the device advertises INTx > >>>> (via PCI_INTERRUPT_PIN != 0) but does not implement PCI 2.3 INTx masking. > >>>> > >>>> Fixes: 2170dd04316e ("vfio-pci: Mask INTx if a device is not capabable of enabling it") > >>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > >>>> --- > >>>> drivers/vfio/pci/vfio_pci.c | 8 +++++++- > >>>> 1 file changed, 7 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c > >>>> index b0f7594..7f2ec47 100644 > >>>> --- a/drivers/vfio/pci/vfio_pci.c > >>>> +++ b/drivers/vfio/pci/vfio_pci.c > >>>> @@ -245,7 +245,13 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev) > >>>> > >>>> if (likely(!nointxmask)) { > >>>> if (vfio_pci_nointx(pdev)) { > >>>> - dev_info(&pdev->dev, "Masking broken INTx support\n"); > >>>> + u8 pin = 0; > >>>> + > >>>> + pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, > >>>> + &pin); > >>>> + if (pin) > >>>> + dev_info(&pdev->dev, > >>>> + "Masking broken INTx support\n"); > >>>> vdev->nointx = true; > >>>> pci_intx(pdev, 0); > >>>> } else > >>> > > Should we only be invoking the nointx path > > if a) the device supports intx (pin reg), b) pdev->irq is null, and c) > > the device passes the v2.3 masking test (which is cached, not retested > > now, btw)? > > Yes we should, and yes I missed that last bit about caching the result of > the test which is performed anyway in the common pci code. Thanks, I spent a while considering whether we should include c) in the patch I posted, but I couldn't get past the question of what we'd do with the device if it doesn't pass the INTx masking test. We'd want to emulate the INTx pin register and report no INTx irq info, as the nointx path does, then what? If writing DisINTx doesn't do anything, we need to decide whether lack of INTx routing means the device can harmlessly pull INTx, or if pulling INTx is not harmless, then we'd need to ban the device from being assigned to the user. Since we don't know of any problems and it seems sort of reasonable to assume that if there's no INTx routing that we can tug it all we want (because if someone is listening, it's a platform bug that it's not routed), I left it alone in my follow-up patch. Let me know if you have other thoughts. Thanks, Alex
On 19/3/18 1:15 pm, Alex Williamson wrote: > On Mon, 19 Mar 2018 01:05:27 +1100 > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >> On 15/3/18 2:02 pm, Alex Williamson wrote: >>> On Tue, 13 Mar 2018 13:38:46 +1100 >>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: >>> >>>> On 13/3/18 9:44 am, Alex Williamson wrote: >>>>> On Fri, 9 Mar 2018 12:17:36 +1100 >>>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: >>>>> >>>>>> Commit 2170dd04316e ("vfio-pci: Mask INTx if a device is not capabable >>>>>> of enabling it") is causing 'Masking broken INTx support' messages >>>>>> every time when a PCI without INTx support is enabled. This message is >>>>>> intended to appear when a device known for broken INTx is being enabled. >>>>>> However the message also appears on IOV virtual functions where INTx >>>>>> cannot possibly be enabled so saying that the "broken" support is masked >>>>>> is not correct. >>>>>> >>>>>> This changes the message to only appear when the device advertises INTx >>>>>> (via PCI_INTERRUPT_PIN != 0) but does not implement PCI 2.3 INTx masking. >>>>>> >>>>>> Fixes: 2170dd04316e ("vfio-pci: Mask INTx if a device is not capabable of enabling it") >>>>>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> >>>>>> --- >>>>>> drivers/vfio/pci/vfio_pci.c | 8 +++++++- >>>>>> 1 file changed, 7 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c >>>>>> index b0f7594..7f2ec47 100644 >>>>>> --- a/drivers/vfio/pci/vfio_pci.c >>>>>> +++ b/drivers/vfio/pci/vfio_pci.c >>>>>> @@ -245,7 +245,13 @@ static int vfio_pci_enable(struct vfio_pci_device *vdev) >>>>>> >>>>>> if (likely(!nointxmask)) { >>>>>> if (vfio_pci_nointx(pdev)) { >>>>>> - dev_info(&pdev->dev, "Masking broken INTx support\n"); >>>>>> + u8 pin = 0; >>>>>> + >>>>>> + pci_read_config_byte(vdev->pdev, PCI_INTERRUPT_PIN, >>>>>> + &pin); >>>>>> + if (pin) >>>>>> + dev_info(&pdev->dev, >>>>>> + "Masking broken INTx support\n"); >>>>>> vdev->nointx = true; >>>>>> pci_intx(pdev, 0); >>>>>> } else >>>>> >>> Should we only be invoking the nointx path >>> if a) the device supports intx (pin reg), b) pdev->irq is null, and c) >>> the device passes the v2.3 masking test (which is cached, not retested >>> now, btw)? >> >> Yes we should, and yes I missed that last bit about caching the result of >> the test which is performed anyway in the common pci code. Thanks, > > I spent a while considering whether we should include c) in the patch I > posted, but I couldn't get past the question of what we'd do with the > device if it doesn't pass the INTx masking test. We'd want to emulate > the INTx pin register and report no INTx irq info, as the nointx path > does, then what? If writing DisINTx doesn't do anything, we need to > decide whether lack of INTx routing means the device can harmlessly > pull INTx, or if pulling INTx is not harmless, then we'd need to ban > the device from being assigned to the user. Since we don't know of any > problems and it seems sort of reasonable to assume that if there's no > INTx routing that we can tug it all we want (because if someone is > listening, it's a platform bug that it's not routed), I left it alone > in my follow-up patch. Let me know if you have other thoughts. Thanks, Well, pulling INTx is nasty anyway as a device may die afterwards and the specific INTx line (or a latch in PCIe bridge) will be blocked for the devices sharing the same line, and DisINTx does not change that. Our other hypervisor (pHyp) simply prohibits INTx in order to avoid dealing with this (this is why 2170dd04316e was made). So not enforcing nointx for DisINTx-incapable devices does not seem much worse than enabling INTx for pass through itself.
On Thu, 22 Mar 2018 15:02:09 +1100 Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > On 19/3/18 1:15 pm, Alex Williamson wrote: > > On Mon, 19 Mar 2018 01:05:27 +1100 > > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > > > >> On 15/3/18 2:02 pm, Alex Williamson wrote: > >>> On Tue, 13 Mar 2018 13:38:46 +1100 > >>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >>> > >>>> On 13/3/18 9:44 am, Alex Williamson wrote: > >>> Should we only be invoking the nointx path > >>> if a) the device supports intx (pin reg), b) pdev->irq is null, and c) > >>> the device passes the v2.3 masking test (which is cached, not retested > >>> now, btw)? > >> > >> Yes we should, and yes I missed that last bit about caching the result of > >> the test which is performed anyway in the common pci code. Thanks, > > > > I spent a while considering whether we should include c) in the patch I > > posted, but I couldn't get past the question of what we'd do with the > > device if it doesn't pass the INTx masking test. We'd want to emulate > > the INTx pin register and report no INTx irq info, as the nointx path > > does, then what? If writing DisINTx doesn't do anything, we need to > > decide whether lack of INTx routing means the device can harmlessly > > pull INTx, or if pulling INTx is not harmless, then we'd need to ban > > the device from being assigned to the user. Since we don't know of any > > problems and it seems sort of reasonable to assume that if there's no > > INTx routing that we can tug it all we want (because if someone is > > listening, it's a platform bug that it's not routed), I left it alone > > in my follow-up patch. Let me know if you have other thoughts. Thanks, > > > Well, pulling INTx is nasty anyway as a device may die afterwards and the > specific INTx line (or a latch in PCIe bridge) will be blocked for the > devices sharing the same line, and DisINTx does not change that. "... a device may die afterwards"? What does that mean? In normal operation, INTx fires, DisINTx gets set, the interrupt is forwarded to the user, the user services and unmasks the interrupt, rinse and repeat. If the device somehow gets into a locked state where it's pulling INTx and disregarding DisINTx, a) this is broken hardware, b) the spurious interrupt handler will mask the interrupt and switch to polling, penalizing everyone sharing that line. I don't know that there's a model where we can factor in catastrophic hardware failure into the equation. > Our other > hypervisor (pHyp) simply prohibits INTx in order to avoid dealing with this > (this is why 2170dd04316e was made). So not enforcing nointx for > DisINTx-incapable devices does not seem much worse than enabling INTx for > pass through itself. Too. Many. Negatives. If a device shares an INTx line with other devices then it certainly needs to support DisINTx in order to play nicely with userspace drivers. We require exclusive interrupts for devices that do not support DisINTx. If the hypervisor is choosing to avoid dealing with that problem by not providing routing for INTx (effectively a paravirt please don't use this interface), then it seems like it's not fit to support userspace drivers. It's depending on the cooperation of the user. We can't prevent the user from making the card pull INTx, we can only discourage them from using it, which is not sufficient. Therefore if we really do need to block use of such devices (not INTx routing, no DisINTx support) from using vfio, my patch is insufficient and I think the best approach at this stage is to revert 2170dd04316e and you can rehash a solution for pHyp. Thanks, Alex
On 22/3/18 3:40 pm, Alex Williamson wrote: > On Thu, 22 Mar 2018 15:02:09 +1100 > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >> On 19/3/18 1:15 pm, Alex Williamson wrote: >>> On Mon, 19 Mar 2018 01:05:27 +1100 >>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: >>> >>>> On 15/3/18 2:02 pm, Alex Williamson wrote: >>>>> On Tue, 13 Mar 2018 13:38:46 +1100 >>>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: >>>>> >>>>>> On 13/3/18 9:44 am, Alex Williamson wrote: >>>>> Should we only be invoking the nointx path >>>>> if a) the device supports intx (pin reg), b) pdev->irq is null, and c) >>>>> the device passes the v2.3 masking test (which is cached, not retested >>>>> now, btw)? >>>> >>>> Yes we should, and yes I missed that last bit about caching the result of >>>> the test which is performed anyway in the common pci code. Thanks, >>> >>> I spent a while considering whether we should include c) in the patch I >>> posted, but I couldn't get past the question of what we'd do with the >>> device if it doesn't pass the INTx masking test. We'd want to emulate >>> the INTx pin register and report no INTx irq info, as the nointx path >>> does, then what? If writing DisINTx doesn't do anything, we need to >>> decide whether lack of INTx routing means the device can harmlessly >>> pull INTx, or if pulling INTx is not harmless, then we'd need to ban >>> the device from being assigned to the user. Since we don't know of any >>> problems and it seems sort of reasonable to assume that if there's no >>> INTx routing that we can tug it all we want (because if someone is >>> listening, it's a platform bug that it's not routed), I left it alone >>> in my follow-up patch. Let me know if you have other thoughts. Thanks, >> >> >> Well, pulling INTx is nasty anyway as a device may die afterwards and the >> specific INTx line (or a latch in PCIe bridge) will be blocked for the >> devices sharing the same line, and DisINTx does not change that. > > "... a device may die afterwards"? What does that mean? I meant the device could be blocked by EEH after raising INTx. > In normal > operation, INTx fires, DisINTx gets set, the interrupt is forwarded to > the user, the user services and unmasks the interrupt, rinse and > repeat. If the device somehow gets into a locked state where it's > pulling INTx and disregarding DisINTx, a) this is broken hardware, b) > the spurious interrupt handler will mask the interrupt and switch to > polling, penalizing everyone sharing that line. I don't know that > there's a model where we can factor in catastrophic hardware failure > into the equation. >>> Our other >> hypervisor (pHyp) simply prohibits INTx in order to avoid dealing with this >> (this is why 2170dd04316e was made). So not enforcing nointx for >> DisINTx-incapable devices does not seem much worse than enabling INTx for >> pass through itself. > > Too. Many. Negatives. If a device shares an INTx line with other > devices then it certainly needs to support DisINTx in order to play > nicely with userspace drivers. We require exclusive interrupts for > devices that do not support DisINTx. If the hypervisor is choosing to > avoid dealing with that problem by not providing routing for INTx > (effectively a paravirt please don't use this interface), then it seems > like it's not fit to support userspace drivers. It's depending on the > cooperation of the user. We can't prevent the user from making the card > pull INTx, we can only discourage them from using it, which is not > sufficient. Therefore if we really do need to block use of such > devices (not INTx routing, no DisINTx support) from using vfio, my patch > is insufficient and I think the best approach at this stage is to revert > 2170dd04316e and you can rehash a solution for pHyp. Thanks, No, I do not want to block INTx and I am happy with "[PATCH v2] vfio/pci: Quiet broken INTx whining when INTx is unsupported by device". I am missing the point here, sorry, no-INTx-routing + no-DisINTx + no-MSI(X) - I understand why we would want to block these but if a device cannot do INTx nicely but still can do MSI(X) - why would we want to block it?
On Thu, 22 Mar 2018 18:32:56 +1100 Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > On 22/3/18 3:40 pm, Alex Williamson wrote: > > On Thu, 22 Mar 2018 15:02:09 +1100 > > Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > > > >> On 19/3/18 1:15 pm, Alex Williamson wrote: > >>> On Mon, 19 Mar 2018 01:05:27 +1100 > >>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >>> > >>>> On 15/3/18 2:02 pm, Alex Williamson wrote: > >>>>> On Tue, 13 Mar 2018 13:38:46 +1100 > >>>>> Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > >>>>> > >>>>>> On 13/3/18 9:44 am, Alex Williamson wrote: > >>>>> Should we only be invoking the nointx path > >>>>> if a) the device supports intx (pin reg), b) pdev->irq is null, and c) > >>>>> the device passes the v2.3 masking test (which is cached, not retested > >>>>> now, btw)? > >>>> > >>>> Yes we should, and yes I missed that last bit about caching the result of > >>>> the test which is performed anyway in the common pci code. Thanks, > >>> > >>> I spent a while considering whether we should include c) in the patch I > >>> posted, but I couldn't get past the question of what we'd do with the > >>> device if it doesn't pass the INTx masking test. We'd want to emulate > >>> the INTx pin register and report no INTx irq info, as the nointx path > >>> does, then what? If writing DisINTx doesn't do anything, we need to > >>> decide whether lack of INTx routing means the device can harmlessly > >>> pull INTx, or if pulling INTx is not harmless, then we'd need to ban > >>> the device from being assigned to the user. Since we don't know of any > >>> problems and it seems sort of reasonable to assume that if there's no > >>> INTx routing that we can tug it all we want (because if someone is > >>> listening, it's a platform bug that it's not routed), I left it alone > >>> in my follow-up patch. Let me know if you have other thoughts. Thanks, > >> > >> > >> Well, pulling INTx is nasty anyway as a device may die afterwards and the > >> specific INTx line (or a latch in PCIe bridge) will be blocked for the > >> devices sharing the same line, and DisINTx does not change that. > > > > "... a device may die afterwards"? What does that mean? > > I meant the device could be blocked by EEH after raising INTx. > > > In normal > > operation, INTx fires, DisINTx gets set, the interrupt is forwarded to > > the user, the user services and unmasks the interrupt, rinse and > > repeat. If the device somehow gets into a locked state where it's > > pulling INTx and disregarding DisINTx, a) this is broken hardware, b) > > the spurious interrupt handler will mask the interrupt and switch to > > polling, penalizing everyone sharing that line. I don't know that > > there's a model where we can factor in catastrophic hardware failure > > into the equation. > >>> Our other > >> hypervisor (pHyp) simply prohibits INTx in order to avoid dealing with this > >> (this is why 2170dd04316e was made). So not enforcing nointx for > >> DisINTx-incapable devices does not seem much worse than enabling INTx for > >> pass through itself. > > > > Too. Many. Negatives. If a device shares an INTx line with other > > devices then it certainly needs to support DisINTx in order to play > > nicely with userspace drivers. We require exclusive interrupts for > > devices that do not support DisINTx. If the hypervisor is choosing to > > avoid dealing with that problem by not providing routing for INTx > > (effectively a paravirt please don't use this interface), then it seems > > like it's not fit to support userspace drivers. It's depending on the > > cooperation of the user. We can't prevent the user from making the card > > pull INTx, we can only discourage them from using it, which is not > > sufficient. Therefore if we really do need to block use of such > > devices (not INTx routing, no DisINTx support) from using vfio, my patch > > is insufficient and I think the best approach at this stage is to revert > > 2170dd04316e and you can rehash a solution for pHyp. Thanks, > > No, I do not want to block INTx and I am happy with "[PATCH v2] vfio/pci: > Quiet broken INTx whining when INTx is unsupported by device". I am missing > the point here, sorry, no-INTx-routing + no-DisINTx + no-MSI(X) - I > understand why we would want to block these but if a device cannot do INTx > nicely but still can do MSI(X) - why would we want to block it? Because "cannot do INTx nicely" means one of: a) device can assert INTx all it wants and nobody cares (ie. it's not connected or routed) b) device INTx is connected or routed at the pHyp level, the device can cause badness by asserting INTx, we are only advising the user not to use INTx The assumption in my version of the patch is a), but you seem to have clarified that the situation is really b), in which case we have no solution for devices which do not support DisINTx, they should be disallowed from use through vfio. The user has the ability to cause the device to assert INTx regardless of whether we report a pin register, report INTx irq info, or support it through SET_IRQ, it cannot be masked at the device, the pHyp hypervisor has not provided routing, so it cannot be masked at an interrupt controller, and apparently we cannot presume that asserting INTx is harmless, as I'd hoped. What prevents a user from exploiting INTx on a device that does not support DisINTx when running on pHyp? Thanks, Alex
On Thu, 22 Mar 2018 21:09:09 +0000 Casey Leedom <leedom@chelsio.com> wrote: > | From: Alex Williamson <alex.williamson@redhat.com> > | Sent: Thursday, March 22, 2018 6:13 AM > | ... > | On Thu, 22 Mar 2018 18:32:56 +1100 > | Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > | > ... > | > No, I do not want to block INTx and I am happy with "[PATCH v2] > | > vfio/pci: Quiet broken INTx whining when INTx is unsupported by > | > device". I am missing the point here, sorry, no-INTx-routing + > | > no-DisINTx + no-MSI(X) - I understand why we would want to block > | > these but if a device cannot do INTx nicely but still can do > | > MSI(X) - why would we want to block it? > | > | Because "cannot do INTx nicely" means one of: > | > | a) device can assert INTx all it wants and nobody cares (ie. it's not > | connected or routed) > | > | b) device INTx is connected or routed at the pHyp level, the device can > | cause badness by asserting INTx, we are only advising the user not to > | use INTx > | > | The assumption in my version of the patch is a), but you seem to have > | clarified that the situation is really b), in which case we have no > | solution for devices which do not support DisINTx, they should be > | disallowed from use through vfio. The user has the ability to cause > | the device to assert INTx regardless of whether we report a pin > | register, report INTx irq info, or support it through SET_IRQ, it > | cannot be masked at the device, the pHyp hypervisor has not provided > | routing, so it cannot be masked at an interrupt controller, and > | apparently we cannot presume that asserting INTx is harmless, as I'd > | hoped. What prevents a user from exploiting INTx on a device that does > | not support DisINTx when running on pHyp? Thanks, > > I must remind everyone that Single Root I/O Virtualization Virtual Functions > may _NOT_ implement Legacy Pin INTx Interrupts as per the SR-IOV > specification. I.e. VFs cannot and must not implement INTx ... and they That's not the under debate, VFs are fine, they cannot assert INTx, the original commit was never intended to affect them. The question is on this pHyp platform where we might be assigning a traditional PCIe endpoint where the device does support INTx, but the hypervisor (which we're running on) is not exposing INTx routing, therefore we have an INTx pin, but we don't have pdev->irq and we need to guess whether the device asserting INTx can cause bad things. If we think the device supports DisINTx, we can limit it's ability to assert INTx, if not, the user can make the device assert INTx regardless of what we expose to it. > were the entire original motivation for allowing PCI Devices to be attached > to Virtual Machines ... That's debatable. Obviously aspects of VFs make them much preferred for assignment, but device assignment would be here with or without them and we get to deal with supporting all flavors of devices. Thanks, Alex
diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index ad18ed266dc0..34d463a0b4a5 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -192,6 +192,8 @@ static void vfio_pci_disable(struct vfio_pci_device *vdev); */ static bool vfio_pci_nointx(struct pci_dev *pdev) { + u8 pin; + switch (pdev->vendor) { case PCI_VENDOR_ID_INTEL: switch (pdev->device) { @@ -207,7 +209,9 @@ static bool vfio_pci_nointx(struct pci_dev *pdev) } } - if (!pdev->irq) + pci_read_config_byte(pdev, PCI_INTERRUPT_PIN, &pin); + + if (pin && !pdev->irq) return true; return false;