Message ID | 1406868871-350-1-git-send-email-gwshan@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On 8/1/2014 7:54 AM, Gavin Shan wrote: > The VFIO driver is routing LSI interrupts by capturing, masking, > and then delivering. When passing though Mallanox adapters from > host to guest, interrupt storm was reported from host and guest. > That's because we can't mask the LSI interrupt with help of PCI > command register. Hi Gavin, What is the problem with masking the interrupts with the PCI command register? I'm asking because I want to understand in which devices we have the problem, and if it could be fixed by firmware guys. What are the implications of having the quirk? > > [root@ncc-1701 ~]# lspci | grep Mellanox > 0001:05:00.0 Ethernet controller: Mellanox Technologies MT27500 \ > Family [ConnectX-3] > 0005:01:00.0 Ethernet controller: Mellanox Technologies MT26448 \ > [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0) > > The patch marks broken INTx masking for Mellanox devices so that > the VFIO driver will always mask the interrupt from interrupt > controller side to avoid interrupt storm. > > Cc: Amir Vadai <amirv@mellanox.com> > Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> > --- > drivers/pci/quirks.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c > index d0f6926..8c2b96f 100644 > --- a/drivers/pci/quirks.c > +++ b/drivers/pci/quirks.c > @@ -2977,6 +2977,10 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CHELSIO, 0x0030, > quirk_broken_intx_masking); > DECLARE_PCI_FIXUP_HEADER(0x1814, 0x0601, /* Ralink RT2800 802.11n PCI */ > quirk_broken_intx_masking); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0x1003, > + quirk_broken_intx_masking); > +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0x6750, > + quirk_broken_intx_masking); I still don't understand what exactly is the problem, but I assume that there are other Mellanox devices that suffer from it. > /* > * Realtek RTL8169 PCI Gigabit Ethernet Controller (rev 10) > * Subsystem: Realtek RTL8169/8110 Family PCI Gigabit Ethernet NIC > Thanks, Amir -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Aug 03, 2014 at 10:51:08AM +0300, Amir Vadai wrote: >On 8/1/2014 7:54 AM, Gavin Shan wrote: >> The VFIO driver is routing LSI interrupts by capturing, masking, >> and then delivering. When passing though Mallanox adapters from >> host to guest, interrupt storm was reported from host and guest. >> That's because we can't mask the LSI interrupt with help of PCI >> command register. >Hi Gavin, > >What is the problem with masking the interrupts with the PCI command >register? I'm asking because I want to understand in which devices we >have the problem, and if it could be fixed by firmware guys. >What are the implications of having the quirk? > The way to mask the interrupt through PCI command register isn't taking effect on IBM power platform. So we have to have the quirk so that the interrupt could be masked from interrupt controller side with function disable_irq_nosync(). If the interrupt can't be masked properly, we detect interrupt storm reported from host/guest when passing through those devices via VFIO without suprise. >> >> [root@ncc-1701 ~]# lspci | grep Mellanox >> 0001:05:00.0 Ethernet controller: Mellanox Technologies MT27500 \ >> Family [ConnectX-3] >> 0005:01:00.0 Ethernet controller: Mellanox Technologies MT26448 \ >> [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0) >> >> The patch marks broken INTx masking for Mellanox devices so that >> the VFIO driver will always mask the interrupt from interrupt >> controller side to avoid interrupt storm. >> >> Cc: Amir Vadai <amirv@mellanox.com> >> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> >> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> >> --- >> drivers/pci/quirks.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c >> index d0f6926..8c2b96f 100644 >> --- a/drivers/pci/quirks.c >> +++ b/drivers/pci/quirks.c >> @@ -2977,6 +2977,10 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CHELSIO, 0x0030, >> quirk_broken_intx_masking); >> DECLARE_PCI_FIXUP_HEADER(0x1814, 0x0601, /* Ralink RT2800 802.11n PCI */ >> quirk_broken_intx_masking); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0x1003, >> + quirk_broken_intx_masking); >> +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0x6750, >> + quirk_broken_intx_masking); >I still don't understand what exactly is the problem, but I assume that >there are other Mellanox devices that suffer from it. > Yeah, that's possible. I didn't have chance to test other Mellanox devices except the above two. >> /* >> * Realtek RTL8169 PCI Gigabit Ethernet Controller (rev 10) >> * Subsystem: Realtek RTL8169/8110 Family PCI Gigabit Ethernet NIC >> > >Thanks, >Amir > Thanks, Gavin -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> >>What is the problem with masking the interrupts with the PCI command >>register? I'm asking because I want to understand in which devices we >>have the problem, and if it could be fixed by firmware guys. >>What are the implications of having the quirk? >> >The way to mask the interrupt through PCI command register isn't taking effect on IBM power platform. So we have to have the >quirk so that the interrupt could be masked from interrupt controller side with function disable_irq_nosync(). > >If the interrupt can't be masked properly, we detect interrupt storm reported from host/guest when passing through those devices >via VFIO without suprise. Hi Gavin, Does it have any effect on performance. Also, can you tell in which cases interrupts need to be masked? -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Aug 03, 2014 at 08:57:39AM +0000, Eli Cohen wrote: >>> >>>What is the problem with masking the interrupts with the PCI command >>>register? I'm asking because I want to understand in which devices we >>>have the problem, and if it could be fixed by firmware guys. >>>What are the implications of having the quirk? >>> > >>The way to mask the interrupt through PCI command register isn't taking effect on IBM power platform. So we have to have the >quirk so that the interrupt could be masked from interrupt controller side with function disable_irq_nosync(). >> >>If the interrupt can't be masked properly, we detect interrupt storm reported from host/guest when passing through those devices >via VFIO without suprise. > >Hi Gavin, >Does it have any effect on performance. Also, can you tell in which cases interrupts need to be masked? > Eli, more code needed to be run for masking the LSI from interrupt controller side than from PCI command register. I was passing through Mellanox devices from host to guest with VFIO, and I designated to use LSI in the guest side. More details could be found in drivers/vfio/pci/vfio_pci_intrs.c::vfio_intx_handler() Thanks, Gavin -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 2014-08-04 at 00:30 +1000, Gavin Shan wrote: > On Sun, Aug 03, 2014 at 08:57:39AM +0000, Eli Cohen wrote: > >>> > >>>What is the problem with masking the interrupts with the PCI command > >>>register? I'm asking because I want to understand in which devices we > >>>have the problem, and if it could be fixed by firmware guys. > >>>What are the implications of having the quirk? > >>> > > > >>The way to mask the interrupt through PCI command register isn't taking effect on IBM power platform. So we have to have the >quirk so that the interrupt could be masked from interrupt controller side with function disable_irq_nosync(). > >> > >>If the interrupt can't be masked properly, we detect interrupt storm reported from host/guest when passing through those devices >via VFIO without suprise. > > > >Hi Gavin, > >Does it have any effect on performance. Also, can you tell in which cases interrupts need to be masked? > > > > Eli, more code needed to be run for masking the LSI from interrupt controller > side than from PCI command register. > > I was passing through Mellanox devices from host to guest with VFIO, and I > designated to use LSI in the guest side. More details could be found in > drivers/vfio/pci/vfio_pci_intrs.c::vfio_intx_handler() INTx is relatively high overhead already for device assignment since the interrupt is level triggered and needs to be masked on the host while the guest is processing it. The more important restriction imposed by marking broken INTx masking is that the device needs an exclusive interrupt line in order to be assigned to a guest. That may be common practice on IBM power, but on x86 it can make it much harder to configure the system for this use case. Thanks, Alex -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Aug 03, 2014 at 09:08:06AM -0600, Alex Williamson wrote: >On Mon, 2014-08-04 at 00:30 +1000, Gavin Shan wrote: >> On Sun, Aug 03, 2014 at 08:57:39AM +0000, Eli Cohen wrote: >> >>> >> >>>What is the problem with masking the interrupts with the PCI command >> >>>register? I'm asking because I want to understand in which devices we >> >>>have the problem, and if it could be fixed by firmware guys. >> >>>What are the implications of having the quirk? >> >>> >> > >> >>The way to mask the interrupt through PCI command register isn't taking effect on IBM power platform. So we have to have the >quirk so that the interrupt could be masked from interrupt controller side with function disable_irq_nosync(). >> >> >> >>If the interrupt can't be masked properly, we detect interrupt storm reported from host/guest when passing through those devices >via VFIO without suprise. >> > >> >Hi Gavin, >> >Does it have any effect on performance. Also, can you tell in which cases interrupts need to be masked? >> > >> >> Eli, more code needed to be run for masking the LSI from interrupt controller >> side than from PCI command register. >> >> I was passing through Mellanox devices from host to guest with VFIO, and I >> designated to use LSI in the guest side. More details could be found in >> drivers/vfio/pci/vfio_pci_intrs.c::vfio_intx_handler() > >INTx is relatively high overhead already for device assignment since the >interrupt is level triggered and needs to be masked on the host while >the guest is processing it. The more important restriction imposed by >marking broken INTx masking is that the device needs an exclusive >interrupt line in order to be assigned to a guest. That may be common >practice on IBM power, but on x86 it can make it much harder to >configure the system for this use case. Thanks, > Power platform has the similar situation: Each PCI controller has 4 LSIs shared by all child devices attached to the PHB. It would be racy if one LSI is shared by 2 or more devices. So masking LSI with PCI command register is the preferred mechanism. Unfortunately, it doesn't work on those 2 Mellanox devices. With the quirk, it's workable at least. Thanks, Gavin >Alex > > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 04, 2014 at 10:34:38AM +1000, Gavin Shan wrote: >On Sun, Aug 03, 2014 at 09:08:06AM -0600, Alex Williamson wrote: >>On Mon, 2014-08-04 at 00:30 +1000, Gavin Shan wrote: >>> On Sun, Aug 03, 2014 at 08:57:39AM +0000, Eli Cohen wrote: >>> >>> >>> >>>What is the problem with masking the interrupts with the PCI command >>> >>>register? I'm asking because I want to understand in which devices we >>> >>>have the problem, and if it could be fixed by firmware guys. >>> >>>What are the implications of having the quirk? >>> >>> >>> > >>> >>The way to mask the interrupt through PCI command register isn't taking effect on IBM power platform. So we have to have the >quirk so that the interrupt could be masked from interrupt controller side with function disable_irq_nosync(). >>> >> >>> >>If the interrupt can't be masked properly, we detect interrupt storm reported from host/guest when passing through those devices >via VFIO without suprise. >>> > >>> >Hi Gavin, >>> >Does it have any effect on performance. Also, can you tell in which cases interrupts need to be masked? >>> > >>> >>> Eli, more code needed to be run for masking the LSI from interrupt controller >>> side than from PCI command register. >>> >>> I was passing through Mellanox devices from host to guest with VFIO, and I >>> designated to use LSI in the guest side. More details could be found in >>> drivers/vfio/pci/vfio_pci_intrs.c::vfio_intx_handler() >> >>INTx is relatively high overhead already for device assignment since the >>interrupt is level triggered and needs to be masked on the host while >>the guest is processing it. The more important restriction imposed by >>marking broken INTx masking is that the device needs an exclusive >>interrupt line in order to be assigned to a guest. That may be common >>practice on IBM power, but on x86 it can make it much harder to >>configure the system for this use case. Thanks, >> > >Power platform has the similar situation: Each PCI controller has 4 >LSIs shared by all child devices attached to the PHB. It would be >racy if one LSI is shared by 2 or more devices. So masking LSI with >PCI command register is the preferred mechanism. Unfortunately, it >doesn't work on those 2 Mellanox devices. With the quirk, it's workable >at least. > ping, Any more comments on this? :-) Thanks, Gavin -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Aug 01, 2014 at 02:54:31PM +1000, Gavin Shan wrote: >The VFIO driver is routing LSI interrupts by capturing, masking, >and then delivering. When passing though Mallanox adapters from one typo Mellanox Others, looks good to me. >host to guest, interrupt storm was reported from host and guest. >That's because we can't mask the LSI interrupt with help of PCI >command register. > >[root@ncc-1701 ~]# lspci | grep Mellanox >0001:05:00.0 Ethernet controller: Mellanox Technologies MT27500 \ > Family [ConnectX-3] >0005:01:00.0 Ethernet controller: Mellanox Technologies MT26448 \ > [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0) > >The patch marks broken INTx masking for Mellanox devices so that >the VFIO driver will always mask the interrupt from interrupt >controller side to avoid interrupt storm. > >Cc: Amir Vadai <amirv@mellanox.com> >Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> >Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> >--- > drivers/pci/quirks.c | 4 ++++ > 1 file changed, 4 insertions(+) > >diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c >index d0f6926..8c2b96f 100644 >--- a/drivers/pci/quirks.c >+++ b/drivers/pci/quirks.c >@@ -2977,6 +2977,10 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CHELSIO, 0x0030, > quirk_broken_intx_masking); > DECLARE_PCI_FIXUP_HEADER(0x1814, 0x0601, /* Ralink RT2800 802.11n PCI */ > quirk_broken_intx_masking); >+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0x1003, >+ quirk_broken_intx_masking); >+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0x6750, >+ quirk_broken_intx_masking); > /* > * Realtek RTL8169 PCI Gigabit Ethernet Controller (rev 10) > * Subsystem: Realtek RTL8169/8110 Family PCI Gigabit Ethernet NIC >-- >1.8.3.2 > >-- >To unsubscribe from this list: send the line "unsubscribe linux-pci" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Aug 11, 2014 at 09:48:54PM +0800, Wei Yang wrote: >On Fri, Aug 01, 2014 at 02:54:31PM +1000, Gavin Shan wrote: >>The VFIO driver is routing LSI interrupts by capturing, masking, >>and then delivering. When passing though Mallanox adapters from > >one typo Mellanox > >Others, looks good to me. > Thanks, Richard. I'll correct it in next revision. However, I'm still waiting for comments or ACK from Alex/Eli/Amir. Thanks, Gavin >>host to guest, interrupt storm was reported from host and guest. >>That's because we can't mask the LSI interrupt with help of PCI >>command register. >> >>[root@ncc-1701 ~]# lspci | grep Mellanox >>0001:05:00.0 Ethernet controller: Mellanox Technologies MT27500 \ >> Family [ConnectX-3] >>0005:01:00.0 Ethernet controller: Mellanox Technologies MT26448 \ >> [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0) >> >>The patch marks broken INTx masking for Mellanox devices so that >>the VFIO driver will always mask the interrupt from interrupt >>controller side to avoid interrupt storm. >> >>Cc: Amir Vadai <amirv@mellanox.com> >>Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> >>Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> >>--- >> drivers/pci/quirks.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >>diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c >>index d0f6926..8c2b96f 100644 >>--- a/drivers/pci/quirks.c >>+++ b/drivers/pci/quirks.c >>@@ -2977,6 +2977,10 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CHELSIO, 0x0030, >> quirk_broken_intx_masking); >> DECLARE_PCI_FIXUP_HEADER(0x1814, 0x0601, /* Ralink RT2800 802.11n PCI */ >> quirk_broken_intx_masking); >>+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0x1003, >>+ quirk_broken_intx_masking); >>+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0x6750, >>+ quirk_broken_intx_masking); >> /* >> * Realtek RTL8169 PCI Gigabit Ethernet Controller (rev 10) >> * Subsystem: Realtek RTL8169/8110 Family PCI Gigabit Ethernet NIC >>-- >>1.8.3.2 >> >>-- >>To unsubscribe from this list: send the line "unsubscribe linux-pci" in >>the body of a message to majordomo@vger.kernel.org >>More majordomo info at http://vger.kernel.org/majordomo-info.html > >-- >Richard Yang >Help you, Help me -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 8/12/2014 6:52 AM, Gavin Shan wrote: > On Mon, Aug 11, 2014 at 09:48:54PM +0800, Wei Yang wrote: >> >On Fri, Aug 01, 2014 at 02:54:31PM +1000, Gavin Shan wrote: >>> >>The VFIO driver is routing LSI interrupts by capturing, masking, >>> >>and then delivering. When passing though Mallanox adapters from >> > >> >one typo Mellanox >> > >> >Others, looks good to me. >> > > Thanks, Richard. I'll correct it in next revision. However, I'm > still waiting for comments or ACK from Alex/Eli/Amir. > > Thanks, > Gavin > Hi Gavin, Sorry for the delay - wanted to understand the exact limitations with our Hardware/Firmware guys. Please change the device to PCI_ANY_ID since all Mellanox devices currently need the quirk. The issue will be fixed in the next GA firmwares. Thanks, Amir -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Aug 12, 2014 at 10:51:57AM +0300, Amir Vadai wrote: >On 8/12/2014 6:52 AM, Gavin Shan wrote: >> On Mon, Aug 11, 2014 at 09:48:54PM +0800, Wei Yang wrote: >>> >On Fri, Aug 01, 2014 at 02:54:31PM +1000, Gavin Shan wrote: >>>> >>The VFIO driver is routing LSI interrupts by capturing, masking, >>>> >>and then delivering. When passing though Mallanox adapters from >>> > >>> >one typo Mellanox >>> > >>> >Others, looks good to me. >>> > >> Thanks, Richard. I'll correct it in next revision. However, I'm >> still waiting for comments or ACK from Alex/Eli/Amir. >> >> Thanks, >> Gavin >> > >Hi Gavin, > >Sorry for the delay - wanted to understand the exact limitations with >our Hardware/Firmware guys. > >Please change the device to PCI_ANY_ID since all Mellanox devices >currently need the quirk. >The issue will be fixed in the next GA firmwares. > Amir, thanks for confirm. I'll update in next revision accordingly. Please ack next revision if possible. Thanks, Gavin >Thanks, >Amir > -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 8/12/2014 11:57 AM, Gavin Shan wrote: > On Tue, Aug 12, 2014 at 10:51:57AM +0300, Amir Vadai wrote: >> On 8/12/2014 6:52 AM, Gavin Shan wrote: [..] > > Amir, thanks for confirm. I'll update in next revision accordingly. > Please ack next revision if possible. Sure. Amir -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index d0f6926..8c2b96f 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -2977,6 +2977,10 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_CHELSIO, 0x0030, quirk_broken_intx_masking); DECLARE_PCI_FIXUP_HEADER(0x1814, 0x0601, /* Ralink RT2800 802.11n PCI */ quirk_broken_intx_masking); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0x1003, + quirk_broken_intx_masking); +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MELLANOX, 0x6750, + quirk_broken_intx_masking); /* * Realtek RTL8169 PCI Gigabit Ethernet Controller (rev 10) * Subsystem: Realtek RTL8169/8110 Family PCI Gigabit Ethernet NIC
The VFIO driver is routing LSI interrupts by capturing, masking, and then delivering. When passing though Mallanox adapters from host to guest, interrupt storm was reported from host and guest. That's because we can't mask the LSI interrupt with help of PCI command register. [root@ncc-1701 ~]# lspci | grep Mellanox 0001:05:00.0 Ethernet controller: Mellanox Technologies MT27500 \ Family [ConnectX-3] 0005:01:00.0 Ethernet controller: Mellanox Technologies MT26448 \ [ConnectX EN 10GigE, PCIe 2.0 5GT/s] (rev b0) The patch marks broken INTx masking for Mellanox devices so that the VFIO driver will always mask the interrupt from interrupt controller side to avoid interrupt storm. Cc: Amir Vadai <amirv@mellanox.com> Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> --- drivers/pci/quirks.c | 4 ++++ 1 file changed, 4 insertions(+)