Message ID | 20250204053758.6025-2-feng.tang@linux.alibaba.com (mailing list archive) |
---|---|
State | New |
Delegated to: | Bjorn Helgaas |
Headers | show |
Series | [1/2] PCI/portdrv: Add necessary delay for disabling hotplug events | expand |
On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote: > There was a irq storm bug when testing "pci=nomsi" case, and the root > cause is: 'nomsi' will disable MSI and let devices and root ports use > legacy INTX inerrupt, and likely make several devices/ports share one > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug > interrupts, and actually asserts the command-complete interrupt. > As MSI is disabled, ACPI initialization code will not enumerate root > port's PCIE hotplug capability, and pciehp service driver wont' be > enabled for the root port to handle that interrupt, later on when it is > shared and enabled by other device driver like NVME or NIC, the "nobody > care irq storm" happens. > > So disable the pcie hotplug CCIE/HPIE interrupt in early boot phase when > MSI is not enbaled. So I think this issue should go away if disabling the interrupt by portdrv is no longer conditional on (pcie_ports_native || host->native_pcie_hotplug) like I've just proposed here: https://lore.kernel.org/r/Z6HYuBDP6uvE1Sf4@wunner.de/ ... in which case this patch won't be necessary. Can you confirm that? You can split the change I've proposed into two patches if you like. Thanks, Lukas
On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote: > There was a irq storm bug when testing "pci=nomsi" case, and the root > cause is: 'nomsi' will disable MSI and let devices and root ports use > legacy INTX inerrupt, and likely make several devices/ports share one > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug > interrupts, and actually asserts the command-complete interrupt. > As MSI is disabled, ACPI initialization code will not enumerate root > port's PCIE hotplug capability, and pciehp service driver wont' be > enabled for the root port to handle that interrupt, later on when it is > shared and enabled by other device driver like NVME or NIC, the "nobody > care irq storm" happens. Is there a section in the PCI Firmware Spec which says ACPI doesn't enumerate the hotplug capability if MSI is disabled? If so, it should be referenced in the commit message. If not, I'm wondering if it's safe to fiddle with the Slot Control register given the platform hasn't granted OSPM control of it. Of course if this is spec-defined behavior in the nomsi case, we could make the write to the Slot Control register conditional on that. But if this turns out to be platform-specific behavior, we can't deal with it in generic PCI code but only in a quirk. Thanks, Lukas
Hi Lukas, On Tue, Feb 04, 2025 at 10:23:45AM +0100, Lukas Wunner wrote: > On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote: > > There was a irq storm bug when testing "pci=nomsi" case, and the root > > cause is: 'nomsi' will disable MSI and let devices and root ports use > > legacy INTX inerrupt, and likely make several devices/ports share one > > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug > > interrupts, and actually asserts the command-complete interrupt. > > As MSI is disabled, ACPI initialization code will not enumerate root > > port's PCIE hotplug capability, and pciehp service driver wont' be > > enabled for the root port to handle that interrupt, later on when it is > > shared and enabled by other device driver like NVME or NIC, the "nobody > > care irq storm" happens. > > Is there a section in the PCI Firmware Spec which says ACPI doesn't > enumerate the hotplug capability if MSI is disabled? No, I didn't get it from spec, but found the logic by code reading during debugging the irq storm issue. The related code is about: #define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \ | OSC_PCI_ASPM_SUPPORT \ | OSC_PCI_CLOCK_PM_SUPPORT \ | OSC_PCI_MSI_SUPPORT) acpi_pci_root_add negotiate_os_control calculate_support if (pci_msi_enabled()) support |= OSC_PCI_MSI_SUPPORT; decode_osc_support os_control_query_checks if ((support & ACPI_PCIE_REQ_SUPPORT) != ACPI_PCIE_REQ_SUPPORT) return false acpi_pci_osc_control_set And later in get_port_device_capability(), the pciehp service bit won't be set, and driver is not loaded. Thanks, Feng > If so, it should be referenced in the commit message. > > If not, I'm wondering if it's safe to fiddle with the Slot Control > register given the platform hasn't granted OSPM control of it. > > Of course if this is spec-defined behavior in the nomsi case, > we could make the write to the Slot Control register conditional > on that. But if this turns out to be platform-specific behavior, > we can't deal with it in generic PCI code but only in a quirk. > > Thanks, > > Lukas
On Tue, Feb 04, 2025 at 10:14:10AM +0100, Lukas Wunner wrote: > On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote: > > There was a irq storm bug when testing "pci=nomsi" case, and the root > > cause is: 'nomsi' will disable MSI and let devices and root ports use > > legacy INTX inerrupt, and likely make several devices/ports share one > > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug > > interrupts, and actually asserts the command-complete interrupt. > > As MSI is disabled, ACPI initialization code will not enumerate root > > port's PCIE hotplug capability, and pciehp service driver wont' be > > enabled for the root port to handle that interrupt, later on when it is > > shared and enabled by other device driver like NVME or NIC, the "nobody > > care irq storm" happens. > > > > So disable the pcie hotplug CCIE/HPIE interrupt in early boot phase when > > MSI is not enbaled. > > So I think this issue should go away if disabling the interrupt > by portdrv is no longer conditional on > > (pcie_ports_native || host->native_pcie_hotplug) > > like I've just proposed here: > > https://lore.kernel.org/r/Z6HYuBDP6uvE1Sf4@wunner.de/ > > ... in which case this patch won't be necessary. Can you confirm that? Thanks for the suggestion! I will try to get the platform for test, and report back. As for the change, + if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE)) + pcie_capability_clear_word(dev, PCI_EXP_SLTCTL, + PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE); The CONFIG_HOTPLUG_PCI_PCIE is always enabled on our platform and many distros, I guess the check needs to be removed, which sees the 1 second waiting again, and need the waiting logic in 1/2 patch? Thanks, Feng > > You can split the change I've proposed into two patches if you like. > > Thanks, > > Lukas
On Wed, Feb 05, 2025 at 02:31:56PM +0800, Feng Tang wrote: > On Tue, Feb 04, 2025 at 10:14:10AM +0100, Lukas Wunner wrote: > > On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote: > > > There was a irq storm bug when testing "pci=nomsi" case, and the root > > > cause is: 'nomsi' will disable MSI and let devices and root ports use > > > legacy INTX inerrupt, and likely make several devices/ports share one > > > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug > > > interrupts, and actually asserts the command-complete interrupt. > > > As MSI is disabled, ACPI initialization code will not enumerate root > > > port's PCIE hotplug capability, and pciehp service driver wont' be > > > enabled for the root port to handle that interrupt, later on when it is > > > shared and enabled by other device driver like NVME or NIC, the "nobody > > > care irq storm" happens. > > > > > > So disable the pcie hotplug CCIE/HPIE interrupt in early boot phase when > > > MSI is not enbaled. > > > > So I think this issue should go away if disabling the interrupt > > by portdrv is no longer conditional on > > > > (pcie_ports_native || host->native_pcie_hotplug) > > > > like I've just proposed here: > > > > https://lore.kernel.org/r/Z6HYuBDP6uvE1Sf4@wunner.de/ > > > > ... in which case this patch won't be necessary. Can you confirm that? > > Thanks for the suggestion! I will try to get the platform for test, > and report back. I haven't got the platform, but I recalled something, that disabling HP interrupts inside get_port_device_capability()/portdrv_probe() got called after the nvme_probe(), so it may still cause the irq storm due to: * pcie root port's hotplug interrupt asserted * the interrupt is shared with NVME and other device * those device drivers enable the interrupt line early before portdrv's probe() That's why we tried to put the disabling early in PCI initialization code. Thanks, Feng > As for the change, > + if (!IS_ENABLED(CONFIG_HOTPLUG_PCI_PCIE)) > + pcie_capability_clear_word(dev, PCI_EXP_SLTCTL, > + PCI_EXP_SLTCTL_CCIE | PCI_EXP_SLTCTL_HPIE); > > The CONFIG_HOTPLUG_PCI_PCIE is always enabled on our platform and many > distros, I guess the check needs to be removed, which sees the 1 second > waiting again, and need the waiting logic in 1/2 patch? > > Thanks, > Feng
[to += Rafael, start of thread is here: https://lore.kernel.org/all/Z6HcoUB3i51bzQDs@wunner.de/ ] Hi Rafael, On Wed, Feb 05, 2025 at 11:58:04AM +0800, Feng Tang wrote: > On Tue, Feb 04, 2025 at 10:23:45AM +0100, Lukas Wunner wrote: > > On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote: > > > There was a irq storm bug when testing "pci=nomsi" case, and the root > > > cause is: 'nomsi' will disable MSI and let devices and root ports use > > > legacy INTX inerrupt, and likely make several devices/ports share one > > > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug > > > interrupts, and actually asserts the command-complete interrupt. > > > As MSI is disabled, ACPI initialization code will not enumerate root > > > port's PCIE hotplug capability, and pciehp service driver wont' be > > > enabled for the root port to handle that interrupt, later on when it is > > > shared and enabled by other device driver like NVME or NIC, the "nobody > > > care irq storm" happens. > > > > Is there a section in the PCI Firmware Spec which says ACPI doesn't > > enumerate the hotplug capability if MSI is disabled? > > No, I didn't get it from spec, but found the logic by code reading > during debugging the irq storm issue. The related code is about: > > #define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \ > | OSC_PCI_ASPM_SUPPORT \ > | OSC_PCI_CLOCK_PM_SUPPORT \ > | OSC_PCI_MSI_SUPPORT) Commit 415e12b23792 ("PCI/ACPI: Request _OSC control once for each root bridge (v3)") contains a change which doesn't seem to be explained in the commit message: If the user passes "pci=nomsi" on the command line, Linux doesn't request hotplug control (or any other control) from the platform. So ACPI always remains responsible for hotplug in the "pci=nomsi" case. The commit sought to fix a cpu hog issue: https://bugzilla.kernel.org/show_bug.cgi?id=29722 It's unclear to me if that bug was fixed by requesting _OSC only once, as the commit message suggests, or if the addition of OSC_MSI_SUPPORT to ACPI_PCIE_REQ_SUPPORT fixed the issue. Since the latter is not mentioned in the commit message, it seems plausible to assume that the OSC_MSI_SUPPORT change was unintentional. In any case it doesn't seem to make sense to not request any control in the "pci=nomsi" case. It's also worth noting that the behavior is different on Apple machines as they use a fixed _OSC set even for "pci=nomsi". I'm wondering if OSC_PCI_MSI_SUPPORT should simply be removed from ACPI_PCIE_REQ_SUPPORT, but I'm worried that it may cause reappearance of the cpu hog issue. Thoughts? Thanks, Lukas
On Thu, Feb 06, 2025 at 07:21:47AM +0100, Lukas Wunner wrote: > [to += Rafael, start of thread is here: > https://lore.kernel.org/all/Z6HcoUB3i51bzQDs@wunner.de/ > ] > > Hi Rafael, > > On Wed, Feb 05, 2025 at 11:58:04AM +0800, Feng Tang wrote: > > On Tue, Feb 04, 2025 at 10:23:45AM +0100, Lukas Wunner wrote: > > > On Tue, Feb 04, 2025 at 01:37:58PM +0800, Feng Tang wrote: > > > > There was a irq storm bug when testing "pci=nomsi" case, and the root > > > > cause is: 'nomsi' will disable MSI and let devices and root ports use > > > > legacy INTX inerrupt, and likely make several devices/ports share one > > > > interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug > > > > interrupts, and actually asserts the command-complete interrupt. > > > > As MSI is disabled, ACPI initialization code will not enumerate root > > > > port's PCIE hotplug capability, and pciehp service driver wont' be > > > > enabled for the root port to handle that interrupt, later on when it is > > > > shared and enabled by other device driver like NVME or NIC, the "nobody > > > > care irq storm" happens. > > > > > > Is there a section in the PCI Firmware Spec which says ACPI doesn't > > > enumerate the hotplug capability if MSI is disabled? > > > > No, I didn't get it from spec, but found the logic by code reading > > during debugging the irq storm issue. The related code is about: > > > > #define ACPI_PCIE_REQ_SUPPORT (OSC_PCI_EXT_CONFIG_SUPPORT \ > > | OSC_PCI_ASPM_SUPPORT \ > > | OSC_PCI_CLOCK_PM_SUPPORT \ > > | OSC_PCI_MSI_SUPPORT) > > Commit 415e12b23792 ("PCI/ACPI: Request _OSC control once for each root > bridge (v3)") contains a change which doesn't seem to be explained in > the commit message: > > If the user passes "pci=nomsi" on the command line, Linux doesn't > request hotplug control (or any other control) from the platform. > So ACPI always remains responsible for hotplug in the "pci=nomsi" > case. > > The commit sought to fix a cpu hog issue: > > https://bugzilla.kernel.org/show_bug.cgi?id=29722 > > It's unclear to me if that bug was fixed by requesting _OSC only once, > as the commit message suggests, or if the addition of OSC_MSI_SUPPORT > to ACPI_PCIE_REQ_SUPPORT fixed the issue. > > Since the latter is not mentioned in the commit message, > it seems plausible to assume that the OSC_MSI_SUPPORT change > was unintentional. > > In any case it doesn't seem to make sense to not request any > control in the "pci=nomsi" case. > > It's also worth noting that the behavior is different on > Apple machines as they use a fixed _OSC set even for "pci=nomsi". > > I'm wondering if OSC_PCI_MSI_SUPPORT should simply be removed > from ACPI_PCIE_REQ_SUPPORT, but I'm worried that it may cause > reappearance of the cpu hog issue. Hi Lukas, I tried to remove OSC_PCI_MSI_SUPPORT from ACPI_PCIE_REQ_SUPPORT, but after negotiate_os_control(), the 'PCIeHotplug' control is still disabled in the control capability after ACPI query_osc, run_osc routines (I haven't figured out why yet), thus the pciehp severvice driver can't be loader. Thanks, Feng > Thoughts? > > Thanks, > > Lukas
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index b6536ed599c3..10d72156da9a 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1664,6 +1664,15 @@ void set_pcie_hotplug_bridge(struct pci_dev *pdev) pcie_capability_read_dword(pdev, PCI_EXP_SLTCAP, ®32); if (reg32 & PCI_EXP_SLTCAP_HPC) pdev->is_hotplug_bridge = 1; + + /* + * When MSI is disabled, root port will use legacy INTX, and likely + * share INTX interrupt line with other devices like NIC/NVME. There + * was real world issue that the CCIE IRQ is asserted afer boot, but + * will not be handled well and cause IRQ storm. So disable it early. + */ + if (!pci_msi_enabled()) + pcie_disable_hp_interrupts_early(pdev); } static void set_pcie_thunderbolt(struct pci_dev *dev)
There was a irq storm bug when testing "pci=nomsi" case, and the root cause is: 'nomsi' will disable MSI and let devices and root ports use legacy INTX inerrupt, and likely make several devices/ports share one interrupt. In the failure case, BIOS doesn't disable the PCIE hotplug interrupts, and actually asserts the command-complete interrupt. As MSI is disabled, ACPI initialization code will not enumerate root port's PCIE hotplug capability, and pciehp service driver wont' be enabled for the root port to handle that interrupt, later on when it is shared and enabled by other device driver like NVME or NIC, the "nobody care irq storm" happens. So disable the pcie hotplug CCIE/HPIE interrupt in early boot phase when MSI is not enbaled. Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com> --- drivers/pci/probe.c | 9 +++++++++ 1 file changed, 9 insertions(+)