Message ID | 1448027966-21610-1-git-send-email-grygorii.strashko@ti.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Hi Grygorii, On Fri, Nov 20, 2015 at 03:59:26PM +0200, Grygorii Strashko wrote: > On -RT, TI DRA7 PCIe driver always produces below backtrace when the > first PCI interrupt is triggered: > > ------------[ cut here ]------------ > WARNING: CPU: 1 PID: 82 at kernel/irq/handle.c:150 handle_irq_event_percpu+0x14c/0x174() > irq 460 handler irq_default_primary_handler+0x0/0x14 enabled interrupts > Modules linked in: ahci_platform(+) ti_vip(+) libahci_platform xhci_pci(+) c_can_platform(+) > libahci xhci_hcd ti_vpe c_can libata v4l2_mem2mem can_dev gpio_keys leds_gpio > snd_soc_simple_card usbcore spi_ti_qspi videobuf2_core extcon_usb_gpio omap_wdt > ti_vpdma videobuf2_dma_contig ti_soc_thermal videobuf2_memops dwc3_omap extcon > rtc_omap ov1063x v4l2_common videodev media snd_soc_tlv320aic3x omap_rng rng_core omap_remoteproc > CPU: 1 PID: 82 Comm: irq/26-dra7-pci Not tainted 4.1.10-rt10-02638-g6486d7e-dirty #1 > Hardware name: Generic DRA74X (Flattened Device Tree) > Backtrace: > [<c0013060>] (dump_backtrace) from [<c0013280>] (show_stack+0x18/0x1c) > r7:c07acca0 r6:00000000 r5:c09034e4 r4:00000000 > [<c0013268>] (show_stack) from [<c061ef14>] (dump_stack+0x90/0xac) > [<c061ee84>] (dump_stack) from [<c00436e4>] (warn_slowpath_common+0x7c/0xb8) > r7:c07acca0 r6:00000096 r5:00000009 r4:ee4d3e38 > [<c0043668>] (warn_slowpath_common) from [<c0043758>] (warn_slowpath_fmt+0x38/0x40) > r8:ee3fcc00 r7:000001cc r6:00000000 r5:00000002 r4:c07acc78 > [<c0043724>] (warn_slowpath_fmt) from [<c0085fd0>] (handle_irq_event_percpu+0x14c/0x174) > r3:000001cc r2:c07acc78 > r4:ecfcdd80 > [<c0085e84>] (handle_irq_event_percpu) from [<c008607c>] (handle_irq_event+0x84/0xb8) > r10:00000001 r9:ee4b59c0 r8:ee1d0900 r7:00000001 r6:00000000 r5:ecfcdd80 > r4:ee3fcc00 > [<c0085ff8>] (handle_irq_event) from [<c0089240>] (handle_simple_irq+0x90/0x118) > r5:ee4a9020 r4:ee3fcc00 > [<c00891b0>] (handle_simple_irq) from [<c0085514>] (generic_handle_irq+0x30/0x44) > r5:ee4a9020 r4:000001cc > [<c00854e4>] (generic_handle_irq) from [<c034a7d4>] (dra7xx_pcie_msi_irq_handler+0x7c/0x8c) > r5:ee4a9020 r4:00000001 > [<c034a758>] (dra7xx_pcie_msi_irq_handler) from [<c0086f08>] (irq_forced_thread_fn+0x28/0x5c) > r5:ee1d0900 r4:ee4b59c0 > [<c0086ee0>] (irq_forced_thread_fn) from [<c00871dc>] (irq_thread+0x128/0x204) > r7:00000001 r6:00000000 r5:ee4d2000 r4:ee4b59e4 > [<c00870b4>] (irq_thread) from [<c005e2fc>] (kthread+0xd4/0xec) > r10:00000000 r9:00000000 r8:00000000 r7:c00870b4 r6:ee4b59c0 r5:ee4b5a00 > r4:00000000 > [<c005e228>] (kthread) from [<c000faa8>] (ret_from_fork+0x14/0x2c) > r7:00000000 r6:00000000 r5:c005e228 r4:ee4b5a00 > ---[ end trace 0000000000000002 ]--- The backtrace might be OK (maybe slightly overkill), but all the stack addresses are certainly irrelevant and distracting. We only need enough to recognize the problem. I don't think the modules list is relevant either. > This backtrace is triggered because dra7xx_pcie_msi_irq_handler() > forced to be threaded by default on -RT but, in the same time, it's > IRQ dispatcher and calls generic_handle_irq(), which leads to > handle_simple_irq() call. The handle_simple_irq() expected to be > called with IRQ disabled. > > The same issue will also happen if kernel will boot with "threadirqs" > cmdline parameter (CONFIG_IRQ_FORCED_THREADING). > > As discussed in [1], the current DRA7 PCIe hw configuration supports > 32 interrupts, which cannot change because it's hardwired in silicon, > this is a single status read and assuming that only a few (most of the > time it will be exactly ONE) of those interrupts are pending at the > same time is pretty much a sane assumption. And recommended fix for > this issue is to request dra7xx_pcie_msi IRQ with IRQF_NO_THREAD flag. > > [1] https://lkml.org/lkml/2015/11/3/660 > > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> > --- > Changes in v2: > - comment in code truncated > Link on v1: > https://lkml.org/lkml/2015/11/5/593 > drivers/pci/host/pci-dra7xx.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c > index 8c36880..0415192 100644 > --- a/drivers/pci/host/pci-dra7xx.c > +++ b/drivers/pci/host/pci-dra7xx.c > @@ -301,8 +301,19 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie *dra7xx, > return -EINVAL; > } > > + /* > + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD > + * On -RT and if kernel is booting with "threadirqs" cmd line parameter > + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, > + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), > + * which, in turn, will be resolved to handle_simple_irq() call. > + * The handle_simple_irq() expected to be called with IRQ disabled, as > + * result kernle will display warning: > + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". > + */ > ret = devm_request_irq(&pdev->dev, pp->irq, > - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, > + dra7xx_pcie_msi_irq_handler, > + IRQF_SHARED | IRQF_NO_THREAD, > "dra7-pcie-msi", pp); There's similar code in exynos_add_pcie_port(), imx6_add_pcie_port(), and spear13xx_add_pcie_port(). Do they need similar changes? If not, why not? I see your discussion about DRA7 hardware design, but my impression is that this problem affects anybody who calls dw_handle_msi_irq() from a handler registered with IRQF_SHARED. Bjorn > if (ret) { > dev_err(&pdev->dev, "failed to request irq\n"); > -- > 2.6.3 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-pci" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/04/2015 08:46 PM, Bjorn Helgaas wrote: > Hi Grygorii, > > On Fri, Nov 20, 2015 at 03:59:26PM +0200, Grygorii Strashko wrote: >> On -RT, TI DRA7 PCIe driver always produces below backtrace when the >> first PCI interrupt is triggered: >> >> ------------[ cut here ]------------ >> WARNING: CPU: 1 PID: 82 at kernel/irq/handle.c:150 handle_irq_event_percpu+0x14c/0x174() >> irq 460 handler irq_default_primary_handler+0x0/0x14 enabled interrupts [...] >> [<c00891b0>] (handle_simple_irq) from [<c0085514>] (generic_handle_irq+0x30/0x44) >> r5:ee4a9020 r4:000001cc >> [<c00854e4>] (generic_handle_irq) from [<c034a7d4>] (dra7xx_pcie_msi_irq_handler+0x7c/0x8c) >> r5:ee4a9020 r4:00000001 >> [<c034a758>] (dra7xx_pcie_msi_irq_handler) from [<c0086f08>] (irq_forced_thread_fn+0x28/0x5c) >> r5:ee1d0900 r4:ee4b59c0 >> [<c0086ee0>] (irq_forced_thread_fn) from [<c00871dc>] (irq_thread+0x128/0x204) >> r7:00000001 r6:00000000 r5:ee4d2000 r4:ee4b59e4 >> [<c00870b4>] (irq_thread) from [<c005e2fc>] (kthread+0xd4/0xec) >> r10:00000000 r9:00000000 r8:00000000 r7:c00870b4 r6:ee4b59c0 r5:ee4b5a00 >> r4:00000000 >> [<c005e228>] (kthread) from [<c000faa8>] (ret_from_fork+0x14/0x2c) >> r7:00000000 r6:00000000 r5:c005e228 r4:ee4b5a00 >> ---[ end trace 0000000000000002 ]--- > > The backtrace might be OK (maybe slightly overkill), but all the > stack addresses are certainly irrelevant and distracting. We only > need enough to recognize the problem. I don't think the modules list > is relevant either. I'll take this into account. Is this blocker for this patch now? > >> This backtrace is triggered because dra7xx_pcie_msi_irq_handler() >> forced to be threaded by default on -RT but, in the same time, it's >> IRQ dispatcher and calls generic_handle_irq(), which leads to >> handle_simple_irq() call. The handle_simple_irq() expected to be >> called with IRQ disabled. >> >> The same issue will also happen if kernel will boot with "threadirqs" >> cmdline parameter (CONFIG_IRQ_FORCED_THREADING). >> >> As discussed in [1], the current DRA7 PCIe hw configuration supports >> 32 interrupts, which cannot change because it's hardwired in silicon, >> this is a single status read and assuming that only a few (most of the >> time it will be exactly ONE) of those interrupts are pending at the >> same time is pretty much a sane assumption. And recommended fix for >> this issue is to request dra7xx_pcie_msi IRQ with IRQF_NO_THREAD flag. >> >> [1] https://lkml.org/lkml/2015/11/3/660 >> >> Cc: Thomas Gleixner <tglx@linutronix.de> >> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> >> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> >> --- >> Changes in v2: >> - comment in code truncated >> Link on v1: >> https://lkml.org/lkml/2015/11/5/593 >> drivers/pci/host/pci-dra7xx.c | 13 ++++++++++++- >> 1 file changed, 12 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c >> index 8c36880..0415192 100644 >> --- a/drivers/pci/host/pci-dra7xx.c >> +++ b/drivers/pci/host/pci-dra7xx.c >> @@ -301,8 +301,19 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie *dra7xx, >> return -EINVAL; >> } >> >> + /* >> + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD >> + * On -RT and if kernel is booting with "threadirqs" cmd line parameter >> + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, >> + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), >> + * which, in turn, will be resolved to handle_simple_irq() call. >> + * The handle_simple_irq() expected to be called with IRQ disabled, as >> + * result kernle will display warning: >> + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". >> + */ >> ret = devm_request_irq(&pdev->dev, pp->irq, >> - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, >> + dra7xx_pcie_msi_irq_handler, >> + IRQF_SHARED | IRQF_NO_THREAD, >> "dra7-pcie-msi", pp); > > There's similar code in exynos_add_pcie_port(), imx6_add_pcie_port(), > and spear13xx_add_pcie_port(). Do they need similar changes? If not, > why not? > > I see your discussion about DRA7 hardware design, but my impression is > that this problem affects anybody who calls dw_handle_msi_irq() from a > handler registered with IRQF_SHARED. Issue fixed by this patch is not related to IRQF_SHARED. It will happen on -RT or if kernel will boot with "threadirqs" cmd line parameter - in both cases these PCI IRQ handlers will be forced to be threaded and, as result, generic_handle_irq() will produce above backtrace. Personally, I don't have strong opinion about "should similar change be applied to other PCI drivers or not?" And I think, that owners of those driver should make such decision.
[+cc Jingoo (exynos), Richard, Lucas (imx6), Pratyush (spear13xx)] On Fri, Dec 04, 2015 at 11:22:50PM +0200, Grygorii Strashko wrote: > On 12/04/2015 08:46 PM, Bjorn Helgaas wrote: > > Hi Grygorii, > > > > On Fri, Nov 20, 2015 at 03:59:26PM +0200, Grygorii Strashko wrote: > >> On -RT, TI DRA7 PCIe driver always produces below backtrace when the > >> first PCI interrupt is triggered: > >> > >> ------------[ cut here ]------------ > >> WARNING: CPU: 1 PID: 82 at kernel/irq/handle.c:150 handle_irq_event_percpu+0x14c/0x174() > >> irq 460 handler irq_default_primary_handler+0x0/0x14 enabled interrupts > [...] > >> [<c00891b0>] (handle_simple_irq) from [<c0085514>] (generic_handle_irq+0x30/0x44) > >> r5:ee4a9020 r4:000001cc > >> [<c00854e4>] (generic_handle_irq) from [<c034a7d4>] (dra7xx_pcie_msi_irq_handler+0x7c/0x8c) > >> r5:ee4a9020 r4:00000001 > >> [<c034a758>] (dra7xx_pcie_msi_irq_handler) from [<c0086f08>] (irq_forced_thread_fn+0x28/0x5c) > >> r5:ee1d0900 r4:ee4b59c0 > >> [<c0086ee0>] (irq_forced_thread_fn) from [<c00871dc>] (irq_thread+0x128/0x204) > >> r7:00000001 r6:00000000 r5:ee4d2000 r4:ee4b59e4 > >> [<c00870b4>] (irq_thread) from [<c005e2fc>] (kthread+0xd4/0xec) > >> r10:00000000 r9:00000000 r8:00000000 r7:c00870b4 r6:ee4b59c0 r5:ee4b5a00 > >> r4:00000000 > >> [<c005e228>] (kthread) from [<c000faa8>] (ret_from_fork+0x14/0x2c) > >> r7:00000000 r6:00000000 r5:c005e228 r4:ee4b5a00 > >> ---[ end trace 0000000000000002 ]--- > > > > The backtrace might be OK (maybe slightly overkill), but all the > > stack addresses are certainly irrelevant and distracting. We only > > need enough to recognize the problem. I don't think the modules list > > is relevant either. > > I'll take this into account. Is this blocker for this patch now? If you repost this, you can fix it then. If I end up taking it as-is, I'll fix it myself. > >> This backtrace is triggered because dra7xx_pcie_msi_irq_handler() > >> forced to be threaded by default on -RT but, in the same time, it's > >> IRQ dispatcher and calls generic_handle_irq(), which leads to > >> handle_simple_irq() call. The handle_simple_irq() expected to be > >> called with IRQ disabled. > >> > >> The same issue will also happen if kernel will boot with "threadirqs" > >> cmdline parameter (CONFIG_IRQ_FORCED_THREADING). > >> > >> As discussed in [1], the current DRA7 PCIe hw configuration supports > >> 32 interrupts, which cannot change because it's hardwired in silicon, > >> this is a single status read and assuming that only a few (most of the > >> time it will be exactly ONE) of those interrupts are pending at the > >> same time is pretty much a sane assumption. And recommended fix for > >> this issue is to request dra7xx_pcie_msi IRQ with IRQF_NO_THREAD flag. > >> > >> [1] https://lkml.org/lkml/2015/11/3/660 > >> > >> Cc: Thomas Gleixner <tglx@linutronix.de> > >> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> > >> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> > >> --- > >> Changes in v2: > >> - comment in code truncated > >> Link on v1: > >> https://lkml.org/lkml/2015/11/5/593 > >> drivers/pci/host/pci-dra7xx.c | 13 ++++++++++++- > >> 1 file changed, 12 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c > >> index 8c36880..0415192 100644 > >> --- a/drivers/pci/host/pci-dra7xx.c > >> +++ b/drivers/pci/host/pci-dra7xx.c > >> @@ -301,8 +301,19 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie *dra7xx, > >> return -EINVAL; > >> } > >> > >> + /* > >> + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD > >> + * On -RT and if kernel is booting with "threadirqs" cmd line parameter > >> + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, > >> + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), > >> + * which, in turn, will be resolved to handle_simple_irq() call. > >> + * The handle_simple_irq() expected to be called with IRQ disabled, as > >> + * result kernle will display warning: > >> + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". > >> + */ > >> ret = devm_request_irq(&pdev->dev, pp->irq, > >> - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, > >> + dra7xx_pcie_msi_irq_handler, > >> + IRQF_SHARED | IRQF_NO_THREAD, > >> "dra7-pcie-msi", pp); > > > > There's similar code in exynos_add_pcie_port(), imx6_add_pcie_port(), > > and spear13xx_add_pcie_port(). Do they need similar changes? If not, > > why not? > > > > I see your discussion about DRA7 hardware design, but my impression is > > that this problem affects anybody who calls dw_handle_msi_irq() from a > > handler registered with IRQF_SHARED. > > Issue fixed by this patch is not related to IRQF_SHARED. > It will happen on -RT or if kernel will boot with "threadirqs" cmd line parameter > - in both cases these PCI IRQ handlers will be forced to be threaded and, > as result, generic_handle_irq() will produce above backtrace. > > Personally, I don't have strong opinion about "should similar change be applied > to other PCI drivers or not?" And I think, that owners of those driver should > make such decision. If the same issue affects several drivers, I'd like to see them all handle it the same way. Otherwise, somebody coming along later will wonder why they're different, and there won't be a good answer. I cc'd the other maintainers to see what they think. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Bjorn Helgaas | 2015-12-04 12:46:19 [-0600]: >The backtrace might be OK (maybe slightly overkill), but all the >stack addresses are certainly irrelevant and distracting. We only >need enough to recognize the problem. I don't think the modules list >is relevant either. I would shorten it to the bare minimum. Also the patch description itself could be truncated to the required bits… >> diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c >> index 8c36880..0415192 100644 >> --- a/drivers/pci/host/pci-dra7xx.c >> +++ b/drivers/pci/host/pci-dra7xx.c >> @@ -301,8 +301,19 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie *dra7xx, >> return -EINVAL; >> } >> >> + /* >> + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD >> + * On -RT and if kernel is booting with "threadirqs" cmd line parameter >> + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, >> + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), >> + * which, in turn, will be resolved to handle_simple_irq() call. >> + * The handle_simple_irq() expected to be called with IRQ disabled, as >> + * result kernle will display warning: >> + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". >> + */ …not to mention this piece. d7ce4377494a ("powerpc/fsl_msi: mark the msi cascade handler IRQF_NO_THREAD") fixes the same bug in arch/ppc so they bypassed you fixing it. >> ret = devm_request_irq(&pdev->dev, pp->irq, >> - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, >> + dra7xx_pcie_msi_irq_handler, >> + IRQF_SHARED | IRQF_NO_THREAD, >> "dra7-pcie-msi", pp); > >There's similar code in exynos_add_pcie_port(), imx6_add_pcie_port(), >and spear13xx_add_pcie_port(). Do they need similar changes? If not, >why not? You are right. The request for the handler exynos_pcie_msi_irq_handler(), imx6_pcie_msi_handler() and spear13xx_pcie_irq_handler() needs same treatment. Additionally we have: arch/mips/pci/msi-octeon.c: if (request_irq(OCTEON_IRQ_PCI_MSI0, octeon_msi_interrupt0, arch/sparc/kernel/pci_msi.c: err = request_irq(irq, sparc64_msiq_interrupt, 0, drivers/pci/host/pci-tegra.c: err = request_irq(msi->irq, tegra_pcie_msi_irq, 0, drivers/pci/host/pcie-rcar.c: err = devm_request_irq(&pdev->dev, msi->irq1, rcar_pcie_msi_irq, drivers/pci/host/pcie-rcar.c: err = devm_request_irq(&pdev->dev, msi->irq2, rcar_pcie_msi_irq, drivers/pci/host/pcie-xilinx.c: err = devm_request_irq(dev, port->irq, xilinx_pcie_intr_handler, which require the same kind of fix… >I see your discussion about DRA7 hardware design, but my impression is >that this problem affects anybody who calls dw_handle_msi_irq() from a >handler registered with IRQF_SHARED. … brecause all of them invoke generic_handle_irq() from the requsted handler. generic_handle_irq grabs raw_locks and this needs to run in raw-irq context. IRQF_SHARED could probably go away. The IRQ is mostlikely exclusive assigned in each SoC for MSI interrupt demux. >Bjorn Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Bjorn, Am Montag, den 07.12.2015, 21:33 -0600 schrieb Bjorn Helgaas: > [+cc Jingoo (exynos), Richard, Lucas (imx6), Pratyush (spear13xx)] > > On Fri, Dec 04, 2015 at 11:22:50PM +0200, Grygorii Strashko wrote: > > On 12/04/2015 08:46 PM, Bjorn Helgaas wrote: > > > Hi Grygorii, [...] > > >> > > >> + /* > > >> + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD > > >> + * On -RT and if kernel is booting with "threadirqs" cmd line parameter > > >> + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, > > >> + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), > > >> + * which, in turn, will be resolved to handle_simple_irq() call. > > >> + * The handle_simple_irq() expected to be called with IRQ disabled, as > > >> + * result kernle will display warning: > > >> + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". > > >> + */ > > >> ret = devm_request_irq(&pdev->dev, pp->irq, > > >> - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, > > >> + dra7xx_pcie_msi_irq_handler, > > >> + IRQF_SHARED | IRQF_NO_THREAD, > > >> "dra7-pcie-msi", pp); > > > > > > There's similar code in exynos_add_pcie_port(), imx6_add_pcie_port(), > > > and spear13xx_add_pcie_port(). Do they need similar changes? If not, > > > why not? > > > > > > I see your discussion about DRA7 hardware design, but my impression is > > > that this problem affects anybody who calls dw_handle_msi_irq() from a > > > handler registered with IRQF_SHARED. > > > > Issue fixed by this patch is not related to IRQF_SHARED. > > It will happen on -RT or if kernel will boot with "threadirqs" cmd line parameter > > - in both cases these PCI IRQ handlers will be forced to be threaded and, > > as result, generic_handle_irq() will produce above backtrace. > > > > Personally, I don't have strong opinion about "should similar change be applied > > to other PCI drivers or not?" And I think, that owners of those driver should > > make such decision. > > If the same issue affects several drivers, I'd like to see them all > handle it the same way. Otherwise, somebody coming along later will > wonder why they're different, and there won't be a good answer. > > I cc'd the other maintainers to see what they think. > Yes, this should absolutely be changed in all drivers, based on the DW core. There were some patches for this already, but apparently they have not been applied, because some review feedback wasn't taken care of. I have a patch titled "PCI: designware: Mark the msi cascade handler IRQF_NO_THREAD" in my inbox that did this, and I still think it's the right thing to do. Regards, Lucas
On Tue, Dec 8, 2015 at 2:53 PM, Lucas Stach <l.stach@pengutronix.de> wrote: > Hi Bjorn, > > Am Montag, den 07.12.2015, 21:33 -0600 schrieb Bjorn Helgaas: >> [+cc Jingoo (exynos), Richard, Lucas (imx6), Pratyush (spear13xx)] >> >> On Fri, Dec 04, 2015 at 11:22:50PM +0200, Grygorii Strashko wrote: >> > On 12/04/2015 08:46 PM, Bjorn Helgaas wrote: >> > > Hi Grygorii, > [...] >> > >> >> > >> + /* >> > >> + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD >> > >> + * On -RT and if kernel is booting with "threadirqs" cmd line parameter >> > >> + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, >> > >> + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), >> > >> + * which, in turn, will be resolved to handle_simple_irq() call. >> > >> + * The handle_simple_irq() expected to be called with IRQ disabled, as >> > >> + * result kernle will display warning: >> > >> + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". >> > >> + */ >> > >> ret = devm_request_irq(&pdev->dev, pp->irq, >> > >> - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, >> > >> + dra7xx_pcie_msi_irq_handler, >> > >> + IRQF_SHARED | IRQF_NO_THREAD, >> > >> "dra7-pcie-msi", pp); >> > > >> > > There's similar code in exynos_add_pcie_port(), imx6_add_pcie_port(), >> > > and spear13xx_add_pcie_port(). Do they need similar changes? If not, >> > > why not? >> > > >> > > I see your discussion about DRA7 hardware design, but my impression is >> > > that this problem affects anybody who calls dw_handle_msi_irq() from a >> > > handler registered with IRQF_SHARED. >> > >> > Issue fixed by this patch is not related to IRQF_SHARED. >> > It will happen on -RT or if kernel will boot with "threadirqs" cmd line parameter >> > - in both cases these PCI IRQ handlers will be forced to be threaded and, >> > as result, generic_handle_irq() will produce above backtrace. >> > >> > Personally, I don't have strong opinion about "should similar change be applied >> > to other PCI drivers or not?" And I think, that owners of those driver should >> > make such decision. >> >> If the same issue affects several drivers, I'd like to see them all >> handle it the same way. Otherwise, somebody coming along later will >> wonder why they're different, and there won't be a good answer. >> >> I cc'd the other maintainers to see what they think. >> I too agree with the change for SPEAr13xx as well. ~Pratyush -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Dec 08, 2015 at 10:05:56AM +0100, Sebastian Andrzej Siewior wrote: > * Bjorn Helgaas | 2015-12-04 12:46:19 [-0600]: > > >The backtrace might be OK (maybe slightly overkill), but all the > >stack addresses are certainly irrelevant and distracting. We only > >need enough to recognize the problem. I don't think the modules list > >is relevant either. > > I would shorten it to the bare minimum. Also the patch description > itself could be truncated to the required bits… > > >> diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c > >> index 8c36880..0415192 100644 > >> --- a/drivers/pci/host/pci-dra7xx.c > >> +++ b/drivers/pci/host/pci-dra7xx.c > >> @@ -301,8 +301,19 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie *dra7xx, > >> return -EINVAL; > >> } > >> > >> + /* > >> + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD > >> + * On -RT and if kernel is booting with "threadirqs" cmd line parameter > >> + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, > >> + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), > >> + * which, in turn, will be resolved to handle_simple_irq() call. > >> + * The handle_simple_irq() expected to be called with IRQ disabled, as > >> + * result kernle will display warning: > >> + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". > >> + */ > > …not to mention this piece. d7ce4377494a ("powerpc/fsl_msi: mark the msi > cascade handler IRQF_NO_THREAD") fixes the same bug in arch/ppc so they > bypassed you fixing it. > > >> ret = devm_request_irq(&pdev->dev, pp->irq, > >> - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, > >> + dra7xx_pcie_msi_irq_handler, > >> + IRQF_SHARED | IRQF_NO_THREAD, > >> "dra7-pcie-msi", pp); > > > >There's similar code in exynos_add_pcie_port(), imx6_add_pcie_port(), > >and spear13xx_add_pcie_port(). Do they need similar changes? If not, > >why not? > > You are right. The request for the handler exynos_pcie_msi_irq_handler(), > imx6_pcie_msi_handler() and spear13xx_pcie_irq_handler() needs same > treatment. > Additionally we have: > > arch/mips/pci/msi-octeon.c: if (request_irq(OCTEON_IRQ_PCI_MSI0, octeon_msi_interrupt0, > arch/sparc/kernel/pci_msi.c: err = request_irq(irq, sparc64_msiq_interrupt, 0, > drivers/pci/host/pci-tegra.c: err = request_irq(msi->irq, tegra_pcie_msi_irq, 0, > drivers/pci/host/pcie-rcar.c: err = devm_request_irq(&pdev->dev, msi->irq1, rcar_pcie_msi_irq, > drivers/pci/host/pcie-rcar.c: err = devm_request_irq(&pdev->dev, msi->irq2, rcar_pcie_msi_irq, > drivers/pci/host/pcie-xilinx.c: err = devm_request_irq(dev, port->irq, xilinx_pcie_intr_handler, > > which require the same kind of fix… > > >I see your discussion about DRA7 hardware design, but my impression is > >that this problem affects anybody who calls dw_handle_msi_irq() from a > >handler registered with IRQF_SHARED. > > … brecause all of them invoke generic_handle_irq() from the requsted > handler. generic_handle_irq grabs raw_locks and this needs to run in > raw-irq context. > IRQF_SHARED could probably go away. The IRQ is mostlikely exclusive > assigned in each SoC for MSI interrupt demux. It sounds like we should fix all these places at once. If you (Grygorii) work up a patch that does them all, with a more generic changelog, then we can solicit testing and acks. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 12/09/2015 05:24 PM, Bjorn Helgaas wrote: > On Tue, Dec 08, 2015 at 10:05:56AM +0100, Sebastian Andrzej Siewior wrote: >> * Bjorn Helgaas | 2015-12-04 12:46:19 [-0600]: >> >>> The backtrace might be OK (maybe slightly overkill), but all the >>> stack addresses are certainly irrelevant and distracting. We only >>> need enough to recognize the problem. I don't think the modules list >>> is relevant either. >> >> I would shorten it to the bare minimum. Also the patch description >> itself could be truncated to the required bits… >> >>>> diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c >>>> index 8c36880..0415192 100644 >>>> --- a/drivers/pci/host/pci-dra7xx.c >>>> +++ b/drivers/pci/host/pci-dra7xx.c >>>> @@ -301,8 +301,19 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie *dra7xx, >>>> return -EINVAL; >>>> } >>>> >>>> + /* >>>> + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD >>>> + * On -RT and if kernel is booting with "threadirqs" cmd line parameter >>>> + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, >>>> + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), >>>> + * which, in turn, will be resolved to handle_simple_irq() call. >>>> + * The handle_simple_irq() expected to be called with IRQ disabled, as >>>> + * result kernle will display warning: >>>> + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". >>>> + */ >> >> …not to mention this piece. d7ce4377494a ("powerpc/fsl_msi: mark the msi >> cascade handler IRQF_NO_THREAD") fixes the same bug in arch/ppc so they >> bypassed you fixing it. >> >>>> ret = devm_request_irq(&pdev->dev, pp->irq, >>>> - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, >>>> + dra7xx_pcie_msi_irq_handler, >>>> + IRQF_SHARED | IRQF_NO_THREAD, >>>> "dra7-pcie-msi", pp); >>> >>> There's similar code in exynos_add_pcie_port(), imx6_add_pcie_port(), >>> and spear13xx_add_pcie_port(). Do they need similar changes? If not, >>> why not? >> >> You are right. The request for the handler exynos_pcie_msi_irq_handler(), >> imx6_pcie_msi_handler() and spear13xx_pcie_irq_handler() needs same >> treatment. >> Additionally we have: >> >> arch/mips/pci/msi-octeon.c: if (request_irq(OCTEON_IRQ_PCI_MSI0, octeon_msi_interrupt0, >> arch/sparc/kernel/pci_msi.c: err = request_irq(irq, sparc64_msiq_interrupt, 0, >> drivers/pci/host/pci-tegra.c: err = request_irq(msi->irq, tegra_pcie_msi_irq, 0, >> drivers/pci/host/pcie-rcar.c: err = devm_request_irq(&pdev->dev, msi->irq1, rcar_pcie_msi_irq, >> drivers/pci/host/pcie-rcar.c: err = devm_request_irq(&pdev->dev, msi->irq2, rcar_pcie_msi_irq, >> drivers/pci/host/pcie-xilinx.c: err = devm_request_irq(dev, port->irq, xilinx_pcie_intr_handler, >> >> which require the same kind of fix… >> >>> I see your discussion about DRA7 hardware design, but my impression is >>> that this problem affects anybody who calls dw_handle_msi_irq() from a >>> handler registered with IRQF_SHARED. >> >> … brecause all of them invoke generic_handle_irq() from the requsted >> handler. generic_handle_irq grabs raw_locks and this needs to run in >> raw-irq context. >> IRQF_SHARED could probably go away. The IRQ is mostlikely exclusive >> assigned in each SoC for MSI interrupt demux. > > It sounds like we should fix all these places at once. If you > (Grygorii) work up a patch that does them all, with a more generic > changelog, then we can solicit testing and acks. Sure, will do: - change will be applied for files listed in this thread only - IRQF_SHARED will be left unchanged - commit message will be made more generic - i prefer to keep comment in code for dra7 as is. Is it ok?
diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c index 8c36880..0415192 100644 --- a/drivers/pci/host/pci-dra7xx.c +++ b/drivers/pci/host/pci-dra7xx.c @@ -301,8 +301,19 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie *dra7xx, return -EINVAL; } + /* + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD + * On -RT and if kernel is booting with "threadirqs" cmd line parameter + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), + * which, in turn, will be resolved to handle_simple_irq() call. + * The handle_simple_irq() expected to be called with IRQ disabled, as + * result kernle will display warning: + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". + */ ret = devm_request_irq(&pdev->dev, pp->irq, - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, + dra7xx_pcie_msi_irq_handler, + IRQF_SHARED | IRQF_NO_THREAD, "dra7-pcie-msi", pp); if (ret) { dev_err(&pdev->dev, "failed to request irq\n");
On -RT, TI DRA7 PCIe driver always produces below backtrace when the first PCI interrupt is triggered: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 82 at kernel/irq/handle.c:150 handle_irq_event_percpu+0x14c/0x174() irq 460 handler irq_default_primary_handler+0x0/0x14 enabled interrupts Modules linked in: ahci_platform(+) ti_vip(+) libahci_platform xhci_pci(+) c_can_platform(+) libahci xhci_hcd ti_vpe c_can libata v4l2_mem2mem can_dev gpio_keys leds_gpio snd_soc_simple_card usbcore spi_ti_qspi videobuf2_core extcon_usb_gpio omap_wdt ti_vpdma videobuf2_dma_contig ti_soc_thermal videobuf2_memops dwc3_omap extcon rtc_omap ov1063x v4l2_common videodev media snd_soc_tlv320aic3x omap_rng rng_core omap_remoteproc CPU: 1 PID: 82 Comm: irq/26-dra7-pci Not tainted 4.1.10-rt10-02638-g6486d7e-dirty #1 Hardware name: Generic DRA74X (Flattened Device Tree) Backtrace: [<c0013060>] (dump_backtrace) from [<c0013280>] (show_stack+0x18/0x1c) r7:c07acca0 r6:00000000 r5:c09034e4 r4:00000000 [<c0013268>] (show_stack) from [<c061ef14>] (dump_stack+0x90/0xac) [<c061ee84>] (dump_stack) from [<c00436e4>] (warn_slowpath_common+0x7c/0xb8) r7:c07acca0 r6:00000096 r5:00000009 r4:ee4d3e38 [<c0043668>] (warn_slowpath_common) from [<c0043758>] (warn_slowpath_fmt+0x38/0x40) r8:ee3fcc00 r7:000001cc r6:00000000 r5:00000002 r4:c07acc78 [<c0043724>] (warn_slowpath_fmt) from [<c0085fd0>] (handle_irq_event_percpu+0x14c/0x174) r3:000001cc r2:c07acc78 r4:ecfcdd80 [<c0085e84>] (handle_irq_event_percpu) from [<c008607c>] (handle_irq_event+0x84/0xb8) r10:00000001 r9:ee4b59c0 r8:ee1d0900 r7:00000001 r6:00000000 r5:ecfcdd80 r4:ee3fcc00 [<c0085ff8>] (handle_irq_event) from [<c0089240>] (handle_simple_irq+0x90/0x118) r5:ee4a9020 r4:ee3fcc00 [<c00891b0>] (handle_simple_irq) from [<c0085514>] (generic_handle_irq+0x30/0x44) r5:ee4a9020 r4:000001cc [<c00854e4>] (generic_handle_irq) from [<c034a7d4>] (dra7xx_pcie_msi_irq_handler+0x7c/0x8c) r5:ee4a9020 r4:00000001 [<c034a758>] (dra7xx_pcie_msi_irq_handler) from [<c0086f08>] (irq_forced_thread_fn+0x28/0x5c) r5:ee1d0900 r4:ee4b59c0 [<c0086ee0>] (irq_forced_thread_fn) from [<c00871dc>] (irq_thread+0x128/0x204) r7:00000001 r6:00000000 r5:ee4d2000 r4:ee4b59e4 [<c00870b4>] (irq_thread) from [<c005e2fc>] (kthread+0xd4/0xec) r10:00000000 r9:00000000 r8:00000000 r7:c00870b4 r6:ee4b59c0 r5:ee4b5a00 r4:00000000 [<c005e228>] (kthread) from [<c000faa8>] (ret_from_fork+0x14/0x2c) r7:00000000 r6:00000000 r5:c005e228 r4:ee4b5a00 ---[ end trace 0000000000000002 ]--- This backtrace is triggered because dra7xx_pcie_msi_irq_handler() forced to be threaded by default on -RT but, in the same time, it's IRQ dispatcher and calls generic_handle_irq(), which leads to handle_simple_irq() call. The handle_simple_irq() expected to be called with IRQ disabled. The same issue will also happen if kernel will boot with "threadirqs" cmdline parameter (CONFIG_IRQ_FORCED_THREADING). As discussed in [1], the current DRA7 PCIe hw configuration supports 32 interrupts, which cannot change because it's hardwired in silicon, this is a single status read and assuming that only a few (most of the time it will be exactly ONE) of those interrupts are pending at the same time is pretty much a sane assumption. And recommended fix for this issue is to request dra7xx_pcie_msi IRQ with IRQF_NO_THREAD flag. [1] https://lkml.org/lkml/2015/11/3/660 Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> --- Changes in v2: - comment in code truncated Link on v1: https://lkml.org/lkml/2015/11/5/593 drivers/pci/host/pci-dra7xx.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)