Message ID | 1446748990-27722-1-git-send-email-grygorii.strashko@ti.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
On 11/05/2015 07:43 PM, Grygorii Strashko wrote: > diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c > index 6589e7e..31460f4 100644 > --- a/drivers/pci/host/pci-dra7xx.c > +++ b/drivers/pci/host/pci-dra7xx.c > @@ -302,8 +302,30 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie *dra7xx, > return -EINVAL; > } > > + /* > + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD > + * On -RT and if kernel is booting with "threadirqs" cmd line parameter > + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, > + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), > + * which, in turn, will be resolved to handle_simple_irq() call. > + * The handle_simple_irq() expected to be called with IRQ disabled, as > + * result kernle will display warning: > + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". > + * > + * Current DRA7 PCIe hw configuration supports 32 interrupts, > + * which cannot change because it's hardwired in silicon, and can assume > + * that only a few (most of the time it will be exactly ONE) of those > + * interrupts are pending at the same time. So, It's sane way to dial > + * with above issue by marking dra7xx_pcie_msi IRQ as IRQF_NO_THREAD. > + * if some platform will utilize a lot of MSI IRQs and will suffer > + * form -RT latencies degradation because of that - IRQF_NO_THREAD can > + * be removed and "Warning Annihilation" W/A can be applied [1] > + * > + * [1] https://lkml.org/lkml/2015/11/2/578 I don't think this novel is required. Also a reference to a patch which was not accepted is not a smart idea (since people might think they will improve their -RT latency by applying it without thinking). Do you have any *real* numbers where you can say this patch does better/worse than what you suggester earlier? I'm simply asking because each gpio-irq multiplexing driver does the same thing. > + */ > ret = devm_request_irq(&pdev->dev, pp->irq, > - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, > + dra7xx_pcie_msi_irq_handler, > + IRQF_SHARED | IRQF_NO_THREAD, > "dra7-pcie-msi", pp); > if (ret) { > dev_err(&pdev->dev, "failed to request irq\n"); > Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Sebastian, On 11/06/2015 10:53 AM, Sebastian Andrzej Siewior wrote: > On 11/05/2015 07:43 PM, Grygorii Strashko wrote: >> diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c >> index 6589e7e..31460f4 100644 >> --- a/drivers/pci/host/pci-dra7xx.c >> +++ b/drivers/pci/host/pci-dra7xx.c >> @@ -302,8 +302,30 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie *dra7xx, >> return -EINVAL; >> } >> >> + /* >> + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD >> + * On -RT and if kernel is booting with "threadirqs" cmd line parameter >> + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, >> + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), >> + * which, in turn, will be resolved to handle_simple_irq() call. >> + * The handle_simple_irq() expected to be called with IRQ disabled, as >> + * result kernle will display warning: >> + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". >> + * >> + * Current DRA7 PCIe hw configuration supports 32 interrupts, >> + * which cannot change because it's hardwired in silicon, and can assume >> + * that only a few (most of the time it will be exactly ONE) of those >> + * interrupts are pending at the same time. So, It's sane way to dial >> + * with above issue by marking dra7xx_pcie_msi IRQ as IRQF_NO_THREAD. I can remove below text. >> + * if some platform will utilize a lot of MSI IRQs and will suffer >> + * form -RT latencies degradation because of that - IRQF_NO_THREAD can >> + * be removed and "Warning Annihilation" W/A can be applied [1] >> + * >> + * [1] https://lkml.org/lkml/2015/11/2/578 > > I don't think this novel is required. Also a reference to a patch which > was not accepted is not a smart idea (since people might think they > will improve their -RT latency by applying it without thinking). Agree. Honestly, I don't like both these "solution"/W/A... :( but have nothing better in mind. > > Do you have any *real* numbers where you can say this patch does > better/worse than what you suggester earlier? I'm simply asking because > each gpio-irq multiplexing driver does the same thing. > Indeed. And not only gpio :( Below you can find list of files which contain "generic_handle_irq" and do not contain "irq_set_chained_handler" (of course this list is inaccurate). [Addendum 1] Regarding *real* numbers, I've tested it only on TI dra7-evm + pcie USB3 expander card which generates legacy IRQ every 1 ms and didn't observe latency changes. I'm not sure that I'd be able to test any scenario soon (next week), which will be close to the potential "worst" case. Plus, I found acceptable assumption, that only few pci irqs might be pending at the same time (at least for the reference Kernel code). That's why I gave up without fight :( Seems, I need explain why I'm trying to avoid using IRQF_NO_THREAD by all means (I'm sorry, if below may sound dummy, and may be I'm overestimating/demonizing problem - my -RT experience is not really good [yet:)]). - IRQF_NO_THREAD is the first considered option for such kind of issues. But: Now in LKML there are ~60 occurrences of IRQF_NO_THREAD - most of them are used by Arch code. And It's only used by 6 drivers (drivers/*) [Addendum 2]. During past year, I've found only two threads related to IRQF_NO_THREAD and, in both cases, IRQF_NO_THREAD was added for arch specific IRQs which can't be threaded (https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-November/122659.html, https://lkml.org/lkml/2015/4/21/404). - ARM UP system: TI's am437xx SoCs for example. Here everything is started from /drivers/irqchip/irq-gic.c -> gic_handle_irq() static void __exception_irq_entry gic_handle_irq(struct pt_regs *regs) { u32 irqstat, irqnr; struct gic_chip_data *gic = &gic_data[0]; void __iomem *cpu_base = gic_data_cpu_base(gic); do { irqstat = readl_relaxed(cpu_base + GIC_CPU_INTACK); irqnr = irqstat & GICC_IAR_INT_ID_MASK; if (likely(irqnr > 15 && irqnr < 1021)) { if (static_key_true(&supports_deactivate)) writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); handle_domain_irq(gic->domain, irqnr, regs); |- struct pt_regs *old_regs = set_irq_regs(regs); * |- irq_enter(); |- generic_handle_irq(irq); [1] //*** -RT: all IRQ are threaded (except SPI, timers,..) *** |- handle_fasteoi_irq() |- irq_default_primary_handler() |- _irq_wake_thread(desc, action); [2] //*** chained irqX *** chained_irq_handler |- chained_irq_enter(chip, desc); |- handle IRQX.1 |- generic_handle_irq(IRQX.1); ... |- handle IRQX.2 |- handle IRQX.n |- chained_irq_exit(chip, desc); [3] //*** IRQF_NO_THREAD irqY *** |- handle_fasteoi_irq() |- driverY_hw_irq_handler() |- handle IRQY.1 |- generic_handle_irq(IRQY.1); ... |- handle IRQY.2 |- handle IRQY.n * |- irq_exit(); |- set_irq_regs(old_regs); continue; } if (irqnr < 16) { writel_relaxed(irqstat, cpu_base + GIC_CPU_EOI); if (static_key_true(&supports_deactivate)) writel_relaxed(irqstat, cpu_base + GIC_CPU_DEACTIVATE); #ifdef CONFIG_SMP handle_IPI(irqnr, regs); #endif continue; } break; } while (1); } GIC IRQ handler gic_handle_irq() may process more than one IRQ without leaving HW IRQ mode (during my experiments I saw up to 6 IRQs processed in one cycle). As result, It was concluded, that having such current HW/SW and all IRQs forced threaded [1], it will be potentially possible to predict system behavior, because gic_handle_irq() will do the same things for most of processed IRQs. But, once there will be chained [2] or IRQF_NO_THREAD [3] IRQs - complete unpredictability. So, It was selected as goal to have all PPI IRQs (forced) threaded. And if someone will require faster IRQ handling - IRQF_NO_THREAD can be always added, but it will be custom solution then. I'd be appreciated for your comments - if above problem is not a problem. Good - IRQF_NO_THREAD forever! Thanks.
On 11/06/2015 08:59 PM, Grygorii Strashko wrote: > Hi Sebastian, Hi Grygorii, > - IRQF_NO_THREAD is the first considered option for such kind of issues. > But: Now in LKML there are ~60 occurrences of IRQF_NO_THREAD - most of > them are used by Arch code. And It's only used by 6 drivers (drivers/*) [Addendum 2]. > During past year, I've found only two threads related to IRQF_NO_THREAD > and, in both cases, IRQF_NO_THREAD was added for arch specific IRQs which > can't be threaded (https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-November/122659.html, > https://lkml.org/lkml/2015/4/21/404). That powerpc patch you reference is doing the same thing you are doing here. > - ARM UP system: TI's am437xx SoCs for example. > Here everything is started from /drivers/irqchip/irq-gic.c -> gic_handle_irq() > > GIC IRQ handler gic_handle_irq() may process more than one IRQ without leaving HW IRQ mode > (during my experiments I saw up to 6 IRQs processed in one cycle). not only GIC. But then what good does it do if it leaves and returns immediately back to the routine? > As result, It was concluded, that having such current HW/SW and all IRQs forced threaded [1], > it will be potentially possible to predict system behavior, because gic_handle_irq() will > do the same things for most of processed IRQs. > But, once there will be chained [2] or IRQF_NO_THREAD [3] IRQs - complete unpredictability. I would not go as far as "complete unpredictability". What you do (or should do) is testing the system for longer period of time with different behavior in order to estimate the worst case. You can't predict the system anyway since it is way too complex. Just try something that ensures that cyclictest is no longer cache hot and see what happens then. > So, It was selected as goal to have all PPI IRQs (forced) threaded. And if someone > will require faster IRQ handling - IRQF_NO_THREAD can be always added, but it will > be custom solution then. > > I'd be appreciated for your comments - if above problem is not a problem. > Good - IRQF_NO_THREAD forever! Yes, we try to avoid IRQF_NO_THREAD under all circumstances. However it is required for low-level arch code. This includes basically everything that is using raw-locks which includes interrupt controller (the "real" ones like GIC or cascading like MSI or GPIO). Here it is simple - you have a cascading MSI-interrupt controller and as such it should be IRQF_NO_THREAD marked. The latency spikes in worst case are not huge as explained earlier: The only thing your cascading controller is allowed to do is to mark interrupt as pending (which is with threaded interrupts just a task wakeup). And this is not a -RT only problem: it is broken in vanilla linux with threaded interrupts as well. Btw: There is an exception to this rule (as always): If you are on a slow/blocking bus then you don't do IRQF_NO_THREAD. A slow bus would be a GPIO controller behind an I2C/SPI controller which also acts as an interrupt controller. Here you use a threaded-handler + handle_nested_irq(). > Thanks. Sebastian -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 11/12/2015 11:19 AM, Sebastian Andrzej Siewior wrote: > On 11/06/2015 08:59 PM, Grygorii Strashko wrote: >> Hi Sebastian, > > Hi Grygorii, > >> - IRQF_NO_THREAD is the first considered option for such kind of issues. >> But: Now in LKML there are ~60 occurrences of IRQF_NO_THREAD - most of >> them are used by Arch code. And It's only used by 6 drivers (drivers/*) [Addendum 2]. >> During past year, I've found only two threads related to IRQF_NO_THREAD >> and, in both cases, IRQF_NO_THREAD was added for arch specific IRQs which >> can't be threaded (https://lists.ozlabs.org/pipermail/linuxppc-dev/2014-November/122659.html, >> https://lkml.org/lkml/2015/4/21/404). > > That powerpc patch you reference is doing the same thing you are doing > here. Probably. I don't know this hw, so my assumption was based on commits descriptions. > >> - ARM UP system: TI's am437xx SoCs for example. >> Here everything is started from /drivers/irqchip/irq-gic.c -> gic_handle_irq() >> > >> GIC IRQ handler gic_handle_irq() may process more than one IRQ without leaving HW IRQ mode >> (during my experiments I saw up to 6 IRQs processed in one cycle). > > not only GIC. But then what good does it do if it leaves and returns > immediately back to the routine? > >> As result, It was concluded, that having such current HW/SW and all IRQs forced threaded [1], >> it will be potentially possible to predict system behavior, because gic_handle_irq() will >> do the same things for most of processed IRQs. >> But, once there will be chained [2] or IRQF_NO_THREAD [3] IRQs - complete unpredictability. > > I would not go as far as "complete unpredictability". What you do (or > should do) is testing the system for longer period of time with > different behavior in order to estimate the worst case. > You can't predict the system anyway since it is way too complex. Just > try something that ensures that cyclictest is no longer cache hot and > see what happens then. I understand that. That's the current plan and work is in progress. The nearest target is to get rid of all -RT specific backtracks and ensure TI -RT kernel supports the same functionality as non-RT. next step - try to optimize. > >> So, It was selected as goal to have all PPI IRQs (forced) threaded. And if someone >> will require faster IRQ handling - IRQF_NO_THREAD can be always added, but it will >> be custom solution then. >> >> I'd be appreciated for your comments - if above problem is not a problem. >> Good - IRQF_NO_THREAD forever! > > Yes, we try to avoid IRQF_NO_THREAD under all circumstances. However it > is required for low-level arch code. This includes basically > everything that is using raw-locks which includes interrupt controller > (the "real" ones like GIC or cascading like MSI or GPIO). > Here it is simple - you have a cascading MSI-interrupt controller and > as such it should be IRQF_NO_THREAD marked. > The latency spikes in worst case are not huge as explained earlier: The > only thing your cascading controller is allowed to do is to mark > interrupt as pending (which is with threaded interrupts just a task > wakeup). > And this is not a -RT only problem: it is broken in vanilla linux with > threaded interrupts as well. > Ok. I've got it. IRQF_NO_THREAD will be a solution for reference code and for issues like this. I understand, that each, -RT based, real solution is unique and need to be specifically tuned, so if someone will have problem with IRQF_NO_THREAD - it can be removed easily and replaced with any sort of custom hacks/improvements. Thanks a lot for your comments. I'll apply your previous comments and re-send.
diff --git a/drivers/pci/host/pci-dra7xx.c b/drivers/pci/host/pci-dra7xx.c index 6589e7e..31460f4 100644 --- a/drivers/pci/host/pci-dra7xx.c +++ b/drivers/pci/host/pci-dra7xx.c @@ -302,8 +302,30 @@ static int __init dra7xx_add_pcie_port(struct dra7xx_pcie *dra7xx, return -EINVAL; } + /* + * Mark dra7xx_pcie_msi IRQ as IRQF_NO_THREAD + * On -RT and if kernel is booting with "threadirqs" cmd line parameter + * the dra7xx_pcie_msi_irq_handler() will be forced threaded but, + * in the same time, it's IRQ dispatcher and calls generic_handle_irq(), + * which, in turn, will be resolved to handle_simple_irq() call. + * The handle_simple_irq() expected to be called with IRQ disabled, as + * result kernle will display warning: + * "irq XXX handler YYY+0x0/0x14 enabled interrupts". + * + * Current DRA7 PCIe hw configuration supports 32 interrupts, + * which cannot change because it's hardwired in silicon, and can assume + * that only a few (most of the time it will be exactly ONE) of those + * interrupts are pending at the same time. So, It's sane way to dial + * with above issue by marking dra7xx_pcie_msi IRQ as IRQF_NO_THREAD. + * if some platform will utilize a lot of MSI IRQs and will suffer + * form -RT latencies degradation because of that - IRQF_NO_THREAD can + * be removed and "Warning Annihilation" W/A can be applied [1] + * + * [1] https://lkml.org/lkml/2015/11/2/578 + */ ret = devm_request_irq(&pdev->dev, pp->irq, - dra7xx_pcie_msi_irq_handler, IRQF_SHARED, + dra7xx_pcie_msi_irq_handler, + IRQF_SHARED | IRQF_NO_THREAD, "dra7-pcie-msi", pp); if (ret) { dev_err(&pdev->dev, "failed to request irq\n");
On -RT, TI DRA7 PCIe driver always produces below backtrace when the first PCI interrupt is triggered: ------------[ cut here ]------------ WARNING: CPU: 1 PID: 82 at kernel/irq/handle.c:150 handle_irq_event_percpu+0x14c/0x174() irq 460 handler irq_default_primary_handler+0x0/0x14 enabled interrupts Modules linked in: ahci_platform(+) ti_vip(+) libahci_platform xhci_pci(+) c_can_platform(+) libahci xhci_hcd ti_vpe c_can libata v4l2_mem2mem can_dev gpio_keys leds_gpio snd_soc_simple_card usbcore spi_ti_qspi videobuf2_core extcon_usb_gpio omap_wdt ti_vpdma videobuf2_dma_contig ti_soc_thermal videobuf2_memops dwc3_omap extcon rtc_omap ov1063x v4l2_common videodev media snd_soc_tlv320aic3x omap_rng rng_core omap_remoteproc CPU: 1 PID: 82 Comm: irq/26-dra7-pci Not tainted 4.1.10-rt10-02638-g6486d7e-dirty #1 Hardware name: Generic DRA74X (Flattened Device Tree) Backtrace: [<c0013060>] (dump_backtrace) from [<c0013280>] (show_stack+0x18/0x1c) r7:c07acca0 r6:00000000 r5:c09034e4 r4:00000000 [<c0013268>] (show_stack) from [<c061ef14>] (dump_stack+0x90/0xac) [<c061ee84>] (dump_stack) from [<c00436e4>] (warn_slowpath_common+0x7c/0xb8) r7:c07acca0 r6:00000096 r5:00000009 r4:ee4d3e38 [<c0043668>] (warn_slowpath_common) from [<c0043758>] (warn_slowpath_fmt+0x38/0x40) r8:ee3fcc00 r7:000001cc r6:00000000 r5:00000002 r4:c07acc78 [<c0043724>] (warn_slowpath_fmt) from [<c0085fd0>] (handle_irq_event_percpu+0x14c/0x174) r3:000001cc r2:c07acc78 r4:ecfcdd80 [<c0085e84>] (handle_irq_event_percpu) from [<c008607c>] (handle_irq_event+0x84/0xb8) r10:00000001 r9:ee4b59c0 r8:ee1d0900 r7:00000001 r6:00000000 r5:ecfcdd80 r4:ee3fcc00 [<c0085ff8>] (handle_irq_event) from [<c0089240>] (handle_simple_irq+0x90/0x118) r5:ee4a9020 r4:ee3fcc00 [<c00891b0>] (handle_simple_irq) from [<c0085514>] (generic_handle_irq+0x30/0x44) r5:ee4a9020 r4:000001cc [<c00854e4>] (generic_handle_irq) from [<c034a7d4>] (dra7xx_pcie_msi_irq_handler+0x7c/0x8c) r5:ee4a9020 r4:00000001 [<c034a758>] (dra7xx_pcie_msi_irq_handler) from [<c0086f08>] (irq_forced_thread_fn+0x28/0x5c) r5:ee1d0900 r4:ee4b59c0 [<c0086ee0>] (irq_forced_thread_fn) from [<c00871dc>] (irq_thread+0x128/0x204) r7:00000001 r6:00000000 r5:ee4d2000 r4:ee4b59e4 [<c00870b4>] (irq_thread) from [<c005e2fc>] (kthread+0xd4/0xec) r10:00000000 r9:00000000 r8:00000000 r7:c00870b4 r6:ee4b59c0 r5:ee4b5a00 r4:00000000 [<c005e228>] (kthread) from [<c000faa8>] (ret_from_fork+0x14/0x2c) r7:00000000 r6:00000000 r5:c005e228 r4:ee4b5a00 ---[ end trace 0000000000000002 ]--- This backtrace is triggered because dra7xx_pcie_msi_irq_handler() forced to be threaded by default on -RT but, in the same time, it's IRQ dispatcher and calls generic_handle_irq(), which leads to handle_simple_irq() call. The handle_simple_irq() expected to be called with IRQ disabled. The same issue will also happen if kernel will boot with "threadirqs" cmdline parameter (CONFIG_IRQ_FORCED_THREADING). As discussed in [1], the current DRA7 PCIe hw configuration supports 32 interrupts, which cannot change because it's hardwired in silicon, this is a single status read and assuming that only a few (most of the time it will be exactly ONE) of those interrupts are pending at the same time is pretty much a sane assumption. And recommended fix for this issue is to request dra7xx_pcie_msi IRQ with IRQF_NO_THREAD flag. [1] https://lkml.org/lkml/2015/11/3/660 Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> --- drivers/pci/host/pci-dra7xx.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-)