diff mbox series

genirq: Deduplicate WARN_ON_ONCE() in generic_handle_domain_irq()

Message ID c3caf60bfa78e5fdbdf483096b7174da65d1813a.1652168866.git.lukas@wunner.de (mailing list archive)
State New, archived
Headers show
Series genirq: Deduplicate WARN_ON_ONCE() in generic_handle_domain_irq() | expand

Commit Message

Lukas Wunner May 10, 2022, 7:56 a.m. UTC
Since commit 0953fb263714 ("irq: remove handle_domain_{irq,nmi}()"),
generic_handle_domain_irq() warns if called outside hardirq context,
even though the function calls down to handle_irq_desc(), which warns
about the same.

Moreover the newly added warning is false positive if the interrupt
originates from any other irqchip than x86 APIC or arm GIC/GICv3.
Those are the only ones for which handle_enforce_irqctx() returns true.
Per commit c16816acd086 ("genirq: Add protection against unsafe usage of
generic_handle_irq()"):

 "In general calling generic_handle_irq() with interrupts disabled from
  non interrupt context is harmless.  [But for] interrupt controllers
  like the x86 trainwrecks this is outright dangerous as it might
  corrupt state if an interrupt affinity change is pending."

An example for irqchips where the warning is false positive are
USB-attached GPIO controllers such as drivers/gpio/gpio-dln2.c:
USB gadgets are incapable of directly signaling an interrupt because
they cannot initiate a bus transaction by themselves.  All communication
on the bus is initiated by the host controller, which polls a gadget's
Interrupt Endpoint in regular intervals.  If an interrupt is pending,
that information is passed up the stack in softirq context, from which
a hardirq is synthesized via generic_handle_domain_irq().

Deduplicate the warning to eliminate false positives and speed up IRQ
handling.

Fixes: 0953fb263714 ("irq: remove handle_domain_{irq,nmi}()")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org # v5.16+
Cc: Mark Rutland <mark.rutland@arm.com>
---
 kernel/irq/irqdesc.c | 1 -
 1 file changed, 1 deletion(-)

Comments

Sebastian Andrzej Siewior May 16, 2022, 6:29 a.m. UTC | #1
On 2022-05-10 09:56:05 [+0200], Lukas Wunner wrote:
> An example for irqchips where the warning is false positive are
> USB-attached GPIO controllers such as drivers/gpio/gpio-dln2.c:

They are not false positives becauseā€¦

> USB gadgets are incapable of directly signaling an interrupt because
> they cannot initiate a bus transaction by themselves.  All communication
> on the bus is initiated by the host controller, which polls a gadget's
> Interrupt Endpoint in regular intervals.  If an interrupt is pending,
> that information is passed up the stack in softirq context, from which
> a hardirq is synthesized via generic_handle_domain_irq().

they tell you that the context is wrong. From looking at gpio-dln2
this is called from USB URB's callback which is softirq. In the end
dln2_gpio_event() is invoked while dln2_dev::event_cb_lock is acquired.
That lock is acquired by disabling interrupts which is what gets the
locking right for generic_handle_domain_irq(). If that lock lifted to
spin_lock_bh() (because it is always in urb's calback context and all
HCDs complete in one context unlike now) then this breaks. And
PREEMPT_RT is broken already. Therefore, last week, I've been promoting
generic_handle_domain_irq_safe()
   https://lkml.kernel.org/r/YnkfWFzvusFFktSt@linutronix.de

and sadly I missed dln2. Please let me know if you have more users
similar to dln2. I will add those to my list once upstream buys that
interface.

Sebastian
Thomas Gleixner May 16, 2022, 6:53 a.m. UTC | #2
On Mon, May 16 2022 at 08:29, Sebastian Andrzej Siewior wrote:

> On 2022-05-10 09:56:05 [+0200], Lukas Wunner wrote:
>> An example for irqchips where the warning is false positive are
>> USB-attached GPIO controllers such as drivers/gpio/gpio-dln2.c:
>
> They are not false positives becauseā€¦
>
>> USB gadgets are incapable of directly signaling an interrupt because
>> they cannot initiate a bus transaction by themselves.  All communication
>> on the bus is initiated by the host controller, which polls a gadget's
>> Interrupt Endpoint in regular intervals.  If an interrupt is pending,
>> that information is passed up the stack in softirq context, from which
>> a hardirq is synthesized via generic_handle_domain_irq().
>
> they tell you that the context is wrong.

Why? These handlers can be called from any context, really. Yes. They
need to be called with interrupts disabled, but that's it. The warning
is checking hard interrupt context unconditionally.

> From looking at gpio-dln2 this is called from USB URB's callback which
> is softirq. In the end dln2_gpio_event() is invoked while
> dln2_dev::event_cb_lock is acquired.  That lock is acquired by
> disabling interrupts which is what gets the locking right for
> generic_handle_domain_irq(). If that lock lifted to spin_lock_bh()
> (because it is always in urb's calback context and all HCDs complete
> in one context unlike now) then this breaks.

Yes, but that's a different problem.

> And PREEMPT_RT is broken already. Therefore, last week, I've been
> promoting generic_handle_domain_irq_safe()
> https://lkml.kernel.org/r/YnkfWFzvusFFktSt@linutronix.de

Well, that's just a wrapper which adds the local_irq_save(), so it's not
any different from having the local_irq_save() at the callsite, unless
I'm missing something.

Thanks,

        tglx
Sebastian Andrzej Siewior May 16, 2022, 7:02 a.m. UTC | #3
On 2022-05-16 08:53:29 [+0200], Thomas Gleixner wrote:
> > they tell you that the context is wrong.
> 
> Why? These handlers can be called from any context, really. Yes. They
> need to be called with interrupts disabled, but that's it. The warning
> is checking hard interrupt context unconditionally.

correct. If the context is wrong, the interrupts are usually not
disabled.

> > From looking at gpio-dln2 this is called from USB URB's callback which
> > is softirq. In the end dln2_gpio_event() is invoked while
> > dln2_dev::event_cb_lock is acquired.  That lock is acquired by
> > disabling interrupts which is what gets the locking right for
> > generic_handle_domain_irq(). If that lock lifted to spin_lock_bh()
> > (because it is always in urb's calback context and all HCDs complete
> > in one context unlike now) then this breaks.
> 
> Yes, but that's a different problem.
> 
> > And PREEMPT_RT is broken already. Therefore, last week, I've been
> > promoting generic_handle_domain_irq_safe()
> > https://lkml.kernel.org/r/YnkfWFzvusFFktSt@linutronix.de
> 
> Well, that's just a wrapper which adds the local_irq_save(), so it's not
> any different from having the local_irq_save() at the callsite, unless
> I'm missing something.

I haven't seen a local_irq_save() at the callsite. If it is, then it is
not any different.

> Thanks,
> 
>         tglx

Sebastian
diff mbox series

Patch

diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 939d21c..da3e495 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -699,7 +699,6 @@  int generic_handle_irq_safe(unsigned int irq)
  */
 int generic_handle_domain_irq(struct irq_domain *domain, unsigned int hwirq)
 {
-	WARN_ON_ONCE(!in_hardirq());
 	return handle_irq_desc(irq_resolve_mapping(domain, hwirq));
 }
 EXPORT_SYMBOL_GPL(generic_handle_domain_irq);